Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (59)

Search Parameters:
Keywords = future pyramidal network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 13698 KB  
Article
Edge-Oriented Adaptive Multi-Task Network for Modulation and Signal Type Classification
by Peixin Zhao and Chengqun Wang
Future Internet 2026, 18(6), 275; https://doi.org/10.3390/fi18060275 - 22 May 2026
Viewed by 242
Abstract
Modulation and signal classification are two highly correlated core tasks in wireless communications and are the core foundation of intelligent spectrum management in Future Internet and 6G networks. Although their objectives differ, the two tasks often share a substantial amount of underlying information [...] Read more.
Modulation and signal classification are two highly correlated core tasks in wireless communications and are the core foundation of intelligent spectrum management in Future Internet and 6G networks. Although their objectives differ, the two tasks often share a substantial amount of underlying information in the feature space. However, focusing solely on their commonalities while neglecting their intrinsic differences may lead to suboptimal model performance. Therefore, by taking into account both the correlation and inherent differences between the two tasks, we propose TAMTNet, a task-adaptive multi-task network for edge deployment in Future Internet. TAMTNet introduces Extremely Efficient Spatial Pyramid (EESP) into the shared layer to efficiently extract multi-scale temporal information. In addition, a multi-gate mixture-of-experts (MMoE) mechanism is employed after the shared layer to enhance the modeling capability of task-specific features. Furthermore, to address the difficulty of deploying deep models on resource-constrained edge devices, a joint lightweight framework combining quantization-aware training and knowledge distillation is proposed, which significantly reduces model complexity while maintaining performance. Extensive experiments conducted on the simulation and real-world over-the-air transmission datasets demonstrate that the TAMTNet model achieves excellent performance on both modulation and signal classification tasks across a wide range of signal-to-noise ratios and radio transmit gain conditions. Meanwhile, the low-bitwidth lightweight models are able to maintain classification performance comparable to the full-precision model while significantly reducing model storage and computational complexity. Full article
Show Figures

Figure 1

25 pages, 5720 KB  
Article
MuRDE-FPN: Precise UAV Localization Using Enhanced Feature Pyramid Network
by Monika Kisieliūtė and Ignas Daugėla
Drones 2026, 10(3), 162; https://doi.org/10.3390/drones10030162 - 27 Feb 2026
Viewed by 876
Abstract
Unmanned aerial vehicles (UAVs) require reliable autonomous positioning independent of external satellite navigation signals, motivating the development of a vision-based, end-to-end finding point in map (FPI) framework. This study introduces MuRDE-FPN, an enhanced feature pyramid network (FPN) designed for precise UAV localization, building [...] Read more.
Unmanned aerial vehicles (UAVs) require reliable autonomous positioning independent of external satellite navigation signals, motivating the development of a vision-based, end-to-end finding point in map (FPI) framework. This study introduces MuRDE-FPN, an enhanced feature pyramid network (FPN) designed for precise UAV localization, building upon a lightweight one-stream transformer-based (OS-PCPVT) backbone. MuRDE-FPN integrates efficient channel attention (ECA) for adaptive channel recalibration and features two novel components: a multi-receptive deformable enhancement (MuRDE) that utilizes deformable convolutions with varying kernel sizes to refine the semantically rich final feature layer, and a feature alignment module (FAM) for cross-level fusion. Evaluated on the UL14 dataset and a new, more diverse UAV-Sat dataset, MuRDE-FPN consistently outperformed four state-of-the-art FPI methods (FPI, WAMF-FPI, OS-FPI, DCD-FPI). It achieved a relative distance score of 84.26 on UL14 and 63.74 on UAV-Sat datasets, demonstrating improved localization. Ablation studies confirmed the cumulative benefits of ECA, MuRDE, and FAM. These findings highlight the effectiveness of custom FPN designs and targeted feature enhancements for precise cross-view positioning, with MuRDE-FPN providing a robust solution and the UAV-Sat dataset offering a new benchmark for evaluation. Future efforts will address computational efficiency and performance across varying data quality environments. Full article
Show Figures

Figure 1

23 pages, 1672 KB  
Review
Field-Evolved Resistance to Bt Cry Toxins in Lepidopteran Pests: Insights into Multilayered Regulatory Mechanisms and Next-Generation Management Strategies
by Junfei Xie, Wenfeng He, Min Qiu, Jiaxin Lin, Haoran Shu, Jintao Wang and Leilei Liu
Toxins 2026, 18(2), 60; https://doi.org/10.3390/toxins18020060 - 25 Jan 2026
Cited by 1 | Viewed by 1782
Abstract
Bt Cry toxins remain the cornerstone of transgenic crop protection against Lepidopteran pests, yet field-evolved resistance, particularly in invasive species such as Spodoptera frugiperda and Helicoverpa armigera, can threaten their long-term efficacy. This review presents a comprehensive and unified mechanistic framework that [...] Read more.
Bt Cry toxins remain the cornerstone of transgenic crop protection against Lepidopteran pests, yet field-evolved resistance, particularly in invasive species such as Spodoptera frugiperda and Helicoverpa armigera, can threaten their long-term efficacy. This review presents a comprehensive and unified mechanistic framework that synthesizes current understanding of Bt Cry toxin modes of action and the complex, multilayered regulatory mechanisms of field-evolved resistance. Beyond the classical pore-formation model, emerging evidence highlights signal transduction cascades, immune evasion via suppression of Toll/IMD pathways, and tripartite toxin–host–microbiota interactions that can dynamically modulate protoxin activation and receptor accessibility. Resistance arises from target-site alterations (e.g., ABCC2/ABCC3, Cadherin mutations), altered midgut protease profiles, enhanced immune regeneration, and microbiota-mediated detoxification, orchestrated by transcription factor networks (GATA, FoxA, FTZ-F1), constitutive MAPK hyperactivation (especially MAP4K4-driven cascades), along with preliminary emerging findings on non-coding RNA involvement. Countermeasures now integrate synergistic Cry/Vip pyramiding, CRISPR/Cas9-validated receptor knockouts revealing functional redundancy, Domain III chimerization (e.g., Cry1A.105), phage-assisted continuous evolution (PACE), and the emerging application of AlphaFold3 for structure-guided rational redesign of resistance-breaking variants. Future sustainability hinges on system-level integration of single-cell transcriptomics, midgut-specific CRISPR screens, microbiome engineering, and AI-accelerated protein design to preempt resistance trajectories and secure Bt biotechnology within integrated resistance and pest management frameworks. Full article
Show Figures

Figure 1

18 pages, 4631 KB  
Article
Semantic Segmentation of Rice Fields in Sub-Meter Satellite Imagery Using an HRNet-CA-Enhanced DeepLabV3+ Framework
by Yifan Shao, Pan Pan, Hongxin Zhao, Jiale Li, Guoping Yu, Guomin Zhou and Jianhua Zhang
Remote Sens. 2025, 17(14), 2404; https://doi.org/10.3390/rs17142404 - 11 Jul 2025
Cited by 3 | Viewed by 2165
Abstract
Accurate monitoring of rice-planting areas underpins food security and evidence-based farm management. Recent work has advanced along three complementary lines—multi-source data fusion (to mitigate cloud and spectral confusion), temporal feature extraction (to exploit phenology), and deep-network architecture optimization. However, even the best fusion- [...] Read more.
Accurate monitoring of rice-planting areas underpins food security and evidence-based farm management. Recent work has advanced along three complementary lines—multi-source data fusion (to mitigate cloud and spectral confusion), temporal feature extraction (to exploit phenology), and deep-network architecture optimization. However, even the best fusion- and time-series-based approaches still struggle to preserve fine spatial details in sub-meter scenes. Targeting this gap, we propose an HRNet-CA-enhanced DeepLabV3+ that retains the original model’s strengths while resolving its two key weaknesses: (i) detail loss caused by repeated down-sampling and feature-pyramid compression and (ii) boundary blurring due to insufficient multi-scale information fusion. The Xception backbone is replaced with a High-Resolution Network (HRNet) to maintain full-resolution feature streams through multi-resolution parallel convolutions and cross-scale interactions. A coordinate attention (CA) block is embedded in the decoder to strengthen spatially explicit context and sharpen class boundaries. The rice dataset consisted of 23,295 images (11,295 rice + 12,000 non-rice) via preprocessing and manual labeling and benchmarked the proposed model against classical segmentation networks. Our approach boosts boundary segmentation accuracy to 92.28% MIOU and raises texture-level discrimination to 95.93% F1, without extra inference latency. Although this study focuses on architecture optimization, the HRNet-CA backbone is readily compatible with future multi-source fusion and time-series modules, offering a unified path toward operational paddy mapping in fragmented sub-meter landscapes. Full article
Show Figures

Figure 1

21 pages, 4072 KB  
Article
ST-YOLOv8: Small-Target Ship Detection in SAR Images Targeting Specific Marine Environments
by Fei Gao, Yang Tian, Yongliang Wu and Yunxia Zhang
Appl. Sci. 2025, 15(12), 6666; https://doi.org/10.3390/app15126666 - 13 Jun 2025
Cited by 3 | Viewed by 2116
Abstract
Synthetic Aperture Radar (SAR) image ship detection faces challenges such as distinguishing ships from other terrains and structures, especially in specific marine complex environments. The motivation behind this work is to enhance detection accuracy while minimizing false positives, which is crucial for applications [...] Read more.
Synthetic Aperture Radar (SAR) image ship detection faces challenges such as distinguishing ships from other terrains and structures, especially in specific marine complex environments. The motivation behind this work is to enhance detection accuracy while minimizing false positives, which is crucial for applications like defense vessel monitoring and civilian search and rescue operations. To achieve this goal, we propose several architectural improvements to You Only Look Once version 8 Nano (YOLOv8n) and present Small Target-YOLOv8(ST-YOLOv8)—a novel lightweight SAR ship detection model based on the enhance YOLOv8n framework. The C2f module in the backbone’s transition sections is replaced by the Conv_Online Reparameterized Convolution (C_OREPA) module, reducing convolutional complexity and improving efficiency. The Atrous Spatial Pyramid Pooling (ASPP) module is added to the end of the backbone to extract finer features from smaller and more complex ship targets. In the neck network, the Shuffle Attention (SA) module is employed before each upsampling step to improve upsampling quality. Additionally, we replace the Complete Intersection over Union (C-IoU) loss function with the Wise Intersection over Union (W-IoU) loss function, which enhances bounding box precision. We conducted ablation experiments on two widely used multimodal SAR datasets. The proposed model significantly outperforms the YOLOv8n baseline, achieving 94.1% accuracy, 82% recall, and 87.6% F1 score on the SAR Ship Detection Dataset (SSDD), and 92.7% accuracy, 84.5% recall, and 88.1% F1 score on the SAR Ship Dataset_v0 dataset (SSDv0). Furthermore, the ST-YOLOv8 model outperforms several state-of-the-art multi-scale ship detection algorithms on both datasets. In summary, the ST-YOLOv8 model, by integrating advanced neural network architectures and optimization techniques, significantly improves detection accuracy and reduces false detection rates. This makes it highly suitable for complex backgrounds and multi-scale ship detection. Future work will focus on lightweight model optimization for deployment on mobile platforms to broaden its applicability across different scenarios. Full article
Show Figures

Figure 1

27 pages, 20364 KB  
Article
A Comparative Study of Lesion-Centered and Severity-Based Approaches to Diabetic Retinopathy Classification: Improving Interpretability and Performance
by Gang-Min Park, Ji-Hoon Moon and Ho-Gil Jung
Biomedicines 2025, 13(6), 1446; https://doi.org/10.3390/biomedicines13061446 - 12 Jun 2025
Cited by 1 | Viewed by 1553
Abstract
Background: Despite advances in artificial intelligence (AI) for Diabetic Retinopathy (DR) classification, traditional severity-based approaches often lack interpretability and fail to capture specific lesion-centered characteristics. To address these limitations, we constructed the National Medical Center (NMC) dataset, independently annotated by medical professionals with [...] Read more.
Background: Despite advances in artificial intelligence (AI) for Diabetic Retinopathy (DR) classification, traditional severity-based approaches often lack interpretability and fail to capture specific lesion-centered characteristics. To address these limitations, we constructed the National Medical Center (NMC) dataset, independently annotated by medical professionals with detailed labels of major DR lesions, including retinal hemorrhages, microaneurysms, and exudates. Methods: This study explores four critical research questions. First, we assess the analytical advantages of lesion-centered labeling compared to traditional severity-based labeling. Second, we investigate the potential complementarity between these labeling approaches through integration experiments. Third, we analyze how various model architectures and classification strategies perform under different labeling schemes. Finally, we evaluate decision-making differences between labeling methods using visualization techniques. We benchmarked the lesion-centered NMC dataset against the severity-based public Asia Pacific Tele-Ophthalmology Society (APTOS) dataset, conducting experiments with EfficientNet—a convolutional neural network architecture—and diverse classification strategies. Results: Our results demonstrate that binary classification effectively identifies severe non-proliferative Diabetic Retinopathy (Severe NPDR) exhibiting complex lesion patterns, while relationship-based learning enhances performance for underrepresented classes. Transfer learning from NMC to APTOS notably improved severity classification, achieving performance gains of 15.2% in mild cases and 66.3% in severe cases through feature fusion using Bidirectional Feature Pyramid Network (BiFPN) and Feature Pyramid Network (FPN). Visualization results confirmed that lesion-centered models focus more precisely on pathological features. Conclusions: Our findings highlight the benefits of integrating lesion-centered and severity-based information to enhance both accuracy and interpretability in DR classification. Future research directions include spatial lesion mapping and the development of clinically grounded learning methodologies. Full article
(This article belongs to the Section Endocrinology and Metabolism Research)
Show Figures

Figure 1

27 pages, 14505 KB  
Article
RSWD-YOLO: A Walnut Detection Method Based on UAV Remote Sensing Images
by Yansong Wang, Xuanxi Yang, Haoyu Wang, Huihua Wang, Zaiqing Chen and Lijun Yun
Horticulturae 2025, 11(4), 419; https://doi.org/10.3390/horticulturae11040419 - 14 Apr 2025
Cited by 5 | Viewed by 2396
Abstract
Accurate walnut yield prediction is crucial for the development of the walnut industry. Traditional manual counting methods are limited by labor and time costs, leading to inaccurate walnut quantity assessments. In this paper, we propose a walnut detection method based on UAV (UAV [...] Read more.
Accurate walnut yield prediction is crucial for the development of the walnut industry. Traditional manual counting methods are limited by labor and time costs, leading to inaccurate walnut quantity assessments. In this paper, we propose a walnut detection method based on UAV (UAV means Unmanned Aerial Vehicle) remote sensing imagery to improve the walnut yield prediction accuracy. Based on the YOLOv11 network, we propose several improvements to enhance the multi-scale object detection capability while achieving a more lightweight model structure. Specifically, we reconstruct the feature fusion network with a hierarchical scale-based feature pyramid structure and implement lightweight improvements to the feature extraction component. These modifications result in the RSWD-YOLO network (RSWD means remote sensing walnut detection; YOLO means ‘You Only Look Once’, and it is the specific abbreviation used for a series of object detection algorithms), which is specifically designed for walnut detection. Furthermore, to optimize the detection performance under hardware resource constraints, we apply knowledge distillation to RSWD-YOLO, thereby further improving the detection accuracy. Through model deployment and testing on small edge devices, we demonstrate the feasibility of our proposed method. The detection algorithm achieves 86.1% mean Average Precision on the walnut dataset while maintaining operational functionality on small edge devices. The experimental results demonstrate that our proposed UAV remote sensing-based walnut detection method has a significant practical application value and can provide valuable insights for future research in related fields. Full article
(This article belongs to the Section Postharvest Biology, Quality, Safety, and Technology)
Show Figures

Figure 1

25 pages, 7750 KB  
Article
Pyramidal Predictive Network V2: An Improved Predictive Architecture and Training Strategies for Future Perception Prediction
by Chaofan Ling, Junpei Zhong, Weihua Li, Ran Dong and Mingjun Dai
Big Data Cogn. Comput. 2025, 9(4), 79; https://doi.org/10.3390/bdcc9040079 - 28 Mar 2025
Viewed by 1280
Abstract
In this paper, we propose an improved version of the Pyramidal Predictive Network (PPNV2), a theoretical framework inspired by predictive coding, which addresses the limitations of its predecessor (PPNV1) in the task of future perception prediction. While PPNV1 employed a temporal pyramid architecture [...] Read more.
In this paper, we propose an improved version of the Pyramidal Predictive Network (PPNV2), a theoretical framework inspired by predictive coding, which addresses the limitations of its predecessor (PPNV1) in the task of future perception prediction. While PPNV1 employed a temporal pyramid architecture and demonstrated promising results, its innate signal processing led to aliasing in the prediction, restricting its application in robotic navigation. We analyze the signal dissemination and characteristic artifacts of PPNV1 and introduce architectural enhancements and training strategies to mitigate these issues. The improved architecture focuses on optimizing information dissemination and reducing aliasing in neural networks. We redesign the downsampling and upsampling components to enable the network to construct images more effectively from low-frequency-input Fourier features, replacing the simple concatenation of different inputs in the previous version. Furthermore, we refine the training strategies to alleviate input inconsistency during training and testing phases. The enhanced model exhibits increased interpretability, stronger prediction accuracy, and improved quality of predictions. The proposed PPNV2 offers a more robust and efficient approach to future video-frame prediction, overcoming the limitations of its predecessor and expanding its potential applications in various robotic domains, including pedestrian prediction, vehicle prediction, and navigation. Full article
Show Figures

Figure 1

16 pages, 1401 KB  
Article
Open-Loop Wavefront Reconstruction with Pyramidal Sensors Using Convolutional Neural Networks
by Saúl Pérez-Fernández, Alejandro Buendía-Roca, Carlos González-Gutiérrez, Francisco García-Riesgo, Javier Rodríguez-Rodríguez, Santiago Iglesias-Alvarez, Julia Fernández-Díaz and Francisco Javier Iglesias-Rodríguez
Mathematics 2025, 13(7), 1028; https://doi.org/10.3390/math13071028 - 21 Mar 2025
Cited by 2 | Viewed by 1310
Abstract
Neural networks have significantly advanced adaptive optics systems for telescopes in recent years. Future adaptive optics systems, especially for extremely large telescopes, are expected to predominantly employ pyramid wavefront sensors, which offer good sensitivity but suffer from a non-linear response under certain conditions. [...] Read more.
Neural networks have significantly advanced adaptive optics systems for telescopes in recent years. Future adaptive optics systems, especially for extremely large telescopes, are expected to predominantly employ pyramid wavefront sensors, which offer good sensitivity but suffer from a non-linear response under certain conditions. This non-linearity limits the performance of traditional linear reconstruction methods, such as matrix–vector multiplication, leading to suboptimal performance. Convolutional Neural Networks offer a promising alternative, as they can model complex non-linear relationships and extract spatial patterns from sensor images. While CNN-based reconstruction has shown success in closed-loop systems, this study investigates their application in open-loop wavefront reconstruction. A custom network architecture and training strategy are developed, using realistic training data from end-to-end atmospheric turbulence simulations. CNNs are trained to reconstruct Zernike polynomial coefficients representing optical aberrations, enabling a tomographic estimation of turbulence. The proposed approach demonstrates significant improvements over conventional open-loop methods, underscoring the potential of CNNs to enhance wavefront reconstruction in next-generation AO systems. Full article
Show Figures

Figure 1

23 pages, 12690 KB  
Article
MSS-YOLO: Multi-Scale Edge-Enhanced Lightweight Network for Personnel Detection and Location in Coal Mines
by Wenjuan Yang, Yanqun Wang, Xuhui Zhang, Le Zhu, Tenghui Wang, Yunkai Chi and Jie Jiang
Appl. Sci. 2025, 15(6), 3238; https://doi.org/10.3390/app15063238 - 16 Mar 2025
Cited by 10 | Viewed by 2450
Abstract
As a critical task in underground coal mining, personnel identification and positioning in fully mechanized mining faces are essential for safety. Yet, complex environmental factors—such as narrow tunnels, heavy dust, and uneven lighting—pose significant challenges to accurate detection. In this paper, we propose [...] Read more.
As a critical task in underground coal mining, personnel identification and positioning in fully mechanized mining faces are essential for safety. Yet, complex environmental factors—such as narrow tunnels, heavy dust, and uneven lighting—pose significant challenges to accurate detection. In this paper, we propose a personnel detection network, MSS-YOLO, for fully mechanized mining faces based on YOLOv8. By designing a Multi-Scale Edge Enhancement (MSEE) module and fusing it with the C2f module, the performance of the network for personnel feature extraction under high-dust or long-distance conditions is effectively enhanced. Meanwhile, by designing a Spatial Pyramid Shared Conv (SPSC) module, the redundancy of the model is reduced, which effectively compensates for the problem of the max pooling being prone to losing the characteristics of the personnel at long distances. Finally, the lightweight Shared Convolutional Detection Head (SCDH) ensures real-time detection under limited computational resources. The experimental results show that compared to Faster-RCNN, SSD, YOLOv5s6, YOLOv7-tiny, YOLOv8n, and YOLOv11n, MSS-YOLO achieves AP50 improvements of 4.464%, 10.484%, 3.751%, 4.433%, 3.655%, and 2.188%, respectively, while reducing the inference time by 50.4 ms, 11.9 ms, 3.7 ms, 2.0 ms, 1.2 ms, and 2.3 ms. In addition, MSS-YOLO is combined with the SGBM binocular stereo vision matching algorithm to provide a personnel 3D spatial position solution by using disparity results. The personnel location results show that in the measurement range of 10 m, the position errors in the x-, y-, and z-directions are within 0.170 m, 0.160 m, and 0.200 m, respectively, which proves that MSS-YOLO is able to accurately detect underground personnel in real time and can meet the underground personnel detection and localization requirements. The current limitations lie in the reliance on a calibrated binocular camera and the performance degradation beyond 15 m. Future work will focus on multi-sensor fusion and adaptive distance scaling to enhance practical deployment. Full article
Show Figures

Figure 1

23 pages, 3564 KB  
Article
Serious Game Design for Teaching University Students to Address Complexity Issues in the Healthcare Logistics System: Lessons from an Emergency Department Case Study
by Yan Sun and Chen Zhang
Systems 2025, 13(3), 197; https://doi.org/10.3390/systems13030197 - 12 Mar 2025
Cited by 1 | Viewed by 3756
Abstract
As pioneers in this field, our role in shaping the future of serious games in healthcare logistics is crucial. Digital media design significantly influences the quality of gaming simulation studies in healthcare. The leading challenge scholars face is introducing innovative and valuable features [...] Read more.
As pioneers in this field, our role in shaping the future of serious games in healthcare logistics is crucial. Digital media design significantly influences the quality of gaming simulation studies in healthcare. The leading challenge scholars face is introducing innovative and valuable features to university students. The data–simulation–gaming pyramid could serve as a blueprint for outlining how interactive simulations could be conducted. A participatory design process is important in serious game development. More recently, the literature has illustrated the contribution of extended reality. However, researchers have not explored this research framework in detail. This paper traces the participatory design process of serious games using an emergency logistics case study in Stockholm, Sweden. It underscores the importance of choosing the correct narratives and game mechanics to support the implementation of serious games using extended reality for the demonstration of non-technical skills. The research findings are threefold. (1) The participatory design process helps to place focus on the implementing philosophy that values health equality in networked hospitals. (2) Further analysis reveals that gamification could turn everyday tasks in the emergency department, which represents a stressful workplace in a hospital, into a spectrum of learning experiences for in-demand skills, including situational awareness, leadership, communication, and ethical thinking. (3) A closer inspection of the reality-changing methods shows new requirements to shorten patient queues before and after the (implementation of the) strengthened waiting time guarantee proposal in 2024. There is abundant room for principals in healthcare institutions to implement reality-changing methods to foster collaboration at the departmental, cross-departmental, and cross-institutional levels. Full article
(This article belongs to the Special Issue Innovative Systems Approaches to Healthcare Systems)
Show Figures

Figure 1

22 pages, 6129 KB  
Article
A Novel Machine Vision-Based Collision Risk Warning Method for Unsignalized Intersections on Arterial Roads
by Zhongbin Luo, Yanqiu Bi, Qing Ye, Yong Li and Shaofei Wang
Electronics 2025, 14(6), 1098; https://doi.org/10.3390/electronics14061098 - 11 Mar 2025
Cited by 2 | Viewed by 2102
Abstract
To address the critical need for collision risk warning at unsignalized intersections, this study proposes an advanced predictive system combining YOLOv8 for object detection, Deep SORT for tracking, and Bi-LSTM networks for trajectory prediction. To adapt YOLOv8 for complex intersection scenarios, several architectural [...] Read more.
To address the critical need for collision risk warning at unsignalized intersections, this study proposes an advanced predictive system combining YOLOv8 for object detection, Deep SORT for tracking, and Bi-LSTM networks for trajectory prediction. To adapt YOLOv8 for complex intersection scenarios, several architectural enhancements were incorporated. The RepLayer module replaced the original C2f module in the backbone, integrating large-kernel depthwise separable convolution to better capture contextual information in cluttered environments. The GIoU loss function was introduced to improve bounding box regression accuracy, mitigating the issues related to missed or incorrect detections due to occlusion and overlapping objects. Furthermore, a Global Attention Mechanism (GAM) was implemented in the neck network to better learn both location and semantic information, while the ReContext gradient composition feature pyramid replaced the traditional FPN, enabling more effective multi-scale object detection. Additionally, the CSPNet structure in the neck was substituted with Res-CSP, enhancing feature fusion flexibility and improving detection performance in complex traffic conditions. For tracking, the Deep SORT algorithm was optimized with enhanced appearance feature extraction, reducing the identity switches caused by occlusions and ensuring the stable tracking of vehicles, pedestrians, and non-motorized vehicles. The Bi-LSTM model was employed for trajectory prediction, capturing long-range dependencies to provide accurate forecasting of future positions. The collision risk was quantified using the predictive collision risk area (PCRA) method, categorizing risks into three levels (danger, warning, and caution) based on the predicted overlaps in trajectories. In the experimental setup, the dataset used for training the model consisted of 30,000 images annotated with bounding boxes around vehicles, pedestrians, and non-motorized vehicles. Data augmentation techniques such as Mosaic, Random_perspective, Mixup, HSV adjustments, Flipud, and Fliplr were applied to enrich the dataset and improve model robustness. In real-world testing, the system was deployed as part of the G310 highway safety project, where it achieved a mean Average Precision (mAP) of over 90% for object detection. Over a one-month period, 120 warning events involving vehicles, pedestrians, and non-motorized vehicles were recorded. Manual verification of the warnings indicated a prediction accuracy of 97%, demonstrating the system’s reliability in identifying potential collisions and issuing timely warnings. This approach represents a significant advancement in enhancing safety at unsignalized intersections in urban traffic environments. Full article
(This article belongs to the Special Issue Computer Vision and Image Processing in Machine Learning)
Show Figures

Figure 1

20 pages, 26272 KB  
Article
Cascade DeepLab Net: A Method for Accurate Extraction of Fragmented Cultivated Land in Mountainous Areas Based on a Cascaded Network
by Man Li, Renru Wang, Ana Dai, Weitao Yuan, Guangbin Yang, Lijun Xie, Weili Zhao and Linglin Zhao
Agriculture 2025, 15(3), 348; https://doi.org/10.3390/agriculture15030348 - 6 Feb 2025
Cited by 4 | Viewed by 1697
Abstract
Approximately 24% of the global land area consists of mountainous regions, with 10% of the population relying on these areas for their cultivated land. Accurate statistics and monitoring of cultivated land in mountainous regions are crucial for ensuring food security, creating scientific land [...] Read more.
Approximately 24% of the global land area consists of mountainous regions, with 10% of the population relying on these areas for their cultivated land. Accurate statistics and monitoring of cultivated land in mountainous regions are crucial for ensuring food security, creating scientific land use policies, and protecting the ecological environment. However, the fragmented nature of cultivated land in these complex terrains challenges the effectiveness of existing extraction methods. To address this issue, this study proposed a cascaded network based on an improved semantic segmentation model (DeepLabV3+), called Cascade DeepLab Net, specifically designed to improve the accuracy in the scenario of fragmented land features. This method aims to accurately extract cultivated land from remote sensing images. This model enhances the accuracy of cultivated land extraction in complex terrains by incorporating the Style-based Recalibration Module (SRM), Spatial Attention Module (SAM), and Refinement Module (RM). The experimental results using high-resolution satellite images of mountainous areas in southern China show that the improved model achieved an overall accuracy (OA) of 92.33% and an Intersection over Union (IoU) of 82.51%, marking a significant improvement over models such as U-shaped Network (UNet), Pyramid Scene Parsing Network (PSPNet), and DeepLabV3+. This method enhances the efficiency and accuracy of monitoring cultivated land in mountainous areas and offers a scientific basis for policy formulation and resource management, aiding in ecological protection and sustainable development. Additionally, this study presents new ideas and methods for future applications of cultivated land monitoring in other complex terrain regions. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

24 pages, 12871 KB  
Article
OW-YOLO: An Improved YOLOv8s Lightweight Detection Method for Obstructed Walnuts
by Haoyu Wang, Lijun Yun, Chenggui Yang, Mingjie Wu, Yansong Wang and Zaiqing Chen
Agriculture 2025, 15(2), 159; https://doi.org/10.3390/agriculture15020159 - 13 Jan 2025
Cited by 16 | Viewed by 2905
Abstract
Walnut detection in mountainous and hilly regions often faces significant challenges due to obstructions, which adversely affect model performance. To address this issue, we collected a dataset comprising 2379 walnut images from these regions, with detailed annotations for both obstructed and non-obstructed walnuts. [...] Read more.
Walnut detection in mountainous and hilly regions often faces significant challenges due to obstructions, which adversely affect model performance. To address this issue, we collected a dataset comprising 2379 walnut images from these regions, with detailed annotations for both obstructed and non-obstructed walnuts. Based on this dataset, we propose OW-YOLO, a lightweight object detection model specifically designed for detecting small, obstructed walnuts. The model’s backbone was restructured with the integration of the DWR-DRB (Dilated Weighted Residual-Dilated Residual Block) module. To enhance efficiency and multi-scale feature fusion, we incorporated the HSFPN (High-Level Screening Feature Pyramid Network) and redesigned the detection head by replacing the original head with the more efficient LADH detection head while removing the head processing 32 × 32 feature maps. These improvements effectively reduced model complexity and significantly enhanced detection accuracy for obstructed walnuts. Experiments were conducted using the PyTorch framework on an NVIDIA GeForce RTX 4060 Ti GPU. The results demonstrate that OW-YOLO outperforms other models, achieving an mAP@0.5 (mean average precision) of 83.6%, mAP@[0.5:0.95] of 53.7%, and an F1 score of 77.9%. Additionally, the model’s parameter count decreased by 49.2%, weight file size was reduced by 48.1%, and computational load dropped by 37.3%, effectively mitigating the impact of obstruction on detection accuracy. These findings provide robust support for the future development of walnut agriculture and lay a solid foundation for the broader adoption of intelligent agriculture. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

19 pages, 5434 KB  
Article
A Classifier Model Using Fine-Tuned Convolutional Neural Network and Transfer Learning Approaches for Prostate Cancer Detection
by Murat Sarıateş and Erdal Özbay
Appl. Sci. 2025, 15(1), 225; https://doi.org/10.3390/app15010225 - 30 Dec 2024
Cited by 12 | Viewed by 2396
Abstract
Background: Accurate and reliable classification models play a major role in clinical decision-making processes for prostate cancer (PCa) diagnosis. However, existing methods often demonstrate limited performance, particularly when applied to small datasets and binary classification problems. Objectives: This study aims to design a [...] Read more.
Background: Accurate and reliable classification models play a major role in clinical decision-making processes for prostate cancer (PCa) diagnosis. However, existing methods often demonstrate limited performance, particularly when applied to small datasets and binary classification problems. Objectives: This study aims to design a fine-tuned deep learning (DL) model capable of classifying PCa MRI images with high accuracy and to evaluate its performance by comparing it with various DL architectures. Methods: In this study, a basic convolutional neural network (CNN) model was developed and subsequently optimized using techniques such as L2 regularization, Tanh activation, dropout, and early stopping to enhance its performance. Additionally, a pyramid-type CNN architecture was designed to simultaneously evaluate both fine details and broader structures by combining low- and high-resolution information through feature maps extracted from different CNN layers. This approach enabled the model to learn complex features more effectively. For performance comparison, the developed fine-tuned enhanced pyramid network (FT-EPN) model was benchmarked against models such as Vgg16, Vgg19, Resnet50, InceptionV3, Densenet121, and Xception, which were trained using transfer learning (TL) techniques. It was also compared to next-generation models such as vision transformer (ViT) and MaxViT-v2. Results: The developed fine-tuned model achieved an accuracy rate of 96.77%, outperforming pre-trained TL models and next-generation models like ViT and MaxViT-v2. Among the TL models, Vgg19 achieved the highest accuracy rate at 92.74%. In comparison, ViT achieved an accuracy of 93.55%, while MaxViT-v2 achieved an accuracy of 95.16%. Conclusions: This study presents an optimized FT-EPN model to enhance the performance of DL models for PCa classification, offering a reference solution for future research. This model provides significant advantages in terms of classification accuracy and simplicity and has been evaluated as an effective solution in clinical applications. Full article
Show Figures

Figure 1

Back to TopTop