Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (70)

Search Parameters:
Keywords = bottom-up feature fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 3481 KiB  
Article
Research on Adaptive Identification Technology for Rolling Bearing Performance Degradation Based on Vibration–Temperature Fusion
by Zhenghui Li, Lixia Ying, Liwei Zhan, Shi Zhuo, Hui Li and Xiaofeng Bai
Sensors 2025, 25(15), 4707; https://doi.org/10.3390/s25154707 - 30 Jul 2025
Viewed by 245
Abstract
To address the issue of low accuracy in identifying the transition states of rolling bearing performance degradation when relying solely on vibration signals, this study proposed a vibration–temperature fusion-based adaptive method for bearing performance degradation assessments. First, a multidimensional time–frequency feature set was [...] Read more.
To address the issue of low accuracy in identifying the transition states of rolling bearing performance degradation when relying solely on vibration signals, this study proposed a vibration–temperature fusion-based adaptive method for bearing performance degradation assessments. First, a multidimensional time–frequency feature set was constructed by integrating vibration acceleration and temperature signals. Second, a novel composite sensitivity index (CSI) was introduced, incorporating the trend persistence, monotonicity, and signal complexity to perform preliminary feature screening. Mutual information clustering and regularized entropy weight optimization were then combined to reselect highly sensitive parameters from the initially screened features. Subsequently, an adaptive feature fusion method based on auto-associative kernel regression (AFF-AAKR) was introduced to compress the data in the spatial dimension while enhancing the degradation trend characterization capability of the health indicator (HI) through a temporal residual analysis. Furthermore, the entropy weight method was employed to quantify the information entropy differences between the vibration and temperature signals, enabling dynamic weight allocation to construct a comprehensive HI. Finally, a dual-criteria adaptive bottom-up merging algorithm (DC-ABUM) was proposed, which achieves bearing life-stage identification through error threshold constraints and the adaptive optimization of segmentation quantities. The experimental results demonstrated that the proposed method outperformed traditional vibration-based life-stage identification approaches. Full article
(This article belongs to the Special Issue Fault Diagnosis Based on Sensing and Control Systems)
Show Figures

Figure 1

20 pages, 28899 KiB  
Article
MSDP-Net: A Multi-Scale Domain Perception Network for HRRP Target Recognition
by Hongxu Li, Xiaodi Li, Zihan Xu, Xinfei Jin and Fulin Su
Remote Sens. 2025, 17(15), 2601; https://doi.org/10.3390/rs17152601 - 26 Jul 2025
Viewed by 329
Abstract
High-resolution range profile (HRRP) recognition serves as a foundational task in radar automatic target recognition (RATR), enabling robust classification under all-day and all-weather conditions. However, existing approaches often struggle to simultaneously capture the multi-scale spatial dependencies and global spectral relationships inherent in HRRP [...] Read more.
High-resolution range profile (HRRP) recognition serves as a foundational task in radar automatic target recognition (RATR), enabling robust classification under all-day and all-weather conditions. However, existing approaches often struggle to simultaneously capture the multi-scale spatial dependencies and global spectral relationships inherent in HRRP signals, limiting their effectiveness in complex scenarios. To address these limitations, we propose a novel multi-scale domain perception network tailored for HRRP-based target recognition, called MSDP-Net. MSDP-Net introduces a hybrid spatial–spectral representation learning strategy through a multiple-domain perception HRRP (DP-HRRP) encoder, which integrates multi-head convolutions to extract spatial features across diverse receptive fields, and frequency-aware filtering to enhance critical spectral components. To further enhance feature fusion, we design a hierarchical scale fusion (HSF) branch that employs stacked semantically enhanced scale fusion (SESF) blocks to progressively aggregate information from fine to coarse scales in a bottom-up manner. This architecture enables MSDP-Net to effectively model complex scattering patterns and aspect-dependent variations. Extensive experiments on both simulated and measured datasets demonstrate the superiority of MSDP-Net, achieving 80.75% accuracy on the simulated dataset and 94.42% on the measured dataset, highlighting its robustness and practical applicability. Full article
Show Figures

Figure 1

41 pages, 2824 KiB  
Review
Assessing Milk Authenticity Using Protein and Peptide Biomarkers: A Decade of Progress in Species Differentiation and Fraud Detection
by Achilleas Karamoutsios, Pelagia Lekka, Chrysoula Chrysa Voidarou, Marilena Dasenaki, Nikolaos S. Thomaidis, Ioannis Skoufos and Athina Tzora
Foods 2025, 14(15), 2588; https://doi.org/10.3390/foods14152588 - 23 Jul 2025
Viewed by 655
Abstract
Milk is a nutritionally rich food and a frequent target of economically motivated adulteration, particularly through substitution with lower-cost milk types. Over the past decade, significant progress has been made in the authentication of milk using advanced proteomic and chemometric approaches, with a [...] Read more.
Milk is a nutritionally rich food and a frequent target of economically motivated adulteration, particularly through substitution with lower-cost milk types. Over the past decade, significant progress has been made in the authentication of milk using advanced proteomic and chemometric approaches, with a focus on the discovery and application of protein and peptide biomarkers for species differentiation and fraud detection. Recent innovations in both top-down and bottom-up proteomics have markedly improved the sensitivity and specificity of detecting key molecular targets, including caseins and whey proteins. Peptide-based methods are especially valuable in processed dairy products due to their thermal stability and resilience to harsh treatment, although their species specificity may be limited when sequences are conserved across related species. Robust chemometric approaches are increasingly integrated with proteomic pipelines to handle high-dimensional datasets and enhance classification performance. Multivariate techniques, such as principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA), are frequently employed to extract discriminatory features and model adulteration scenarios. Despite these advances, key challenges persist, including the lack of standardized protocols, variability in sample preparation, and the need for broader validation across breeds, geographies, and production systems. Future progress will depend on the convergence of high-resolution proteomics with multi-omics integration, structured data fusion, and machine learning frameworks, enabling scalable, specific, and robust solutions for milk authentication in increasingly complex food systems. Full article
Show Figures

Figure 1

20 pages, 2516 KiB  
Article
Visual Attention Fusion Network (VAFNet): Bridging Bottom-Up and Top-Down Features in Infrared and Visible Image Fusion
by Yaochen Liu, Yunke Wang and Zixuan Jing
Symmetry 2025, 17(7), 1104; https://doi.org/10.3390/sym17071104 - 9 Jul 2025
Viewed by 208
Abstract
Infrared and visible image fusion aims to integrate useful information from the source image to obtain a fused image that not only has excellent visual perception but also promotes the performance of the subsequent object detection task. However, due to the asymmetry between [...] Read more.
Infrared and visible image fusion aims to integrate useful information from the source image to obtain a fused image that not only has excellent visual perception but also promotes the performance of the subsequent object detection task. However, due to the asymmetry between image fusion and object detection tasks, obtaining superior visual effects while facilitating object detection tasks remains challenging in real-world applications. Addressing this issue, we propose a novel visual attention fusion network for infrared and visible image fusion (VAFNet), which can bridge bottom-up and top-down features to achieve high-quality visual perception while improving the performance of object detection tasks. The core idea is that bottom-up visual attention is utilized to extract multi-layer bottom-up features for ensuring superior visual perception, while top-down visual attention determines object attention signals related to object detection tasks. Then, a bidirectional attention integration mechanism is designed to naturally integrate two forms of attention into the fused image. Experiments on public and collection datasets demonstrate that VAFNet not only outperforms seven state-of-the-art (SOTA) fusion methods in qualitative and quantitative evaluation but also has advantages in facilitating object detection tasks. Full article
(This article belongs to the Special Issue Symmetry in Next-Generation Intelligent Information Technologies)
Show Figures

Figure 1

27 pages, 12000 KiB  
Article
Multi-Model Synergistic Satellite-Derived Bathymetry Fusion Approach Based on Mamba Coral Reef Habitat Classification
by Xuechun Zhang, Yi Ma, Feifei Zhang, Zhongwei Li and Jingyu Zhang
Remote Sens. 2025, 17(13), 2134; https://doi.org/10.3390/rs17132134 - 21 Jun 2025
Viewed by 388
Abstract
As fundamental geophysical information, the high-precision detection of shallow water bathymetry is critical data support for the utilization of island resources and coral reef protection delimitation. In recent years, the combination of active and passive remote sensing technologies has led to a revolutionary [...] Read more.
As fundamental geophysical information, the high-precision detection of shallow water bathymetry is critical data support for the utilization of island resources and coral reef protection delimitation. In recent years, the combination of active and passive remote sensing technologies has led to a revolutionary breakthrough in satellite-derived bathymetry (SDB). Optical SDB extracts bathymetry by quantifying light–water–bottom interactions. Therefore, the apparent differences in the reflectance of different bottom types in specific wavelength bands are a core component of SDB. In this study, refined classification was performed for complex seafloor sediment and geomorphic features in coral reef habitats. A multi-model synergistic SDB fusion approach constrained by coral reef habitat classification based on the deep learning framework Mamba was constructed. The dual error of the global single model was suppressed by exploiting sediment and geomorphic partitions, as well as the accuracy complementarity of different models. Based on multispectral remote sensing imagery Sentinel-2 and the Ice, Cloud, and Land Elevation Satellite-2 (ICESat-2) active spaceborne lidar bathymetry data, wide-range and high-accuracy coral reef habitat classification results and bathymetry information were obtained for the Yuya Shoal (0–23 m) and Niihau Island (0–40 m). The results showed that the overall Mean Absolute Errors (MAEs) in the two study areas were 0.2 m and 0.5 m and the Mean Absolute Percentage Errors (MAPEs) were 9.77% and 6.47%, respectively. And R2 reached 0.98 in both areas. The estimated error of the SDB fusion strategy based on coral reef habitat classification was reduced by more than 90% compared with classical SDB models and a single machine learning method, thereby improving the capability of SDB in complex geomorphic ocean areas. Full article
(This article belongs to the Section Remote Sensing in Geology, Geomorphology and Hydrology)
Show Figures

Figure 1

20 pages, 1482 KiB  
Article
Research on Person Pose Estimation Based on Parameter Inverted Pyramid and High-Dimensional Feature Enhancement
by Guofeng Ma and Qianyi Zhang
Symmetry 2025, 17(6), 941; https://doi.org/10.3390/sym17060941 - 13 Jun 2025
Viewed by 690
Abstract
Heating, Ventilation and Air Conditioning (HVAC) systems are significant carbon emitters in buildings, and precise regulation is crucial for achieving carbon neutrality. Computer vision-based occupant behavior prediction provides vital data for demand-driven control strategies. Real-time multi-person pose estimation faces challenges in balancing speed [...] Read more.
Heating, Ventilation and Air Conditioning (HVAC) systems are significant carbon emitters in buildings, and precise regulation is crucial for achieving carbon neutrality. Computer vision-based occupant behavior prediction provides vital data for demand-driven control strategies. Real-time multi-person pose estimation faces challenges in balancing speed and accuracy, especially in complex environments. Traditional top-down methods become computationally expensive as the number of people increases, while bottom-up methods struggle with key point mismatches in dense crowds. This paper introduces the Efficient-RTMO model, which leverages the Parameter Inverted Image Pyramid (PIIP) with hierarchical multi-scale symmetry for lightweight processing of high-resolution images and a deeper network for low-resolution images. This approach reduces computational complexity, particularly in dense crowd scenarios, and incorporates a dynamic sparse connectivity mechanism via the star-shaped dynamic feed-forward network (StarFFN). By optimizing the symmetry structure, it improves inference efficiency and ensures effective feature fusion. Experimental results on the COCO dataset show that Efficient-RTMO outperforms the baseline RTMO model, achieving more than 2× speed improvement and a 0.3 AP increase. Ablation studies confirm that PIIP and StarFFN enhance robustness against occlusions and scale variations, demonstrating their synergistic effectiveness. Full article
Show Figures

Figure 1

17 pages, 9764 KiB  
Article
Depth Estimation of an Underwater Moving Source Based on the Acoustic Interference Pattern Stream
by Lintai Rong, Bo Lei, Tiantian Gu and Zhaoyang He
Electronics 2025, 14(11), 2228; https://doi.org/10.3390/electronics14112228 - 30 May 2025
Viewed by 420
Abstract
For a bottom-moored vertical line array in deep ocean, the underwater maneuvering source will produce interference patterns in both grazing angle–distance (vertical-time record, VTR) and frequency–grazing angle (wideband beamforming output) domains, respectively, and the interference period is modulated by the source depth. Based [...] Read more.
For a bottom-moored vertical line array in deep ocean, the underwater maneuvering source will produce interference patterns in both grazing angle–distance (vertical-time record, VTR) and frequency–grazing angle (wideband beamforming output) domains, respectively, and the interference period is modulated by the source depth. Based on these characteristics, an interference feature fusion (IFF) method is proposed in the space–time–frequency domain for source depth estimation, in which the principal interference mode of the VTR is extracted adaptively and the depth ambiguity function is constructed by fusing the ambiguity sequence, mapped by wideband beamforming intensity, and the principal interference mode, which can achieve the long-term depth estimation and recognition of underwater sources without requiring environmental information. Theoretical analysis and simulation results indicate that the IFF can suppress the false peaks generated by the generalized Fourier transform (GFT) method, and the depth estimation error of the IFF for a single source is reduced by at least 47% compared to GFT. In addition, the IFF is proven to be effective at separating the depth of multiple adjacent sources (with the average estimation error reduced by 28%) and exhibits a high degree of robustness within the fluctuating acoustic channel (with the average estimation error reduced by 12%). Full article
Show Figures

Figure 1

20 pages, 3875 KiB  
Article
A Bottom-Up Multi-Feature Fusion Algorithm for Individual Tree Segmentation in Dense Rubber Tree Plantations Using Unmanned Aerial Vehicle–Light Detecting and Ranging
by Zhipeng Zeng, Junpeng Miao, Xiao Huang, Peng Chen, Ping Zhou, Junxiang Tan and Xiangjun Wang
Plants 2025, 14(11), 1640; https://doi.org/10.3390/plants14111640 - 27 May 2025
Viewed by 465
Abstract
Accurate individual tree segmentation (ITS) in dense rubber plantations is a challenging task due to overlapping canopies, indistinct tree apexes, and intricate branch structures. To address these challenges, we propose a bottom-up, multi-feature fusion method for segmenting rubber trees using UAV-LiDAR point clouds. [...] Read more.
Accurate individual tree segmentation (ITS) in dense rubber plantations is a challenging task due to overlapping canopies, indistinct tree apexes, and intricate branch structures. To address these challenges, we propose a bottom-up, multi-feature fusion method for segmenting rubber trees using UAV-LiDAR point clouds. Our approach first involves performing a trunk extraction based on branch-point density variations and neighborhood directional features, which allows for the precise separation of trunks from overlapping canopies. Next, we introduce a multi-feature fusion strategy that replaces single-threshold constraints, integrating geometric, directional, and density attributes to classify core canopy points, boundary points, and overlapping regions. Disputed points are then iteratively assigned to adjacent trees based on neighborhood growth angle consistency, enhancing the robustness of the segmentation. Experiments conducted in rubber plantations with varying canopy closure (low, medium, and high) show accuracies of 0.97, 0.98, and 0.95. Additionally, the crown width and canopy projection area derived from the segmented individual tree point clouds are highly consistent with ground truth data, with R2 values exceeding 0.98 and 0.97, respectively. The proposed method provides a reliable foundation for 3D tree modeling and biomass estimation in structurally complex plantations, advancing precision forestry and ecosystem assessment by overcoming the critical limitations of existing ITS approaches in high-closure tropical agroforests. Full article
(This article belongs to the Special Issue Advances in Artificial Intelligence for Plant Research)
Show Figures

Figure 1

18 pages, 5574 KiB  
Article
An Intelligent Method for Real-Time Surface Monitoring of Rock Drillability at the Well Bottom Based on Logging and Drilling Data Fusion
by Dexin Ma, Hongbo Yang, Zhi Yang, Junbo Liu, Hui Zhang, Chengkai Weng, Haifei Lv, Kunhong Lv, Yuting Zhou and Cheng Qin
Processes 2025, 13(3), 668; https://doi.org/10.3390/pr13030668 - 27 Feb 2025
Viewed by 860
Abstract
The accurate prediction and monitoring of rock drillability are essential for geomechanical modeling and optimizing drilling parameters. Traditional methods often rely on laboratory core experiments and well logging data to evaluate rock drillability. However, these methods can only obtain core samples and sonic [...] Read more.
The accurate prediction and monitoring of rock drillability are essential for geomechanical modeling and optimizing drilling parameters. Traditional methods often rely on laboratory core experiments and well logging data to evaluate rock drillability. However, these methods can only obtain core samples and sonic logging data in drilled wells. To enable the real-time monitoring of bottom-hole rock drillability during drilling, we propose the following novel approach: data fusion and a CNN-GBDT framework for surface-based real-time monitoring. The specific process involves using 1D-CNN convolution to extract deep features from historical wells’ drilling data and sonic log data. These deep features are then fused with the original features and passed to the GBDT framework’s machine learning model for training. To validate the effectiveness of this method, this study conducted a case analysis on two wells in the Missan Oil Fields. CNN-GBDT models based on XGBoost, LightGBM, and CatBoost were established and compared with physical methods. The results indicate that the CNN-GBDT model centered on LightGBM achieved a mean square error (MSE) of 0.026, which was one-tenth of the MSE of 0.282 of the physical evaluation method. Furthermore, the effectiveness of the proposed CNN-GBDT framework for monitoring rock drillability suggests potential applications in monitoring other bottom-hole parameters. Full article
(This article belongs to the Special Issue Oil and Gas Drilling Processes: Control and Optimization)
Show Figures

Figure 1

31 pages, 6413 KiB  
Article
Noise-to-Convex: A Hierarchical Framework for SAR Oriented Object Detection via Scattering Keypoint Feature Fusion and Convex Contour Refinement
by Shuoyang Liu, Ming Tong, Bokun He, Jiu Jiang and Chu He
Electronics 2025, 14(3), 569; https://doi.org/10.3390/electronics14030569 - 31 Jan 2025
Cited by 1 | Viewed by 756
Abstract
Oriented object detection has become a hot topic in SAR image interpretation. Due to the unique imaging mechanism, SAR objects are represented as clusters of scattering points surrounded by coherent speckle noise, leading to blurred outlines and increased false alarms in complex scenes. [...] Read more.
Oriented object detection has become a hot topic in SAR image interpretation. Due to the unique imaging mechanism, SAR objects are represented as clusters of scattering points surrounded by coherent speckle noise, leading to blurred outlines and increased false alarms in complex scenes. To address these challenges, we propose a novel noise-to-convex detection paradigm with a hierarchical framework based on the scattering-keypoint-guided diffusion detection transformer (SKG-DDT), which consists of three levels. At the bottom level, the strong-scattering-region generation (SSRG) module constructs the spatial distribution of strong scattering regions via a diffusion model, enabling the direct identification of approximate object regions. At the middle level, the scattering-keypoint feature fusion (SKFF) module dynamically locates scattering keypoints across multiple scales, capturing their spatial and structural relationships with the attention mechanism. Finally, the convex contour prediction (CCP) module at the top level refines the object outline by predicting fine-grained convex contours. Furthermore, we unify the three-level framework into an end-to-end pipeline via a detection transformer. The proposed method was comprehensively evaluated on three public SAR datasets, including HRSID, RSDD-SAR, and SAR-Aircraft-v1.0. The experimental results demonstrate that the proposed method attains an AP50 of 86.5%, 92.7%, and 89.2% on these three datasets, respectively, which is an increase of 0.7%, 0.6%, and 1.0% compared to the existing state-of-the-art method. These results indicate that our approach outperforms existing algorithms across multiple object categories and diverse scenes. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

16 pages, 1222 KiB  
Article
Infrared Small Target Detection Algorithm Based on Improved Dense Nested U-Net Network
by Xinyue Du, Ke Cheng, Jin Zhang, Yuanyu Wang, Fan Yang, Wei Zhou and Yu Lin
Sensors 2025, 25(3), 814; https://doi.org/10.3390/s25030814 - 29 Jan 2025
Cited by 1 | Viewed by 1771
Abstract
Infrared weak and small target detection technology has attracted much attention in recent years and is crucial in the application fields of early warning, monitoring, medical diagnostics, and anti-UAV detection.With the advancement of deep learning, CNN-based methods have achieved promising results in general-purpose [...] Read more.
Infrared weak and small target detection technology has attracted much attention in recent years and is crucial in the application fields of early warning, monitoring, medical diagnostics, and anti-UAV detection.With the advancement of deep learning, CNN-based methods have achieved promising results in general-purpose target detection due to their powerful modeling capabilities; however, CNN-based methods cannot be directly applied to infrared small targets due to the disappearance of deep targets caused by multiple downsampling operations. Aiming at these problems, we proposed an improved dense nesting and attention infrared small target detection method based on U-Net called IDNA-UNet. A dense nested interaction module (DNIM) is designed as a feature extraction module to achieve level-by-level feature fusion and retain small targets’ features and detailed positioning information. To integrate low-level features into deeper high-level features, we designed a bottom-up feature pyramid fusion module, which can further retain high-level semantic information and target detail information. In addition, a more suitable scale and position sensitive (SLS) loss is applied to each prediction scale to help the detector locate the target more accurately and distinguish different scales of the target. With our IDNA-UNet, the contextual information of small targets can be well incorporated and fully exploited by repetitive fusion and enhancement. Compared with existing methods, IDNA-UNet has achieved significant advantages in the intersection over union (IoU), detection probability (Pd), and false alarm rate (Fa) of infrared small target detection. Full article
(This article belongs to the Special Issue Computer Vision Sensing and Pattern Recognition)
Show Figures

Figure 1

16 pages, 8462 KiB  
Article
Wavelet-Based, Blur-Aware Decoupled Network for Video Deblurring
by Hua Wang, Pornntiwa Pawara and Rapeeporn Chamchong
Appl. Sci. 2025, 15(3), 1311; https://doi.org/10.3390/app15031311 - 27 Jan 2025
Viewed by 1111
Abstract
Video deblurring faces a fundamental challenge, as blur degradation comprehensively affects frames by not only causing detail loss but also severely distorting structural information. This dual degradation across low- and high-frequency domains makes it challenging for existing methods to simultaneously restore both structural [...] Read more.
Video deblurring faces a fundamental challenge, as blur degradation comprehensively affects frames by not only causing detail loss but also severely distorting structural information. This dual degradation across low- and high-frequency domains makes it challenging for existing methods to simultaneously restore both structural and detailed information through a unified approach. To address this issue, we propose a wavelet-based, blur-aware decoupled network (WBDNet) that innovatively decouples structure reconstruction from detail enhancement. Our method decomposes features into multiple frequency bands and employs specialized restoration strategies for different frequency domains. In the low-frequency domain, we construct a multi-scale feature pyramid with optical flow alignment. This enables accurate structure reconstruction through bottom-up progressive feature fusion. For high-frequency components, we combine deformable convolution with a blur-aware attention mechanism. This allows us to precisely extract and merge sharp details from multiple frames. Extensive experiments on benchmark datasets demonstrate the superior performance of our method, particularly in preserving structural integrity and detail fidelity. Full article
Show Figures

Figure 1

18 pages, 3618 KiB  
Article
EBFA-6D: End-to-End Transparent Object 6D Pose Estimation Based on a Boundary Feature Augmented Mechanism
by Xinbei Jiang, Zichen Zhu, Tianhan Gao and Nan Guo
Sensors 2024, 24(23), 7584; https://doi.org/10.3390/s24237584 - 27 Nov 2024
Viewed by 1019
Abstract
Transparent objects, commonly encountered in everyday environments, present significant challenges for 6D pose estimation due to their unique optical properties. The lack of inherent texture and color complicates traditional vision methods, while the transparency prevents depth sensors from accurately capturing geometric details. We [...] Read more.
Transparent objects, commonly encountered in everyday environments, present significant challenges for 6D pose estimation due to their unique optical properties. The lack of inherent texture and color complicates traditional vision methods, while the transparency prevents depth sensors from accurately capturing geometric details. We propose EBFA-6D, a novel end-to-end 6D pose estimation framework that directly predicts the 6D poses of transparent objects from a single RGB image. To overcome the challenges introduced by transparency, we leverage the high contrast at object boundaries inherent to transparent objects by proposing a boundary feature augmented mechanism. We further conduct a bottom-up feature fusion to enhance the location capability of EBFA-6D. EBFA-6D is evaluated on the ClearPose dataset, outperforming the existing methods in accuracy while achieving an inference speed near real-time. The results demonstrate that EBFA-6D provides an efficient and effective solution for accurate 6D pose estimation of transparent objects. Full article
(This article belongs to the Section Sensors and Robotics)
Show Figures

Figure 1

18 pages, 18674 KiB  
Article
An Improved Instance Segmentation Method for Complex Elements of Farm UAV Aerial Survey Images
by Feixiang Lv, Taihong Zhang, Yunjie Zhao, Zhixin Yao and Xinyu Cao
Sensors 2024, 24(18), 5990; https://doi.org/10.3390/s24185990 - 15 Sep 2024
Cited by 2 | Viewed by 1359
Abstract
Farm aerial survey layers can assist in unmanned farm operations, such as planning paths and early warnings. To address the inefficiencies and high costs associated with traditional layer construction, this study proposes a high-precision instance segmentation algorithm based on SparseInst. Considering the structural [...] Read more.
Farm aerial survey layers can assist in unmanned farm operations, such as planning paths and early warnings. To address the inefficiencies and high costs associated with traditional layer construction, this study proposes a high-precision instance segmentation algorithm based on SparseInst. Considering the structural characteristics of farm elements, this study introduces a multi-scale attention module (MSA) that leverages the properties of atrous convolution to expand the sensory field. It enhances spatial and channel feature weights, effectively improving segmentation accuracy for large-scale and complex targets in the farm through three parallel dense connections. A bottom-up aggregation path is added to the feature pyramid fusion network, enhancing the model’s ability to perceive complex targets such as mechanized trails in farms. Coordinate attention blocks (CAs) are incorporated into the neck to capture richer contextual semantic information, enhancing farm aerial imagery scene recognition accuracy. To assess the proposed method, we compare it against existing mainstream object segmentation models, including the Mask R-CNN, Cascade–Mask, SOLOv2, and Condinst algorithms. The experimental results show that the improved model proposed in this study can be adapted to segment various complex targets in farms. The accuracy of the improved SparseInst model greatly exceeds that of Mask R-CNN and Cascade–Mask and is 10.8 and 12.8 percentage points better than the average accuracy of SOLOv2 and Condinst, respectively, with the smallest number of model parameters. The results show that the model can be used for real-time segmentation of targets under complex farm conditions. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

19 pages, 4082 KiB  
Article
Real-Time Detection and Counting of Wheat Spikes Based on Improved YOLOv10
by Sitong Guan, Yiming Lin, Guoyu Lin, Peisen Su, Siluo Huang, Xianyong Meng, Pingzeng Liu and Jun Yan
Agronomy 2024, 14(9), 1936; https://doi.org/10.3390/agronomy14091936 - 28 Aug 2024
Cited by 20 | Viewed by 4051
Abstract
Wheat is one of the most crucial food crops globally, with its yield directly impacting global food security. The accurate detection and counting of wheat spikes is essential for monitoring wheat growth, predicting yield, and managing fields. However, the current methods face challenges, [...] Read more.
Wheat is one of the most crucial food crops globally, with its yield directly impacting global food security. The accurate detection and counting of wheat spikes is essential for monitoring wheat growth, predicting yield, and managing fields. However, the current methods face challenges, such as spike size variation, shading, weed interference, and dense distribution. Conventional machine learning approaches have partially addressed these challenges, yet they are hampered by limited detection accuracy, complexities in feature extraction, and poor robustness under complex field conditions. In this paper, we propose an improved YOLOv10 algorithm that significantly enhances the model’s feature extraction and detection capabilities. This is achieved by introducing a bidirectional feature pyramid network (BiFPN), a separated and enhancement attention module (SEAM), and a global context network (GCNet). BiFPN leverages both top-down and bottom-up bidirectional paths to achieve multi-scale feature fusion, improving performance in detecting targets of various scales. SEAM enhances feature representation quality and model performance in complex environments by separately augmenting the attention mechanism for channel and spatial features. GCNet captures long-range dependencies in the image through the global context block, enabling the model to process complex information more accurately. The experimental results demonstrate that our method achieved a precision of 93.69%, a recall of 91.70%, and a mean average precision (mAP) of 95.10% in wheat spike detection, outperforming the benchmark YOLOv10 model by 2.02% in precision, 2.92% in recall, and 1.56% in mAP. Additionally, the coefficient of determination (R2) between the detected and manually counted wheat spikes was 0.96, with a mean absolute error (MAE) of 3.57 and a root-mean-square error (RMSE) of 4.09, indicating strong correlation and high accuracy. The improved YOLOv10 algorithm effectively solves the difficult problem of wheat spike detection under complex field conditions, providing strong support for agricultural production and research. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

Back to TopTop