Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (27)

Search Parameters:
Keywords = SOLOv2

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 2933 KB  
Article
Image-Based Detection of Chinese Bayberry (Myrica rubra) Maturity Using Cascaded Instance Segmentation and Multi-Feature Regression
by Hao Zheng, Li Sun, Yue Wang, Han Yang and Shuwen Zhang
Horticulturae 2025, 11(10), 1166; https://doi.org/10.3390/horticulturae11101166 - 1 Oct 2025
Viewed by 490
Abstract
The accurate assessment of Chinese bayberry (Myrica rubra) maturity is critical for intelligent harvesting. This study proposes a novel cascaded framework combining instance segmentation and multi-feature regression for accurate maturity detection. First, a lightweight SOLOv2-Light network is employed to segment each [...] Read more.
The accurate assessment of Chinese bayberry (Myrica rubra) maturity is critical for intelligent harvesting. This study proposes a novel cascaded framework combining instance segmentation and multi-feature regression for accurate maturity detection. First, a lightweight SOLOv2-Light network is employed to segment each fruit individually, which significantly reduces computational costs with only a marginal drop in accuracy. Then, a multi-feature extraction network is developed to fuse deep semantic, color (LAB space), and multi-scale texture features, enhanced by a channel attention mechanism for adaptive weighting. The maturity ground truth is defined using the a*/b* ratio measured by a colorimeter, which correlates strongly with anthocyanin accumulation and visual ripeness. Experimental results demonstrated that the proposed method achieves a mask mAP of 0.788 on the instance segmentation task, outperforming Mask R-CNN and YOLACT. For maturity prediction, a mean absolute error of 3.946% is attained, which is a significant improvement over the baseline. When the data are discretized into three maturity categories, the overall accuracy reaches 95.51%, surpassing YOLOX-s and Faster R-CNN by a considerable margin while reducing processing time by approximately 46%. The modular design facilitates easy adaptation to new varieties. This research provides a robust and efficient solution for in-field bayberry maturity detection, offering substantial value for the development of automated harvesting systems. Full article
Show Figures

Figure 1

19 pages, 4802 KB  
Article
Enhanced SOLOv2: An Effective Instance Segmentation Algorithm for Densely Overlapping Silkworms
by Jianying Yuan, Hao Li, Chen Cheng, Zugui Liu, Sidong Wu and Dequan Guo
Sensors 2025, 25(18), 5703; https://doi.org/10.3390/s25185703 - 12 Sep 2025
Viewed by 790
Abstract
Silkworm instance segmentation is crucial for individual silkworm behavior analysis and health monitoring in intelligent sericulture, as the segmentation accuracy directly influences the reliability of subsequent biological parameter estimation. In real farming environments, silkworms often exhibit high density and severe mutual occlusion, posing [...] Read more.
Silkworm instance segmentation is crucial for individual silkworm behavior analysis and health monitoring in intelligent sericulture, as the segmentation accuracy directly influences the reliability of subsequent biological parameter estimation. In real farming environments, silkworms often exhibit high density and severe mutual occlusion, posing significant challenges for traditional instance segmentation algorithms. To address these issues, this paper proposes an enhanced SOLOv2 algorithm. Specifically, (1) in the backbone network, Linear Deformable Convolution (LDC) is incorporated to strengthen the geometric feature modeling of curved silkworms. A Haar Wavelet Downsampling (HWD) module is designed to better preserve details for partial visible targets, and an Edge-Augmented Multi-attention Fusion Network (EAMF-Net) is constructed to improve boundary discrimination in overlapping regions. (2) In the mask branch, Dynamic Upsampling (Dysample), Adaptive Spatial Feature Fusion (ASFF), and Simple Attention Module (SimAM) are integrated to refine the quality of segmentation masks. Experiments conducted on a self-built high-density silkworm dataset demonstrate that the proposed method achieves an Average Precision (AP) of 85.1%, with significant improvements over the baseline model in small- (APs: +10.2%), medium- (APm: +4.0%), and large-target (APl: +2.0%) segmentation accuracy. This effectively advances precision in dense silkworm segmentation scenarios. Full article
(This article belongs to the Special Issue Vision Sensors for Object Detection and Tracking)
Show Figures

Figure 1

12 pages, 2172 KB  
Article
Instance Segmentation Method for Insulators in Complex Backgrounds Based on Improved SOLOv2
by Ze Chen, Yangpeng Ji, Xiaodong Du, Shaokang Zhao, Zhenfei Huo and Xia Fang
Sensors 2025, 25(17), 5318; https://doi.org/10.3390/s25175318 - 27 Aug 2025
Viewed by 879
Abstract
To precisely delineate the contours of insulators in complex transmission line images obtained from Unmanned Aerial Vehicle (UAV) inspections and thereby facilitate subsequent defect analysis, this study proposes an instance segmentation framework predicated upon an enhanced SOLOv2 model. The proposed framework integrates a [...] Read more.
To precisely delineate the contours of insulators in complex transmission line images obtained from Unmanned Aerial Vehicle (UAV) inspections and thereby facilitate subsequent defect analysis, this study proposes an instance segmentation framework predicated upon an enhanced SOLOv2 model. The proposed framework integrates a preprocessed edge channel, generated through the Non-Subsampled Contourlet Transform (NSCT), which augments the model’s capability to accurately capture the edges of insulators. Moreover, the input image resolution to the network is heightened to 1200 × 1600, permitting more detailed extraction of edges. Rather than the original ResNet + FPN architecture, the improved HRNet is utilized as the backbone to effectively harness multi-scale feature information, thereby enhancing the model’s overall efficacy. In response to the increased input size, there is a reduction in the network’s channel count, concurrent with an increase in the number of layers, ensuring an adequate receptive field without substantially escalating network parameters. Additionally, a Convolutional Block Attention Module (CBAM) is incorporated to refine mask quality and augment object detection precision. Furthermore, to bolster the model’s robustness and minimize annotation demands, a virtual dataset is crafted utilizing the fourth-generation Unreal Engine (UE4). Empirical results reveal that the proposed framework exhibits superior performance, with AP0.50 (90.21%), AP0.75 (83.34%), and AP[0.50:0.95] (67.26%) on a test set consisting of images supplied by the power grid. This framework surpasses existing methodologies and contributes significantly to the advancement of intelligent transmission line inspection. Full article
(This article belongs to the Special Issue Recent Trends and Advances in Intelligent Fault Diagnostics)
Show Figures

Figure 1

25 pages, 9564 KB  
Article
Semantic-Aware Cross-Modal Transfer for UAV-LiDAR Individual Tree Segmentation
by Fuyang Zhou, Haiqing He, Ting Chen, Tao Zhang, Minglu Yang, Ye Yuan and Jiahao Liu
Remote Sens. 2025, 17(16), 2805; https://doi.org/10.3390/rs17162805 - 13 Aug 2025
Viewed by 1640
Abstract
Cross-modal semantic segmentation of individual tree LiDAR point clouds is critical for accurately characterizing tree attributes, quantifying ecological interactions, and estimating carbon storage. However, in forest environments, this task faces key challenges such as high annotation costs and poor cross-domain generalization. To address [...] Read more.
Cross-modal semantic segmentation of individual tree LiDAR point clouds is critical for accurately characterizing tree attributes, quantifying ecological interactions, and estimating carbon storage. However, in forest environments, this task faces key challenges such as high annotation costs and poor cross-domain generalization. To address these issues, this study proposes a cross-modal semantic transfer framework tailored for individual tree point cloud segmentation in forested scenes. Leveraging co-registered UAV-acquired RGB imagery and LiDAR data, we construct a technical pipeline of “2D semantic inference—3D spatial mapping—cross-modal fusion” to enable annotation-free semantic parsing of 3D individual trees. Specifically, we first introduce a novel Multi-Source Feature Fusion Network (MSFFNet) to achieve accurate instance-level segmentation of individual trees in the 2D image domain. Subsequently, we develop a hierarchical two-stage registration strategy to effectively align dense matched point clouds (MPC) generated from UAV imagery with LiDAR point clouds. On this basis, we propose a probabilistic cross-modal semantic transfer model that builds a semantic probability field through multi-view projection and the expectation–maximization algorithm. By integrating geometric features and semantic confidence, the model establishes semantic correspondences between 2D pixels and 3D points, thereby achieving spatially consistent semantic label mapping. This facilitates the transfer of semantic annotations from the 2D image domain to the 3D point cloud domain. The proposed method is evaluated on two forest datasets. The results demonstrate that the proposed individual tree instance segmentation approach achieves the highest performance, with an IoU of 87.60%, compared to state-of-the-art methods such as Mask R-CNN, SOLOV2, and Mask2Former. Furthermore, the cross-modal semantic label transfer framework significantly outperforms existing mainstream methods in individual tree point cloud semantic segmentation across complex forest scenarios. Full article
Show Figures

Figure 1

24 pages, 7057 KB  
Article
Construction and Enhancement of a Rural Road Instance Segmentation Dataset Based on an Improved StyleGAN2-ADA
by Zhixin Yao, Renna Xi, Taihong Zhang, Yunjie Zhao, Yongqiang Tian and Wenjing Hou
Sensors 2025, 25(8), 2477; https://doi.org/10.3390/s25082477 - 15 Apr 2025
Cited by 2 | Viewed by 806
Abstract
With the advancement of agricultural automation, the demand for road recognition and understanding in agricultural machinery autonomous driving systems has significantly increased. To address the scarcity of instance segmentation data for rural roads and rural unstructured scenes, particularly the lack of support for [...] Read more.
With the advancement of agricultural automation, the demand for road recognition and understanding in agricultural machinery autonomous driving systems has significantly increased. To address the scarcity of instance segmentation data for rural roads and rural unstructured scenes, particularly the lack of support for high-resolution and fine-grained classification, a 20-class instance segmentation dataset was constructed, comprising 10,062 independently annotated instances. An improved StyleGAN2-ADA data augmentation method was proposed to generate higher-quality image data. This method incorporates a decoupled mapping network (DMN) to reduce the coupling degree of latent codes in W-space and integrates the advantages of convolutional networks and transformers by designing a convolutional coupling transfer block (CCTB). The core cross-shaped window self-attention mechanism in the CCTB enhances the network’s ability to capture complex contextual information and spatial layouts. Ablation experiments comparing the improved and original StyleGAN2-ADA networks demonstrate significant improvements, with the inception score (IS) increasing from 42.38 to 77.31 and the Fréchet inception distance (FID) decreasing from 25.09 to 12.42, indicating a notable enhancement in data generation quality and authenticity. In order to verify the effect of data enhancement on the model performance, the algorithms Mask R-CNN, SOLOv2, YOLOv8n, and OneFormer were tested to compare the performance difference between the original dataset and the enhanced dataset, which further confirms the effectiveness of the improved module. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

17 pages, 4587 KB  
Article
Improved YOLOv8-Based Segmentation Method for Strawberry Leaf and Powdery Mildew Lesions in Natural Backgrounds
by Mingzhou Chen, Wei Zou, Xiangjie Niu, Pengfei Fan, Haowei Liu, Cuiling Li and Changyuan Zhai
Agronomy 2025, 15(3), 525; https://doi.org/10.3390/agronomy15030525 - 21 Feb 2025
Cited by 2 | Viewed by 2353
Abstract
This study addresses the challenge of segmenting strawberry leaves and lesions in natural backgrounds, which is critical for accurate disease severity assessment and automated dosing. Focusing on strawberry powdery mildew, we propose an enhanced YOLOv8-based segmentation method for leaf and lesion detection. Four [...] Read more.
This study addresses the challenge of segmenting strawberry leaves and lesions in natural backgrounds, which is critical for accurate disease severity assessment and automated dosing. Focusing on strawberry powdery mildew, we propose an enhanced YOLOv8-based segmentation method for leaf and lesion detection. Four instance segmentation models (SOLOv2, YOLACT, YOLOv7-seg, and YOLOv8-seg) were compared, using YOLOv8-seg as the baseline. To improve performance, SCDown and PSA modules were integrated into the backbone to reduce redundancy, decrease computational load, and enhance detection of small objects and complex backgrounds. In the neck, the C2f module was replaced with the C2fCIB module, and the SimAM attention mechanism was incorporated to improve target differentiation and reduce noise interference. The loss function combined CIOU with MPDIOU to enhance adaptability in challenging scenarios. Ablation experiments demonstrated a segmentation accuracy of 92%, recall of 85.2%, and mean average precision (mAP) of 90.4%, surpassing the YOLOv8-seg baseline by 4%, 2.9%, and 4%, respectively. Compared to SOLOv2, YOLACT, and YOLOv7-seg, the improved model’s mAP increased by 14.8%, 5.8%, and 3.9%, respectively. The improved model reduces missed detections and enhances target localization, providing theoretical support for subsequent applications in intelligent, dosage-based disease management. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

22 pages, 26214 KB  
Article
SwinInsSeg: An Improved SOLOv2 Model Using the Swin Transformer and a Multi-Kernel Attention Module for Ship Instance Segmentation
by Rabi Sharma, Muhammad Saqib, Chin-Teng Lin and Michael Blumenstein
Mathematics 2025, 13(1), 165; https://doi.org/10.3390/math13010165 - 5 Jan 2025
Cited by 1 | Viewed by 2692
Abstract
Maritime surveillance is essential for ensuring security in the complex marine environment. The study presents SwinInsSeg, an instance segmentation model that combines the Swin transformer and a lightweight MKA module to segment ships accurately and efficiently in maritime surveillance. Current models have limitations [...] Read more.
Maritime surveillance is essential for ensuring security in the complex marine environment. The study presents SwinInsSeg, an instance segmentation model that combines the Swin transformer and a lightweight MKA module to segment ships accurately and efficiently in maritime surveillance. Current models have limitations in segmenting multiscale ships and achieving accurate segmentation boundaries. SwinInsSeg addresses these limitations by identifying ships of various sizes and capturing finer details, including both small and large ships, through the MKA module, which emphasizes important information at different processing stages. Performance evaluations on the MariBoats and ShipInsSeg datasets show that SwinInsSeg outperforms YOLACT, SOLO, and SOLOv2, achieving mask average precision scores of 50.6% and 52.0%, respectively. These results demonstrate SwinInsSeg’s superior capability in segmenting ship instances with improved accuracy. Full article
(This article belongs to the Special Issue New Advances and Applications in Image Processing and Computer Vision)
Show Figures

Figure 1

22 pages, 5767 KB  
Article
Radar Signal Sorting Method with Mimetic Image Mapping Based on Antenna Scan Pattern via SOLOv2 Network
by Tao Chen, Xiaoqi Guo and Jinxin Li
Remote Sens. 2024, 16(24), 4639; https://doi.org/10.3390/rs16244639 - 11 Dec 2024
Cited by 3 | Viewed by 2196
Abstract
Aiming at the problems, in which the traditional radar signal sorting method has high requirements for manual experience and poor adaptability, and considering the differences in received power caused by radar beam scanning under long-term observation, an end-to-end signal sorting method based on [...] Read more.
Aiming at the problems, in which the traditional radar signal sorting method has high requirements for manual experience and poor adaptability, and considering the differences in received power caused by radar beam scanning under long-term observation, an end-to-end signal sorting method based on the instance segmentation network SOLOv2 and using an antenna scan pattern (ASP) is proposed in this letter. Firstly, the interleaved pulse sequences of multiple radar signals with various inter-pulse modulation types, scan patterns, and gain patterns are simulated, mimetic image mapping is constructed to visualize the interleaved pulse sequences as mimetic point graphs, and the index relationship between pulses and pixel points is recorded. Subsequently, the SOLOv2 instance segmentation network is used to segment the mimetic point graph at the pixel level, thereby clustering the discrete pixel points in the image. Finally, based on the index relationship recorded during the construction of the mimetic image mapping, the clustering results of points in the image are traced back to the clustering of pulses, achieving end-to-end intelligent radar signal sorting. Through simulation experiments, it was verified that, compared with YOLOv8-based, U-Net-based, and traditional signal sorting methods, the sorting accuracy of the proposed method increased by 9.26%, 11.17%, and 24.55% in the scenario of five signals with 30% missing pulse ratio (MPR), and increased by 13.33%, 18.88%, and 23.94% in the scenario of five signals with 30% spurious pulse ratio (SPR), respectively. The results show that by introducing the stable parameter, namely ASP, the proposed method can achieve signal sorting with highly overlapping parameters and adapt to non-ideal conditions with measurement errors, missing pulses, and spurious pulses. Full article
Show Figures

Figure 1

26 pages, 100117 KB  
Article
Enhanced Atrous Spatial Pyramid Pooling Feature Fusion for Small Ship Instance Segmentation
by Rabi Sharma, Muhammad Saqib, C. T. Lin and Michael Blumenstein
J. Imaging 2024, 10(12), 299; https://doi.org/10.3390/jimaging10120299 - 21 Nov 2024
Cited by 3 | Viewed by 3135
Abstract
In the maritime environment, the instance segmentation of small ships is crucial. Small ships are characterized by their limited appearance, smaller size, and ships in distant locations in marine scenes. However, existing instance segmentation algorithms do not detect and segment them, resulting in [...] Read more.
In the maritime environment, the instance segmentation of small ships is crucial. Small ships are characterized by their limited appearance, smaller size, and ships in distant locations in marine scenes. However, existing instance segmentation algorithms do not detect and segment them, resulting in inaccurate ship segmentation. To address this, we propose a novel solution called enhanced Atrous Spatial Pyramid Pooling (ASPP) feature fusion for small ship instance segmentation. The enhanced ASPP feature fusion module focuses on small objects by refining them and fusing important features. The framework consistently outperforms state-of-the-art models, including Mask R-CNN, Cascade Mask R-CNN, YOLACT, SOLO, and SOLOv2, in three diverse datasets, achieving an average precision (mask AP) score of 75.8% for ShipSG, 69.5% for ShipInsSeg, and 54.5% for the MariBoats datasets. Full article
(This article belongs to the Special Issue Image Processing and Computer Vision: Algorithms and Applications)
Show Figures

Figure 1

25 pages, 9183 KB  
Article
A High-Accuracy Contour Segmentation and Reconstruction of a Dense Cluster of Mushrooms Based on Improved SOLOv2
by Shuzhen Yang, Jingmin Zhang and Jin Yuan
Agriculture 2024, 14(9), 1646; https://doi.org/10.3390/agriculture14091646 - 20 Sep 2024
Cited by 9 | Viewed by 2091
Abstract
This study addresses challenges related to imprecise edge segmentation and low center point accuracy, particularly when mushrooms are heavily occluded or deformed within dense clusters. A high-precision mushroom contour segmentation algorithm is proposed that builds upon the improved SOLOv2, along with a contour [...] Read more.
This study addresses challenges related to imprecise edge segmentation and low center point accuracy, particularly when mushrooms are heavily occluded or deformed within dense clusters. A high-precision mushroom contour segmentation algorithm is proposed that builds upon the improved SOLOv2, along with a contour reconstruction method using instance segmentation masks. The enhanced segmentation algorithm, PR-SOLOv2, incorporates the PointRend module during the up-sampling stage, introducing fine features and enhancing segmentation details. This addresses the difficulty of accurately segmenting densely overlapping mushrooms. Furthermore, a contour reconstruction method based on the PR-SOLOv2 instance segmentation mask is presented. This approach accurately segments mushrooms, extracts individual mushroom masks and their contour data, and classifies reconstruction contours based on average curvature and length. Regular contours are fitted using least-squares ellipses, while irregular ones are reconstructed by extracting the longest sub-contour from the original irregular contour based on its corners. Experimental results demonstrate strong generalization and superior performance in contour segmentation and reconstruction, particularly for densely clustered mushrooms in complex environments. The proposed approach achieves a 93.04% segmentation accuracy and a 98.13% successful segmentation rate, surpassing Mask RCNN and YOLACT by approximately 10%. The center point positioning accuracy of mushrooms is 0.3%. This method better meets the high positioning requirements for efficient and non-destructive picking of densely clustered mushrooms. Full article
Show Figures

Figure 1

18 pages, 18674 KB  
Article
An Improved Instance Segmentation Method for Complex Elements of Farm UAV Aerial Survey Images
by Feixiang Lv, Taihong Zhang, Yunjie Zhao, Zhixin Yao and Xinyu Cao
Sensors 2024, 24(18), 5990; https://doi.org/10.3390/s24185990 - 15 Sep 2024
Cited by 2 | Viewed by 1921
Abstract
Farm aerial survey layers can assist in unmanned farm operations, such as planning paths and early warnings. To address the inefficiencies and high costs associated with traditional layer construction, this study proposes a high-precision instance segmentation algorithm based on SparseInst. Considering the structural [...] Read more.
Farm aerial survey layers can assist in unmanned farm operations, such as planning paths and early warnings. To address the inefficiencies and high costs associated with traditional layer construction, this study proposes a high-precision instance segmentation algorithm based on SparseInst. Considering the structural characteristics of farm elements, this study introduces a multi-scale attention module (MSA) that leverages the properties of atrous convolution to expand the sensory field. It enhances spatial and channel feature weights, effectively improving segmentation accuracy for large-scale and complex targets in the farm through three parallel dense connections. A bottom-up aggregation path is added to the feature pyramid fusion network, enhancing the model’s ability to perceive complex targets such as mechanized trails in farms. Coordinate attention blocks (CAs) are incorporated into the neck to capture richer contextual semantic information, enhancing farm aerial imagery scene recognition accuracy. To assess the proposed method, we compare it against existing mainstream object segmentation models, including the Mask R-CNN, Cascade–Mask, SOLOv2, and Condinst algorithms. The experimental results show that the improved model proposed in this study can be adapted to segment various complex targets in farms. The accuracy of the improved SparseInst model greatly exceeds that of Mask R-CNN and Cascade–Mask and is 10.8 and 12.8 percentage points better than the average accuracy of SOLOv2 and Condinst, respectively, with the smallest number of model parameters. The results show that the model can be used for real-time segmentation of targets under complex farm conditions. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

15 pages, 9901 KB  
Article
Segmentation Method of Zanthoxylum bungeanum Cluster Based on Improved Mask R-CNN
by Zhiyong Zhang, Shuo Wang, Chen Wang, Li Wang, Yanqing Zhang and Haiyan Song
Agriculture 2024, 14(9), 1585; https://doi.org/10.3390/agriculture14091585 - 12 Sep 2024
Cited by 4 | Viewed by 1325
Abstract
The precise segmentation of Zanthoxylum bungeanum clusters is crucial for developing picking robots. An improved Mask R-CNN model was proposed in this study for the segmentation of Zanthoxylum bungeanum clusters in natural environments. Firstly, the Swin-Transformer network was introduced into the model’s backbone [...] Read more.
The precise segmentation of Zanthoxylum bungeanum clusters is crucial for developing picking robots. An improved Mask R-CNN model was proposed in this study for the segmentation of Zanthoxylum bungeanum clusters in natural environments. Firstly, the Swin-Transformer network was introduced into the model’s backbone as the feature extraction network to enhance the model’s feature extraction capabilities. Then, the SK attention mechanism was utilized to fuse the detailed information into the mask branch from the low-level feature map of the feature pyramid network (FPN), aiming to supplement the image detail features. Finally, the distance intersection over union (DIOU) loss function was adopted to replace the original bounding box loss function of Mask R-CNN. The model was trained and tested based on a self-constructed Zanthoxylum bungeanum cluster dataset. Experiments showed that the improved Mask R-CNN model achieved 84.0% and 77.2% in detection mAP50box and segmentation mAP50mask, respectively, representing a 5.8% and 4.6% improvement over the baseline Mask R-CNN model. In comparison to conventional instance segmentation models, such as YOLACT, Mask Scoring R-CNN, and SOLOv2, the improved Mask R-CNN model also exhibited higher segmentation precision. This study can provide valuable technology support for the development of Zanthoxylum bungeanum picking robots. Full article
(This article belongs to the Section Agricultural Technology)
Show Figures

Figure 1

22 pages, 18614 KB  
Article
Visual Localization Method for Unmanned Aerial Vehicles in Urban Scenes Based on Shape and Spatial Relationship Matching of Buildings
by Yu Liu, Jing Bai and Fangde Sun
Remote Sens. 2024, 16(16), 3065; https://doi.org/10.3390/rs16163065 - 20 Aug 2024
Cited by 4 | Viewed by 2293
Abstract
In urban scenes, buildings are usually dense and exhibit similar shapes. Thus, existing autonomous unmanned aerial vehicle (UAV) localization schemes based on map matching, especially the semantic shape matching (SSM) method, cannot capture the uniqueness of buildings and may result in matching failure. [...] Read more.
In urban scenes, buildings are usually dense and exhibit similar shapes. Thus, existing autonomous unmanned aerial vehicle (UAV) localization schemes based on map matching, especially the semantic shape matching (SSM) method, cannot capture the uniqueness of buildings and may result in matching failure. To solve this problem, we propose a new method to locate UAVs via shape and spatial relationship matching (SSRM) of buildings in urban scenes as an alternative to UAV localization via image matching. SSRM first extracts individual buildings from UAV images using the SOLOv2 instance segmentation algorithm. Then, these individual buildings are subsequently matched with vector e-map data (stored in .shp format) based on their shape and spatial relationship to determine their actual latitude and longitude. Control points are generated according to the matched buildings, and finally, the UAV position is determined. SSRM can efficiently realize high-precision UAV localization in urban scenes. Under the verification of actual data, SSRM achieves localization errors of 7.38 m and 11.92 m in downtown and suburb areas, respectively, with better localization performance than the radiation-variation insensitive feature transform (RIFT), channel features of the oriented gradient (CFOG), and SSM algorithms. Moreover, the SSRM algorithm exhibits a smaller localization error in areas with higher building density. Full article
Show Figures

Graphical abstract

13 pages, 2169 KB  
Article
Road Scene Instance Segmentation Based on Improved SOLOv2
by Qing Yang, Jiansheng Peng, Dunhua Chen and Hongyu Zhang
Electronics 2023, 12(19), 4169; https://doi.org/10.3390/electronics12194169 - 8 Oct 2023
Cited by 8 | Viewed by 3068
Abstract
Road instance segmentation is vital for autonomous driving, yet the current algorithms struggle in complex city environments, with issues like poor small object segmentation, low-quality mask edge contours, slow processing, and limited model adaptability. This paper introduces an enhanced instance segmentation method based [...] Read more.
Road instance segmentation is vital for autonomous driving, yet the current algorithms struggle in complex city environments, with issues like poor small object segmentation, low-quality mask edge contours, slow processing, and limited model adaptability. This paper introduces an enhanced instance segmentation method based on SOLOv2. It integrates the Bottleneck Transformer (BoT) module into VoVNetV2, replacing the standard convolutions with ghost convolutions. Additionally, it replaces ResNet with an improved VoVNetV2 backbone to enhance the feature extraction and segmentation speed. Furthermore, the algorithm employs Feature Pyramid Grids (FPGs) instead of Feature Pyramid Networks (FPNs) to introduce multi-directional lateral connections for better feature fusion. Lastly, it incorporates a convolutional Block Attention Module (CBAM) into the detection head for refined features by considering the attention weight coefficients in both the channel and spatial dimensions. The experimental results demonstrate the algorithm’s effectiveness, achieving a 27.6% mAP on Cityscapes, a 4.2% improvement over SOLOv2. It also attains a segmentation speed of 8.9 FPS, a 1.7 FPS increase over SOLOv2, confirming its practicality for real-world engineering applications. Full article
(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images)
Show Figures

Figure 1

15 pages, 3063 KB  
Article
Detection of Respiratory Rate of Dairy Cows Based on Infrared Thermography and Deep Learning
by Kaixuan Zhao, Yijie Duan, Junliang Chen, Qianwen Li, Xing Hong, Ruihong Zhang and Meijia Wang
Agriculture 2023, 13(10), 1939; https://doi.org/10.3390/agriculture13101939 - 4 Oct 2023
Cited by 15 | Viewed by 4059
Abstract
The respiratory status of dairy cows can reflect their heat stress and health conditions. It is widely used in the precision farming of dairy cows. To realize intelligent monitoring of cow respiratory status, a system based on infrared thermography was constructed. First, the [...] Read more.
The respiratory status of dairy cows can reflect their heat stress and health conditions. It is widely used in the precision farming of dairy cows. To realize intelligent monitoring of cow respiratory status, a system based on infrared thermography was constructed. First, the YOLO v8 model was used to detect and track the nose of cows in thermal images. Three instance segmentation models, Mask2Former, Mask R-CNN and SOLOv2, were used to segment the nostrils from the nose area. Second, the hash algorithm was used to extract the temperature of each pixel in the nostril area of a cow to obtain the temperature change curve. Finally, the sliding window approach was used to detect the peaks of the filtered temperature curve to obtain the respiratory rate of cows. Totally 81 infrared thermography videos were used to test the system, and the results showed that the AP50 of nose detection reached 98.6%, and the AP50 of nostril segmentation reached 75.71%. The accuracy of the respiratory rate was 94.58%, and the correlation coefficient R was 0.95. Combining infrared thermography technology with deep learning models can improve the accuracy and usability of the respiratory monitoring system for dairy cows. Full article
Show Figures

Figure 1

Back to TopTop