Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (19)

Search Parameters:
Keywords = varifocal loss

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 2497 KiB  
Article
The Research on an Improved YOLOX-Based Algorithm for Small-Object Road Vehicle Detection
by Zhixun Liu and Zhenyou Zhang
Electronics 2025, 14(11), 2179; https://doi.org/10.3390/electronics14112179 - 27 May 2025
Cited by 1 | Viewed by 478
Abstract
To address the challenges of missed detections and false positives caused by dense vehicle distribution, occlusions, and small object sizes in complex traffic scenarios, this paper proposes an improved YOLOX-based vehicle detection algorithm with three key innovations. First, we design a novel Wavelet-Enhanced [...] Read more.
To address the challenges of missed detections and false positives caused by dense vehicle distribution, occlusions, and small object sizes in complex traffic scenarios, this paper proposes an improved YOLOX-based vehicle detection algorithm with three key innovations. First, we design a novel Wavelet-Enhanced Convolution (WEC) module that expands the receptive field to enhance the model’s global perception capability. Building upon this foundation, we integrate the SimAM attention mechanism, which improves feature saturation by adaptively fusing semantic features across different channels and spatial locations, thereby strengthening the network’s multi-scale generalization ability. Furthermore, we develop a Varifocal Intersection over Union (VIoU) bounding-box regression loss function that optimizes convergence in multi-scale feature learning while enhancing global feature extraction capabilities. The experimental results on the VisDrone dataset demonstrate that our improved model achieves performance gains of 0.9% mAP and 1.8% mAP75 compared to the baseline version, effectively improving vehicle detection accuracy. Full article
Show Figures

Figure 1

16 pages, 3776 KiB  
Article
MDA-DETR: Enhancing Offending Animal Detection with Multi-Channel Attention and Multi-Scale Feature Aggregation
by Haiyan Zhang, Huiqi Li, Guodong Sun and Feng Yang
Animals 2025, 15(2), 259; https://doi.org/10.3390/ani15020259 - 17 Jan 2025
Cited by 1 | Viewed by 1173
Abstract
Conflicts between humans and animals in agricultural and settlement areas have recently increased, resulting in significant resource loss and risks to human and animal lives. This growing issue presents a global challenge. This paper addresses the detection and identification of offending animals, particularly [...] Read more.
Conflicts between humans and animals in agricultural and settlement areas have recently increased, resulting in significant resource loss and risks to human and animal lives. This growing issue presents a global challenge. This paper addresses the detection and identification of offending animals, particularly in obscured or blurry nighttime images. This article introduces Multi-Channel Coordinated Attention and Multi-Dimension Feature Aggregation (MDA-DETR). It integrates multi-scale features for enhanced detection accuracy, employing a Multi-Channel Coordinated Attention (MCCA) mechanism to incorporate location, semantic, and long-range dependency information and a Multi-Dimension Feature Aggregation Module (DFAM) for cross-scale feature aggregation. Additionally, the VariFocal Loss function is utilized to assign pixel weights, enhancing detail focus and maintaining accuracy. In the dataset section, this article uses a dataset from the Northeast China Tiger and Leopard National Park, which includes images of six common offending animal species. In the comprehensive experiments on the dataset, the mAP50 index of MDA-DETR was 1.3%, 0.6%, 0.3%, 3%, 1.1%, and 0.5% higher than RT-DETR-r18, yolov8n, yolov9-C, DETR, Deformable-detr, and DCA-yolov8, respectively, indicating that MDA-DETR is superior to other advanced methods. Full article
(This article belongs to the Special Issue Animal–Computer Interaction: Advances and Opportunities)
Show Figures

Figure 1

22 pages, 10887 KiB  
Article
Research on a UAV-View Object-Detection Method Based on YOLOv7-Tiny
by Yuyang Miao, Xihan Wang, Ning Zhang, Kai Wang, Lianhe Shao and Quanli Gao
Appl. Sci. 2024, 14(24), 11929; https://doi.org/10.3390/app142411929 - 20 Dec 2024
Cited by 1 | Viewed by 1229
Abstract
To address the issues of missed and false detections caused by small object sizes, dense object distribution, and complex scenes in drone aerial images, this study proposes a drone-view object-detection algorithm based on YOLOv7-tiny with a Partial_C_Detect detection head. The algorithm’s performance in [...] Read more.
To address the issues of missed and false detections caused by small object sizes, dense object distribution, and complex scenes in drone aerial images, this study proposes a drone-view object-detection algorithm based on YOLOv7-tiny with a Partial_C_Detect detection head. The algorithm’s performance in handling object occlusion and multi-scale detection is enhanced by introducing the VarifocalLoss loss function and improving the feature fusion network to BiFPN. Furthermore, incorporating the novel Partial_C_Detect detection head and Adaptive Kernel Convolution (AKConv) improves the detection capabilities for small and dynamically changing objects. In addition, introducing the Dilated Weighted Residual (DWR) attention module optimizes the information processing flow, enhancing the algorithm’s ability to capture key information, especially in complex backgrounds. These enhancements collectively enable the model to balance high detection accuracy and computational efficiency, making it well-suited for resource-constrained UAV platforms. Experiments conducted on the VisDrone2019 dataset show that the improved algorithm achieves a mAP@0.5 of 38.2%, with a model size of 29.01 MB and a computational complexity of 16.2 G. Compared to the original YOLOv7-tiny algorithm, the mAP@0.5 improves by 2.9%, and the algorithm performs better in other key performance metrics, demonstrating its adaptability and robustness in drone aerial image object-detection tasks. Full article
(This article belongs to the Special Issue Advanced Pattern Recognition & Computer Vision)
Show Figures

Figure 1

16 pages, 6553 KiB  
Article
Cucumber Leaf Segmentation Based on Bilayer Convolutional Network
by Tingting Qian, Yangxin Liu, Shenglian Lu, Linyi Li, Xiuguo Zheng, Qingqing Ju, Yiyang Li, Chun Xie and Guo Li
Agronomy 2024, 14(11), 2664; https://doi.org/10.3390/agronomy14112664 - 12 Nov 2024
Cited by 1 | Viewed by 1437
Abstract
When monitoring crop growth using top-down images of the plant canopies, leaves in agricultural fields appear very dense and significantly overlap each other. Moreover, the image can be affected by external conditions such as background environment and light intensity, impacting the effectiveness of [...] Read more.
When monitoring crop growth using top-down images of the plant canopies, leaves in agricultural fields appear very dense and significantly overlap each other. Moreover, the image can be affected by external conditions such as background environment and light intensity, impacting the effectiveness of image segmentation. To address the challenge of segmenting dense and overlapping plant leaves under natural lighting conditions, this study employed a Bilayer Convolutional Network (BCNet) method for accurate leaf segmentation across various lighting environments. The major contributions of this study are as follows: (1) Utilized Fully Convolutional Object Detection (FCOS) for plant leaf detection, incorporating ResNet-50 with the Convolutional Block Attention Module (CBAM) and Feature Pyramid Network (FPN) to enhance Region of Interest (RoI) feature extraction from canopy top-view images. (2) Extracted the sub-region of the RoI based on the position of the detection box, using this region as input for the BCNet, ensuring precise segmentation. (3) Utilized instance segmentation of canopy top-view images using BCNet, improving segmentation accuracy. (4) Applied the Varifocal Loss Function to improve the classification loss function in FCOS, leading to better performance metrics. The experimental results on cucumber canopy top-view images captured in glass greenhouse and plastic greenhouse environments show that our method is highly effective. For cucumber leaves at different growth stages and under various lighting conditions, the Precision, Recall and Average Precision (AP) metrics for object recognition are 97%, 94% and 96.57%, respectively. For instance segmentation, the Precision, Recall and Average Precision (AP) metrics are 87%, 83% and 84.71%, respectively. Our algorithm outperforms commonly used deep learning algorithms such as Faster R-CNN, Mask R-CNN, YOLOv4 and PANet, showcasing its superior capability in complex agricultural settings. The results of this study demonstrate the potential of our method for accurate recognition and segmentation of highly overlapping leaves in diverse agricultural environments, significantly contributing to the application of deep learning algorithms in smart agriculture. Full article
(This article belongs to the Special Issue AI, Sensors and Robotics for Smart Agriculture—2nd Edition)
Show Figures

Figure 1

17 pages, 7240 KiB  
Article
YOLO-BFRV: An Efficient Model for Detecting Printed Circuit Board Defects
by Jiaxin Liu, Bingyu Kang, Chao Liu, Xunhui Peng and Yan Bai
Sensors 2024, 24(18), 6055; https://doi.org/10.3390/s24186055 - 19 Sep 2024
Cited by 3 | Viewed by 2584
Abstract
The small area of a printed circuit board (PCB) results in densely distributed defects, leading to a lower detection accuracy, which subsequently impacts the safety and stability of the circuit board. This paper proposes a new YOLO-BFRV network model based on the improved [...] Read more.
The small area of a printed circuit board (PCB) results in densely distributed defects, leading to a lower detection accuracy, which subsequently impacts the safety and stability of the circuit board. This paper proposes a new YOLO-BFRV network model based on the improved YOLOv8 framework to identify PCB defects more efficiently and accurately. First, a bidirectional feature pyramid network (BIFPN) is introduced to expand the receptive field of each feature level and enrich the semantic information to improve the feature extraction capability. Second, the YOLOv8 backbone network is refined into a lightweight FasterNet network, reducing the computational load while improving the detection accuracy of minor defects. Subsequently, the high-speed re-parameterized detection head (RepHead) reduces inference complexity and boosts the detection speed without compromising accuracy. Finally, the VarifocalLoss is employed to enhance the detection accuracy for densely distributed PCB defects. The experimental results demonstrate that the improved model increases the mAP by 4.12% compared to the benchmark YOLOv8s model, boosts the detection speed by 45.89%, and reduces the GFLOPs by 82.53%, further confirming the superiority of the algorithm presented in this paper. Full article
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

15 pages, 5989 KiB  
Article
Instance Segmentation of Lentinus edodes Images Based on YOLOv5seg-BotNet
by Xingmei Xu, Xiangyu Su, Lei Zhou, Helong Yu and Jian Zhang
Agronomy 2024, 14(8), 1808; https://doi.org/10.3390/agronomy14081808 - 16 Aug 2024
Viewed by 1351
Abstract
The shape and quantity of Lentinus edodes (commonly known as shiitake) fruiting bodies significantly affect their quality and yield. Accurate and rapid segmentation of these fruiting bodies is crucial for quality grading and yield prediction. This study proposed the YOLOv5seg-BotNet, a model for [...] Read more.
The shape and quantity of Lentinus edodes (commonly known as shiitake) fruiting bodies significantly affect their quality and yield. Accurate and rapid segmentation of these fruiting bodies is crucial for quality grading and yield prediction. This study proposed the YOLOv5seg-BotNet, a model for the instance segmentation of Lentinus edodes, to research its application for the mushroom industry. First, the backbone network was replaced with the BoTNet, and the spatial convolutions in the local backbone network were replaced with global self-attention modules to enhance the feature extraction ability. Subsequently, the PANet was adopted to effectively manage and integrate Lentinus edodes images in complex backgrounds at various scales. Finally, the Varifocal Loss function was employed to adjust the weights of different samples, addressing the issues of missed segmentation and mis-segmentation. The enhanced model demonstrated improvements in the precision, recall, Mask_AP, F1-Score, and FPS, achieving 97.58%, 95.74%, 95.90%, 96.65%, and 32.86 frames per second, respectively. These values represented the increases of 2.37%, 4.55%, 4.56%, 3.50%, and 2.61% compared to the original model. The model achieved dual improvements in segmentation accuracy and speed, exhibiting excellent detection and segmentation performance on Lentinus edodes fruiting bodies. This study provided technical fundamentals for future application of image detection and decision-making processes to evaluate mushroom production, including quality grading and intelligent harvesting. Full article
Show Figures

Figure 1

22 pages, 4810 KiB  
Article
Ship Target Detection in Optical Remote Sensing Images Based on E2YOLOX-VFL
by Qichang Zhao, Yiquan Wu and Yubin Yuan
Remote Sens. 2024, 16(2), 340; https://doi.org/10.3390/rs16020340 - 15 Jan 2024
Cited by 12 | Viewed by 2809
Abstract
In this research, E2YOLOX-VFL is proposed as a novel approach to address the challenges of optical image multi-scale ship detection and recognition in complex maritime and land backgrounds. Firstly, the typical anchor-free network YOLOX is utilized as the baseline network for ship detection. [...] Read more.
In this research, E2YOLOX-VFL is proposed as a novel approach to address the challenges of optical image multi-scale ship detection and recognition in complex maritime and land backgrounds. Firstly, the typical anchor-free network YOLOX is utilized as the baseline network for ship detection. Secondly, the Efficient Channel Attention module is incorporated into the YOLOX Backbone network to enhance the model’s capability to extract information from objects of different scales, such as large, medium, and small, thus improving ship detection performance in complex backgrounds. Thirdly, we propose the Efficient Force-IoU (EFIoU) Loss function as a replacement for the Intersection over Union (IoU) Loss, addressing the issue whereby IoU Loss only considers the intersection and union between the ground truth boxes and the predicted boxes, without taking into account the size and position of targets. This also considers the disadvantageous effects of low-quality samples, resulting in inaccuracies in measuring target similarity, and improves the regression performance of the algorithm. Fourthly, the confidence loss function is improved. Specifically, Varifocal Loss is employed instead of CE Loss, effectively handling the positive and negative sample imbalance, challenging samples, and class imbalance, enhancing the overall detection performance of the model. Then, we propose Balanced Gaussian NMS (BG-NMS) to solve the problem of missed detection caused by the occlusion of dense targets. Finally, the E2YOLOX-VFL algorithm is tested on the HRSC2016 dataset, achieving a 9.28% improvement in mAP compared to the baseline YOLOX algorithm. Moreover, the detection performance using BG-NMS is also analyzed, and the experimental results validate the effectiveness of the E2YOLOX-VFL algorithm. Full article
Show Figures

Graphical abstract

17 pages, 10763 KiB  
Article
YOLO-Crater Model for Small Crater Detection
by Lingli Mu, Lina Xian, Lihong Li, Gang Liu, Mi Chen and Wei Zhang
Remote Sens. 2023, 15(20), 5040; https://doi.org/10.3390/rs15205040 - 20 Oct 2023
Cited by 15 | Viewed by 4660
Abstract
Craters are the most prominent geomorphological features on the surface of celestial bodies, which plays a crucial role in studying the formation and evolution of celestial bodies as well as in landing and planning for surface exploration. Currently, the main automatic crater detection [...] Read more.
Craters are the most prominent geomorphological features on the surface of celestial bodies, which plays a crucial role in studying the formation and evolution of celestial bodies as well as in landing and planning for surface exploration. Currently, the main automatic crater detection models and datasets focus on the detection of large and medium craters. In this paper, we created 23 small lunar crater datasets for model training based on the Chang’E-2 (CE-2) DOM, DEM, Slope, and integrated data with 7 kinds of visualization stretching methods. Then, we proposed the YOLO-Crater model for Lunar and Martian small crater detection by replacing EioU and VariFocal loss to solve the crater sample imbalance problem and introducing a CBAM attention mechanism to mitigate interference from the complex extraterrestrial environment. The results show that the accuracy (P = 87.86%, R = 66.04%, and F1 = 75.41%) of the Lunar YOLO-Crater model based on the DOM-MMS (Maximum-Minimum Stretching) dataset is the highest and better than that of the YOLOX model. The Martian YOLO-Crater, trained by the Martian dataset from the 2022 GeoAI Martian Challenge, achieves good performance with P = 88.37%, R = 69.25%, and F1 = 77.65%. It indicates that the YOLO-Crater model has strong transferability and generalization capability, which can be applied to detect small craters on the Moon and other celestial bodies. Full article
(This article belongs to the Special Issue Laser and Optical Remote Sensing for Planetary Exploration)
Show Figures

Graphical abstract

19 pages, 11706 KiB  
Article
SE-YOLOv7 Landslide Detection Algorithm Based on Attention Mechanism and Improved Loss Function
by Qing Liu, Tingting Wu, Yahong Deng and Zhiheng Liu
Land 2023, 12(8), 1522; https://doi.org/10.3390/land12081522 - 31 Jul 2023
Cited by 18 | Viewed by 3350
Abstract
With the continuous development of computer vision technology, more and more landslide identification detection tasks have started to shift from manual visual interpretation to automatic computer identification, and automatic landslide detection methods based on remote sensing satellite images and deep learning have been [...] Read more.
With the continuous development of computer vision technology, more and more landslide identification detection tasks have started to shift from manual visual interpretation to automatic computer identification, and automatic landslide detection methods based on remote sensing satellite images and deep learning have been gradually developed. However, most existing algorithms often have the problem of low precision and weak generalization in landslide detection. Based on the Google Earth Engine platform, this study selected landslide image data from 24 study areas in China and established the DN landslide sample dataset, which contains a total of 1440 landslide samples. The original YOLOv7 algorithm model was improved and optimized by applying the SE squeezed attention mechanism and VariFocal loss function to construct the SE-YOLOv7 model to realize the automatic detection of landslides in remote sensing images. The experimental results show that the mAP, Precision value, Recall value, and F1-Score of the improved SE-YOLOv7 model for landslide identification are 91.15%, 93.35%, 94.54%, and 93.94%, respectively. At the same time, through a field investigation and verification study in Qianyang County, Baoji City, Shaanxi Province, comparing the detection results of SE-YOLOv7, it is concluded that the improved SE-YOLOv7 can locate the landslide location more accurately, detect the landslide range more accurately, and have fewer missed detections. The research results show that the algorithm model has strong detection accuracy for many types of landslide image data, which provides a technical reference for future research on landslide detection based on remote sensing images. Full article
(This article belongs to the Special Issue Remote Sensing Application in Landslide Detection and Assessment)
Show Figures

Figure 1

18 pages, 6803 KiB  
Article
IO-YOLOv5: Improved Pig Detection under Various Illuminations and Heavy Occlusion
by Jiajun Lai, Yun Liang, Yingjie Kuang, Zhannan Xie, Hongyuan He, Yuxin Zhuo, Zekai Huang, Shijie Zhu and Zenghang Huang
Agriculture 2023, 13(7), 1349; https://doi.org/10.3390/agriculture13071349 - 4 Jul 2023
Cited by 10 | Viewed by 2270
Abstract
Accurate detection and counting of live pigs are integral to scientific breeding and production in intelligent agriculture. However, existing pig counting methods are challenged by heavy occlusion and varying illumination conditions. To overcome these challenges, we proposed IO-YOLOv5 (Illumination-Occlusion YOLOv5), an improved network [...] Read more.
Accurate detection and counting of live pigs are integral to scientific breeding and production in intelligent agriculture. However, existing pig counting methods are challenged by heavy occlusion and varying illumination conditions. To overcome these challenges, we proposed IO-YOLOv5 (Illumination-Occlusion YOLOv5), an improved network that expands on the YOLOv5 framework with three key contributions. Firstly, we introduced the Simple Attention Receptive Field Block (SARFB) module to expand the receptive field and give greater weight to important features at different levels. The Ghost Spatial Pyramid Pooling Fast Cross Stage Partial Connections (GSPPFC) module was also introduced to enhance model feature reuse and information flow. Secondly, we optimized the loss function by using Varifocal Loss to improve the model’s learning ability on high-quality and challenging samples. Thirdly, we proposed a public dataset consisting of 1270 images and 15,672 pig labels. Experiments demonstrated that IO-YOLOv5 achieved a mean average precision (mAP) of 90.8% and a precision of 86.4%, surpassing the baseline model by 2.2% and 3.7% respectively. By using a model ensemble and test time augmentation, we further improved the mAP to 92.6%, which is a 4% improvement over the baseline model. Extensive experiments showed that IO-YOLOv5 exhibits excellent performance in pig recognition, particularly under heavy occlusion and various illuminations. These results provide a strong foundation for pig recognition in complex breeding environments. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

19 pages, 11500 KiB  
Article
Exploiting the Potential of Overlapping Cropping for Real-World Pedestrian and Vehicle Detection with Gigapixel-Level Images
by Chunlei Wang, Wenquan Feng, Binghao Liu, Xinyang Ling and Yifan Yang
Appl. Sci. 2023, 13(6), 3637; https://doi.org/10.3390/app13063637 - 13 Mar 2023
Cited by 4 | Viewed by 2571
Abstract
Pedestrian and vehicle detection is widely used in intelligent assisted driving, pedestrian counting, drone aerial photography, and other applications. Recently, with the development of gigacameras, gigapixel-level images have emerged. The large field of view and high resolution provide global and local information, which [...] Read more.
Pedestrian and vehicle detection is widely used in intelligent assisted driving, pedestrian counting, drone aerial photography, and other applications. Recently, with the development of gigacameras, gigapixel-level images have emerged. The large field of view and high resolution provide global and local information, which enables object detection in real-world scenarios. Although existing pedestrian and vehicle detection algorithms have achieved remarkable success for standard images, their methods are not suitable for ultra-high-resolution images. In order to improve the performance of existing pedestrian and vehicle detectors in real-world scenarios, we used a sliding window to crop the original images to solve this problem. When fusing the sub-images, we proposed a midline method to reduce the cropped objects that NMS could not eliminate. At the same time, we used varifocal loss to solve the imbalance between positive and negative samples caused by the high resolution. We also found that pedestrians and vehicles were separable in size and comprised more than one target type. As a result, we improved the detector performance with single-class object detection for pedestrians and vehicles, respectively. At the same time, we provided many useful strategies to improve the detector. The experimental results demonstrated that our method could improve the performance of real-world pedestrian and vehicle detection. Full article
(This article belongs to the Special Issue Intelligent Analysis and Image Recognition)
Show Figures

Figure 1

12 pages, 2895 KiB  
Article
Mid-Infrared Continuous Varifocal Metalens with Adjustable Intensity Based on Phase Change Materials
by Liangde Shao, Kongsi Zhou, Fangfang Zhao, Yixiao Gao, Bingxia Wang and Xiang Shen
Photonics 2022, 9(12), 959; https://doi.org/10.3390/photonics9120959 - 9 Dec 2022
Cited by 3 | Viewed by 2713
Abstract
Metalenses can greatly reduce the complexity of imaging systems due to their small size and light weight and also provide a platform for the realization of multifunctional imaging devices. Achieving dynamic focus length tunability is highly important for metalens research. In this paper, [...] Read more.
Metalenses can greatly reduce the complexity of imaging systems due to their small size and light weight and also provide a platform for the realization of multifunctional imaging devices. Achieving dynamic focus length tunability is highly important for metalens research. In this paper, based on single-crystal Ge and a new low-loss phase change material Ge2Sb2Se5 (GSSe), a tunable metalens formed by a double-layer metasurface composite was realized in the mid-infrared band. The first-layer metasurface formed by Ge nanopillars combines propagation and the geometric phase (equivalent to a half-wave plate function) to produce single- or multiple-polarization-dependent foci. The second-layer metasurface formed by GSSe nanopillars provides a tunable propagation phase, and the double-layer metalens can achieve the tunability of the focus length depending on the different crystalline fractions of GSSe. The focal length varies from 62.91 to 67.13 μm under right circularly polarized light incidence and from 33.84 to 36.66 μm under left circularly polarized light incidence. Despite the difference in the crystallographic fraction, the metalens’s focusing efficiency is maintained basically around 59% and 48% when zooming under RCP and LCP wave excitation. Meanwhile, the incident wave’s ellipticity can be changed to alter the relative intensity ratios of the bifocals from 0.03 to 4.26. This continuous varifocal metalens with adjustable intensity may have potential in practical applications such as optical tomography, multiple imaging, and systems of optical communication. Full article
Show Figures

Figure 1

21 pages, 2726 KiB  
Article
An Improved YOLOX Model and Domain Transfer Strategy for Nighttime Pedestrian and Vehicle Detection
by Kefu Yi, Kai Luo, Tuo Chen and Rongdong Hu
Appl. Sci. 2022, 12(23), 12476; https://doi.org/10.3390/app122312476 - 6 Dec 2022
Cited by 16 | Viewed by 3914
Abstract
Aimed at the vehicle/pedestrian visual sensing task under low-light conditions and the problems of small, dense objects and line-of-sight occlusion, a nighttime vehicle/pedestrian detection method was proposed. First, a vehicle/pedestrian detection algorithm was designed based on You Only Look Once X (YOLOX). The [...] Read more.
Aimed at the vehicle/pedestrian visual sensing task under low-light conditions and the problems of small, dense objects and line-of-sight occlusion, a nighttime vehicle/pedestrian detection method was proposed. First, a vehicle/pedestrian detection algorithm was designed based on You Only Look Once X (YOLOX). The model structure was re-parameterized and lightened, and a coordinate-based attention mechanism was introduced into the backbone network to enhance the feature extraction efficiency of vehicle/pedestrian targets. A feature-scale fusion detection branch was added to the feature pyramid, while a loss function was designed, which combines Complete Intersection Over Union (CIoU) for target localization and Varifocal Loss for confidence prediction to improve the feature extraction ability for small, dense, and low-illumination targets. In addition, in order to further improve the detection accuracy of the algorithm under low-light conditions, a training strategy based on data domain transfer was proposed, which fuses the larger-scale daylight dataset with the smaller-scale nighttime dataset after low-illumination degrading. After low-light enhancement, training and testing were performed accordingly. The experimental results show that, compared with the original YOLOX model, the improved algorithm trained by the proposed data domain transfer strategy achieved better performance, and the mean Average Precision (mAP) increased by 5.9% to 82.4%. This research provided effective technical support for autonomous driving safety at night. Full article
Show Figures

Figure 1

20 pages, 85581 KiB  
Article
Multi-Scale Object Detection Model for Autonomous Ship Navigation in Maritime Environment
by Zeyuan Shao, Hongguang Lyu, Yong Yin, Tao Cheng, Xiaowei Gao, Wenjun Zhang, Qianfeng Jing, Yanjie Zhao and Lunping Zhang
J. Mar. Sci. Eng. 2022, 10(11), 1783; https://doi.org/10.3390/jmse10111783 - 19 Nov 2022
Cited by 27 | Viewed by 5761
Abstract
Accurate detection of sea-surface objects is vital for the safe navigation of autonomous ships. With the continuous development of artificial intelligence, electro-optical (EO) sensors such as video cameras are used to supplement marine radar to improve the detection of objects that produce weak [...] Read more.
Accurate detection of sea-surface objects is vital for the safe navigation of autonomous ships. With the continuous development of artificial intelligence, electro-optical (EO) sensors such as video cameras are used to supplement marine radar to improve the detection of objects that produce weak radar signals and small sizes. In this study, we propose an enhanced convolutional neural network (CNN) named VarifocalNet * that improves object detection in harsh maritime environments. Specifically, the feature representation and learning ability of the VarifocalNet model are improved by using a deformable convolution module, redesigning the loss function, introducing a soft non-maximum suppression algorithm, and incorporating multi-scale prediction methods. These strategies improve the accuracy and reliability of our CNN-based detection results under complex sea conditions, such as in turbulent waves, sea fog, and water reflection. Experimental results under different maritime conditions show that our method significantly outperforms similar methods (such as SSD, YOLOv3, RetinaNet, Faster R-CNN, Cascade R-CNN) in terms of the detection accuracy and robustness for small objects. The maritime obstacle detection results were obtained under harsh imaging conditions to demonstrate the performance of our network model. Full article
(This article belongs to the Special Issue Application of Advanced Technologies in Maritime Safety)
Show Figures

Figure 1

25 pages, 11902 KiB  
Article
A Universal Landslide Detection Method in Optical Remote Sensing Images Based on Improved YOLOX
by Heyi Hou, Mingxia Chen, Yongbo Tie and Weile Li
Remote Sens. 2022, 14(19), 4939; https://doi.org/10.3390/rs14194939 - 3 Oct 2022
Cited by 44 | Viewed by 5232
Abstract
Using deep learning-based object detection algorithms for landslide hazards detection is very popular and effective. However, most existing algorithms are designed for landslides in a specific geographical range. This paper constructs a set of landslide detection models YOLOX-Pro, based on the improved YOLOX [...] Read more.
Using deep learning-based object detection algorithms for landslide hazards detection is very popular and effective. However, most existing algorithms are designed for landslides in a specific geographical range. This paper constructs a set of landslide detection models YOLOX-Pro, based on the improved YOLOX (You Only Look Once) target detection model to address the poor detection of complex mixed landslides. Wherein the VariFocal is used to replace the binary cross entropy in the original classification loss function to solve the uneven distribution of landslide samples and improve the detection recall; the coordinate attention (CA) mechanism is added to enhance the detection accuracy. Firstly, 1200 historical landslide optical remote sensing images in thirty-eight areas of China were extracted from Google Earth to create a mixed sample set for landslide detection. Next, the three attention mechanisms were compared to form the YOLOX-Pro model. Then, we tested the performance of YOLOX-Pro by comparing it with four models: YOLOX, YOLOv5, Faster R-CNN, and Single Shot MultiBox Detector (SSD). The results show that the YOLOX-Pro(m) has significantly improved the detection accuracy of complex and small landslides than the other models, with an average precision (AP0.75) of 51.5%, APsmall of 36.50%, and ARsmall of 49.50%. In addition, optical remote sensing images of a 12.32 km2 group-occurring landslides area located in Mibei village, Longchuan County, Guangdong, China, and 750 Unmanned Aerial Vehicle (UAV) images collected from the Internet were also used for landslide detection. The research results proved that the proposed method has strong generalization and good detection performance for many types of landslides, which provide a technical reference for the broad application of landslide detection using UAV. Full article
Show Figures

Figure 1

Back to TopTop