Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (22)

Search Parameters:
Keywords = varifocal loss

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 4069 KB  
Article
VFR-Net: Varifocal Fine-Grained Refinement Network for 3D Object Detection
by Yuto Sakai, Tomoyasu Shimada, Xiangbo Kong and Hiroyuki Tomiyama
Appl. Sci. 2026, 16(2), 911; https://doi.org/10.3390/app16020911 - 15 Jan 2026
Abstract
High-precision 3D object detection is pivotal for autonomous driving. However, voxel-based two-stage detectors still struggle with small and non-rigid objects due to the misalignment between classification confidence and localization accuracy, and the loss of fine-grained spatial context during feature flattening. To address these [...] Read more.
High-precision 3D object detection is pivotal for autonomous driving. However, voxel-based two-stage detectors still struggle with small and non-rigid objects due to the misalignment between classification confidence and localization accuracy, and the loss of fine-grained spatial context during feature flattening. To address these issues, we propose the Varifocal Fine-grained Refinement Network (VFR-Net). We introduce Varifocal Loss (VFL) to learn IoU-aware scores for prioritizing high-quality proposals, and a Fine-Grained Refinement Attention (FGRA) Module to capture local geometric details via self-attention before flattening. Extensive experiments on the KITTI and ONCE datasets demonstrate that VFR-Net consistently outperforms the Voxel R-CNN baseline, improving the overall mAP by +1.12% on KITTI and +2.63% on ONCE. Specifically, it achieves AP gains of +1.81% and +1.28% for pedestrians and cyclists on KITTI (averaged over Easy/Moderate/Hard), and +6.53% and +1.50% on ONCE (Overall). Full article
Show Figures

Figure 1

27 pages, 9435 KB  
Article
Research on an Intelligent Grading Method for Beef Freshness in Complex Backgrounds Based on the DEVA-ConvNeXt Model
by Xiuling Yu, Yifu Xu, Chenxiao Qu, Senyue Guo, Shuo Jiang, Linqiang Chen and Yang Zhou
Foods 2025, 14(24), 4178; https://doi.org/10.3390/foods14244178 - 5 Dec 2025
Viewed by 435
Abstract
This paper presents a novel DEVA-ConvNeXt model to address challenges in beef freshness grading, including data collection difficulties, complex backgrounds, and model accuracy issues. The Alpha-Background Generation Shift (ABG-Shift) technology enables rapid generation of beef image datasets with complex backgrounds. By incorporating the [...] Read more.
This paper presents a novel DEVA-ConvNeXt model to address challenges in beef freshness grading, including data collection difficulties, complex backgrounds, and model accuracy issues. The Alpha-Background Generation Shift (ABG-Shift) technology enables rapid generation of beef image datasets with complex backgrounds. By incorporating the Dynamic Non-Local Coordinate Attention (DNLC) and Enhanced Depthwise Convolution (EDW) modules, the model enhances feature extraction in complex environments. Additionally, Varifocal Loss (VFL) accelerates key feature learning, reducing training time and improving convergence speed. Experimental results show that DEVA-ConvNeXt outperforms models like ResNet101 and ShuffleNet V2 in terms of overall performance. Compared to the baseline model ConvNeXt, it achieves significant improvements in recognition Accuracy (94.8%, a 6.2% increase), Precision (94.8%, a 5.4% increase), Recall (94.6%, a 5.9% increase), and F1 score (94.7%, a 6.0% increase). Furthermore, real-world deployment and testing on embedded devices confirm the feasibility of this method in terms of accuracy and speed, providing valuable technical support for beef freshness grading and equipment design. Full article
(This article belongs to the Section Food Engineering and Technology)
Show Figures

Figure 1

20 pages, 37629 KB  
Article
Design of a Modified Moiré Varifocal Metalens Based on Fresnel Principles
by Di Chang, Shuiping Sun, Lieshan Zhang and Xueyan Li
Photonics 2025, 12(9), 888; https://doi.org/10.3390/photonics12090888 - 3 Sep 2025
Cited by 1 | Viewed by 961
Abstract
This paper proposes a Fresnel-based Modified Moiré Varifocal Metalens (MMVL) addressing the inherent defocus at 0° rotation and significant focal quality degradation during varifocal operation in Traditional Moiré Varifocal Metalenses (TMVLs). The transmission function of the Fresnel-modified Moiré metalens combines a static term [...] Read more.
This paper proposes a Fresnel-based Modified Moiré Varifocal Metalens (MMVL) addressing the inherent defocus at 0° rotation and significant focal quality degradation during varifocal operation in Traditional Moiré Varifocal Metalenses (TMVLs). The transmission function of the Fresnel-modified Moiré metalens combines a static term with a dynamic term, allowing the MMVLs to effectively overcome these limitations. Meanwhile, to minimize energy losses arising from polarization conversion and diffraction between the two metalenses, the nano-units on the metalenses are optimized by Particle Swarm Optimization (PSO) with FDTD simulations, maximizing the polarization conversion efficiency and transmittance. The simulation results demonstrate superior focal quality and stability in the MMVL throughout full rotational cycles, with super-diffraction-limited focusing maintained across all varifocal states. MMVLs have advantages in robustness; under axial distance variation (d = 0–20d0, 0–3 μm), they maintain on-axis focus without deviation; with centering error (p = 0–10p0, 0–3 μm), they sustain a clear focus at >36% efficiency. These results confirm that MMVLs have enhanced tolerance to manufacturing/assembly errors compared to TMVLs, delivering significantly stabilized optical performance. This advancement enables new possibilities for integrated micro-optics and optical tweezer applications. Full article
Show Figures

Figure 1

15 pages, 2497 KB  
Article
The Research on an Improved YOLOX-Based Algorithm for Small-Object Road Vehicle Detection
by Zhixun Liu and Zhenyou Zhang
Electronics 2025, 14(11), 2179; https://doi.org/10.3390/electronics14112179 - 27 May 2025
Cited by 1 | Viewed by 1135
Abstract
To address the challenges of missed detections and false positives caused by dense vehicle distribution, occlusions, and small object sizes in complex traffic scenarios, this paper proposes an improved YOLOX-based vehicle detection algorithm with three key innovations. First, we design a novel Wavelet-Enhanced [...] Read more.
To address the challenges of missed detections and false positives caused by dense vehicle distribution, occlusions, and small object sizes in complex traffic scenarios, this paper proposes an improved YOLOX-based vehicle detection algorithm with three key innovations. First, we design a novel Wavelet-Enhanced Convolution (WEC) module that expands the receptive field to enhance the model’s global perception capability. Building upon this foundation, we integrate the SimAM attention mechanism, which improves feature saturation by adaptively fusing semantic features across different channels and spatial locations, thereby strengthening the network’s multi-scale generalization ability. Furthermore, we develop a Varifocal Intersection over Union (VIoU) bounding-box regression loss function that optimizes convergence in multi-scale feature learning while enhancing global feature extraction capabilities. The experimental results on the VisDrone dataset demonstrate that our improved model achieves performance gains of 0.9% mAP and 1.8% mAP75 compared to the baseline version, effectively improving vehicle detection accuracy. Full article
Show Figures

Figure 1

16 pages, 3776 KB  
Article
MDA-DETR: Enhancing Offending Animal Detection with Multi-Channel Attention and Multi-Scale Feature Aggregation
by Haiyan Zhang, Huiqi Li, Guodong Sun and Feng Yang
Animals 2025, 15(2), 259; https://doi.org/10.3390/ani15020259 - 17 Jan 2025
Cited by 5 | Viewed by 2224
Abstract
Conflicts between humans and animals in agricultural and settlement areas have recently increased, resulting in significant resource loss and risks to human and animal lives. This growing issue presents a global challenge. This paper addresses the detection and identification of offending animals, particularly [...] Read more.
Conflicts between humans and animals in agricultural and settlement areas have recently increased, resulting in significant resource loss and risks to human and animal lives. This growing issue presents a global challenge. This paper addresses the detection and identification of offending animals, particularly in obscured or blurry nighttime images. This article introduces Multi-Channel Coordinated Attention and Multi-Dimension Feature Aggregation (MDA-DETR). It integrates multi-scale features for enhanced detection accuracy, employing a Multi-Channel Coordinated Attention (MCCA) mechanism to incorporate location, semantic, and long-range dependency information and a Multi-Dimension Feature Aggregation Module (DFAM) for cross-scale feature aggregation. Additionally, the VariFocal Loss function is utilized to assign pixel weights, enhancing detail focus and maintaining accuracy. In the dataset section, this article uses a dataset from the Northeast China Tiger and Leopard National Park, which includes images of six common offending animal species. In the comprehensive experiments on the dataset, the mAP50 index of MDA-DETR was 1.3%, 0.6%, 0.3%, 3%, 1.1%, and 0.5% higher than RT-DETR-r18, yolov8n, yolov9-C, DETR, Deformable-detr, and DCA-yolov8, respectively, indicating that MDA-DETR is superior to other advanced methods. Full article
(This article belongs to the Special Issue Animal–Computer Interaction: Advances and Opportunities)
Show Figures

Figure 1

22 pages, 10887 KB  
Article
Research on a UAV-View Object-Detection Method Based on YOLOv7-Tiny
by Yuyang Miao, Xihan Wang, Ning Zhang, Kai Wang, Lianhe Shao and Quanli Gao
Appl. Sci. 2024, 14(24), 11929; https://doi.org/10.3390/app142411929 - 20 Dec 2024
Cited by 6 | Viewed by 1982
Abstract
To address the issues of missed and false detections caused by small object sizes, dense object distribution, and complex scenes in drone aerial images, this study proposes a drone-view object-detection algorithm based on YOLOv7-tiny with a Partial_C_Detect detection head. The algorithm’s performance in [...] Read more.
To address the issues of missed and false detections caused by small object sizes, dense object distribution, and complex scenes in drone aerial images, this study proposes a drone-view object-detection algorithm based on YOLOv7-tiny with a Partial_C_Detect detection head. The algorithm’s performance in handling object occlusion and multi-scale detection is enhanced by introducing the VarifocalLoss loss function and improving the feature fusion network to BiFPN. Furthermore, incorporating the novel Partial_C_Detect detection head and Adaptive Kernel Convolution (AKConv) improves the detection capabilities for small and dynamically changing objects. In addition, introducing the Dilated Weighted Residual (DWR) attention module optimizes the information processing flow, enhancing the algorithm’s ability to capture key information, especially in complex backgrounds. These enhancements collectively enable the model to balance high detection accuracy and computational efficiency, making it well-suited for resource-constrained UAV platforms. Experiments conducted on the VisDrone2019 dataset show that the improved algorithm achieves a mAP@0.5 of 38.2%, with a model size of 29.01 MB and a computational complexity of 16.2 G. Compared to the original YOLOv7-tiny algorithm, the mAP@0.5 improves by 2.9%, and the algorithm performs better in other key performance metrics, demonstrating its adaptability and robustness in drone aerial image object-detection tasks. Full article
(This article belongs to the Special Issue Advanced Pattern Recognition & Computer Vision)
Show Figures

Figure 1

16 pages, 6553 KB  
Article
Cucumber Leaf Segmentation Based on Bilayer Convolutional Network
by Tingting Qian, Yangxin Liu, Shenglian Lu, Linyi Li, Xiuguo Zheng, Qingqing Ju, Yiyang Li, Chun Xie and Guo Li
Agronomy 2024, 14(11), 2664; https://doi.org/10.3390/agronomy14112664 - 12 Nov 2024
Cited by 2 | Viewed by 2062
Abstract
When monitoring crop growth using top-down images of the plant canopies, leaves in agricultural fields appear very dense and significantly overlap each other. Moreover, the image can be affected by external conditions such as background environment and light intensity, impacting the effectiveness of [...] Read more.
When monitoring crop growth using top-down images of the plant canopies, leaves in agricultural fields appear very dense and significantly overlap each other. Moreover, the image can be affected by external conditions such as background environment and light intensity, impacting the effectiveness of image segmentation. To address the challenge of segmenting dense and overlapping plant leaves under natural lighting conditions, this study employed a Bilayer Convolutional Network (BCNet) method for accurate leaf segmentation across various lighting environments. The major contributions of this study are as follows: (1) Utilized Fully Convolutional Object Detection (FCOS) for plant leaf detection, incorporating ResNet-50 with the Convolutional Block Attention Module (CBAM) and Feature Pyramid Network (FPN) to enhance Region of Interest (RoI) feature extraction from canopy top-view images. (2) Extracted the sub-region of the RoI based on the position of the detection box, using this region as input for the BCNet, ensuring precise segmentation. (3) Utilized instance segmentation of canopy top-view images using BCNet, improving segmentation accuracy. (4) Applied the Varifocal Loss Function to improve the classification loss function in FCOS, leading to better performance metrics. The experimental results on cucumber canopy top-view images captured in glass greenhouse and plastic greenhouse environments show that our method is highly effective. For cucumber leaves at different growth stages and under various lighting conditions, the Precision, Recall and Average Precision (AP) metrics for object recognition are 97%, 94% and 96.57%, respectively. For instance segmentation, the Precision, Recall and Average Precision (AP) metrics are 87%, 83% and 84.71%, respectively. Our algorithm outperforms commonly used deep learning algorithms such as Faster R-CNN, Mask R-CNN, YOLOv4 and PANet, showcasing its superior capability in complex agricultural settings. The results of this study demonstrate the potential of our method for accurate recognition and segmentation of highly overlapping leaves in diverse agricultural environments, significantly contributing to the application of deep learning algorithms in smart agriculture. Full article
(This article belongs to the Special Issue AI, Sensors and Robotics for Smart Agriculture—2nd Edition)
Show Figures

Figure 1

17 pages, 7240 KB  
Article
YOLO-BFRV: An Efficient Model for Detecting Printed Circuit Board Defects
by Jiaxin Liu, Bingyu Kang, Chao Liu, Xunhui Peng and Yan Bai
Sensors 2024, 24(18), 6055; https://doi.org/10.3390/s24186055 - 19 Sep 2024
Cited by 7 | Viewed by 4251
Abstract
The small area of a printed circuit board (PCB) results in densely distributed defects, leading to a lower detection accuracy, which subsequently impacts the safety and stability of the circuit board. This paper proposes a new YOLO-BFRV network model based on the improved [...] Read more.
The small area of a printed circuit board (PCB) results in densely distributed defects, leading to a lower detection accuracy, which subsequently impacts the safety and stability of the circuit board. This paper proposes a new YOLO-BFRV network model based on the improved YOLOv8 framework to identify PCB defects more efficiently and accurately. First, a bidirectional feature pyramid network (BIFPN) is introduced to expand the receptive field of each feature level and enrich the semantic information to improve the feature extraction capability. Second, the YOLOv8 backbone network is refined into a lightweight FasterNet network, reducing the computational load while improving the detection accuracy of minor defects. Subsequently, the high-speed re-parameterized detection head (RepHead) reduces inference complexity and boosts the detection speed without compromising accuracy. Finally, the VarifocalLoss is employed to enhance the detection accuracy for densely distributed PCB defects. The experimental results demonstrate that the improved model increases the mAP by 4.12% compared to the benchmark YOLOv8s model, boosts the detection speed by 45.89%, and reduces the GFLOPs by 82.53%, further confirming the superiority of the algorithm presented in this paper. Full article
(This article belongs to the Topic Applications in Image Analysis and Pattern Recognition)
Show Figures

Figure 1

15 pages, 5989 KB  
Article
Instance Segmentation of Lentinus edodes Images Based on YOLOv5seg-BotNet
by Xingmei Xu, Xiangyu Su, Lei Zhou, Helong Yu and Jian Zhang
Agronomy 2024, 14(8), 1808; https://doi.org/10.3390/agronomy14081808 - 16 Aug 2024
Cited by 2 | Viewed by 1695
Abstract
The shape and quantity of Lentinus edodes (commonly known as shiitake) fruiting bodies significantly affect their quality and yield. Accurate and rapid segmentation of these fruiting bodies is crucial for quality grading and yield prediction. This study proposed the YOLOv5seg-BotNet, a model for [...] Read more.
The shape and quantity of Lentinus edodes (commonly known as shiitake) fruiting bodies significantly affect their quality and yield. Accurate and rapid segmentation of these fruiting bodies is crucial for quality grading and yield prediction. This study proposed the YOLOv5seg-BotNet, a model for the instance segmentation of Lentinus edodes, to research its application for the mushroom industry. First, the backbone network was replaced with the BoTNet, and the spatial convolutions in the local backbone network were replaced with global self-attention modules to enhance the feature extraction ability. Subsequently, the PANet was adopted to effectively manage and integrate Lentinus edodes images in complex backgrounds at various scales. Finally, the Varifocal Loss function was employed to adjust the weights of different samples, addressing the issues of missed segmentation and mis-segmentation. The enhanced model demonstrated improvements in the precision, recall, Mask_AP, F1-Score, and FPS, achieving 97.58%, 95.74%, 95.90%, 96.65%, and 32.86 frames per second, respectively. These values represented the increases of 2.37%, 4.55%, 4.56%, 3.50%, and 2.61% compared to the original model. The model achieved dual improvements in segmentation accuracy and speed, exhibiting excellent detection and segmentation performance on Lentinus edodes fruiting bodies. This study provided technical fundamentals for future application of image detection and decision-making processes to evaluate mushroom production, including quality grading and intelligent harvesting. Full article
Show Figures

Figure 1

22 pages, 4810 KB  
Article
Ship Target Detection in Optical Remote Sensing Images Based on E2YOLOX-VFL
by Qichang Zhao, Yiquan Wu and Yubin Yuan
Remote Sens. 2024, 16(2), 340; https://doi.org/10.3390/rs16020340 - 15 Jan 2024
Cited by 14 | Viewed by 4817
Abstract
In this research, E2YOLOX-VFL is proposed as a novel approach to address the challenges of optical image multi-scale ship detection and recognition in complex maritime and land backgrounds. Firstly, the typical anchor-free network YOLOX is utilized as the baseline network for ship detection. [...] Read more.
In this research, E2YOLOX-VFL is proposed as a novel approach to address the challenges of optical image multi-scale ship detection and recognition in complex maritime and land backgrounds. Firstly, the typical anchor-free network YOLOX is utilized as the baseline network for ship detection. Secondly, the Efficient Channel Attention module is incorporated into the YOLOX Backbone network to enhance the model’s capability to extract information from objects of different scales, such as large, medium, and small, thus improving ship detection performance in complex backgrounds. Thirdly, we propose the Efficient Force-IoU (EFIoU) Loss function as a replacement for the Intersection over Union (IoU) Loss, addressing the issue whereby IoU Loss only considers the intersection and union between the ground truth boxes and the predicted boxes, without taking into account the size and position of targets. This also considers the disadvantageous effects of low-quality samples, resulting in inaccuracies in measuring target similarity, and improves the regression performance of the algorithm. Fourthly, the confidence loss function is improved. Specifically, Varifocal Loss is employed instead of CE Loss, effectively handling the positive and negative sample imbalance, challenging samples, and class imbalance, enhancing the overall detection performance of the model. Then, we propose Balanced Gaussian NMS (BG-NMS) to solve the problem of missed detection caused by the occlusion of dense targets. Finally, the E2YOLOX-VFL algorithm is tested on the HRSC2016 dataset, achieving a 9.28% improvement in mAP compared to the baseline YOLOX algorithm. Moreover, the detection performance using BG-NMS is also analyzed, and the experimental results validate the effectiveness of the E2YOLOX-VFL algorithm. Full article
Show Figures

Graphical abstract

17 pages, 10763 KB  
Article
YOLO-Crater Model for Small Crater Detection
by Lingli Mu, Lina Xian, Lihong Li, Gang Liu, Mi Chen and Wei Zhang
Remote Sens. 2023, 15(20), 5040; https://doi.org/10.3390/rs15205040 - 20 Oct 2023
Cited by 20 | Viewed by 5786
Abstract
Craters are the most prominent geomorphological features on the surface of celestial bodies, which plays a crucial role in studying the formation and evolution of celestial bodies as well as in landing and planning for surface exploration. Currently, the main automatic crater detection [...] Read more.
Craters are the most prominent geomorphological features on the surface of celestial bodies, which plays a crucial role in studying the formation and evolution of celestial bodies as well as in landing and planning for surface exploration. Currently, the main automatic crater detection models and datasets focus on the detection of large and medium craters. In this paper, we created 23 small lunar crater datasets for model training based on the Chang’E-2 (CE-2) DOM, DEM, Slope, and integrated data with 7 kinds of visualization stretching methods. Then, we proposed the YOLO-Crater model for Lunar and Martian small crater detection by replacing EioU and VariFocal loss to solve the crater sample imbalance problem and introducing a CBAM attention mechanism to mitigate interference from the complex extraterrestrial environment. The results show that the accuracy (P = 87.86%, R = 66.04%, and F1 = 75.41%) of the Lunar YOLO-Crater model based on the DOM-MMS (Maximum-Minimum Stretching) dataset is the highest and better than that of the YOLOX model. The Martian YOLO-Crater, trained by the Martian dataset from the 2022 GeoAI Martian Challenge, achieves good performance with P = 88.37%, R = 69.25%, and F1 = 77.65%. It indicates that the YOLO-Crater model has strong transferability and generalization capability, which can be applied to detect small craters on the Moon and other celestial bodies. Full article
(This article belongs to the Special Issue Laser and Optical Remote Sensing for Planetary Exploration)
Show Figures

Graphical abstract

19 pages, 11706 KB  
Article
SE-YOLOv7 Landslide Detection Algorithm Based on Attention Mechanism and Improved Loss Function
by Qing Liu, Tingting Wu, Yahong Deng and Zhiheng Liu
Land 2023, 12(8), 1522; https://doi.org/10.3390/land12081522 - 31 Jul 2023
Cited by 22 | Viewed by 3920
Abstract
With the continuous development of computer vision technology, more and more landslide identification detection tasks have started to shift from manual visual interpretation to automatic computer identification, and automatic landslide detection methods based on remote sensing satellite images and deep learning have been [...] Read more.
With the continuous development of computer vision technology, more and more landslide identification detection tasks have started to shift from manual visual interpretation to automatic computer identification, and automatic landslide detection methods based on remote sensing satellite images and deep learning have been gradually developed. However, most existing algorithms often have the problem of low precision and weak generalization in landslide detection. Based on the Google Earth Engine platform, this study selected landslide image data from 24 study areas in China and established the DN landslide sample dataset, which contains a total of 1440 landslide samples. The original YOLOv7 algorithm model was improved and optimized by applying the SE squeezed attention mechanism and VariFocal loss function to construct the SE-YOLOv7 model to realize the automatic detection of landslides in remote sensing images. The experimental results show that the mAP, Precision value, Recall value, and F1-Score of the improved SE-YOLOv7 model for landslide identification are 91.15%, 93.35%, 94.54%, and 93.94%, respectively. At the same time, through a field investigation and verification study in Qianyang County, Baoji City, Shaanxi Province, comparing the detection results of SE-YOLOv7, it is concluded that the improved SE-YOLOv7 can locate the landslide location more accurately, detect the landslide range more accurately, and have fewer missed detections. The research results show that the algorithm model has strong detection accuracy for many types of landslide image data, which provides a technical reference for future research on landslide detection based on remote sensing images. Full article
(This article belongs to the Special Issue Remote Sensing Application in Landslide Detection and Assessment)
Show Figures

Figure 1

18 pages, 6803 KB  
Article
IO-YOLOv5: Improved Pig Detection under Various Illuminations and Heavy Occlusion
by Jiajun Lai, Yun Liang, Yingjie Kuang, Zhannan Xie, Hongyuan He, Yuxin Zhuo, Zekai Huang, Shijie Zhu and Zenghang Huang
Agriculture 2023, 13(7), 1349; https://doi.org/10.3390/agriculture13071349 - 4 Jul 2023
Cited by 12 | Viewed by 2630
Abstract
Accurate detection and counting of live pigs are integral to scientific breeding and production in intelligent agriculture. However, existing pig counting methods are challenged by heavy occlusion and varying illumination conditions. To overcome these challenges, we proposed IO-YOLOv5 (Illumination-Occlusion YOLOv5), an improved network [...] Read more.
Accurate detection and counting of live pigs are integral to scientific breeding and production in intelligent agriculture. However, existing pig counting methods are challenged by heavy occlusion and varying illumination conditions. To overcome these challenges, we proposed IO-YOLOv5 (Illumination-Occlusion YOLOv5), an improved network that expands on the YOLOv5 framework with three key contributions. Firstly, we introduced the Simple Attention Receptive Field Block (SARFB) module to expand the receptive field and give greater weight to important features at different levels. The Ghost Spatial Pyramid Pooling Fast Cross Stage Partial Connections (GSPPFC) module was also introduced to enhance model feature reuse and information flow. Secondly, we optimized the loss function by using Varifocal Loss to improve the model’s learning ability on high-quality and challenging samples. Thirdly, we proposed a public dataset consisting of 1270 images and 15,672 pig labels. Experiments demonstrated that IO-YOLOv5 achieved a mean average precision (mAP) of 90.8% and a precision of 86.4%, surpassing the baseline model by 2.2% and 3.7% respectively. By using a model ensemble and test time augmentation, we further improved the mAP to 92.6%, which is a 4% improvement over the baseline model. Extensive experiments showed that IO-YOLOv5 exhibits excellent performance in pig recognition, particularly under heavy occlusion and various illuminations. These results provide a strong foundation for pig recognition in complex breeding environments. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

19 pages, 11500 KB  
Article
Exploiting the Potential of Overlapping Cropping for Real-World Pedestrian and Vehicle Detection with Gigapixel-Level Images
by Chunlei Wang, Wenquan Feng, Binghao Liu, Xinyang Ling and Yifan Yang
Appl. Sci. 2023, 13(6), 3637; https://doi.org/10.3390/app13063637 - 13 Mar 2023
Cited by 5 | Viewed by 3149
Abstract
Pedestrian and vehicle detection is widely used in intelligent assisted driving, pedestrian counting, drone aerial photography, and other applications. Recently, with the development of gigacameras, gigapixel-level images have emerged. The large field of view and high resolution provide global and local information, which [...] Read more.
Pedestrian and vehicle detection is widely used in intelligent assisted driving, pedestrian counting, drone aerial photography, and other applications. Recently, with the development of gigacameras, gigapixel-level images have emerged. The large field of view and high resolution provide global and local information, which enables object detection in real-world scenarios. Although existing pedestrian and vehicle detection algorithms have achieved remarkable success for standard images, their methods are not suitable for ultra-high-resolution images. In order to improve the performance of existing pedestrian and vehicle detectors in real-world scenarios, we used a sliding window to crop the original images to solve this problem. When fusing the sub-images, we proposed a midline method to reduce the cropped objects that NMS could not eliminate. At the same time, we used varifocal loss to solve the imbalance between positive and negative samples caused by the high resolution. We also found that pedestrians and vehicles were separable in size and comprised more than one target type. As a result, we improved the detector performance with single-class object detection for pedestrians and vehicles, respectively. At the same time, we provided many useful strategies to improve the detector. The experimental results demonstrated that our method could improve the performance of real-world pedestrian and vehicle detection. Full article
(This article belongs to the Special Issue Intelligent Analysis and Image Recognition)
Show Figures

Figure 1

12 pages, 2895 KB  
Article
Mid-Infrared Continuous Varifocal Metalens with Adjustable Intensity Based on Phase Change Materials
by Liangde Shao, Kongsi Zhou, Fangfang Zhao, Yixiao Gao, Bingxia Wang and Xiang Shen
Photonics 2022, 9(12), 959; https://doi.org/10.3390/photonics9120959 - 9 Dec 2022
Cited by 5 | Viewed by 3328
Abstract
Metalenses can greatly reduce the complexity of imaging systems due to their small size and light weight and also provide a platform for the realization of multifunctional imaging devices. Achieving dynamic focus length tunability is highly important for metalens research. In this paper, [...] Read more.
Metalenses can greatly reduce the complexity of imaging systems due to their small size and light weight and also provide a platform for the realization of multifunctional imaging devices. Achieving dynamic focus length tunability is highly important for metalens research. In this paper, based on single-crystal Ge and a new low-loss phase change material Ge2Sb2Se5 (GSSe), a tunable metalens formed by a double-layer metasurface composite was realized in the mid-infrared band. The first-layer metasurface formed by Ge nanopillars combines propagation and the geometric phase (equivalent to a half-wave plate function) to produce single- or multiple-polarization-dependent foci. The second-layer metasurface formed by GSSe nanopillars provides a tunable propagation phase, and the double-layer metalens can achieve the tunability of the focus length depending on the different crystalline fractions of GSSe. The focal length varies from 62.91 to 67.13 μm under right circularly polarized light incidence and from 33.84 to 36.66 μm under left circularly polarized light incidence. Despite the difference in the crystallographic fraction, the metalens’s focusing efficiency is maintained basically around 59% and 48% when zooming under RCP and LCP wave excitation. Meanwhile, the incident wave’s ellipticity can be changed to alter the relative intensity ratios of the bifocals from 0.03 to 4.26. This continuous varifocal metalens with adjustable intensity may have potential in practical applications such as optical tomography, multiple imaging, and systems of optical communication. Full article
Show Figures

Figure 1

Back to TopTop