MDPI - Publisher of Open Access Journals

24 pages, 8939 KiB

Open AccessArticle

YOLOv7-GCA: A Lightweight and High-Performance Model for Pepper Disease Detection

by Xuejun Yue, Haifeng Li, Qingkui Song, Fanguo Zeng, Jianyu Zheng, Ziyu Ding, Gaobi Kang, Yulin Cai, Yongda Lin, Xiaowan Xu and Chaoran Yu

Agronomy 2024, 14(3), 618; https://doi.org/10.3390/agronomy14030618 - 19 Mar 2024

Cited by 7 | Viewed by 2069 | Correction

Abstract

Existing disease detection models for deep learning-based monitoring and prevention of pepper diseases face challenges in accurately identifying and preventing diseases due to inter-crop occlusion and various complex backgrounds. To address this issue, we propose a modified YOLOv7-GCA model based on YOLOv7 for pepper disease detection, which can effectively overcome these challenges. The model introduces three key enhancements: Firstly, lightweight GhostNetV2 is used as the feature extraction network of the model to improve the detection speed. Secondly, the Cascading fusion network (CFNet) replaces the original feature fusion network, which improves the expression ability of the model in complex backgrounds and realizes multi-scale feature extraction and fusion. Finally, the Convolutional Block Attention Module (CBAM) is introduced to focus on the important features in the images and improve the accuracy and robustness of the model. This study uses the collected dataset, which was processed to construct a dataset of 1259 images with four types of pepper diseases: anthracnose, bacterial diseases, umbilical rot, and viral diseases. We applied data augmentation to the collected dataset, and then experimental verification was carried out on this dataset. The experimental results demonstrate that the YOLOv7-GCA model reduces the parameter count by 34.3% compared to the YOLOv7 original model while improving 13.4% in mAP and 124 frames/s in detection speed. Additionally, the model size was reduced from 74.8 MB to 46.9 MB, which facilitates the deployment of the model on mobile devices. When compared to the other seven mainstream detection models, it was indicated that the YOLOv7-GCA model achieved a balance between speed, model size, and accuracy. This model proves to be a high-performance and lightweight pepper disease detection solution that can provide accurate and timely diagnosis results for farmers and researchers. Full article

(This article belongs to the Special Issue Computer Vision and Deep Learning Technology in Agriculture: 2nd Edition)

► Show Figures

Figure 1

16 pages, 5241 KiB

Open AccessArticle

YOLOv8-CB: Dense Pedestrian Detection Algorithm Based on In-Vehicle Camera

by Qiuli Liu, Haixiong Ye, Shiming Wang and Zhe Xu

Electronics 2024, 13(1), 236; https://doi.org/10.3390/electronics13010236 - 4 Jan 2024

Cited by 36 | Viewed by 7898

Abstract

Recently, the field of vehicle-mounted visual intelligence technology has witnessed a surge of interest in pedestrian detection. Existing algorithms for dense pedestrian detection at intersections face challenges such as high computational weight, complex models that are difficult to deploy, and suboptimal detection accuracy for small targets and highly occluded pedestrians. To address these issues, this paper proposes an improved lightweight multi-scale pedestrian detection algorithm, YOLOv8-CB. The algorithm introduces a lightweight cascade fusion network, CFNet (cascade fusion network), and a CBAM attention module to improve the characterization of multi-scale feature semantics and location information, and it superimposes a bidirectional weighted feature fusion path BIFPN structure to fuse more effective features and improve pedestrian detection performance. It is experimentally verified that compared with the YOLOv8n algorithm, the accuracy of the improved model is increased by 2.4%, the number of model parameters is reduced by 6.45%, and the computational load is reduced by 6.74%. The inference time for a single image is 10.8 ms. The cascade fusion algorithm YOLOv8-CB has higher detection accuracy and is a lighter model for multi-scale pedestrian detection in complex scenes such as streets or intersections. This proposed algorithm presents a valuable approach for device-side pedestrian detection with limited computational resources. Full article

► Show Figures

Figure 1

15 pages, 10516 KiB

Open AccessArticle

Iterative Network for Disparity Prediction with Infrared and Visible Light Images Based on Common Features

by Ziang Zhang, Li Li, Weiqi Jin and Zanxi Qu

Sensors 2024, 24(1), 196; https://doi.org/10.3390/s24010196 - 28 Dec 2023

Viewed by 1349

Abstract

In recent years, the range of applications that utilize multiband imaging has significantly expanded. However, it is difficult to utilize multichannel heterogeneous images to achieve a spectral complementarity advantage and obtain accurate depth prediction based on traditional systems. In this study, we investigate CFNet, an iterative prediction network, for disparity prediction with infrared and visible light images based on common features. CFNet consists of several components, including a common feature extraction subnetwork, context subnetwork, multimodal information acquisition subnetwork, and a cascaded convolutional gated recurrent subnetwork. It leverages the advantages of dual-band (infrared and visible light) imaging, considering semantic information, geometric structure, and local matching details within images to predict the disparity between heterogeneous image pairs accurately. CFNet demonstrates superior performance in recognized evaluation metrics and visual image observations when compared with other publicly available networks, offering an effective technical approach for practical heterogeneous image disparity prediction. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

16 pages, 5810 KiB

Open AccessArticle

An Underwater Dense Small Object Detection Model Based on YOLOv5-CFDSDSE

by Jingyang Wang, Yujia Li, Junkai Wang and Ying Li

Electronics 2023, 12(15), 3231; https://doi.org/10.3390/electronics12153231 - 26 Jul 2023

Cited by 10 | Viewed by 2924

Abstract

Underwater target detection is a key technology in the process of exploring and developing the ocean. Because underwater targets are often very dense, mutually occluded, and affected by light, the detection objects are often unclear, and so, underwater target detection technology faces unique challenges. In order to improve the performance of underwater target detection, this paper proposed a new target detection model YOLOv5-FCDSDSE based on YOLOv5s. In this model, the CFnet (efficient fusion of C3 and FasterNet structure) structure was used to optimize the network structure of the YOLOv5, which improved the model’s accuracy while reducing the number of parameters. Then, Dyhead technology was adopted to achieve better scale perception, space perception, and task perception. In addition, the small object detection (SD) layer was added to combine feature information from different scales effectively, retain more detailed information, and improve the detection ability of small objects. Finally, the attention mechanism squeeze and excitation (SE) was introduced to enhance the feature extraction ability of the model. This paper used the self-made underwater small object dataset URPC_UODD for comparison and ablation experiments. The experimental results showed that the accuracy of the model proposed in this paper was better than the original YOLOv5s and other baseline models in the underwater dense small object detection task, and the number of parameters was also reduced compared to YOLOv5s. Therefore, YOLOv5-FCDSDSE was an innovative solution for underwater target detection tasks. Full article

(This article belongs to the Special Issue Object Detection, Segmentation and Categorization in Artificial Intelligence)

► Show Figures

Figure 1

16 pages, 4970 KiB

Open AccessArticle

Improved Fully Convolutional Siamese Networks for Visual Object Tracking Based on Response Behaviour Analysis

by Xianyun Huang, Songxiao Cao, Chenguang Dong, Tao Song and Zhipeng Xu

Sensors 2022, 22(17), 6550; https://doi.org/10.3390/s22176550 - 30 Aug 2022

Cited by 2 | Viewed by 2202

Abstract

Siamese networks have recently attracted significant attention in the visual tracking community due to their balanced accuracy and speed. However, as a result of the non-update of the appearance model and the changing appearance of the target, the problem of tracking drift is a regular occurrence, particularly in background clutter scenarios. As a means of addressing this problem, this paper proposes an improved fully convolutional Siamese tracker that is based on response behaviour analysis (SiamFC-RBA). Firstly, the response map of the SiamFC is normalised to an 8-bit grey image, and the isohypse contours that represent the candidate target region are generated through thresholding. Secondly, the dynamic behaviour of the contours is analysed in order to check if there are distractors approaching the tracked target. Finally, a peak switching strategy is used as a means of determining the real tracking position of all candidates. Extensive experiments conducted on visual tracking benchmarks, including OTB100, GOT-10k and LaSOT, demonstrated that the proposed tracker outperformed the compared trackers such as DaSiamRPN, SiamRPN, SiamFC, CSK, CFNet and Staple and achieved state-of-the-art performance. In addition, the response behaviour analysis module was embedded into DiMP, with the experimental results showing the performance of the tracker to be improved through the use of the proposed architecture. Full article

(This article belongs to the Special Issue Sensor Systems for Gesture Recognition II)

► Show Figures

Figure 1

16 pages, 2118 KiB

Open AccessArticle

Dietary Nutritional Information Autonomous Perception Method Based on Machine Vision in Smart Homes

by Hongyang Li and Guanci Yang

Entropy 2022, 24(7), 868; https://doi.org/10.3390/e24070868 - 24 Jun 2022

Cited by 10 | Viewed by 2328

Abstract

In order to automatically perceive the user’s dietary nutritional information in the smart home environment, this paper proposes a dietary nutritional information autonomous perception method based on machine vision in smart homes. Firstly, we proposed a food-recognition algorithm based on YOLOv5 to monitor the user’s dietary intake using the social robot. Secondly, in order to obtain the nutritional composition of the user’s dietary intake, we calibrated the weight of food ingredients and designed the method for the calculation of food nutritional composition; then, we proposed a dietary nutritional information autonomous perception method based on machine vision (DNPM) that supports the quantitative analysis of nutritional composition. Finally, the proposed algorithm was tested on the self-expanded dataset CFNet-34 based on the Chinese food dataset ChineseFoodNet. The test results show that the average recognition accuracy of the food-recognition algorithm based on YOLOv5 is 89.7%, showing good accuracy and robustness. According to the performance test results of the dietary nutritional information autonomous perception system in smart homes, the average nutritional composition perception accuracy of the system was 90.1%, the response time was less than 6 ms, and the speed was higher than 18 fps, showing excellent robustness and nutritional composition perception performance. Full article

(This article belongs to the Special Issue Information Theory-Based Deep Learning Tools for Computer Vision)

► Show Figures

Figure 1

21 pages, 37252 KiB

Open AccessArticle

CFNet: LiDAR-Camera Registration Using Calibration Flow Network

by Xudong Lv, Shuo Wang and Dong Ye

Sensors 2021, 21(23), 8112; https://doi.org/10.3390/s21238112 - 4 Dec 2021

Cited by 39 | Viewed by 5618

Abstract

As an essential procedure of data fusion, LiDAR-camera calibration is critical for autonomous vehicles and robot navigation. Most calibration methods require laborious manual work, complicated environmental settings, and specific calibration targets. The targetless methods are based on some complex optimization workflow, which is time-consuming and requires prior information. Convolutional neural networks (CNNs) can regress the six degrees of freedom (6-DOF) extrinsic parameters from raw LiDAR and image data. However, these CNN-based methods just learn the representations of the projected LiDAR and image and ignore the correspondences at different locations. The performances of these CNN-based methods are unsatisfactory and worse than those of non-CNN methods. In this paper, we propose a novel CNN-based LiDAR-camera extrinsic calibration algorithm named CFNet. We first decided that a correlation layer should be used to provide matching capabilities explicitly. Then, we innovatively defined calibration flow to illustrate the deviation of the initial projection from the ground truth. Instead of directly predicting the extrinsic parameters, we utilize CFNet to predict the calibration flow. The efficient Perspective-n-Point (EPnP) algorithm within the RANdom SAmple Consensus (RANSAC) scheme is applied to estimate the extrinsic parameters with 2D–3D correspondences constructed by the calibration flow. Due to its consideration of the geometric information, our proposed method performed better than the state-of-the-art CNN-based methods on the KITTI datasets. Furthermore, we also tested the flexibility of our approach on the KITTI360 datasets. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (7)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI