Study on Real-Time Detection of Lightweight Tomato Plant Height Under Improved YOLOv5 and Visual Features

Leng, Ling; Wang, Lin; Lv, Jinhong; Xie, Pengan; Zeng, Chao; Wu, Weibin; Fan, Chaoyan

doi:10.3390/pr12122622

Open AccessArticle

Study on Real-Time Detection of Lightweight Tomato Plant Height Under Improved YOLOv5 and Visual Features

by

Ling Leng

¹,

Lin Wang

²,

Jinhong Lv

³,

Pengan Xie

⁴,

Chao Zeng

⁴,

Weibin Wu

^3,* and

Chaoyan Fan

³

¹

Zhongshan PolyTechnic Institute, Zhongshan 528400, China

²

Zhongshan Technical Secondary School, Zhongshan 528458, China

³

College of Engineering, South China Agricultural University, Guangzhou 510642, China

⁴

Meizhou Municipal Agricultural Comprehensive Service Center, Meizhou 514021, China

^*

Author to whom correspondence should be addressed.

Processes 2024, 12(12), 2622; https://doi.org/10.3390/pr12122622

Submission received: 18 October 2024 / Revised: 9 November 2024 / Accepted: 19 November 2024 / Published: 21 November 2024

(This article belongs to the Special Issue Intelligent Monitoring and Fault Diagnosis of Complex Industrial Processes or Equipment)

Download

Browse Figures

Versions Notes

Abstract

:

Tomato cultivation is relatively dense, and the main stem is easily submerged in a background environment with small color difference. The semi-enclosed planting space and fast growth cycle are both limitations that cannot be ignored in detection technology. The accuracy and real-time performance of plant height detection are of great practical significance. To this end, we are committed to improving YOLOv5 and proposing a lightweight real-time detection method for plant height by combining visual features of tomato main stems. Here, we improved the backbone, neck, head, and activation functions of YOLOv5, using CSP dark net53-s as the backbone structure and introducing a focus structure to reduce the number of GE modules. We replaced all CSP2_X structures in neck and head with GE modules, embedded interactive multi-head attention, and replaced YOLOv5’s framework function and attention activation function. We defined visual features such as the color of the main stem of tomato plants in the preprocessed image; input improved YOLOv5; and completed plant height detection through effective feature map fusion, main stem framing, and scale conversion. The experimental results show that the linear deviation between the plant height detection value and the actual value of the proposed method is always less than 3 cm, and the detection FPS can reach up to 67 frames per second, with superior timeliness, which can effectively achieve lightweight real-time detection.

Keywords:

YOLOv5; lightweight; visual features; color; texture; real time detection

1. Introduction

With the continuous progress of agricultural science and technology and the increasing demand for precision agriculture, the real-time monitoring and evaluation management of crop growth parameters has become particularly important [1]. As one of the key indicators to measure crop growth, plant height can reflect plant growth rate, health status, adaptability to the external environment, and possible yield trend [2]. Therefore, how to obtain plant height accurately and quickly has become the exploration direction of agriculture, forestry, ecology, and many other fields. By obtaining crop plant height, farmers and researchers can not only screen varieties with excellent traits, improve the genetic material of crops, optimize the growing environment according to the growth conditions of crops, and provide a basis for ecological protection and environmental governance, but also predict the yield trend of crops and provide support for resource allocation and planting management decisions in agricultural production [3,4].

At present, there are two main ways to obtain plant height: direct and indirect. With the continuous progress of advanced technologies such as computer vision, the means of obtaining plant height are constantly updated and improved, making indirect nondestructive testing technology gradually attract extensive attention from scholars at home and abroad, and many methods have been proposed in this regard, but there are still some problems [5,6]. For example, in the lettuce plant height detection method proposed by He Xingyao et al. [7], only the attention mechanism of YOLOv5 was improved, and the interference of farmland environment was not suppressed. Moreover, errors would be generated during image processing by UAV oblique photography. The UAV method proposed by Wu Tingting et al. [8] for measuring wheat plant height ensured the accuracy of acquiring plant height phenotype by UAV, but the remote sensing operation was time-consuming and inefficient. In the UAV RGB remote sensing estimation method of sugarcane plant height proposed by Liang Yongjian et al. [9], the RGB model relies too much on the reflection information of the red, green, and blue wave segments, and the illumination intensity and angle, as well as the overdensity of the plant, will affect the estimation accuracy. In the prediction model constructed based on near-infrared, Raman, and fluorescence spectra, Johannes et al. [10] found that different excitation light sources of fluorescence spectra and changes in farmland environment would all lead to spectral differences and fluctuations, making it impossible to obtain accurate wheat plant height. In the maize plant height estimation method based on the fusion of Sentinel-1 dual-polarization SAR data, the backscattering coefficient, and depolarization parameters studied by Wang et al. [11], although the inversion accuracy in the later period of crop growth was improved, the Sentinel-1 satellite had a long revisit period and could not obtain the continuous growth data of maize in time [12]. And the surface factors would have interfered with the SAR signal, resulting in estimation errors.

For tomato, as a widely grown vegetable crop, the growth rate of change is fast, there is not enough operation time, and it is usually planted in a greenhouse, which means that the operating space is extremely limited and it is not a suitable environment for the operation of drones [13]. At the same time, tomato planting is relatively dense, and the main stems of different plants can easily be confused with each other, but also with other environmental backgrounds such as leaves with small color difference, which increases the difficulty of plant height detection. Therefore, based on the backbone network architecture of lightweight YOLOv5, a lightweight real-time plant height detection method was designed by introducing the visual features of tomato appearance [14]. The lightweight method improves various aspects of YOLOv5, meaning that YOLOv5 can meet the dual needs of plant height detection: speed and accuracy. Defining the visual features of tomato main stems, such as color, texture, and shape, can more accurately locate the detection target from a complex background, retain useful information about the main stem, and thus ensure the detection accuracy of plant height and increase real-time performance.

2. Improvement in YOLOv5 for Real-Time Detection of Plant Height of Lightweight Tomato

In terms of its applicability to the specific task of the real-time detection of tomato plant height, compared with the YOLOv8 and YOLOv11 versions, YOLOv5 has a structure that is more easily modified and optimized and can better adapt to the task. YOLOv5 is more lightweight in terms of computational resource requirements and is more suitable for running in the small model designed in this paper. YOLOv5 is improved in terms of backbone, neck, head, and so on in order to simplify the model scale and the number of parameters and strengthen the model’s target recognition ability against a complex background, aiming to achieve the light weight and real-time detection requirements and handle the density of tomato plant height detection targets.

The improved YOLOv5 model is shown in Figure 1, which is the main resource used to realize the real-time detection of tomato plant height and weight.

Specific improvement methods in various aspects of the model are described as follows:

(1): Backbone: The backbone of YOLOv5 is replaced with CSP dark net53-s [15], and GE modules are reduced to achieve a lightweight structure and simplified computation. A ghost module [16] and Mobile net v2 [17] are combined to establish the ghost bottleneck structure of the GE module to prevent information loss due to scale compression by converting information between small dimensions and large dimensions. In order to compensate for the precision loss of the structure and make the information about the feature map more complete and rich, the GE module channel is extended and the focus structure [18] is introduced to slice the input image.
(2): Neck and head: First, all the CSP2_X structures in the neck and head are replaced with GE modules, and all standard convolution is replaced with deep separable convolution [19], so as to avoid model parameter redundancy and reduce the calculation amount. Then, interactive multi-head attention is embedded into the neck [20], and after obtaining spatial attention and channel attention simultaneously, feature information is selected based on two dimensions.
(3): Activation function: In order to deepen the information in the neural network, the framework function of YOLOv5 is replaced by the swish function [21]. The embedded attention activation function is replaced by the H-sigmoid function [22] to expand the detection accuracy.

3. Real-Time Detection of Tomato Plant Height and Weight

The improved YOLOv5 model plays a core role in real-time detection and can adapt to the real-time detection task of lightweight tomato plant height. Therefore, it is used as the implementation body of the real-time detection of lightweight tomato plant height. The main stem color, texture, shape, and other visual features captured from tomato images are input into the improved YOLOv5 model. It can capture higher-quality visual features and better adapt to the complex and dense background environment, without the influence of lighting conditions, occlusion, shooting angle, and other factors, identifying the main stem of tomato plants [23], and finally, through layer-by-layer reasoning, obtaining the plant height detection results.

The real-time detection process of lightweight tomato plant height based on improved YOLOv5 is described as follows:

(1): Delineation of visual characteristics of main stems of tomato plants. Regarding the visual characteristics of the main stem, the color, texture, and shape are selected to avoid deviation due to single features.

(a) Color is captured by color moments that do not need to quantify color features [24], that is, the formula for calculating the dispersion σ_i and skew s_i of color distribution is as follows:

\{\begin{cases} σ_{i} = \frac{N^{2}}{{[\sum_{j = 1}^{N} {(\frac{N p_{i j} - \sum_{j = 1}^{N} p_{i j}}{N})}^{2}]}^{2}} \\ s_{i} = \frac{N^{3}}{{[\sum_{j = 1}^{N} {(\frac{N p_{i j} - \sum_{j = 1}^{N} p_{i j}}{N})}^{3}]}^{3}} \end{cases}

(1)

In the above formula, P_ij stands for the i-th color value component of pixel j in N pixels.

(b) Texture is captured by a gray co-incidence matrix based on the gray level spatial correlation matrix [25]. In order to reduce the amount of computation and remove the adjacent interval and direction information about image gray level in the matrix, the texture of the main stem of a tomato plant is obtained through the following L gray level optimization formula:

R (g, h) = {[\begin{matrix} r_{00} & r_{01} & \dots & r_{0 (L - 1)} \\ r_{10} & r_{11} & \dots & r_{1 (L - 1)} \\ \dots & \dots & \dots & \dots \\ r_{(L - 1) 0} & r_{(L - 1) 1} & \dots & r_{(L - 1) (L - 1)} \end{matrix}]}_{L \times L}

(2)

In the above formula, (g,h) stands for gray level, g,h = 0, …, L – 1, and matrix element rgh stands for counter, which is used to count all the pixels in the image that meet the following equations:

\{\begin{cases} G (x, y) = g \\ G (x + d_{x}, y + d_{y}) = h \end{cases}

(3)

In the above formula, G(x,y) stands for the gray level of pixel (x,y), and d_x and d_y stand for the offset in different directions.

(c) The shape is mainly captured through curve fitting [26]. First, according to the vertical characteristics of the main stem, the key points are set equidistant in the vertical direction along the contour curve. The equal spacing values D_left and D_right of the left and right contours of the stems are solved using the following two formulas:

\{\begin{cases} D_{l e f t} = \frac{l}{n_{l e f t}} \\ D_{r i g h t} = \frac{l}{n_{r i g h t}} \end{cases}

(4)

In the above formula, l stands for the length of the major axis of the elliptic model used to fit the main stem, and n_left and n_right stand for the key points on the left and right outlines.

The following conics are used to fit the key points of the left and right contours, and their real-time properties are expanded to obtain the fitting curve parameter, that is, the visual feature vector (α₁,α₂,α₃,β₁,β₂,β₃) of the shape, as follows:

\{\begin{cases} Y_{1} = α_{1} X^{2} + α_{2} X + α_{3} D_{l e f t} \\ Y_{2} = β_{1} X^{2} + β_{2} X + β_{3} D_{r i g h t} \end{cases}

(5)

(2): Three types of visual features are input into the improved YOLOv5 model, and through the backbone, neck, and head parts and step-by-step operation and reasoning, effective feature maps of different specifications of receptor fields are obtained. The main stems of tomato plants are selected from the background by using the feature pyramid network [27] and three effective feature maps.
(3): According to the obtained ratio value B of actual plant height H₀ and detected plant height $H_{0}^{'}$ the following ratio conversion is carried out to solve the actual plant height H_t of tomato and complete the test:

$H_{t} = \frac{H_{0}}{H_{0}^{'}} \times H_{t}^{'} = B \times H_{t}^{'}$

(6)

In the above formula,

H_{t}^{'}

stands for frame. The height of the outer frame of the main stem is chosen.

4. Experiments

In the demonstration base of the Agricultural Extension Center of Zhongshan City, Guangdong Province, multiple images of tomato varieties planted in solar greenhouses were collected. The field environment of image collection and plant height detection is shown in Figure 2.

Images of tomato plants grown in solar greenhouses were collected using a Parrot Sequoia multispectral camera, which contains an RGB sensor. The RGB sensor integrated the collected images with the plant height detection data to prepare the data set for the experiment. The data set included 10⁵ images. Labels were divided based on the actual height of the tomato main stem. Labels were classified according to a height of 0.1 m. The highest tomato main stem height was 2.5 m, and the lowest was 0.3 m, and the labels were divided into 22. The image collection and plant height detection data were integrated to prepare the experimental data set. The data in the data set were cleaned and preprocessed. The images were checked to determine whether they had overexposure. All images were adjusted to a resolution of 1920 × 1080 pixels for unified processing. Edge detection methods were used to extract the shape information about the plant height in the images, and the data were augmented. The data set, after augmented processing, was divided into a 70% training set and a 30% test set. A representative image in the training set is shown in Figure 3.

After the data set was imported into the image processing software (OpenCV 4.5, Intel, Santa Clara, CA, USA) for filtering, enhancement, normalization, and other preprocessing steps, 1000 pieces were randomly selected as the training set and the rest as the test set by the random sampling program. The improved YOLOv5 model was trained using the processed tomato image training set. The trained model was applied to the real-time detection task of the test set, run by Pytorch 2.1.1, Python 3.10.0 development framework, and trained and tested by an intel core TM i9 processor and NVIDIA 3080Ti graphics card (12 GB). The tomato plant height in the image set was measured quickly and accurately.

In the training stage, the initial training parameters of the stochastic gradient descent optimizer and YOLOv5 model for training were set, as shown in Table 1.

Bayesian optimization was used to tune the hyperparameters. Fifty values were selected for each hyperparameter for the grid search. Combinations were randomly selected in the hyperparameter space for attempts, and resources in the space were reasonably allocated to select the optimal hyperparameter combination. After 500 iterations, the best hyperparameter combination was obtained: the learning rate was 0.06, the momentum coefficient was 0.875, the weight decay was 0.00027, the batch size was 45, and the number of epochs was 210.

4.1. Ablation Experiment of Improved YOLOv5 Algorithm

From the perspective of the parameter (pixel granularity) amount, calculation amount, and pixel granularity accuracy of the YOLOv5 algorithm [28], an ablation study on YOLOv5, YOLOv5 with improved backbone (i.e., B-YOLOv5), YOLOv5 with improved neck (i.e., N-YOLOv5), YOLOv5 with improved head (i.e., H-YOLOv5), and three types of YOLOv5 (i.e., BN-YOLOv5, NH-YOLOv5, BH-YOLOv5) obtained by implementing the three improvement strategies in different orders, as well as YOLOv5 obtained by implementing all three improvement strategies (i.e., the improved YOLOv5 model shown in Figure 1), was conducted to investigate the performance of the improved YOLOv5 model. The experimental results are shown in Figure 4. Among them, the standard accuracy was 0.8, the recall rate was 0.7, the value of mAP 0.5 was 0.65, and the value of mAP 0.95 was 0.5. The pixel granularity accuracy was divided into @0.5 and @0.4:1, which, respectively, represent the average pixel granularity with the IOU (intersection ratio) threshold of 0.5 and within the range of 0.4~1 of 0.06 steps.

As shown in Figure 4, the parameter number and scale of different YOLOv5 improvement methods show that the parameter number of the improved YOLOv5 algorithm after the proposed method was reduced by 6.61 M, the calculation amount of the algorithm was simplified to 3.84 MB, and the granularity accuracy of @0.5 pixel was improved by 0.101 and @0.4. The granularity accuracy of 1 pixel was improved by 0.172; this is because when the main stem color, texture, shape, and other visual features captured from tomato images were input into the improved YOLOv5 model, the model could capture visual features of higher quality, thus giving YOLOv5 better detection performance.

4.2. Deviation and Speed Test of Plant Height Detection

For tomato crops in the study area, the UAV measurement method, multi-spectral fusion measurement method, and designed detection method were used to obtain the plant height [29,30]. By comparing the linear deviation of the measured plant height and the level evaluation of FPS (frames per second), the real-time detection performance of the proposed method was analyzed [31]. We selected any planting ditch in the study area, and the plant height detection of tomatoes in the single ditch is shown in Figure 5.

In Figure 5a, compared with other methods, the linear deviation between the measured plant height and the actual value of the method proposed in this paper was the lowest, which was always less than 3 cm. However, due to the limitation of aerial photography height and insufficient light, the deviation of tomato plant height under 1 m was the largest. In Figure 5b, the FPS timing trends of different methods show that the detection speed of the proposed method was significantly faster, up to about 67 frames per second, while the FPS of the UAV measurement method was no more than 55 frames per second [32], and the FPS of the multi-spectral fusion measurement method was no more than 45 frames per second, showing poor real-time performance.

In order to further verify the effectiveness of the proposed method, the proposed method was compared with YOLOv8 and YOLOv9, and the results of real-time tomato plant height detection are shown in Figure 6.

In Figure 6a, the linear deviation between the measured plant height and the actual value of the proposed method was the lowest, which was always less than 2 cm. This is because the improved YOLOv5 model could perform real-time lightweight calculation on the data, and the resulting deviation was low. However, the results of the YOLOv8, YOLOv9, and YOLOv11 methods were biased due to excessive calculation amounts. In Figure 6b, the detection speed of the proposed method was about 69 frames per second, which was much higher than that of the other three methods, while the FPS of the YOLOv11 method did not exceed 50 frames per second, the FPS of the YOLOv8 method did not exceed 45 frames per second, and that of the multi-spectral fusion measurement method did not exceed 42 frames per second. These existing methods’ real-time performance is poor, indicating the accuracy and practicability of the proposed method.

5. Conclusions

As an important crop, the accurate monitoring of tomato’s growth state is very important for optimizing plant management and improving yield and quality [33]. Plant height is one of the key indicators that reflect the growth of tomato, and it is of great significance for agricultural production to accurately measure it in real time [34]. With the development of modern agriculture, the automatic plant height detection method which does not rely on manual measurement has become the mainstream [35]. However, if the advanced YOLOv5 real-time target detection algorithm is directly applied to plant height detection, there will be problems such as high complexity and large resource requirements [36]. Therefore, with the support of the lightweight and computationally efficient YOLOv5 structure; the simplification of the calculation; the rich information provided by the main stem’s color, texture, shape, and other visual features based on tomato’s multiple visual features [37]; and the effective suppression of the negative interference brought by unideal collection methods and lighting conditions in actual environments, most of the measured values of tomato plant height are basically consistent with the actual height of the main stem [38] and have good detection efficiency. However, under different lighting conditions such as cloudy days and strong light, the visibility and feature recognition of tomato plants in the image will be affected by the model, and the detection of plant height will be affected. In the future, optimizing tomato planting management can be further explored to improve the yield and quality of tomatoes.

Author Contributions

Data curation, L.L. and L.W.; funding acquisition, W.W.; project administration, W.W.; formal analysis, L.L., J.L. and C.Z.; investigation, W.W.; methodology, L.L., L.W. and C.Z.; software, L.W., P.X. and C.F.; supervision, W.W.; resources, L.W. and P.X.; validation, W.W.; writing—original draft, L.L.; writing—review and editing, J.L. and C.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by Guangdong Province universities in key areas of special science and technology service rural revitalization project (2023ZDZX4133), and Guangdong Province (Shenzhen) Digital and Intelligent Agricultural Service Industrial Park (FNXM012022020-1).

Data Availability Statement

The data presented in this study are available in article.

Acknowledgments

The authors acknowledge the editors and reviewers for their constructive comments and all the support for this work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Karunathilake, M.; Le, A.T.; Heo, S.; Chung, Y.S.; Mansoor, S. The Path to Smart Farming: Innovations and Opportunities in Precision Agriculture. Agriculture 2023, 13, 1593. [Google Scholar] [CrossRef]
Chen, C.; Zhang, L.; Ni, K. Simulation of Pepper Plant Height, Fruit Growth and Yield Based on Logistic Model with Film Mulching. J. Yunnan Agric. Univ. 2023, 38, 1049–1058. [Google Scholar]
Yin, C.; Wang, S.; Liu, H. Screening of High-yield and Good-taste Japonica Rice Varieties (Lines) in the Yellow River Basin of Henan Province and Their Characteristics. Jiangsu Agric. Sci. 2022, 50, 60–67. [Google Scholar]
Cai, G.; Tao, J. Crop Growth Environment Control Based on Bacterial Foraging Optimized Multi-kernel Support Vector Machine. J. Univ. Jinan 2023, 37, 303–308. [Google Scholar]
Yang, G.; Wang, J.; Nie, Z.; Yang, H.; Yu, S. A Lightweight YOLOv8 Tomato Detection Algorithm Combining Feature Enhancement and Attention. Agronomy 2023, 13, 1824. [Google Scholar] [CrossRef]
Lin, S.; Xu, T.; Ge, Y.; Ma, J.; Sun, T.; Zhao, C. 3D Information Detection Method of facility tomato based on improved YOLOv5l. J. Chin. Agric. Mech. 2024, 45, 274–284. [Google Scholar]
He, X.; Feng, T.; Liang, H.; Yuan, J. Lettuce Plant Height Detection Driven by Unmanned Aerial Vehicle Image Data. Electron. Meas. Technol. 2023, 46, 169–176. [Google Scholar]
Wu, T.; Liu, X.; Nie, R.; Liu, J.; Wu, L.; Li, T. Unmanned Aerial Vehicle Measurement Method for Wheat Plant Height Based on Fine-grained Calibration. Trans. Chin. Soc. Agric. Mach. 2023, 54, 158–167. [Google Scholar]
Liang, Y.; Wu, W.; Shi, Z.; Tang , L.; Song , X.; Yan , M.; Guo , Q.; Qin , C.; He , H.; Zhang , X. Estimation of Sugarcane Plant Height Based on Unmanned Aerial Vehicle RGB Remote Sensing. Crops 2023, 1, 226–232. [Google Scholar]
Johannes, N.; Kaiser, L.; Longin, C.F.H.; Hitzmann, B. Prediction of Wheat Quality Parameters Combining Raman, Fluorescence, and Near-Infrared Spectroscopy (NIRS). Cereal Chem. 2022, 99, 830–842. [Google Scholar]
Wang, Y.; Fang, S.; Zhao, L.; Huang, X. Estimation of Maize Plant Height in North China by Means of Backscattering Coefficient and Depolarization Parameters Using Sentinel-1 Dual-Pol SAR Data. Int. J. Remote Sens. 2022, 43, 1960–1982. [Google Scholar] [CrossRef]
Busquier, M.; Valcarce-Diñeiro, R.; Lopez-Sanchez, J.M.; Plaza, J.; Sánchez, N.; Arias-Pérez, B. Fusion of Multi-Temporal PAZ and Sentinel-1 Data for Crop Classification. Remote Sens. 2021, 13, 3915. [Google Scholar] [CrossRef]
Maham, G.; Rahimi, A.; Subramanian, S.; Smith, D.L. The environmental impacts of organic greenhouse tomato production based on the nitrogen-fixing plant (Azolla). J. Clean. Prod. 2020, 245, 118679. [Google Scholar] [CrossRef]
Raja, R.; Nguyen, T.; Slaughter, C.; Fennimore, S.A. Real-time robotic weed knife control system for tomato and lettuce based on geometric appearance of plant labels. Biosyst. Eng. 2020, 194, 152–164. [Google Scholar] [CrossRef]
Jin, M.; Li, Y.; Zhang, L.; Ma, Z. An Improved Lightweight Object Detection Algorithm Based on Attention Mechanism. Laser Optoelectron. Prog. 2023, 60, 385–392. [Google Scholar]
Yu, Y.; Mo, Y.; Yan, J.; Xiong, C.; Dou, S.; Yang, R. Research on Citrus Disease Recognition Based on Improved ShuffleNet V2. J. Henan Agric. Sci. 2024, 53, 142–151. [Google Scholar]
Peng, Y.; Li, S. A Model for Leaf Disease Recognition of Crops Based on Reparameterized MobileNetV2. Trans. Chin. Soc. Agric. Eng. 2023, 39, 132–140. [Google Scholar]
Wang, Q.; Li, H.; Xie, L.; Xie, J.; Peng, S. Research on the Improvement of Vehicle Target Detection Algorithm Based on Lidar Point Cloud. Electron. Meas. Technol. 2023, 46, 120–126. [Google Scholar]
Gu, T.; Sun, Y.; Lin, H. A Semantic Segmentation Network for Complex Background Characters Based on Lightweight UNet. J. South Cent. Univ. Natl. 2024, 43, 273–279. [Google Scholar]
Zhang, T.; Guo, Q.; Li, Z.; Deng, L. MC-CA: Multi-modal Emotion Analysis Based on Modality Time Series Coupling and Interactive Multi-head Attention. J. Chongqing Univ. Posts Telecommun. 2023, 35, 680–687. [Google Scholar]
Cheng, Q.; Li, J.; Du, J. Ship Target Detection Algorithm in Optical Remote Sensing Images Based on YOLOv5. Syst. Eng. Electron. 2023, 45, 1270–1276. [Google Scholar]
Xu, H.; Yang, D.; Jiang, Q.; Lin, L. Improvement of Lightweight Vehicle Detection Network Based on SSD. Comput. Eng. Appl. 2022, 58, 209–217. [Google Scholar]
Liu, C.; Feng, Q.; Sun, Y.; Li, Y.; Ru, M.; Xu, L. YOLACTFusion: An instance segmentation method for RGB-NIR multimodal image fusion based on an attention mechanism. Comput. Electron. Agric. 2023, 213, 108186. [Google Scholar] [CrossRef]
Li, W.; Wang, Z.; Cui, L. Target Tracking Algorithm Based on Multi-feature Fusion and Improved SIFT. J. Zhengzhou Univ. 2024, 56, 40–46. [Google Scholar]
Dai, G.; Tian, Z.; Fan, J.; Wang, C. Plant Leaf Disease Enhancement Recognition Method for Neural Network Structure Search. J. Northwest For. Univ. 2023, 38, 153–161. [Google Scholar]
Liu, M.; Chen, J.; Chen, H. Building Point Cloud Extraction Method Based on Vector Angle Difference and Fitting Curve Fusion. Sci. Technol. Eng. 2023, 23, 15360–15369. [Google Scholar]
Dang, J.; Tang, X.; Li, S. HA-FPN: Hierarchical Attention Feature Pyramid Network for Object Detection. Sensors 2023, 23, 4508. [Google Scholar] [CrossRef]
Zuo, X.; Chu, J.; Shen, J.; Sun, J. Multi-Granularity Feature Aggregation with Self-Attention and Spatial Reasoning for Fine-Grained Crop Disease Classification. Agriculture 2022, 12, 1499. [Google Scholar] [CrossRef]
Shahi, B.; Xu, Y.; Neupane, A.; Guo, W. Recent Advances in Crop Disease Detection Using UAV and Deep Learning Techniques. Remote Sens. 2023, 15, 2450. [Google Scholar] [CrossRef]
Jimenez-Sierra, D.A.; Benítez-Restrepo, H.D.; Vargas-Cardona, H.D.; Chanussot, J. Graph-Based Data Fusion Applied to: Change Detection and Biomass Estimation in Rice Crops. Remote Sens. 2020, 12, 2683. [Google Scholar] [CrossRef]
Zhang, X.; Li, X.; Zhang, B.; Zhou, J.; Tian, G.; Xiong, Y.; Gu, B. Automated robust crop-row detection in maize fields based on position clustering algorithm and shortest path method. Comput. Electron. Agric. 2018, 154, 165–175. [Google Scholar] [CrossRef]
Lu, X.; Zhou, J.; Yang, R.; Yan, Z.; Lin, Y.; Jiao, J.; Liu, F. Automated Rice Phenology Stage Mapping Using UAV Images and Deep Learning. Drones 2023, 7, 83. [Google Scholar] [CrossRef]
Lee, U.; Islam, M.; Kochi, N.; Tokuda, K.; Nakano, Y.; Naito, H.; Kawasaki, Y.; Ota, T.; Sugiyama, T.; Ahn, D.-H. An Automated, Clip-Type, Small Internet of Things Camera-Based Tomato Flower and Fruit Monitoring and Harvest Prediction System. Sensors 2022, 22, 2456. [Google Scholar] [CrossRef] [PubMed]
Wang, S.; Hu, Z.; Chen, Y.; Wu, H.; Wang, Y.; Wu, F.; Gu, F. Integration of agricultural machinery and agronomy for mechanised peanut production using the vine for animal feed. Biosyst. Eng. 2022, 219, 135–152. [Google Scholar] [CrossRef]
Li, J.; Zhou, Y.; Zhang, H.; Pan, D.; Gu, Y.; Luo, B. Maize plant height automatic reading of measurement scale based on improved YOLOv5 lightweight model. PeerJ Comput. Sci. 2024, 10, 2207. [Google Scholar] [CrossRef]
Zhao, Z.; Wang, J.; Zhao, H. Research on Apple Recognition Algorithm in Complex Orchard Environment Based on Deep Learning. Sensors 2023, 23, 5425. [Google Scholar] [CrossRef]
Zheng, X.; Rong, J.; Zhang, Z.; Yang, Y.; Li, W.; Yuan, T. Fruit growing direction recognition and nesting grasping strategies for tomato harvesting robots. J. Field Robot. 2023, 41, 300–313. [Google Scholar] [CrossRef]
Wang, Y.; Liu, Q.; Yang, J.; Ren, G.; Wang, W.; Zhang, W.; Li, F. A Method for Tomato Plant Stem and Leaf Segmentation and Phenotypic Extraction Based on Skeleton Extraction and Supervoxel Clustering. Agronomy 2024, 14, 198. [Google Scholar] [CrossRef]

Figure 1. Improved YOLOv5 schematic.

Figure 2. Image acquisition and plant height detection site map.

Figure 3. Tomato image.

Figure 4. Schematic diagram of the improved YOLOv5 ablation experiment. (a) Parameter and computational quantities; (b) pixel granularity accuracy.

Figure 5. Schematic diagram of plant height detection performance of different methods. (a) Linearity deviation of the measured value; (b) FPS value.

Figure 6. Performance results of tomato plant height detection by different comparison methods. (a) Linearity deviation of the measured value; (b) FPS value.

Table 1. Hyperparameter setting based on model training.

Module	Argument	Numerical Value
Stochastic gradient descent optimizer	Learning rate	0.02
	Momentum coefficient	0.965
	Weight attenuation	0.0003
YOLOv5 model	Batch size	48
YOLOv5 model	Epoch number	200

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Leng, L.; Wang, L.; Lv, J.; Xie, P.; Zeng, C.; Wu, W.; Fan, C. Study on Real-Time Detection of Lightweight Tomato Plant Height Under Improved YOLOv5 and Visual Features. Processes 2024, 12, 2622. https://doi.org/10.3390/pr12122622

AMA Style

Leng L, Wang L, Lv J, Xie P, Zeng C, Wu W, Fan C. Study on Real-Time Detection of Lightweight Tomato Plant Height Under Improved YOLOv5 and Visual Features. Processes. 2024; 12(12):2622. https://doi.org/10.3390/pr12122622

Chicago/Turabian Style

Leng, Ling, Lin Wang, Jinhong Lv, Pengan Xie, Chao Zeng, Weibin Wu, and Chaoyan Fan. 2024. "Study on Real-Time Detection of Lightweight Tomato Plant Height Under Improved YOLOv5 and Visual Features" Processes 12, no. 12: 2622. https://doi.org/10.3390/pr12122622

APA Style

Leng, L., Wang, L., Lv, J., Xie, P., Zeng, C., Wu, W., & Fan, C. (2024). Study on Real-Time Detection of Lightweight Tomato Plant Height Under Improved YOLOv5 and Visual Features. Processes, 12(12), 2622. https://doi.org/10.3390/pr12122622

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Study on Real-Time Detection of Lightweight Tomato Plant Height Under Improved YOLOv5 and Visual Features

Abstract

1. Introduction

2. Improvement in YOLOv5 for Real-Time Detection of Plant Height of Lightweight Tomato

3. Real-Time Detection of Tomato Plant Height and Weight

4. Experiments

4.1. Ablation Experiment of Improved YOLOv5 Algorithm

4.2. Deviation and Speed Test of Plant Height Detection

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI