Next Article in Journal
Optimal Lyapunov-Based Sliding Mode Control for Slotless-Self Bearing Motor System
Previous Article in Journal
Modeling and Analyzing a Multi-Objective Financial Planning Model Using Goal Programming
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Comparative Study of Vetiveria zizanioides Leaf Segmentation Techniques Using Visible, Infrared, and Thermal Camera Sensors in an Outdoor Environment

by
Aryuanto Soetedjo
1,* and
Evy Hendriarianti
2
1
Department of Electrical Engineering, National Institute of Technology (ITN), Malang 65145, Indonesia
2
Department of Environmental Engineering, National Institute of Technology (ITN), Malang 65145, Indonesia
*
Author to whom correspondence should be addressed.
Appl. Syst. Innov. 2023, 6(1), 1; https://doi.org/10.3390/asi6010001
Submission received: 14 November 2022 / Revised: 13 December 2022 / Accepted: 16 December 2022 / Published: 20 December 2022

Abstract

:
A camera vision system is a fast and effective approach to monitoring leaves. It can be used to monitor plant growth, detect diseases, and conduct plant phenotyping. However, due to the outdoor environment of plants, it becomes challenging to use. This paper addresses the problems of Vetiveria zizanioides leaf segmentation by comparing different camera types and segmentation techniques. Visible, no infrared filter (NoIR), and thermal cameras interfaced in an embedded device were used to capture plants during the day and at night. Several popular thresholding techniques and the K-Means algorithm were employed for leaf segmentation, and their performance was measured using Recall, Precision, and F1 score. The comparison results show that the visible camera achieved the best performance on daytime images, with the highest Recall of 0.934 using Triangle thresholding, the highest Precision of 0.751 using K-Means (K = 3), and the highest F1 score of 0.794 using Multi-Otsu thresholding. For nighttime images, the highest Recall of 0.990 was achieved by the thermal camera using Isodata and Otsu thresholding, the highest Precision of 0.572 was achieved by the NoIR camera using K-Means (K = 3), and the highest F1 score of 0.636 was achieved by the NoIR camera using K-Means (K = 3). To compare the leaf segmentation performance of the thresholding techniques and the K-Means algorithm between our image dataset and the well-known plant image dataset, we also evaluated the methods using the Ara2012 image dataset. The results showed that K-Means (K-3) achieved the best performance. The execution time of K-Means was about 3 s, which was longer than the thresholding techniques. However, it is still acceptable for the real-time plant monitoring system.

1. Introduction

Image processing approaches have been used in a range of applications. One of the challenging and interesting applications is plant monitoring, where camera systems are employed to capture information of plants for further tasks, such as leaf detection and counting [1,2,3,4,5], growth monitoring [6,7,8,9], leaf classification [10], disease detection [11,12,13], stress monitoring [14,15], and phenotyping [16,17,18].
A leaf segmentation technique is commonly used as the preliminary process, which extracts the leaf area from the background in the image. Once the leaf area is detected, image analysis of the leaf object can be carried out. The most common leaf segmentation techniques are thresholding-based approaches, watershed, random walker, K-Means, artificial neural networks (ANNs), and color-index-based approaches.
HSV color thresholding was used in [19] for rice plant segmentation as the initial stage of plant height measurement. Color thresholding using the CIELAB color model was used in [20] for vine leaf segmentation from real environment images. HSI color segmentation was employed in [9] to extract leafy vegetables using a Kinect sensor. In [21], thresholding using a new color model was used to measure the leaf area of lettuce plants. Otsu thresholding based on the hue component was used for leaf area measurement in [22]. A software tool for estimating the leaf area was developed in [23] using RGB color thresholding to extract the leaf from the background.
The watershed algorithm has been used for rosette plant leaf segmentation [16] with a Dice score greater than 90%, cotton leaf segmentation [24] with a correct rate of 98%, and vegetable leaf segmentation [25]. The random walker algorithm was used in an interactive tool for leaf annotation [26] with a Dice of 97%. The robust random walker algorithm was proposed in [27] for leaf segmentation under different conditions with an F-measure of 98%. The K-Means algorithm was used for extraction of paddy leaf images for detecting leaf disease [12], conducting tomato leaf segmentation [28] with an F1 score of 98%, and detecting defective regions in leaves [29]. Deep learning was used in [2] for leaf counting. Mask R-CNN was used in [10] for leaf segmentation with a misclassification error of 1% and classification against complex backgrounds. A convolutional neural network (CNN) was used to detect and recognize leaf diseases [13]. Leaf segmentation based on color indexes, including normalized difference index (NDI), excess green minus excess red index (ExGR), and color index of vegetation extraction (CIVE), was addressed in [30] with segmentation rates of 80.7%, 80.9%, and 81.4%, respectively.
Even though the previously described leaf segmentation techniques showed a high performance, they used a visible camera (RGB camera). Thus, they worked in the daytime. Furthermore, most of them were not real-time systems. Other cameras, such as an infrared camera, are also employed. The NoIR camera, which is a standard camera without the infrared filter, thus allowing the infrared spectrum (around 880 nm) to be captured by the sensor, was used in [3,4,31,32]. A near-infrared (NIR) camera was used in [5] for leaf phenotyping. A stereo infrared camera that worked in the NIR spectrum (700 nm to 1400 nm) was used in [7] for leaf growth modeling. To the best of our knowledge, no such works have used a thermal camera for leaf segmentation. The typical applications of a thermal camera in plant monitoring are for plant canopy temperature measurement [33,34,35,36], plant water status monitoring [34,37], and leaf stomata conductance measurement [14,38]. A technique related to leaf segmentation was proposed by [36], in which a visible camera is combined with a thermal camera. In that system, the visible camera was used for leaf segmentation in order to define the region for temperature measurement by the thermal camera.
The thresholding technique is a simple, effective method for leaf segmentation, thus it is suitable for implementation on an embedded device for real-time application. However, the algorithm is sensitive to lighting changes. Therefore, the segmentation performance under different lighting should be adequately evaluated.
In this paper, we present the results of an experiment on techniques for Vetiveria zizanioides leaf segmentation using three camera sensors: a visible camera (standard RGB camera), a non-infrared filter (NoIR) camera, and a thermal camera. We evaluated the performance of several popular image segmentation techniques implemented with the three cameras. The objective was to find the best solution for a low-cost embedded camera system for real-time implementation of leaf monitoring.
The main contributions of this study are as follows:
  • The image acquisition system employs low-cost camera systems suitable for real-time implementation.
  • The evaluated image segmentation techniques are fast-computation algorithms suitable for implementation on the embedded device.
  • The leaf segmentation performance of camera types and segmentation techniques was compared in an outdoor environment during the day and at night.
  • This is the first work to investigate the feasibility of using a thermal camera for leaf segmentation.
  • The image dataset used for evaluation comprised real images captured from the natural outdoor environment.
The rest of the paper is organized as follows: Section 2 presents the materials and methods. Section 3 presents the result and discussion. Section 4 covers the conclusion.

2. Materials and Methods

2.1. Image Data Collection

The image datasets were prepared using images captured by the multi-camera system in the outdoor environment. The multi-camera system contains 3 camera sensors: visible, NoIR, and thermal, as shown in Figure 1. Each camera is connected to a Raspberry Pi 3 Model B+ for image processing and data storage. The visible camera, on the right side of the figure, uses a Sony IMX219 camera sensor with an image resolution of 8 megapixels. The NoIR camera system, on the left side, consists of a camera module with a 5-megapixel Omnivision 5647 sensor (without an infrared filter), and a pair of infrared LEDs. The thermal camera, in the center, is a Seek Thermal CompactPro with a resolution of 320 × 240 pixels, connected to the Raspberry Pi module via a USB interface, as shown in the figure; its vertical position is above both the visible and NoIR cameras. The image data were also sent to Google Drive every 5 min for easy data storage and access. The images were captured continuously for a whole day (day and night). The proposed camera systems provide a low-cost solution, where the prices of the Raspberry Pi module and visible, NoIR, and thermal cameras are about USD 45, USD 7, USD 28, and USD 384, respectively.
The plant used for image data collection was Vetiveria zizanioides, planted in polybags and put in an outdoor environment (yard) during image collection. Photographs of the plants and the environment taken by the visible, NoIR, and thermal cameras are shown in Figure 2, Figure 3 and Figure 4, respectively. Due to the arrangement of the three cameras (as shown in Figure 1), the image viewpoint of each camera is different. Therefore, to make a fair comparison, we employed image warping to transform the images taken by the NoIR and thermal cameras to the match the reference image taken by the visible camera. In addition, to optimize the cameras’ resolution and the computation time of the image processor, all images were resized to 600 × 400 pixels. It is noted that even though the resized images of the thermal camera are larger than the original ones, it will not affect the performance significantly for the following reason. A thermal camera captures images based on the thermal energy emitted by the objects; thus, the parts of things appear in uniform color (as shown in Figure 4) rather than the precise color as with the visible camera. Therefore, resizing an image to a higher resolution does not change the detail of objects significantly. Figure 3a,b depict the original and transformed NoIR images, respectively. Figure 4a,b depict the original and transformed thermal images, respectively. The figures show that the viewpoints of the NoIR and thermal images are similar to that of the visible image, in the sense that the objects and their orientation in the images are almost the same. For instance, the leaves, polybags, and background objects appear in nearly the same position or area in Figure 2, Figure 3b and Figure 4b. Since the objects’ appearances in the observed images (visible, NoIR, and thermal images) are almost the same, we may evaluate the segmentation performance fairly.

2.2. Image Segmentation Techniques

2.2.1. Thresholding Technique

Thresholding is a simple technique for separating objects from the background by introducing a threshold. Pixels are assigned to the foreground if their intensity is lower than the threshold; otherwise, they are set as background. There are about 40 thresholding algorithms, as investigated in [39]. In this work, we evaluated 9 thresholding algorithms: Otsu, Multi-Otsu, Yen, Isodata, Li, Local, Minimum, Mean, and Triangle. Short explanations of the algorithms are described in the following.
The cumulative distribution function (F) is defined as:
F ( g ) = i = 0 g p ( i ) ,
where p(g) is the probability mass function, and g is the intensity value in the image (g = 0, ..., 255). The mean of foreground and background can be expressed as a function of threshold level T as:
m f ( T ) = g = 0 T g p ( g ) ,
m b ( T ) = g = T + 1 255 g p ( g ) .
The variance of foreground and background can be expressed as a function of threshold level T:
σ f 2 ( T ) = g = 0 T ( g m f ( T ) ) 2 p ( g ) ,
σ b 2 ( T ) = g = T + 1 255 ( g m b ( T ) ) 2 p ( g ) .
Otsu thresholding [40] finds an optimal threshold Topt by maximizing the between-class variance of foreground and background, and is given as [39]:
T o p t = arg max { F ( T ) ( 1 F ( T ) ) ( m f ( T ) m b ( T ) ) 2 F ( T ) σ f 2 ( T ) + ( 1 F ( T ) ) σ b 2 ( T ) } .
Multi-level Otsu (Multi-Otsu) thresholding [41,42] is an extension of Otsu thresholding, where M-1 optimal thresholds (Topt_1, …, Topt_M-1) are calculated for multi-class (M classes) separation.
Yen thresholding [43] finds an optimal threshold Topt by maximizing the entropic correlation, and is given as [39]:
T o p t = arg max { C b ( T ) + C f ( T ) } ,
where
C b ( T ) = log { g = 0 T ( p ( g ) F ( T ) ) 2 } ,
C f ( T ) = log { g = T + 1 255 ( p ( g ) 1 F ( T ) ) 2 } .
Isodata thresholding [44] finds an optimal threshold Topt, which is defined as [39]:
T o p t = lim n m f ( T n ) + m b ( T n ) 2 .
Li thresholding [45,46] finds an optimal threshold Topt, by minimizing the information theoretic distance, which is given as [39]:
T o p t = arg min [ g = 0 T g p ( g ) log g m f ( T ) + g = T + 1 255 g p ( g ) log g m b ( T ) ] ,
where
g T g = g T m f ( T )   and   g T g = g T m b ( T ) .
In Local thresholding [47], an optimal threshold is calculated for each pixel by considering the mean value of its neighbors. The optimal threshold at pixel-(x,y) Topt(x,y) is defined as:
T o p t ( x , y ) = m w × w ( x , y ) C .
where mwxw(x,y) is the local mean value over a window size w at the neighbors of pixel-(x,y), and C is a constant.
Minimum thresholding [48] finds an optimal threshold Topt as the valley between two maxima of the histogram, which is defined such that the following formula is satisfied:
y T o p t > y T o p t 1   A N D   y T o p t y T o p t + 1 .
where yg is the number of pixels with intensity value g.
Mean thresholding [48] finds an optimal threshold Topt as the mean of the intensity value, which is calculated as the integer part of the following:
g = 0 255 g y g g = 0 255 y g .
Triangle thresholding [49] finds an optimal threshold Topt in the histogram based on the Triangle method, as illustrated in Figure 5. The algorithm locates a point A by maximizing the distance between the triangle line and the histogram, then the optimal threshold Topt is defined by adding a fixed offset Ofs to A.

2.2.2. K-Means Segmentation

The K-Means algorithm [50] is a machine learning technique that is used to cluster a dataset into K classes. The algorithm divides the dataset into classes such that there is maximum distance between classes and minimum distance within the class [28]. The algorithm is as follows:
  • Set the number of classes (=K).
  • Initialize K cluster centers from the dataset randomly.
  • Calculate the distance of each pixel to each cluster center using a distance function.
  • Classify each pixel to the nearest cluster, i.e., the closest distance.
  • Recalculate the cluster center using the pixels belonging to the cluster.
  • Repeat Steps 2 to 5 until there is no change in the cluster center.

2.3. Performance Measurement of Leaf Segmentation

Leaf segmentation performance was measured using the standard metrics: Recall, Precision, and F1 score. The metrics are defined using true positive (TP), false negative (FN), and false positive (FP). TP indicates that the extracted leaf pixel was correctly identified as a leaf. FN indicates that the leaf was not extracted. FP indicates that the extracted leaf pixel was wrongly identified as a leaf. Recall, Precision, and F1 score are defined as follows:
Recall = T P T P + F N ,
Precision = T P T P + F P ,
F 1 score = 2 ( Recall ) ( Precision ) Recall + Precision .  
The summation of TP and FN in (16) represents the ground-truth leaf pixels. Thus, Recall represents the portion of ground-truth leaf pixels present in the extracted leaf pixels in the segmented image [1]. The summation of TP and FP in (17) represents the extracted leaf pixels. Thus, the Precision represents the portion of extracted leaf pixels in the segmented image that matches the ground-truth leaf pixels [1]. The F1 score combines Recall and Precision and represents the mean harmonic between them.
In addition to TP, FN, and FP, the true negative (TN) is also computed to evaluate the confusion table composed of TP, FN, FP, and TN. TN indicates that the extracted non-leaf was correctly identified as a non-leaf object.
According to (16)–(18), the F1 score does not consider TN like Matthews’s correlation coefficient or Cohen’s kappa. However, it is noted here that the F1 score is selected for measuring performance rather than Matthews’s correlation coefficient or Cohen’s kappa because our objective is to extract the leaf (foreground) from the image. Thus, we emphasize the TP more than TN. Furthermore, the F1 score is commonly used for evaluating the leaf segmentation performance, as described in Section 1.

2.4. Image Dataset

The image dataset, collected as described in Section 2.1, consisted of images captured by visible, NoIR, and thermal cameras. There was a total of 672 images containing:
  • 133 images taken by the visible camera during the day (06:00–17:00 h);
  • 275 images taken by the NoIR camera during the day and at night (06:00–05:00 h (next day)); and
  • 264 images taken by the thermal camera during the day and at night (06:00–05:00 h (next day)).
It is noted here that all three cameras captured the same scene at the same time interval, where each scene consists of four or five Vetiveria zizanioides plants. The data loss during transmission from the camera modules on the site to the cloud storage causes differences in the number of images captured by the NoIR and thermal cameras. This loss is affected by several problems, such as the camera module’s image acquisition error, the internet connection problem, and the error while accessing the Google Drive cloud storage. Since the visible camera only captured images during the day, the number of captured images is half those of the NoIR and thermal cameras.
Since the image data are time-stamped in intervals of 5 min, the image intensity during a whole day can be plotted, as shown in Figure 6. The blue, red, and green lines represent the intensity of visible, NoIR, and thermal images, respectively. It is clearly shown in the figure that at night, the intensity of the visible image is zero (no objects captured), and the image intensity varies during the day due to variations in sunlight (for visible and NoIR images) and temperature (for thermal images).
Since infrared LEDs are the light source for the NoIR camera, the camera can capture objects at night. However, due to the low power of the LEDs, only objects near the camera are captured. Therefore, the image intensity is low, as shown in Figure 6.
Figure 6 also shows that there is a significant difference in image intensity between daytime and nighttime images with the NoIR camera. This can be observed in Figure 7, which shows NoIR images at 13:00 h and 19:00 h. In the thermal image, the difference is insignificant, indicating that the temperature did not drop abruptly at night. This situation can be explained by observing Figure 8, which shows thermal images at 13:00 h and 19:00 h. In the images, brighter colors (orange and white) represent higher temperatures than dark colors (blue and purple). As shown in Figure 8, in the night image, some background parts are orange, indicating high temperature. This condition produces images with medium intensity as shown in Figure 6.

2.5. Ground-Truth Images

Ground-truth images were prepared manually, labeling the leaves using image editor software (Microsoft Paint). The ground truth for visible and NoIR cameras was prepared from visible and NoIR images, respectively. However, due to the different characteristics of a thermal camera, its ground-truth images are challenging to prepare from thermal images; instead, we used the ground-truth images from the visible images. Fortunately, this arrangement complies with our objective to evaluate the feasibility of a thermal camera for leaf segmentation rather than leaf temperature measurement.
Examples of ground-truth images for visible and NoIR cameras are shown in Figure 9a,b, respectively. In the figures, green represents the leaf object, while other colors represent the non-leaf objects and background.

2.6. Method of Leaf Segmentation Performance

The segmentation algorithms were implemented on a Raspberry Pi 3 Model B+ powered with: Broadcom BCM2837B0, Cortex-A53 64-bit SoC @ 1.4GHz, and run with the Raspberry Pi OS operating system. The algorithms were written in Python using the OpenCV and Scikit-image libraries.
The comparison of leaf segmentation performance is divided into five parts:
  • Comparison of camera types and segmentation algorithms.
  • Comparison of camera types only (performance measurement of algorithms is averaged or maximized).
  • Comparison of segmentation algorithms only (performance measurement of camera types is averaged or maximized).
  • Comparison of execution time.
  • Comparison of segmentation algorithms using the well-known tested images.

3. Results and Discussion

3.1. Segmentation Results

Examples of leaf segmentation results of visible, NoIR, and thermal images taken during the day are illustrated in Figure 10, Figure 11 and Figure 12, respectively. In each figure, panel (a) shows the original tested image and panel (b) shows the leaf segmentation results using Isodata, Li, Local, Mean, Minimum, Otsu, Multi-Otsu, Triangle, Yen, K-Means (K = 2), and K-Means (K = 3), respectively. Recall (Rec), Precision (Prec), and F1 score (F1) are shown at the bottom of each image. The white and black colors in the figures represent the foreground (extracted leaf pixels) and background, respectively.
Figure 10 shows that when given a visible image depicted in Figure 10a, most of the algorithms achieve high Recall, except the images in Figure 10d,h,l, which achieved lower Recall. Higher Recall is indicated by many white pixels in the leaf object. The lower Recall for the image in Figure 10d was caused by Local thresholding using the local window to segment the image, resulting in some parts of the leaves not being extracted. Multi-Otsu thresholding (Figure 10h) and the K-Means algorithm (Figure 10l) use three classes in the segmentation, and they fail to extract some parts of the leaves when the color intensity is not uniform, thus they achieved lower Recall.
Figure 10 shows that most of the algorithms achieved medium Precision. This can be observed by the presence of white on non-leaf objects, which indicates false positive results. As false positives increase, Precision will decrease. The unbalance of Recall and Precision results yields a medium F1 score value for most algorithms. It is noted that Local thresholding had the worst F1 score value.
Observing Figure 10, it is found that due to the nature of the backgrounds, the wall (upper left side in the image) and concrete floor (lower right side in the image and white gap below the black fence) produce false positives with most algorithms, except Multi-Otsu and K-Means (K = 3). This condition suggests that bi-level segmentation techniques cannot separate these particular backgrounds from the foreground (leaf). They can be detected as backgrounds by multi-level segmentation techniques such as Multi-Otsu and K-Means (K = 3). However, the multi-level segmentation techniques tend to extract some parts of leaves as the background, as shown in Figure 10h,l, thus producing lower Recall. From this discussion, it is clear that our image dataset offers a challenge in achieving a high F1 score. It is worth noting that this challenge allows our aim to evaluate the thresholding techniques and the camera types for a suitable real-time leaf segmentation application. Similar results were obtained for the NoIR images shown in Figure 11. However, the performance of NoIR images was lower than that of visible images, as can be seen by the higher color contrast of the visible image in Figure 10a than the NoIR image in Figure 11a.
The thermal images in Figure 12 appear different from the visible and NoIR images. Due to the thermal camera’s resolution and characteristics, the extracted leaves are condensed rather than dispersed as in Figure 10 and Figure 11. Furthermore, the edge and thin parts of the leaves are not extracted. Therefore, the Recall, Precision, and F1 scores are lower compared to visible and NoIR images. Figure 12 also shows an interesting phenomenon: the plants’ polybags contribute to false positive results, which is opposite to the visible and NoIR cameras. It is also noted in Figure 12i that the Triangle thresholding failed to extract the leaves.

3.2. Comparison of Leaf Segmentation Performance

3.2.1. Comparison of Camera Types and Segmentation Algorithms

Figure 13 shows the segmentation performance of the visible camera during the day. The figure shows that the segmentation performance (Recall, Precision, and F1 score) of 11 algorithms with the visible camera varied. The highest Recall of 0.934, Precision of 0.751, and F1 score of 0.794 were achieved by Triangle thresholding, K-Means algorithm (K = 3), and Multi-Otsu thresholding. Local thresholding, as predicted from one example image discussed in Section 3.1, achieved the lowest performance.
Table 1 shows the confusion tables of segmentation results using the visible camera, where TP, FN, FP, and TN for 11 algorithms are expressed in the normalized form. The normalization is calculated by dividing the numbers of TP, FN, FP, and TN with the total number of samples. The table shows that Triangle thresholding achieved the highest TP and lowest FN. This result conforms with the highest Recall as described previously. Multi-Otsu and K-Means (K = 3) algorithms achieved the lower FP that resulted in the higher Precision as explained previously. The tables also show that higher TN was achieved by Multi-Otsu and K-Means (K = 3). This suggested that for the given tested images, non-leaf objects can be extracted properly by considering the multi-class segmentation.
Figure 14 and Table 2 show the segmentation performance and normalized confusion tables of segmentation results using the NoIR camera during the day, respectively. Figure 14 shows that most algorithms had similar performance to the NoIR camera (daytime). Significantly, Precision was very close, except for Yen thresholding. Similar to the visible camera, Triangle thresholding achieved the highest Recall. It is notable that Multi-Otsu thresholding and K-Means (K = 3) achieved the highest F1 scores. This suggests that by employing three classes/clusters, both methods can balance Recall and Precision, thus increasing the Precision of the NoIR camera (daytime). These results are confirmed by Table 2, where normalized confusion tables in the figure show that Triangle thresholding achieved the highest TP and lowest FP, and Multi-Otsu and K-Means (K-3) achieved the highest TN and lowest FP, similarly to Table 1.
Figure 15 and Table 3 show the segmentation performance and normalized confusion tables of segmentation results using the NoIR camera at night, respectively. Figure 15 shows that the performance of algorithms with the NoIR camera (nighttime) is low. It can be seen from Figure 7b that the leaf objects partially appeared due to the low power of infrared LEDs used in this experiment. This condition produced a lower TP for most algorithms, as shown in the normalized confusion tables in Table 3. However, the approach to using a NoIR camera for night measurement is promising due to the low cost and small size of the infrared LEDs embedded in the camera module. Further, it is non-visible light; thus, it does not disrupt the natural conditions in the plant environments at night.
It can be noted from the figure that K-Means (K = 2) achieved high Recall but low Precision and F1 score. From the experimental results, the high Recall was caused by an incorrect segmentation method that produced a wide white section (foreground), thus producing very high Recall and very low Precision. The results suggest that Recall and Precision should be considered when evaluating segmentation performance.
Figure 16 and Table 4 show the segmentation performance and normalized confusion tables of segmentation results using the NoIR camera over a whole day, respectively. Figure 16 shows that similar to the visible camera, the performance of 11 algorithms with the NoIR camera (daytime and nighttime) also varied. The figure shows that the Multi-Otsu and K-Means (K = 3) achieved the highest Precisions, however, the values are lower than the visible camera. Table 4 shows that TP values are lower and FN values are higher compared to the ones in Table 1 (visible camera). These results are affected by the condition at night, where due to the low power of infrared LEDs, the leaf objects partially appeared in the images, as discussed previously.
Figure 17, Figure 18 and Figure 19 and Table 5, Table 6 and Table 7 show the segmentation performances and normalized confusion tables of segmentation results using thermal camera during the day, at night, and over a whole day, respectively. As shown in the figures, a significant difference between Recall and Precision was found for most algorithms. It can be seen in Figure 8 and the previous discussion in Section 3.2 that thermal images tend to yield condensed segmented images, thus they will achieve high Recall and low or medium Precision. The figures also show that the Minimum and Triangle thresholding almost fail on nighttime images (Figure 18 and Table 6) and daytime images (Figure 17 and Table 5).

3.2.2. Comparison of Camera Types

To find the effectiveness of camera types in segmentation, we compared the performance of each camera based on the average and maximum values of the metrics of all segmentation algorithms. The comparisons of average and maximum Recall, Precision, and F1 score for each camera type are given in Table 8, Table 9 and Table 10, respectively. A comparison of daytime, nighttime, and the whole day was also conducted, and the results are given in the tables. For straightforward reading, the highest values in the tables are given in gray.
Table 8 shows that the visible camera achieved the best performance for average and maximum Recall on daytime images with values of 0.794 and 0.934, respectively. The thermal camera achieved the best performance for average and maximum Recall on daytime (whole day) images with values of 0.831 (0.740) and 0.990 (0.926), respectively.
Table 9 shows that the visible camera achieved the best performance for average and maximum Precision on daytime images, with values of 0.663 and 0.751, respectively. The NoIR camera achieved the best performance for average and maximum Precision on whole day images, with the values of 0.443 and 0.575, respectively, and maximum Precision of 0.572 on nighttime images. The thermal camera achieved the best performance for average Precision on nighttime images with a value of 0.418.
Table 10 shows that the visible camera achieved the best performance for average and maximum F1 score on daytime images with values of 0.584 and 0.794, respectively. The NoIR camera achieved the best performance for average and maximum F1 score on daytime (whole day) images, with values of 0.333 (0.396) and 0.636 (0.599), respectively.
From Table 8, Table 9 and Table 10, we can conclude that the visible camera is the best choice for daytime images. When considering Recall only, the thermal camera is the best choice for nighttime and whole day images. When considering F1 score, the NoIR camera is the best choice for nighttime and whole day images.

3.2.3. Comparison of Segmentation Algorithms

To find the effectiveness of the segmentation algorithm, we compared the performance of each algorithm based on the average and maximum values of the metrics for the three cameras. The comparisons of average and maximum Recall, Precision, and F1 score for each algorithm are given in Table 11, Table 12 and Table 13, respectively.
Table 11 shows that on daytime images, the best performance for average Recall was achieved by Mean thresholding, with a value of 0.820, and for maximum Recall Triangle thresholding performed best, with a value of 0.934. K-Means (K = 3) achieved the best performance on nighttime and whole day images, with average Recall values of 0.963 and 0.876, respectively. The best performance on nighttime images for maximum Recall was achieved by Isodata and Otsu thresholding, with a value of 0.990. Yen thresholding achieved the best performance for maximum Recall on whole day images with a value of 0.926.
Table 12 shows that K-Means (K = 3) performed best in Precision for all conditions. The average Precision values for daytime, nighttime, and whole day images are 0.603, 0.539, and 0.534, respectively. The maximum Precisions for daytime, nighttime, and whole day images are 0.751, 0.572, and 0.575, respectively.
Table 13 shows that K-Means (K = 3) performed best on the F1 score for most conditions. The average F1 score values for daytime, nighttime, and whole day images are 0.585, 0.509, and 0.496, respectively. The maximum Precision values for nighttime and whole day images are 0.636 and 0.599, respectively. The maximum Precision of 0.794 for daytime images was achieved by Multi-Otsu.
From Table 11, Table 12 and Table 13, we can conclude that K-Means (K = 3) is the best choice when considering F1 score and Precision. In the case of Recall, there is no single algorithm that outperforms under all conditions.

3.2.4. Comparison of the Execution Time

The execution time of 11 segmentation algorithms is given in Table 14. Most of the thresholding algorithms, except Local thresholding, have a fast execution time of about 0.2 s. Since Local thresholding computes the thresholding throughout the local windows, the computation time is longer. The execution time of K-Means, an iterative approach, is longer than that of the thresholding approach; it took 2.463 s for two clusters (K = 2) and 3.483 s for three clusters (K = 3). With regard to its application to plant monitoring, the execution time is still acceptable for real-time implementation.

3.2.5. Comparison of Segmentation Algorithms Using Well-Known Image Dataset

To verify the performance of segmentation techniques (thresholding and K-means) used in this work for leaf segmentation, we evaluated 11 segmentation algorithms using a well-known image dataset, namely Ara2012 [16,17,51], consisting of 120 Arabidopsis thaliana plant images. Examples of tested images are shown in Figure 20.
Figure 21 and Table 15 show the segmentation performance and normalized confusion tables of segmentation results using the Ara2012 dataset. Most algorithms achieved high Recall, Precision, and F1 score, which are higher than that of the image dataset used in our experiments. These results show that our image datasets are more complicated than Ara2012. The normalized confusion tables in Table 15 show that FP is low and FN is very low, thus resulting the higher Recall and Precision. TP and TN of most algorithms were almost the same with higher values. Similar to previous results in Section 3.2.1, Local thresholding showed the worst performance.
Figure 21 shows that K-Means (K = 3) achieved the highest F1 score of 0.917, while values of Recall and Precision are 0.952 and 0.931, respectively. The second highest F1 score was achieved by Multi-Otsu with a value of 0.813, while values of Recall and Precision are 0.894 and 0.845, respectively. Referring to Section 3.2.1 and Section 3.2.3, K-Means (K-3) and Multi-Otsu also show the best performance, even though it is lower than for the Ara2012 dataset. This result implies that leaf segmentation using multi-clusters provides a better solution for achieving high Recall, Precision, and F1 score. Unfortunately, due to the complicated backgrounds in our image dataset, this approach cannot achieve high performance.
Furthermore, by comparing our image dataset and the Ara2012 dataset, we can observe an interesting result, as described in the following. Comparing F1 scores of the Otsu and Triangle thresholding techniques between our dataset and the Ara2012 dataset, it was seen that Otsu thresholding shows better performance than Triangle thresholding in our dataset, however, the Triangle thresholding shows better performance than Otsu thresholding in the Ara2012 dataset. This result could be seen from the image histograms shown in Figure 22, where Figure 22a,b are the image histograms of our dataset and the Ara2012 dataset, respectively. It is clear from the figures that the Triangle thresholding works better when the shape of the histogram is bi-modal, as shown in Figure 16. The histograms also show evidence that our image dataset consisted of complicated images for leaf segmentation, especially when considering bi-level segmentation.

4. Conclusions

A comparative study of leaf segmentation methods for Vetiveria zizanioides was conducted. The objective was to find the best camera type and technique for leaf segmentation in an outdoor environment using an embedded camera system. Three segmentation metrics, Recall, Precision, and F1 score, were used to measure the segmentation performance. The results show that the visible camera performed best on daytime images, while the NoIR camera was suitable for nighttime images. The K-Means algorithm achieved the maximum Precision and F1 score in all conditions; meanwhile, no single segmentation algorithm showed superior Recall in all conditions.
Based on the results, a technique for combining camera types and segmentation techniques will be developed in future work for better performance of the leaf segmentation system.

Author Contributions

Conceptualization, A.S. and E.H.; methodology, A.S.; software, A.S.; validation, A.S. and E.H.; data curation, A.S. and E.H.; writing—original draft preparation, A.S.; writing—review and editing, A.S. and E.H.; project administration, E.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by a Research Grant from the Ministry of Education, Culture, Research, and Technology, Republic of Indonesia, Year 2022 (No.: SP DIPA-023.17.1.690523/2022).

Data Availability Statement

Not applicable.

Acknowledgments

The authors thank the MBKM team (Muhammad Suriansyah, M Rifki Abdilah, M Syahriel M, and Mohamad Khafil) for preparation the hardware sensors used for the data collection.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Praveen Kumar, J.; Domnic, S. Image Based Leaf Segmentation and Counting in Rosette Plants. Inf. Process. Agric. 2019, 6, 233–246. [Google Scholar] [CrossRef]
  2. Buzzy, M.; Thesma, V.; Davoodi, M.; Velni, J.M. Real-Time Plant Leaf Counting Using Deep Object Detection Networks. Sensors 2020, 20, 6896. [Google Scholar] [CrossRef] [PubMed]
  3. Soetedjo, A.; Hendriarianti, E. Leaf Segmentation in Outdoor Environment Using A Low-Cost Infrared Camera. In Proceedings of the 2021 IEEE International Conference on Imaging Systems and Techniques (IST), New York, NY, USA (Virtual Conference), 24–26 August 2021. [Google Scholar] [CrossRef]
  4. Soetedjo, A.; Hendriarianti, E. Plant Leaf Detection and Counting in a Greenhouse during Day and Nighttime Using a Raspberry Pi NoIR Camera. Sensors 2021, 21, 6659. [Google Scholar] [CrossRef] [PubMed]
  5. Giuffrida, M.V.; Doerner, P.; Tsaftaris, S.A. Pheno-Deep Counter: A Unified and Versatile Deep Learning Architecture for Leaf Counting. Plant J. 2018, 96, 880–890. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Yeh, Y.H.; Lai, T.C.; Liu, T.Y.; Liu, C.C.; Chung, W.C.; Lin, T.T. An Automated Growth Measurement System for Leafy Vegetables. Biosyst. Eng. 2014, 117, 43–50. [Google Scholar] [CrossRef]
  7. Aksoy, E.E.; Abramov, A.; Wörgötter, F.; Scharr, H.; Fischbach, A.; Dellen, B. Modeling Leaf Growth of Rosette Plants Using Infrared Stereo Image Sequences. Comput. Electron. Agric. 2015, 110, 78–90. [Google Scholar] [CrossRef] [Green Version]
  8. Zhang, L.; Xu, Z.; Xu, D.; Ma, J.; Chen, Y.; Fu, Z. Growth Monitoring of Greenhouse Lettuce Based on a Convolutional Neural Network. Hortic. Res. 2020, 7, 124. [Google Scholar] [CrossRef]
  9. Hu, Y.; Wang, L.; Xiang, L.; Wu, Q.; Jiang, H. Automatic Non-Destructive Growth Measurement of Leafy Vegetables Based on Kinect. Sensors 2018, 18, 806. [Google Scholar] [CrossRef] [Green Version]
  10. Yang, K.; Zhong, W.; Li, F. Leaf Segmentation and Classification with a Complicated Background Using Deep Learning. Agronomy 2020, 10, 1721. [Google Scholar] [CrossRef]
  11. Singh, V.; Misra, A.K. Detection of Plant Leaf Diseases Using Image Segmentation and Soft Computing Techniques. Inf. Process. Agric. 2017, 4, 41–49. [Google Scholar] [CrossRef] [Green Version]
  12. Gayathri Devi, T.; Neelamegam, P.; Srinivasan, A. Plant Leaf Disease Detection Using K Means Segmentation. Int. J. Pure Appl. Math. 2018, 119, 3477–3483. [Google Scholar]
  13. Militante, S.V.; Gerardo, B.D.; DIonisio, N.V. Plant Leaf Detection and Disease Recognition Using Deep Learning. In Proceedings of the 2019 IEEE Eurasia Conference on IOT, Communication and Engineering (ECICE), Yunlin, Taiwan, 3–6 October 2019; pp. 579–582. [Google Scholar] [CrossRef]
  14. Stoll, M.; Jones, H.G. Thermal Imaging as a Viable Tool for Monitoring Plant Stress. J. Int. Sci. Vigne Vin 2007, 41, 77–84. [Google Scholar] [CrossRef]
  15. Pineda, M.; Barón, M.; Pérez-Bueno, M.L. Thermal Imaging for Plant Stress Detection and Phenotyping. Remote Sens. 2021, 13, 68. [Google Scholar] [CrossRef]
  16. Scharr, H.; Minervini, M.; French, A.P.; Klukas, C.; Kramer, D.M.; Liu, X.; Luengo, I.; Pape, J.M.; Polder, G.; Vukadinovic, D.; et al. Leaf Segmentation in Plant Phenotyping: A Collation Study. Mach. Vis. Appl. 2016, 27, 585–606. [Google Scholar] [CrossRef] [Green Version]
  17. Minervini, M.; Fischbach, A.; Scharr, H.; Tsaftaris, S.A. Finely-Grained Annotated Datasets for Image-Based Plant Phenotyping. Pattern Recognit. Lett. 2016, 81, 80–89. [Google Scholar] [CrossRef] [Green Version]
  18. Li, L.; Zhang, Q.; Huang, D. A Review of Imaging Techniques for Plant Phenotyping. Sensors 2014, 14, 20078–20111. [Google Scholar] [CrossRef]
  19. Constantino, K.P.; Gonzales, E.J.; Lazaro, L.M.; Serrano, E.C.; Samson, B.P. Towards an Automated Plant Height Measurement and Tiller Segmentation of Rice Crops Using Image Processing. In Mechatronics and Machine Vision in Practice 3; Springer: Cham, Switzerland, 2018; pp. 155–168. [Google Scholar] [CrossRef]
  20. Pereira, C.S.; Morais, R.; Reis, M.J.C.S. Pixel-Based Leaf Segmentation from Natural Vineyard Images Using Color Model and Threshold Techniques. In Image Analysis and Recognition; Campilho, A., Karray, F., ter Haar Romeny, B., Eds.; Springer: Cham, Switzerland, 2018; Volume 10882, pp. 96–106. [Google Scholar] [CrossRef]
  21. Valle, B.; Simonneau, T.; Boulord, R.; Sourd, F.; Frisson, T.; Ryckewaert, M.; Hamard, P.; Brichet, N.; Dauzat, M.; Christophe, A. PYM: A New, Affordable, Image-Based Method Using a Raspberry Pi to Phenotype Plant Leaf Area in a Wide Diversity of Environments. Plant Methods 2017, 13, 98. [Google Scholar] [CrossRef]
  22. Lin, K.; Wu, J.H.; Chen, J.; Si, H. Measurement of Plant Leaf Area Based on Computer Vision. In Proceedings of the 2014 Sixth International Conference on Measuring Technology and Mechatronics Automation (ICMTMA 2014), Zhangjiajie, China, 10–11 January 2014; Volume 2, pp. 401–405. [Google Scholar] [CrossRef]
  23. Easlon, H.M.; Bloom, A.J. Easy Leaf Area: Automated Digital Image Analysis for Rapid and Accurate Measurement of Leaf Area. Appl. Plant Sci. 2014, 2, 1400033. [Google Scholar] [CrossRef]
  24. Niu, C.; Li, H.; Niu, Y.; Zhou, Z.; Bu, Y.; Niu, C.; Li, H.; Niu, Y.; Zhou, Z.; Bu, Y.; et al. Segmentation of Cotton Leaves Based on Improved Watershed Algorithm. In CCTA 2015: Computer and Computing Technologies in Agriculture IX; Li, D., Li, Z., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 425–436. [Google Scholar] [CrossRef] [Green Version]
  25. Ci, D.; Cui, S.; Liang, F. Research of Statistical Method for the Number of Leaves in Plant Growth Cabinet. MATEC Web Conf. 2015, 31, 5–8. [Google Scholar] [CrossRef]
  26. Minervini, M.; Giuffrida, M.V.; Tsaftaris, S. An Interactive Tool for Semi-Automated Leaf Annotation. In Proceedings of the Computer Vision Problems in Plant Phenotyping (CVPPP), Swansea, UK, 10 September 2015; pp. 6.1–6.13. [Google Scholar] [CrossRef] [Green Version]
  27. Hu, J.; Chen, Z.; Zhang, R.; Yang, M.; Zhang, S. Robust Random Walk for Leaf Segmentation. IET Image Process. 2020, 14, 1180–1186. [Google Scholar] [CrossRef]
  28. Tian, K.; Li, J.; Zeng, J.; Evans, A.; Zhang, L. Segmentation of Tomato Leaf Images Based on Adaptive Clustering Number of K-Means Algorithm. Comput. Electron. Agric. 2019, 165, 104962. [Google Scholar] [CrossRef]
  29. DIvya, P.; Anusudha, K. Segmentation of Defected Regions in Leaves Using K- Means and OTSU’s Method. In Proceedings of the 4th International Conference on Electrical Energy Systems (ICEES 2018), Chennai, India, 7–9 February 2018; pp. 111–115. [Google Scholar] [CrossRef]
  30. Hamuda, E.; Glavin, M.; Jones, E. A Survey of Image Processing Techniques for Plant Extraction and Segmentation in the Field. Comput. Electron. Agric. 2016, 125, 184–199. [Google Scholar] [CrossRef]
  31. Dash, J.; Verma, S.; Dasmunshi, S.; Nigam, S. Plant Health Monitoring System Using Raspberry Pi. Int. J. Pure Appl. Math. 2018, 119, 955–959. [Google Scholar]
  32. Tovar, J.C.; Hoyer, J.S.; Lin, A.; Tielking, A.; Callen, S.T.; Elizabeth Castillo, S.; Miller, M.; Tessman, M.; Fahlgren, N.; Carrington, J.C.; et al. Raspberry Pi–Powered Imaging for Plant Phenotyping. Appl. Plant Sci. 2018, 6, e1031. [Google Scholar] [CrossRef] [PubMed]
  33. Martínez, J.; Egea, G.; Agüera, J.; Pérez-Ruiz, M. A Cost-Effective Canopy Temperature Measurement System for Precision Agriculture: A Case Study on Sugar Beet. Precis. Agric. 2017, 18, 95–110. [Google Scholar] [CrossRef]
  34. Noguera, M.; Millán, B.; Pérez-Paredes, J.J.; Ponce, J.M.; Aquino, A.; Andújar, J.M. A New Low-Cost Device Based on Thermal Infrared Sensors for Olive Tree Canopy Temperature Measurement and Water Status Monitoring. Remote Sens. 2020, 12, 723. [Google Scholar] [CrossRef] [Green Version]
  35. Kokin, E.; Pennar, M.; Palge, V.; Jürjenson, K. Strawberry Leaf Surface Temperature Dynamics Measured by Thermal Camera in Night Frost Conditions. Agron. Res. 2018, 16, 122–133. [Google Scholar] [CrossRef]
  36. Giménez-Gallego, J.; González-Teruel, J.D.; Soto-Valles, F.; Jiménez-Buendía, M.; Navarro-Hellín, H.; Torres-Sánchez, R. Intelligent Thermal Image-Based Sensor for Affordable Measurement of Crop Canopy Temperature. Comput. Electron. Agric. 2021, 188, 106319. [Google Scholar] [CrossRef]
  37. Zhou, Z.; Diverres, G.; Kang, C.; Thapa, S.; Karkee, M.; Zhang, Q.; Keller, M. Ground-Based Thermal Imaging for Assessing Crop Water Status in Grapevines over a Growing Season. Agronomy 2022, 12, 322. [Google Scholar] [CrossRef]
  38. Iseki, K.; Olaleye, O. A New Indicator of Leaf Stomatal Conductance Based on Thermal Imaging for Field Grown Cowpea. Plant Prod. Sci. 2020, 23, 136–147. [Google Scholar] [CrossRef] [Green Version]
  39. Sezgin, M.; Sankur, B. Survey over Image Thresholding Techniques and Quantitative Performance Evaluation. J. Electron. Imaging 2004, 13, 146–165. [Google Scholar] [CrossRef]
  40. Otsu, N. A Tlreshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef] [Green Version]
  41. Liao, P.S.; Chen, T.S.; Chung, P.C. A Fast Algorithm for Multilevel Thresholding. J. Inf. Sci. Eng. 2001, 17, 713–727. [Google Scholar] [CrossRef]
  42. Huang, D.-Y.; Lin, T.-W.; Hu, W.-C. Automatic Multilevel Thresholding Based on Two-Stage Otsu’s Method with Cluster Determination by Valley Estimation. Int. J. Innov. Comput. Inf. Control 2011, 7, 5631–5644. [Google Scholar]
  43. Yen, J.C.; Chang, F.J.; Chang, S. A New Criterion for Automatic Multilevel Thresholding. IEEE Trans. Image Process. 1995, 4, 370–378. [Google Scholar] [CrossRef]
  44. Ridler, T.W.; Calvard, S. Picture Thresholding Using an Iterative Selection Method. IEEE Trans. Syst. Man Cybern. 1978, 8, 630–632. [Google Scholar]
  45. Li, C.H.; Lee, C.K. Minimum Cross Entropy Thresholding. Pattern Recognit. 1993, 26, 617–625. [Google Scholar] [CrossRef]
  46. Li, C.H.; Tam, P.K.S. An Iterative Algorithm for Minimum Cross Entropy Thresholding. Pattern Recognit. Lett. 1998, 19, 771–776. [Google Scholar] [CrossRef]
  47. Gonzalez, R.C.; Wood, R.E. Digital Image Processing, 2nd ed.; Prentice-Hall Inc.: Hoboken, NJ, USA, 2002. [Google Scholar]
  48. Glasbey, C.A. An Analysis of Histogram-Based Thresholding Algorithms. CVGIP Graph. Model. Image Process. 1993, 55, 532–537. [Google Scholar] [CrossRef]
  49. Zack, G.W.; Rogers, W.E.; Latt, S.A. EXCHANGE. J. Histochem. Cytochem. 1977, 25, 741–753. [Google Scholar] [CrossRef]
  50. Macqueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability: Weather Modification, Berkeley, CA, USA, 1 January 1967; University of California Press: Oakland, CA, USA, 1967; Volume 1, pp. 281–297. [Google Scholar]
  51. Minervini, M.; Fischbach, A.; Scharr, H.; Tsaftaris, S.A. Plant Phenotyping Datasets. Available online: http://www.plant-phenotyping.org/datasets (accessed on 23 September 2021).
Figure 1. Hardware arrangement of multi-camera system.
Figure 1. Hardware arrangement of multi-camera system.
Asi 06 00001 g001
Figure 2. Sample image from visible camera.
Figure 2. Sample image from visible camera.
Asi 06 00001 g002
Figure 3. Sample images from NoIR camera: (a) original image; (b) transformed image.
Figure 3. Sample images from NoIR camera: (a) original image; (b) transformed image.
Asi 06 00001 g003
Figure 4. Sample images from thermal camera: (a) original image; (b) transformed image.
Figure 4. Sample images from thermal camera: (a) original image; (b) transformed image.
Asi 06 00001 g004
Figure 5. Illustration of Triangle thresholding method [49].
Figure 5. Illustration of Triangle thresholding method [49].
Asi 06 00001 g005
Figure 6. Intensity of visible, NoIR, and thermal images during a whole day.
Figure 6. Intensity of visible, NoIR, and thermal images during a whole day.
Asi 06 00001 g006
Figure 7. NoIR images taken at: (a) 13:00 h and (b) 19:00 h.
Figure 7. NoIR images taken at: (a) 13:00 h and (b) 19:00 h.
Asi 06 00001 g007
Figure 8. Thermal images taken at: (a) 13:00 h and (b) 19:00 h.
Figure 8. Thermal images taken at: (a) 13:00 h and (b) 19:00 h.
Asi 06 00001 g008
Figure 9. Sample ground-truth images: (a) visible camera and (b) NoIR camera.
Figure 9. Sample ground-truth images: (a) visible camera and (b) NoIR camera.
Asi 06 00001 g009
Figure 10. Image segmentation results of visible images: (a) original tested image; (b) Isodata; (c) Li; (d) Local; (e) Mean; (f) Minimum; (g) Otsu; (h) Multi-Otsu; (i) Triangle; (j) Yen; (k) K-Means (K = 2); (l) K-Means (K = 3).
Figure 10. Image segmentation results of visible images: (a) original tested image; (b) Isodata; (c) Li; (d) Local; (e) Mean; (f) Minimum; (g) Otsu; (h) Multi-Otsu; (i) Triangle; (j) Yen; (k) K-Means (K = 2); (l) K-Means (K = 3).
Asi 06 00001 g010
Figure 11. Image segmentation results of NoIR images: (a) original tested image; (b) Isodata; (c) Li; (d) Local; (e) Mean; (f) Minimum; (g) Otsu; (h) Multi-Otsu; (i) Triangle; (j) Yen; (k) K-Means (K = 2); (l) K-Means (K = 3).
Figure 11. Image segmentation results of NoIR images: (a) original tested image; (b) Isodata; (c) Li; (d) Local; (e) Mean; (f) Minimum; (g) Otsu; (h) Multi-Otsu; (i) Triangle; (j) Yen; (k) K-Means (K = 2); (l) K-Means (K = 3).
Asi 06 00001 g011
Figure 12. Image segmentation results of thermal images: (a) original tested image; (b) Isodata; (c) Li; (d) Local; (e) Mean; (f) Minimum; (g) Otsu; (h) Multi-Otsu; (i) Triangle; (j) Yen; (k) K-Means (K = 2); (l) K-Means (K = 3).
Figure 12. Image segmentation results of thermal images: (a) original tested image; (b) Isodata; (c) Li; (d) Local; (e) Mean; (f) Minimum; (g) Otsu; (h) Multi-Otsu; (i) Triangle; (j) Yen; (k) K-Means (K = 2); (l) K-Means (K = 3).
Asi 06 00001 g012aAsi 06 00001 g012b
Figure 13. Segmentation performance of visible camera (daytime).
Figure 13. Segmentation performance of visible camera (daytime).
Asi 06 00001 g013
Figure 14. Segmentation performance of NoIR camera (daytime).
Figure 14. Segmentation performance of NoIR camera (daytime).
Asi 06 00001 g014
Figure 15. Segmentation performance of NoIR camera (nighttime).
Figure 15. Segmentation performance of NoIR camera (nighttime).
Asi 06 00001 g015
Figure 16. Segmentation performance of NoIR camera (daytime and nighttime).
Figure 16. Segmentation performance of NoIR camera (daytime and nighttime).
Asi 06 00001 g016
Figure 17. Segmentation performance of thermal camera (daytime).
Figure 17. Segmentation performance of thermal camera (daytime).
Asi 06 00001 g017
Figure 18. Segmentation performance of thermal camera (nighttime).
Figure 18. Segmentation performance of thermal camera (nighttime).
Asi 06 00001 g018
Figure 19. Segmentation performance of thermal camera (daytime and nighttime).
Figure 19. Segmentation performance of thermal camera (daytime and nighttime).
Asi 06 00001 g019
Figure 20. Sample images of Arabidopsis thaliana plants [16,17,51].
Figure 20. Sample images of Arabidopsis thaliana plants [16,17,51].
Asi 06 00001 g020
Figure 21. Segmentation performance of Arabidopsis thaliana plant.
Figure 21. Segmentation performance of Arabidopsis thaliana plant.
Asi 06 00001 g021
Figure 22. Image histograms of (a) our dataset and (b) Ara2012 dataset.
Figure 22. Image histograms of (a) our dataset and (b) Ara2012 dataset.
Asi 06 00001 g022
Table 1. Normalized confusion tables of segmentation results using visible camera (daytime).
Table 1. Normalized confusion tables of segmentation results using visible camera (daytime).
IsodataLiLocalMean
TPFNTPFNTPFNTPFN
22.27%6.15%25.52%2.90%17.28%11.14%24.54%3.88%
FPTNFPTNFPTNFPTN
17.40%54.18%18.52%53.06%27.75%43.83%18.11%53.47%
MinOtsuMulti-OtsuTriangle
TPFNTPFNTPFNTPFN
25.59%2.83%22.22%6.20%19.26%9.16%26.48%1.94%
FPTNFPTNFPTNFPTN
18.76%52.82%17.39%54.19%5.19%66.39%24.96%46.62%
YenK-Means (K = 2)K-Means (K = 3)
TPFNTPFNTPFN
23.63%4.79%21.39%7.03%20.33%8.09%
FPTNFPTNFPTN
18.07%53.51%17.75%53.83%5.51%66.07%
Table 2. Normalized confusion tables of segmentation results using NoIR camera (daytime).
Table 2. Normalized confusion tables of segmentation results using NoIR camera (daytime).
IsodataLiLocalMean
TPFNTPFNTPFNTPFN
19.69%5.25%21.18%3.77%13.21%11.74%20.54%4.41%
FPTNFPTNFPTNFPTN
22.49%52.56%24.57%50.48%34.46%40.59%23.51%51.54%
MinOtsuMulti-OtsuTriangle
TPFNTPFNTPFNTPFN
21.08%3.87%19.67%5.28%14.68%10.27%22.85%2.10%
FPTNFPTNFPTNFPTN
24.52%50.54%22.46%52.59%11.77%63.28%33.18%41.87%
YenK-Means (K = 2)K-Means (K = 3)
TPFNTPFNTPFN
12.15%12.80%19.70%5.25%14.85%10.10%
FPTNFPTNFPTN
16.11%58.94%22.33%52.72%11.96%63.09%
Table 3. Normalized confusion tables of segmentation results using NoIR camera (nighttime).
Table 3. Normalized confusion tables of segmentation results using NoIR camera (nighttime).
IsodataLiLocalMean
TPFNTPFNTPFNTPFN
1.46%15.68%10.65%6.50%7.21%9.94%10.69%6.45%
FPTNFPTNFPTNFPTN
5.50%77.35%11.38%71.48%11.18%71.68%11.39%71.46%
MinOtsuMulti-OtsuTriangle
TPFNTPFNTPFNTPFN
0.31%16.84%1.39%15.75%8.95%8.20%2.94%14.21%
FPTNFPTNFPTNFPTN
4.90%77.95%5.46%77.39%17.26%65.59%3.91%78.95%
YenK-Means (K = 2)K-Means (K = 3)
TPFNTPFNTPFN
4.05%13.10%16.07%1.08%9.15%7.99%
FPTNFPTNFPTN
6.67%76.19%76.26%6.60%7.07%75.78%
Table 4. Normalized confusion tables of segmentation results using NoIR camera (daytime and nighttime).
Table 4. Normalized confusion tables of segmentation results using NoIR camera (daytime and nighttime).
IsodataLiLocalMean
TPFNTPFNTPFNTPFN
10.15%10.71%15.66%5.20%10.07%10.80%15.38%5.48%
FPTNFPTNFPTNFPTN
13.59%65.54%17.66%61.47%22.27%56.87%17.17%61.97%
MinOtsuMulti-OtsuTriangle
TPFNTPFNTPFNTPFN
10.20%10.66%10.10%10.76%11.68%9.18%12.42%8.44%
FPTNFPTNFPTNFPTN
14.24%64.89%13.56%65.58%14.65%64.49%17.85%61.28%
YenK-Means (K = 2)K-Means (K = 3)
TPFNTPFNTPFN
7.91%12.96%17.80%3.06%11.86%9.00%
FPTNFPTNFPTN
11.16%67.97%50.57%28.57%9.40%69.74%
Table 5. Normalized confusion tables of segmentation results using thermal camera (daytime).
Table 5. Normalized confusion tables of segmentation results using thermal camera (daytime).
IsodataLiLocalMean
TPFNTPFNTPFNTPFN
22.54%5.68%20.28%7.93%15.43%12.79%21.34%6.88%
FPTNFPTNFPTNFPTN
41.08%30.70%35.62%36.16%39.49%32.29%36.88%34.90%
MinOtsuMulti-OtsuTriangle
TPFNTPFNTPFNTPFN
13.02%15.20%22.59%5.63%19.10%9.12%0.43%27.79%
FPTNFPTNFPTNFPTN
25.07%46.72%41.20%30.59%30.72%41.06%4.90%66.88%
YenK-Means (K = 2)K-Means (K = 3)
TPFNTPFNTPFN
23.97%4.25%21.41%7.19%16.68%11.92%
FPTNFPTNFPTN
45.54%26.24%37.10%34.30%24.62%46.78%
Table 6. Normalized confusion tables of segmentation results using thermal camera (nighttime).
Table 6. Normalized confusion tables of segmentation results using thermal camera (nighttime).
IsodataLiLocalMean
TPFNTPFNTPFNTPFN
23.50%0.23%23.36%0.37%12.62%11.11%22.96%0.77%
FPTNFPTNFPTNFPTN
56.25%20.03%52.75%23.52%41.56%34.71%46.23%30.04%
MinOtsuMulti-OtsuTriangle
TPFNTPFNTPFNTPFN
0.41%23.32%23.50%0.23%23.07%0.65%22.98%0.75%
FPTNFPTNFPTNFPTN
5.49%70.78%56.40%19.87%47.71%28.56%48.58%27.69%
YenK-Means (K = 2)K-Means (K = 3)
TPFNTPFNTPFN
23.47%0.26%23.46%0.30%17.69%6.06%
FPTNFPTNFPTN
55.41%20.86%53.76%22.49%28.38%47.86%
Table 7. Normalized confusion tables of segmentation results using thermal camera (daytime and nighttime).
Table 7. Normalized confusion tables of segmentation results using thermal camera (daytime and nighttime).
IsodataLiLocalMean
TPFNTPFNTPFNTPFN
23.02%2.94%21.83%4.12%14.01%11.94%22.16%3.80%
FPTNFPTNFPTNFPTN
48.72%25.32%44.25%29.79%40.54%33.51%41.59%32.45%
MinOtsuMulti-OtsuTriangle
TPFNTPFNTPFNTPFN
6.67%19.29%23.05%2.91%21.10%4.85%11.79%14.17%
FPTNFPTNFPTNFPTN
15.20%58.84%48.86%25.19%39.28%34.76%26.91%47.14%
YenK-Means (K = 2)K-Means (K = 3)
TPFNTPFNTPFN
23.72%2.24%22.53%3.43%17.23%8.73%
FPTNFPTNFPTN
50.51%23.53%46.19%27.86%26.67%47.37%
Table 8. Average and maximum Recall for each camera type.
Table 8. Average and maximum Recall for each camera type.
Camera TypeAverage RecallMaximum Recall
DaytimeNighttimeWhole DayDaytimeNighttimeWhole Day
Visible camera0.794NANA0.934NANA
NoIR camera0.7610.4050.5740.9170.9370.868
Thermal camera0.6470.8310.7400.8630.9900.926
Table 9. Average and maximum Precision for each camera type.
Table 9. Average and maximum Precision for each camera type.
Camera TypeAverage PrecisionMaximum Precision
DaytimeNighttimeWhole DayDaytimeNighttimeWhole Day
Visible camera0.663NANA0.751NANA
NoIR camera0.5680.3290.4430.5990.5720.575
Thermal camera0.4190.4180.4180.4930.5050.494
Table 10. Average and maximum F1 score for each camera type.
Table 10. Average and maximum F1 score for each camera type.
Camera TypeAverage F1 ScoreMaximum F1 Score
DaytimeNighttimeWhole DayDaytimeNighttimeWhole Day
Visible camera0.584NANA0.794NANA
NoIR camera0.4660.3330.3960.5600.6360.599
Thermal camera0.3190.2820.3010.4050.3830.394
Table 11. Average and maximum Recall for each segmentation method.
Table 11. Average and maximum Recall for each segmentation method.
Segmentation MethodAverage RecallMaximum Recall
DaytimeNighttimeWhole DayDaytimeNighttimeWhole Day
Isodata0.7950.5380.6620.8140.9900.903
Li0.8280.8030.7960.8970.9840.862
Local0.6670.2750.4760.8470.5320.539
Mean0.8200.7960.7950.8640.9680.871
Min0.7390.3190.4860.8960.6210.731
Otsu0.7940.5360.6610.8160.9900.904
Multi-Otsu0.6590.7470.6970.6980.9720.836
Triangle0.6220.5700.5110.9340.9680.527
Yen0.7320.6130.6450.8630.9890.926
K-Means (K = 2)0.7730.9630.8760.7910.9880.885
K-Means (K = 3)0.6430.6390.6220.7170.7440.676
Table 12. Average and maximum Precision for each segmentation method.
Table 12. Average and maximum Precision for each segmentation method.
Segmentation MethodAverage PrecisionMaximum Precision
DaytimeNighttimeWhole DayDaytimeNighttimeWhole Day
Isodata0.5750.2790.4030.6500.4540.472
Li0.5930.5060.5220.7020.5450.571
Local0.4790.1680.3190.5970.3240.347
Mean0.5920.5200.5320.6890.5460.570
Min0.5380.2770.3660.6990.5450.571
Otsu0.5750.2760.4010.6490.4540.471
Multi-Otsu0.5980.4960.5130.7310.5030.538
Triangle0.4110.2990.2900.6550.4800.332
Yen0.5140.3670.4000.6690.4570.474
K-Means (K = 2)0.5700.3790.4560.6300.4620.477
K-Means (K = 3)0.6030.5390.5340.7510.5720.575
Table 13. Average and maximum F1 score for each segmentation method.
Table 13. Average and maximum F1 score for each segmentation method.
Segmentation MethodAverage F1 ScoreMaximum F1 Score
DaytimeNighttimeWhole DayDaytimeNighttimeWhole Day
Isodata0.4600.2380.3210.5580.2950.325
Li0.4680.3980.4060.5780.4890.476
Local0.3750.1220.2410.4620.2330.257
Mean0.4690.4100.4130.5740.4890.478
Min0.4340.2520.3080.5740.4890.476
Otsu0.4600.2350.3190.5580.2940.324
Multi-Otsu0.5810.4360.4550.7940.5460.553
Triangle0.3140.2050.2080.5080.3190.244
Yen0.4180.3350.3410.5620.3720.359
K-Means (K = 2)0.4590.2400.3240.5460.3020.332
K-Means (K = 3)0.5850.5090.4960.7920.6360.599
Table 14. Execution time of segmentation algorithms.
Table 14. Execution time of segmentation algorithms.
Segmentation AlgorithmIsodataLiLocalMeanMinimumOtsuMulti-
Otsu
TriangleYenK-Means
(K = 2)
K-Means
(K = 3)
Execution time (s)0.1740.2104.9920.1750.3420.1830.2050.1670.1762.4633.483
Table 15. Normalized confusion tables of segmentation results of Arabidopsis thaliana plant.
Table 15. Normalized confusion tables of segmentation results of Arabidopsis thaliana plant.
IsodataLiLocalMean
TPFNTPFNTPFNTPFN
32.70%1.05%33.02%0.72%24.39%9.36%32.73%1.01%
FPTNFPTNFPTNFPTN
15.49%50.76%17.81%48.44%25.24%41.02%15.48%50.78%
MinOtsuMulti-OtsuTriangle
TPFNTPFNTPFNTPFN
31.87%1.88%32.68%1.07%30.35%3.39%30.41%3.34%
FPTNFPTNFPTNFPTN
12.95%53.30%15.35%50.90%9.27%56.98%10.94%55.31%
YenK-Means (K = 2)K-Means (K = 3)
TPFNTPFNTPFN
32.24%1.51%32.82%0.92%32.47%1.28%
FPTNFPTNFPTN
18.20%48.05%14.26%52.00%3.49%62.76%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Soetedjo, A.; Hendriarianti, E. A Comparative Study of Vetiveria zizanioides Leaf Segmentation Techniques Using Visible, Infrared, and Thermal Camera Sensors in an Outdoor Environment. Appl. Syst. Innov. 2023, 6, 1. https://doi.org/10.3390/asi6010001

AMA Style

Soetedjo A, Hendriarianti E. A Comparative Study of Vetiveria zizanioides Leaf Segmentation Techniques Using Visible, Infrared, and Thermal Camera Sensors in an Outdoor Environment. Applied System Innovation. 2023; 6(1):1. https://doi.org/10.3390/asi6010001

Chicago/Turabian Style

Soetedjo, Aryuanto, and Evy Hendriarianti. 2023. "A Comparative Study of Vetiveria zizanioides Leaf Segmentation Techniques Using Visible, Infrared, and Thermal Camera Sensors in an Outdoor Environment" Applied System Innovation 6, no. 1: 1. https://doi.org/10.3390/asi6010001

APA Style

Soetedjo, A., & Hendriarianti, E. (2023). A Comparative Study of Vetiveria zizanioides Leaf Segmentation Techniques Using Visible, Infrared, and Thermal Camera Sensors in an Outdoor Environment. Applied System Innovation, 6(1), 1. https://doi.org/10.3390/asi6010001

Article Metrics

Back to TopTop