Predicting Sweet Pepper Yield Based on Fruit Counts at Multiple Ripeness Stages Monitored by an AI-Based System Mounted on a Pipe-Rail Trolley

Shimomoto, Kota; Shimazu, Mitsuyoshi; Matsuo, Takafumi; Kato, Syuji; Naito, Hiroki; Kashino, Masakazu; Ohta, Nozomu; Yoshida, Sota; Fukatsu, Tokihiro

doi:10.3390/horticulturae11070718

Open AccessArticle

Predicting Sweet Pepper Yield Based on Fruit Counts at Multiple Ripeness Stages Monitored by an AI-Based System Mounted on a Pipe-Rail Trolley^†

by

Kota Shimomoto

^1,*

,

Mitsuyoshi Shimazu

¹,

Takafumi Matsuo

²,

Syuji Kato

²,

Hiroki Naito

³

,

Masakazu Kashino

¹,

Nozomu Ohta

¹

,

Sota Yoshida

¹

and

Tokihiro Fukatsu

¹

Institute of Agricultural Machinery, National Agriculture and Food Research Organization, Tsukuba 3050856, Japan

²

Takahiko Agro-Business Co., Ltd., Kokonoe 8794802, Japan

³

Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 1138657, Japan

^*

Author to whom correspondence should be addressed.

^†

The findings presented in this manuscript were partially reported in proceedings of the International Symposium on New Technologies for Sustainable Greenhouse Systems: GreenSys2023, Cancún (Mexico), 22–27 October 2023, Available online: https://www.ishs.org/ishs-article/1426_49.

Horticulturae 2025, 11(7), 718; https://doi.org/10.3390/horticulturae11070718

Submission received: 19 May 2025 / Revised: 15 June 2025 / Accepted: 19 June 2025 / Published: 20 June 2025

(This article belongs to the Special Issue Artificial Intelligence in Horticulture Production)

Download

Browse Figures

Versions Notes

Abstract

In our previous study, we developed a monitoring system for automatically counting tomatoes produced in protected horticulture using deep learning–based object detection. In this study, we adapted the system for sweet peppers and developed a monitoring system tailored to this crop. We evaluated its fruit detection and counting performance in a large-scale commercial greenhouse. Furthermore, we investigated the relationship between fruit counts at different ripeness stages and the total yield in the cultivation area, and we assessed the accuracy when predicting the yield for the following week. The results confirmed that the system maintained a stable fruit detection performance throughout the trial, and that its outputs were reliable enough to indicate its potential to replace manual counting. In addition, the average number of fruits at the 1–40% and 41–80% ripeness stages across six planting rows showed a correlation with the total weekly yield in the entire 0.6 ha cultivation area the following week. A yield prediction model using average fruit counts at these two ripeness stages as explanatory variables achieved a WAPE of 21.35%, indicating that the monitoring system is effective for yield prediction.

Keywords:

fruit detection; maturity classification; continuous monitoring; harvest timing; large-scale greenhouse

1. Introduction

With the introduction of cutting-edge technologies, protected horticulture has advanced toward large-scale operations. In addition to environmental data, biometric measurements have become increasingly common. Takayama et al. [1] conducted large-scale measurements of photosynthetic function in commercial greenhouses using a chlorophyll fluorescence measurement robot.

With the emergence of object detection using deep learning, fruit detection has become an active area of research, leading to the development of numerous fruit detection models, including those for tomatoes [2,3,4]. Sensing robots that estimate the yield by counting fruits through fruit detection have also been studied. Afonso et al. [5] developed a robot based on Mask R-CNN to detect tomatoes. To reduce false detections of background fruits in the camera’s field of view, depth information was used for filtering, thereby improving detection accuracy. Similarly, Rong et al. [6] developed a robot that counted mini tomato clusters using depth information to achieve high accuracy.

In a previous study, we developed a fruit monitoring system for tomatoes [7]. This system enables simple yet highly accurate fruit object detection without the use of depth information by employing a unique lighting method called the focus illumination unit (FIU) [8]. Furthermore, Naito et al. [9] utilized this system to predict the fruit count, yield, and harvesting time for rows of crops in low-density cultivation and long-term multistage tomato cultivation, and they evaluated the accuracy of these predictions. In that study, it was demonstrated that the yield of a planting row could be predicted five days in advance with a weighted absolute percentage error (WAPE) of approximately 25% through automated counting. However, this work was conducted in a small facility and focused on the yield of a single planting row, rather than a large-scale greenhouse with over 100 rows. Moreover, because harvesting is conducted weekly in year-round greenhouse cultivation, it is necessary not only to predict the final (cumulative) yield but also to estimate the quantity of fruit to be harvested in a given week. Therefore, unlike yield prediction in vineyards or orchards—where the harvest occurs once a year, as in the studies by Nuske et al. [10] and Stein et al. [11]—it is necessary to assess fruit maturity. In the study by Naito et al. [9], colored fruits were classified into two categories: “turning” and “red.” To further improve yield prediction accuracy, it is necessary to classify fruit maturity into more detailed stages.

Most automatic fruit-counting technologies introduced in protected horticulture have primarily focused on tomatoes. However, like tomatoes, sweet peppers are widely cultivated in large-scale protected horticulture systems [12]. Therefore, monitoring techniques and harvesting robots have been developed for this crop [13,14,15,16,17,18]. Sweet pepper production is characterized by significant fluctuations in yield due to phenomena such as fruit flushes and flower drop [19,20,21]. Therefore, real-time monitoring of fruit conditions and the ability to predict the yield several weeks in advance are crucial from a marketing perspective. In addition to these yield-related characteristics, sweet pepper plants differ from tomatoes in that they cannot undergo vine lowering due to their rigid stems, making it difficult to adjust fruit height during the cultivation period. As the cultivation progresses, plants can reach heights of up to 4 m, requiring image capture at elevated positions. Furthermore, unlike tomatoes, sweet pepper fruits often exhibit non-uniform coloration during ripening, with color change progressing in a mottled pattern [22,23]. During ripening, the surface of sweet pepper fruits may also temporarily turn dark brown [24,25]. Recently, several studies have focused on fruit detection and maturity assessment for sweet peppers [13,14]. However, these studies have not extended their analyses to yield prediction based on the number of fruits at each maturity stage.

In this study, our primary objective was to investigate the relationship between the number of sweet pepper fruits (N_f) at different ripening stages and the yield across a wide cultivation area with more than 100 crop rows. To achieve this, we applied our tomato fruit monitoring system to sweet peppers and developed a scanning device mounted on a high-positioning work platform. This device was used over several months in a large-scale facility to monitor the N_f at different ripening stages. During this period, its performance in fruit detection and counting was also evaluated. Moreover, the prediction accuracy of a model using N_f at each ripeness stage as an explanatory variable was evaluated.

2. Materials and Methods

2.1. Plant Materials and Greenhouse Specifications

Sweet pepper (Capsicum annuum var. annuum ‘Nagano’; transplant date: 21 September 2022) was grown on rockwool cubes (Grodan Plantop, GRODAN Group, Roermond, The Netherlands; 100 mm [W] × 100 mm [H] × 65 mm [D]) placed on rockwool slabs (GROW BAGS, NDK Eco Agro Co., Ltd., Oita, Japan; 1.0 m [W] × 0.2 m [H] × 75 mm [D]) in a 3.0 ha commercial greenhouse (net cultivation area: 2.4 ha) at Ai-Sai Farm Kokonoe, located at 33°25′ N, 131°28′ E. A total of 108 sweet pepper plants, each trained with three stems, were grown in a cultivation lane approximately 50 m long, oriented along the northwest–southeast axis. The cultivation area was divided into four rooms, each covering an area of 0.6 ha, with each room containing more than 130 plant rows. This study focused on one of these rooms, which had 144 planting rows. Professional growers managed the sweet pepper plants, and routine crop maintenance operations—such as sucker pruning, deleafing (removal of old leaves), wiring, and spraying—were conducted weekly.

2.2. Sweet Pepper Fruit Monitoring System

The sweet pepper fruit monitoring system consisted of a scanning device, an automatic data transfer and processing system, and an automatic analysis program for fruit detection and maturity classification. Each component is described in detail in the following subsections.

2.2.1. Fruit-Scanning Device with Dual Cameras

The main body of the scanning device was constructed using aluminum pipe components (GFF-400B, SUS Corporation, Shizuoka, Japan) and measured to be 1.3 m (H) × 0.4 m (W) × 0.25 m (D) (Figure 1a). It was equipped with two shooting units. In the tomato monitoring system [7,9], only one side of the two planting rows flanking the pipe rail was imaged. Therefore, images were captured on both the outbound and return trips, and the camera direction was switched at the end of the outbound trip. In contrast, the sweet pepper monitoring system used two shooting units to scan both sides of the planting rows simultaneously, eliminating the need to capture images during the outbound trip. This allowed high-speed movement and reduced the measurement time, increasing the number of rows monitored per unit time. Each shooting unit included two light sources (ZFS-155000-CW, JKL Components Corporation, Los Angeles, CA, USA; correlated color temperature: 6500 K) with shading plates and a USB camera module (ELP-USBFHD01M-L21-JP, Shenzhen Ailipu Technology Co., Ltd., Shenzhen, China; frame rate: 60 fps) connected to a tablet computer (Raytrek RT08WT, ThirdWave Corporation, Tokyo, Japan) (Figure 1a,b). All light sources were powered by a 12 V mobile battery (C86-JP, Huizhou RoyPow Technology Co., Ltd., Huizhou, China, 23,400 mAh) and consisted of a flexible, tape-type LED fixture in which LEDs were densely arranged in two rows to ensure brightness uniformity. The illuminance, as measured by a light meter (T-10A, Konica Minolta, Inc., Tokyo, Japan), was 540 lx at 30 cm from the LED light source. The camera modules employed an electronic rolling shutter and featured auto-exposure control, which automatically adjusted the shutter speed in response to ambient light. The sensor sensitivity was regulated by automatic gain control. As the measurements were conducted at night under stable lighting conditions, neither the shutter speed nor the sensor sensitivity was manually fixed. Each module was equipped with a 2.1 mm fixed-focus lens. The two camera modules were fixed at the bottom of the device, facing the left and right sides. Following the findings of Kurtser and Edan [26], the cameras were angled at a zenith angle of 60°, capturing images diagonally from below to facilitate the detection of fruits occluded by leaves.

Our previous system for tomato fruit monitoring used a rail-guided vehicle as the moving unit. However, in the Venlo-type greenhouse (Figure 2a), sweet pepper fruits reached heights exceeding 4 m during the final stages of annual cultivation. For high-position measurements in greenhouses, a hanging robot was developed by Kano et al. [27]. In a harvesting robot for sweet peppers, Arad et al. [18] approached high-position fruits using a moving platform with an elevation mechanism. In our new system for sweet peppers, to monitor high-position fruits, the scanning device was mounted on a pipe-rail trolley used for harvesting and routine crop maintenance at elevated positions (Figure 2b,c). The device weighed 13.16 kg and could be easily carried and attached to the trolley by greenhouse staff. During measurements, the trolley traveled along a pipe rail between planting rows at a constant speed of approximately 0.24 m/s.

2.2.2. Automatic Data Transfer and Analysis Process

The data acquisition and analysis workflow of the system is outlined below. After scanning, the tablet PCs were removed from the scanning device and placed at a station equipped with a cooling fan, located in a greenhouse office adjacent to the cultivation area. At a designated time, software on the tablet PCs automatically initiated the generation of panoramic images from videos recorded by the scanning device. As in our previous research [8], a series of strip-shaped photographs were extracted along vertical lines in the images and stitched into a single panoramic image, which was saved in JPEG format. These panoramic images were then divided into 512 × 512-pixel squares, and the square images were uploaded to online storage at scheduled times. The widths of the strip-shaped photographs were manually adjusted according to the speed of the pipe-rail trolley. A computer located outside the greenhouse retrieved the square images, counted the fruits, and assessed their maturity levels. Fruit maturity classification was performed separately after detection, as described in Section 2.2.3.

2.2.3. Fruit Detection Model

Consistent with our previous studies [7,8,9], Mask R-CNN models built in a TensorFlow-Keras environment [28] were used for fruit detection. Mask R-CNN identifies and classifies objects in images using bounding boxes and performs pixel-level instance segmentation to distinguish objects of the same category [29,30]. The model was run in an environment equipped with a graphics card (GeForce^® GTX 2080 Ti, NVIDIA Corporation, Santa Clara, CA, USA), a CPU (Intel^® Core™ i9-10900K, 10 cores, 3.70 GHz, Intel Corporation, Santa Clara, CA, USA), 64 GB of RAM, and a 64-bit Windows 10 Pro OS.

Training images were obtained from a cultivation area within the same greenhouse but distinct from those described later, between 7 October and 18 November 2022. Videos were converted into images with a resolution of 512 × 512 pixels. Although the default input size in Mask R-CNN is set to 1024 × 1024, preliminary experiments showed no considerable difference in detection performance between 1024 × 1024 and 512 × 512 images. Therefore, in this study, an input size of 512 × 512 was adopted to reduce data transfer and the image processing time. The dataset consisted of 1042 images for training and 190 for validation, with polygon annotations added. A standard Mask R-CNN model with a ResNet101 backbone, pre-trained on the Microsoft Common Objects in the Context (MS COCO) dataset [31], was fine-tuned using these annotated images. Data augmentation was performed using the Imgaug library [32], including horizontal flipping with a 50% probability, affine transformations of 45° and 90°, and scaling from 0.5 to 1.5 times, each with a 50% chance. The number of iterations in each epoch corresponded to the number of training images, and training was run for 500 epochs, following prior studies [8] and preliminary experiments, which revealed no notable signs of overfitting. Therefore, early stopping was not applied. The learning rate of 0.001, the learning momentum of 0.9, and the weight decay of 0.0001 were employed. The intersection over union (IoU) threshold was greater than 0.5. In this study, fruit detection was performed for a single category, without distinguishing between mature and immature fruits.

2.2.4. Fruit Maturity Classification

As in our previous study [9], Python 3.7.13 was used to convert RGB images to the HSL color model for masked objects, and fruit maturity was classified based on hue values. Following criteria from Jonauskaite et al. [33] and Naito et al. [9], pixel hue values in masked images were classified into red, turning, and green categories, as shown in Table 1.

The coloring rate of each detected fruit was calculated using the following formula:

R_{f} = \frac{P i x_{H R} + 0.25 * P i x_{H T}}{P i x_{A L L}},

(1)

where R_f is the coloring rate for red, Pix_ALL is the total number of pixels in the masked image, Pix_HR is the number of pixels classified as red, and Pix_H_T is the number of pixels classified as turning, based on hue values. To appropriately evaluate the maturation stage characterized by the onset of subtle coloration and the appearance of a brownish hue across a relatively wide area of the fruit surface, we applied a multiplication factor of 0.25 to Pix_HT as an adjustment. This factor was empirically determined through preliminary experiments. To minimize the impact of overexposure, pixels with brightness values over 210 were excluded from the coloring rate calculation, and masked images with an average brightness above 210 were removed from analysis, as in our previous study [9]. All image processing steps were conducted using the OpenCV library (version 3.4.2) in Python (version 3.7.13).

Fruit ripeness was defined based on the coloring rate as follows (Figure 3):

0%: Enlargement stage;
1–40%: Early maturation stage;
41–80%: Mid-to-late maturation stage;
81–100%: Ripe fruit.

The ripeness classification criteria were discussed with a team including producers, and it was assumed that fruits in the 81–100% category would be harvested within one week of measurement. The remaining 1–80% range was divided into early and mid-to-late maturation stages.

2.3. On-Site Trials of Continuous Fruit Monitoring in a Commercial Sweet Pepper Greenhouse

We used our developed system to track the number of colored sweet pepper fruits grown across six rows of plants (divided into three sites with two rows each) within a designated 0.6-hectare area (comprising 144 rows) in the greenhouse (Figure 4). Among these sites, one (Site 2) was located near the center, while the other two (Sites 1 and 3) were placed equidistant from the center and the perimeter of the designated area. The initial week of monitoring, from 20 to 26 November 2022, was designated as Week 1. Measurements were conducted weekly, typically on Friday nights, and occasionally on Thursdays to accommodate other farming activities, from 25 November 2022 (Week 1) to 10 February 2023 (Week 12). Given that this study involved nighttime and long-term measurements, we limited the number of monitored rows to six in consideration of the associated workload, as monitoring six rows could be completed in approximately 30 min. In this study, “colored sweet pepper fruits” refers to fruits at the turning and fully ripe stages, excluding fruits at the 0% ripeness level (i.e., the enlargement stage with fully green fruits).

Through this monitoring, we conducted the following investigations: (1) fruit detection performance; (2) counting performance for colored sweet pepper fruits; (3) the relationship between the number of detected colored fruits and weekly yield; (4) prediction and accuracy verification of the weekly yield for the following week.

2.3.1. Fruit Detection Performance

The fruit detection performance of the Mask R-CNN model was evaluated using average precision (AP). The evaluation was conducted separately for the images obtained on each monitoring day to assess performance stability over time. We used all the square images (512 × 512 pixels) obtained from one side of Sites 1–3.

2.3.2. Counting Performance for Colored Sweet Pepper Fruits

To evaluate counting performance, we selected 20 plants from the middle of a planting row in Site 2. A grower manually counted the colored fruits on these plants from Week 1 to Week 11. Counting was conducted either on the day of monitoring or the following day, depending on other cultivation tasks. Counting was not performed in Week 12. Manual counting of unripe green fruits was omitted due to their large numbers, which would have significantly increased the workload during the long-term study. The grower subjectively classified the counted fruits into three maturity classes—1–40%, 41–80%, and 81–100%—based on experience.

2.3.3. Relationship Between the Number of Detected Colored Fruits and Weekly Yield

To develop a yield prediction method, we investigated the relationship between the number of detected colored fruits and the weekly yield. Harvesting occurred two to three times per week. The total yield (i.e., the weight of harvested sweet pepper fruits) was measured for the entire 0.6-hectare cultivation area and for each of Sites 1–3 from Week 2 (27 November–3 December 2022) to Week 13 (12–18 February 2023). Greenhouse temperature and humidity data were obtained from the environmental control system (Priva Connext, Priva B.V., De Lier, Netherlands), and outdoor solar radiation data were obtained from the agrometeorological grid square dataset [34] for the greenhouse’s location.

2.3.4. Prediction and Accuracy Verification of Weekly Yield in the Following Week of Monitoring

As a first step toward developing a yield prediction framework, we focused on short-term forecasting and conducted weekly yield predictions for the week following monitoring using the automatically counted N_f by ripeness stage. Multiple linear regression analysis was employed, with the N_f at each ripeness stage serving as the explanatory variable, and the weekly yield of the entire 0.6-hectare area for the following week being the response variable. The prediction model was defined as

y_{i + 1} = α_{1} x_{1, i} + α_{2} x_{2, i} + α_{3} x_{3, i} + C_{1},

(2)

where the explanatory variables x_1,i, x_2,i, and x_3,i represent the N_f at ripeness stages of 1–40%, 41–80%, and 81–100%, respectively, in week i; and coefficients α₁, α₂, and α₃, as well as constant C₁, were determined using the least squares method. The prediction accuracy was evaluated using the leave-one-out cross-validation method. For each target week, the model was trained using data from all weeks except the target week, and then used to predict the yield for that week. This process was repeated for all weeks, and the weighted absolute percentage error (WAPE) [9] was calculated to assess the prediction performance:

W A P E = \frac{\sum_{i = 2}^{n} |y_{i} - \hat{y_{i}}|}{\sum_{i = 2}^{n} |y_{i}|},

(3)

where y_i and

\hat{y_{i}}

denote the observed and predicted yields in week i, and n indicates the total number of weeks. To examine the optimal combination of fruit counts across ripeness stages, WAPE was calculated using Equations (4)–(9):

y_{i + 1} = α_{4} x_{1, i} + α_{5} x_{2, i} + C_{2},

(4)

y_{i + 1} = α_{6} x_{2, i} + α_{7} x_{3, i} + C_{3},

(5)

y_{i + 1} = α_{8} x_{1, i} + α_{9} x_{3, i} + C_{4},

(6)

y_{i + 1} = α_{10} x_{1, i} + C_{5},

(7)

y_{i + 1} = α_{11} x_{2, i} + C_{6},

(8)

y_{i + 1} = α_{12} x_{3, i} + C_{7},

(9)

In addition, to better support the WAPE-based assessment, error distribution graphs based on residuals were plotted.

3. Results

3.1. Fruit Detection Performance

Figure 5a shows 512 × 512-pixel square images captured using the developed monitoring system. Owing to our unique lighting method, referred to as the focused illumination unit, the background crops appear almost completely black and are barely visible. Figure 5b presents the results of fruit detection using Mask R-CNN, with detected fruit areas highlighted in color and the remaining regions shown in grayscale. Figure 5c–f illustrate examples of maturity classification at the 0% (c), 1–40% (d), 41–80% (e), and 81–100% (f) maturity stages. Specifically, these include the original masked fruit images, image analysis results, and pie charts illustrating the coloration rate. The blue areas in the image analysis results indicate pixels with brightness values greater than 210, which were excluded from the coloration rate calculations. Figure 6 and Table 2 present the precision–recall curves and average precision (AP) values at IoU = 0.5, used to evaluate the fruit detection performance throughout all monitoring weeks. All precision–recall curves from week 1 to week 12 showed a consistent pattern, with relatively low precision levels. The AP values remained stable during the entire monitoring period, ranging from 71% to 77%.

3.2. Counting Performance for Colored Sweet Pepper Fruits

Figure 7 illustrates the relationship between N_f manually counted by the grower and those automatically counted by the monitoring system at each ripeness stage. The correlation coefficient (r) for the 1–40% ripeness stage was 0.96 (p < 0.01), with a regression slope of 0.80 (Figure 7a). For fruits at the 41–80% stage, r was 0.63 (p < 0.05), with a regression slope of 0.19 (Figure 7b). For fruits at the 81–100% stage, r was 0.92 (p < 0.01), with a regression slope of 0.87 (Figure 7c). For fruits at the 1–40% and 81–100% stages, the actual N_f counted manually ranged from 0 to 30. In contrast, for the 41–80% stage, the N_f remained below 10 throughout all weeks.

3.3. Relation Between the Number of Detected Colored Fruits and the Weekly Yield

Figure 8a shows the weekly cumulative solar radiation and total weekly yield in the 0.6 ha cultivation room during the experimental period. Week 1 corresponded to 20–26 November 2022 and Week 13 to 12–18 February 2023. The weekly cumulative solar radiation gradually decreased until Week 5, then increased and peaked in Week 7. Although it declined again until Week 10, it rose once more in Week 11. Overall, cumulative solar radiation fluctuated in a zigzag pattern throughout the experimental period. The average daily temperature and humidity inside the greenhouse remained stable, at 18.87 °C (±0.52 °C S.D.) and 77.24% (±3.61% S.D.), respectively. The weekly total yield increased sharply after the start of measurements, peaking at 7044 kg in Week 4. It then declined rapidly, with a slight rebound in Week 8, but continued to decrease overall, reaching a minimum of 255 kg in Week 10. Subsequently, the yield recovered, exceeding 2000 kg in Weeks 12 and 13.

Figure 8b shows the time course of the average N_f in Sites 1–3 at each ripeness stage. Error bars represent the standard deviation. For fruits at the 1–40% ripeness stage (yellow markers), the count increased after the experiment began, peaking in Week 2. A high value of nearly 200 was maintained in Week 3. The count then dropped sharply to around 50 in Weeks 7 and 8, remaining below 50 through Week 10, before recovering to approximately 100 in Week 11. For fruits at the 41–80% ripeness stage (orange markers), the count was below 10 in Week 1 but rose to over 50 from Weeks 2 to 4. Although it declined thereafter, a small increase was observed in Week 7. The count then fell below 10 in Weeks 9 and 10, before recovering to approximately 50 in Weeks 11 and 12. For fruits at the 81–100% ripeness stage (red markers), the count was also below 10 in Week 1, similar to the 41–80% stage. It increased to 90 by Week 4, declined through Week 6, then rose again in Week 7 to a maximum of 134. This was followed by a decrease below 10 in Week 10, with a subsequent recovery in Weeks 11 and 12, similar to the other stages.

A comparison of ripeness stages revealed differences in both the timing and magnitude of fruit count changes. The average N_f across the experimental period also varied: 82 (±52 S.D.) for the 1–40% stage, 35 (±25 S.D.) for the 41–80% stage, and 55 (±37 S.D.) for the 81–100% stage. While overall temporal trends were similar across ripening stages, the timing of fluctuations varied slightly, and several distinct patterns emerged. First, the timing of the early peaks shifted by ripeness stage: Week 2 for 1–40%, Week 3 for 41–80%, and Week 4 for 81–100%. The 1–40% stage showed an increase in Week 6, while the 41–80% and 81–100% stages increased in Week 7. Notably, the increase in the 81–100% stage in Week 7 significantly exceeded its earlier peak, marking the highest value during the experimental period. Regarding the lowest fruit counts in the latter half of the experiment, the 1–40% stage reached its minimum in Week 8, while the 41–80% and 81–100% stages reached theirs in Week 10. In the recovery phase (Weeks 11 and 12), the 1–40% stage showed a sharp increase to around 100 fruits in Week 11, followed by a slight decline to about 50 in Week 12. In contrast, the 41–80% stage increased modestly in Week 11 and continued rising gradually in Week 12. The 81–100% stage exhibited a marked linear increase over Weeks 11 and 12.

Simple linear regression analysis was conducted to examine the relationship between the average N_f, automatically counted by the monitoring system for each ripeness stage, and the total weekly yield in the following week (Figure 9). For fruits at the 1–40% and 41–80% ripeness stages, strong correlations were observed, with r values of 0.88 and 0.94, respectively (p < 0.01) (Figure 9a,b). However, for fruits at the 81–100% ripeness stage, no significant correlation was found, with an r value of 0.48 and a p-value greater than 0.05 (Figure 9c).

Figure 10a shows the temporal trends in the average weekly total yield for the entire 0.6 ha cultivation area and for Sites 1–3. The average yield for the entire area was calculated by dividing the weekly total yield by 72, corresponding to the number of working aisles. This value represents the average yield per operation, where one operation refers to harvesting along two crop rows during a single entry into a working aisle. Error bars for Sites 1–3 represent standard deviations. Compared to the overall average, the average weekly total yield in Sites 1–3 was slightly higher during Weeks 2–4 and slightly lower during Weeks 5–7. Thereafter, the values for both followed nearly identical trends, with similar fluctuations throughout the experimental period. A regression analysis between the two datasets (Figure 10b) yielded an r value of 0.98 (p < 0.01), with a slope of 1.10.

3.4. Prediction and Accuracy Verification of Weekly Yield in the Following Week of Monitoring

Table 3 presents the WAPE results for each combination of explanatory variables (i.e., prediction formulas) used to estimate the weekly total yield in the following week. Combinations are listed in ascending order of WAPE, with smaller values indicating higher prediction accuracy. The combination of (‘1–40%’, ‘41–80%’) produced the highest prediction accuracy, with a WAPE of 21.35. This was followed by the combination including all ripeness stages (‘1–40%’, ‘41–80%’, ‘81–100%’), which yielded the second-lowest WAPE. The third-best accuracy was achieved using only the (‘41–80%’) stage as the explanatory variable. The model with the lowest accuracy was the one using only the (‘81–100%’) stage, which resulted in the highest WAPE of 63.18.

Figure 11 shows the error distribution graphs based on residuals for each combination of explanatory variables. In most combinations, the histogram of residuals showed the highest frequency near the center of the distribution and exhibited a well-balanced and consistent shape without noticeable outliers. For the combinations of (‘1–40%’, ‘41–80%’) and (‘1–40%’, ‘41–80%’, ‘81–100%’), residuals did not exceed 2000. In contrast, when only the (‘81–100%’) stage was used as the explanatory variable, residuals exceeding 4000 were observed symmetrically.

4. Discussion

The model performance was relatively low, with an average precision (AP) of about 70% (Table 2), which is believed to be due to leaf occlusion. In experiments conducted by Naito et al. [7], the fruit detection performance for tomatoes grown using low-node-order pinching and high-density planting methods—with high occlusion—showed an AP of approximately 0.8. The results of the present study were clearly lower. However, in terms of counting performance, a correlation was observed between N_f automatically counted by the system and those manually counted by the grower across all ripeness stages (Figure 7). This suggests that the automatic counting system has a sufficient performance to serve as a substitute for manual counting. The relatively low detection performance was attributed to low precision (Figure 6), likely resulting from false detections caused by the visual similarity between green fruits and surrounding leaves. As shown in Figure 5b, some leaves were incorrectly identified as green fruits. In the current method, fruits are detected regardless of color, followed by ripeness classification based on hue. Therefore, given that the counting performance for colored fruits—the main focus of this study—was sufficiently high, it is presumed that most false detections occurred for green fruits, which were not included in the final count. These false detections likely contributed to the lower AP. Consequently, AP could potentially be improved by training and running inference using a model focused solely on colored fruits. Additionally, the use of newer instance segmentation models [35] may be worth exploring. Regarding the stability of detection performance, AP remained consistent throughout the approximately three-month experimental period (Table 2). This stability can be largely attributed to the consistent appearance and condition of the crops, maintained through professional cultivation management by skilled staff in a high-tech commercial greenhouse. Despite changes in crop growth, fruit count, and fruit positioning over time, the developed system maintained a consistent level of fruit detection performance. This stability likely supports fruit-counting accuracy and may have a positive effect on subsequent yield prediction.

The regression slopes for the counting results were 0.80, 0.19, and 0.87 for the three ripeness stages, respectively (Figure 7), indicating that the system consistently underestimated actual fruit counts. This underestimation was likely due to occlusion, as only the surface of the fruit facing the camera is visible. As noted, visibility of the peppers is limited, and capturing the condition of all fruits (including coloration) using simple photography from outside the canopy is challenging. Hemming et al. [36] reported that increasing the number of cameras from one to five improves visibility. However, using multiple cameras increases both cost and analytical complexity. In particular, identifying the same fruit across different camera views adds difficulty and significantly increases the processing time. Even with five cameras, the maximum fruit detection rate under 50% occlusion was 90%, which may drop to 76% depending on the cultivation method and season. Kurtser and Edan [26] also found that combining multiple viewpoints can improve detection rates up to 85%. Conversely, Harel et al. [23] reported that fruit maturity classification using a single viewpoint can sometimes reach very high accuracy (up to 93%), although results tend to be unstable. They further noted that integrating more than three viewpoints can stabilize accuracy, albeit at the cost of higher expenses and longer processing times. Therefore, while accurately counting the absolute N_f may be limited, using a single, low-cost, and simple camera to capture the overall trend is considered a realistic and practical approach for large-scale measurement.

For fruits at the 41–80% ripeness stage, although a correlation was observed, the regression coefficient from the simple linear regression analysis was extremely low at 0.19 compared to the other ripeness stages (Figure 7b). This may be attributed to the relatively small N_f at this stage and the lack of a sharp contrast between periods of high and low fruit counts, which made it more difficult to detect a clear correlation. Considering that over 80% of fruits in the 1–40% and 81–100% ripeness stages were successfully counted (Figure 7a) and that the overall recall was not particularly low (Figure 6), it is unlikely that fruits in the 41–80% ripeness stage were disproportionately missed. Similarly, it is unlikely that occlusion affected only this stage. Therefore, the automatic counting system, which relies on the visible surface captured by the camera, may have misclassified some fruits into other ripeness stages. This suggests a high likelihood that fruits in the 41–80% ripeness stage were mistakenly classified as belonging to other stages. Moreover, in cases where the actual N_f in the 81–100% ripeness stage was zero, the system still counted up to six fruits (Figure 7e). No red-colored objects other than fruits were present in the captured images, and no nonfruit objects were found to have been misclassified at this ripeness stage. Therefore, it is likely that fruits from the 1–40% and 41–80% stages were misclassified. In particular, fruits in the 41–80% ripeness stage, which typically have a higher proportion of red hues, were more likely to be misclassified. This misclassification was likely not due to errors in the hue-based classification algorithm itself but rather to occlusion. When part of a fruit is hidden and the visible surface captured by the camera shows a high proportion of red, the system may classify it as belonging to the 81–100% ripeness stage, even if it actually belongs to the 41–80% stage. In the study by Viveros Escamilla et al. [14], sweet pepper fruits at the mid-late stage (from 50% to 90% ripeness but no more) were most frequently misclassified as mature (over 90% ripeness). For the reasons described above, if we aim to address the issue of misclassification of ripening stages without increasing the number of cameras in future work, it may be effective to track individual fruits over time using a monocular camera. By consistently identifying the same fruit and monitoring its ripening progression, we may be able to estimate how many days have passed since the onset of color change. This temporal information could enable us to infer the ripening stage even when the opposite side of the fruit is not visible to the camera, thereby potentially reducing classification errors.

As shown in Figure 8, the total yield exceeded 7000 kg in week 4—marking a peak—and dropped to its lowest value in week 10, indicating dynamic fluctuations throughout the experimental period. This is considered to be a result of the characteristic “flush” phenomenon of sweet pepper crops [21]. The N_f at each ripeness stage generally followed a trend similar to that of the weekly total yield, with a one-week lead. However, slight differences in the timing of fluctuations were observed among the ripening stages. In the first half of the experiment, earlier peaks were recorded in weeks 2, 3, and 4 for the 1–40%, 41–80%, and 81–100% stages, respectively, reflecting the progression of ripening. This suggests a temporal shift in fruit counts among the stages due to maturation over time.

A spike in the N_f at the 81–100% ripeness stage was observed in week 7. This was presumed to result from operational constraints rather than crop conditions, as weeks 6 and 7 coincided with the year-end and New Year holidays, during which harvesting was limited to once per week. The harvest in week 6 (25–31 December) was conducted on December 27, while the harvest in week 7 (1–7 January) was conducted on January 5, resulting in a gap of more than one week between harvests. Therefore, it is likely that some fruits at the 81–100% stage remained unharvested during the single harvest in week 7. In addition, during weeks 6 and 7, weekly cumulative solar radiation increased, potentially accelerating fruit ripening and resulting in a greater N_f at the 81–100% ripeness stage. Because this experiment was conducted during the winter season, environmental conditions aside from those mentioned above remained relatively stable, and their impact on yield was likely limited. However, it should be noted that in summer or during periods with greater fluctuations in solar radiation—and consequently in greenhouse temperature and humidity—environmental factors may have a more significant impact on yield.

In Figure 9, it is notable that the 81–100% ripeness stage—where fruits are most likely to be harvested in the following week (at least more likely than those in other ripeness stages)—did not show a significant correlation with the predicted yield for the following week. This may be because, aside from the irregular spike in week 7, the proportion of 81–100% ripeness stage fruits among those harvested in the following week was not as high as expected. One possible explanation is that, based on the results of the counting performance test, some fruits at the 41–80% ripeness stage may have been misclassified as 81–100%. This suggests that the actual N_f at the 81–100% ripeness stage might have been lower than shown in Figure 8b. Conversely, the N_f at the 41–80% ripeness stage may have been higher than indicated, not only because of the potential misclassification described above but also due to the low counting rate (approximately 20%, as shown in Figure 7d). Therefore, it is likely that the proportion of 41–80% ripeness stage fruits among those harvested in the following week was higher than that of fruits at other ripeness stages. This could explain why the N_f at the 41–80% ripeness stage showed the highest correlation with the total weekly yield in the following week (Figure 9b), indicating that this ripeness stage may have had the greatest influence on subsequent harvest volumes. As mentioned above, the absolute number of fruits at each ripeness stage shown in Figure 8b may not be highly reliable due to possible misclassifications and limitations in counting performance. However, as shown in Figure 7, the fruit counts at all ripeness stages, as measured by the monitoring system, were significantly correlated with the actual number of fruits. Thus, it can be assumed that the system reliably captures trends in the increase or decrease in fruit counts at each stage. The correlation between the N_f at the 1–40% ripeness stage and the weekly total yield in the following week was lower than that at the 41–80% stage. This is likely because the probability of fruits in the 1–40% ripeness stage being harvested in the following week was lower than that of fruits in the 41–80% stage.

Notably, the fruit count monitoring results shown in Figure 9 are based on only six planting rows in Sites 1–3. These six rows accounted for less than 5% of the total 144 planting rows within the 0.6 ha cultivation area. Nevertheless, the monitored fruit counts at the 1–40% and 41–80% ripeness stages showed strong correlations with the weekly total yield in the following week, with r values of 0.88 and 0.94, respectively. This strong correlation can be attributed, in part, to the fact that, as shown in Figure 10, the yield of the monitored planting rows exhibited temporal fluctuations similar to those of the entire cultivation area throughout the experimental period. This suggests that the crops in this commercial greenhouse were cultivated under highly uniform conditions. However, in future studies, it will be important to assess the variability in yield across the entire cultivation area to determine the appropriate number of planted rows for accurate and representative monitoring.

Furthermore, Figure 10 suggests that the effect of the illuminance from the fruit monitoring system on physiological functions and sweet pepper production may be negligible. The illuminance was 540 lx at a distance of 30 cm from the LED. The device had a depth of 0.25 m and moved at a speed of about 0.24 m/s, resulting in a light exposure duration of approximately one second. Even if there were any physiological effects in a longer-term experiment, the number of measured specimens (measured rows) represented less than 5% of the entire 0.6 ha cultivation area, suggesting that the impact on overall production (yield) would be minimal.

Yield is influenced by both the number and size of fruits. In this system, we conducted yield predictions but did not measure fruit size. Due to occluded leaves, accurate estimation of fruit size using pixel size from frontal images alone is not possible. Therefore, it is necessary to verify whether there is significant variation in fruit size in the future. Incidentally, we confirmed that the ratio of fruit size remained stable during the investigated period. Fruits weighing over 160 g accounted for more than 80% of the total weekly yield in the 0.6 ha cultivation area, except from week 5 to week 7.

Regarding the prediction accuracy shown in Table 3, the combination of fruit counts from the 1–40% and 41–80% ripeness stages yielded the best performance. This result aligns with the findings in Figure 9, which indicate that these two ripeness stages are effective predictors of the yield in the following week. This combination outperformed the use of all ripeness stages as explanatory variables, likely because, as shown in Figure 9c, the fruit count at the 81–100% ripeness stage negatively affected the prediction. This is also consistent with the finding that using only the 81–100% stage as an explanatory variable resulted in the highest WAPE. Among the models using a single explanatory variable, the highest prediction accuracy was achieved with the 41–80% ripeness stage, which also aligns with the findings in Figure 9.

As Table 3 compares prediction models with different sets of explanatory variables, the number of variables used must also be considered. From this perspective, the fact that the combination of only 1–40% and 41–80% ripeness stage fruit counts outperformed the model using all ripeness stages suggests that it is more effective to focus on earlier ripening stages when predicting the yield in the following week, rather than simply increasing the number of variables. However, the effectiveness of a given ripeness stage is influenced by the fruit maturation speed. Therefore, it is important to note that the results may be affected by the seasonal and regional climatic conditions under which fruit monitoring and yield predictions are conducted. Previous studies related to fruit detection in protected horticulture [5,9,37] have typically classified fruit ripeness broadly—using categories such as “red” or “turning and red” to represent mature stages. In contrast, our approach classifies ripeness into three stages, enabling the identification of the stages most useful for yield prediction based on harvest timing. This approach proved effective; however, some misclassification was observed. It should be noted that increasing the resolution of ripeness classification while maintaining a high classification accuracy remains a challenge.

Although the WAPE of 21.35% reported in this study does not represent a high prediction accuracy, there is potential for improvement through the incorporation of additional parameters not included in the current model, such as weather conditions and maturation speed, as well as through further data accumulation, enhanced sensing technology, and consideration of the number of monitored rows. Nevertheless, given that the trial was conducted in a large-scale commercial greenhouse—where many factors can influence yield—and that only 4% of the planting rows were monitored, and considering that the prediction model used only simple fruit counts at each ripeness stage, the achieved accuracy was not necessarily low. Moreover, given the near-normal distribution of residuals and the absence of noticeable outliers in the error distribution graphs shown in Figure 11, the model’s predictions can be regarded as stable. These characteristics further support the reliability of the WAPE score as a meaningful performance metric. Taken together, these results are promising for future research. This study focused on the use of colored fruits for short-term yield prediction. However, for long-term prediction, the classification of the growth stages of immature fruits, as demonstrated by Moon et al. [13], would be necessary.

Our trolley-mounted scanning device required a person to ride and operate the trolley. While it would be desirable to mount the system on a mobile platform with an elevation function [18] or a suspended mobile unit [27], our device offers advantages such as significantly reduced initial costs and easy implementation without the need for rails or mapping for autonomous movement. Therefore, it may be a practical option for those seeking to adopt such systems without incurring high costs.

5. Conclusions

This study aimed to investigate the yield prediction performance based on the number of sweet pepper fruits at different ripeness stages monitored by our developed system, as well as its fruit detection and counting performance. After several months of trials in a large-scale commercial greenhouse (Ai-Sai Farm Kokonoe, located at 33°25′ N, 131°28′ E), a stable fruit detection performance was observed, and the counting performance for the three classes of colored fruits was found to be sufficient to substitute for manual counting by growers. Additionally, for fruits at the 1–40% and 41–80% ripeness stages, the average N_f measured in six crop rows—representing less than 5% of the entire cultivation area—showed a correlation with the total yield of the entire area in the week following the measurement. The yield prediction performance of a multiple linear regression model using the average fruit counts at these two ripeness stages as explanatory variables achieved a WAPE of 21.35%. These results indicate that our monitoring system performs well under actual sweet pepper production conditions and is useful for short-term yield prediction in large-scale cultivation areas.

Author Contributions

Conceptualization, K.S., M.S. and H.N.; methodology, K.S., M.S., T.M., S.K., H.N. and T.F.; software, K.S., H.N. and N.O.; validation, K.S., M.K. and S.Y.; formal analysis, K.S.; investigation, K.S., M.S., T.M. and S.K.; resources, K.S., T.M., S.K. and H.N.; data curation, K.S. and M.K.; writing—original draft preparation, K.S.; writing—review and editing, K.S., M.S., T.M., S.K., H.N., M.K., N.O., S.Y. and T.F.; visualization, K.S., M.K. and S.Y.; supervision, T.F.; project administration, K.S., M.S. and T.M.; funding acquisition, K.S., H.N. and T.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Ministry of Agriculture, Forestry, and Fisheries (MAFF) of Japan, as a commissioned study on “future agricultural production utilizing artificial intelligence”; and by the Japan Society for the Promotion of Science (JSPS) KAKENHI (grant no. JP22K14974).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

This article is a revised and expanded version of a paper entitled “Development of double-camera AI system for efficient monitoring of paprika fruits” [38], which was presented at the International Symposium on New Technologies for Sustainable Greenhouse Systems: GreenSys2023, Cancún, 22–27 October 2023. The authors thank Unseok Lee, Senior Scientist at the Research Center for Agricultural Robotics at NARO, for providing the annotation support tool.

Conflicts of Interest

The authors declare the following potential competing interests: H. Naito and T. Fukatsu applied for patent “plant-imaging device and plant-imaging method” (WO/2020/218323); H. Naito applied for patent “plant-body imaging device” (JP/2022/099767).

References

Takayama, K.; Hirota, R.; Takahashi, N.; Yamamoto, K.; Sakai, Y.; Okada, H.; Nishina, H.; Arima, S. Development of Chlorophyll Fluorescence Imaging Robot for Practical Use in Commercial Greenhouse. Acta Hortic. 2014, 1037, 671–676. [Google Scholar] [CrossRef]
Sun, X. Enhanced Tomato Detection in Greenhouse Environments: A Lightweight Model Based on S-YOLO with High Accuracy. Front. Plant Sci. 2024, 15, 1451018. [Google Scholar] [CrossRef]
Wang, S.; Xiang, J.; Chen, D.; Zhang, C. A Method for Detecting Tomato Maturity Based on Deep Learning. Appl. Sci. 2024, 14, 11111. [Google Scholar] [CrossRef]
Zhao, J.; Bao, W.; Mo, L.; Li, Z.; Liu, Y.; Du, J. Design of Tomato Picking Robot Detection and Localization System Based on Deep Learning Neural Networks Algorithm of Yolov5. Sci. Rep. 2025, 15, 6180. [Google Scholar] [CrossRef]
Afonso, M.; Fonteijn, H.; Fiorentin, F.S.; Lensink, D.; Mooij, M.; Faber, N.; Polder, G.; Wehrens, R. Tomato Fruit Detection and Counting in Greenhouses Using Deep Learning. Front. Plant Sci. 2020, 11, 571299. [Google Scholar] [CrossRef]
Rong, J.; Zhou, H.; Zhang, F.; Yuan, T.; Wang, P. Tomato Cluster Detection and Counting Using Improved YOLOv5 Based on RGB-D Fusion. Comput. Electron. Agric. 2023, 207, 107741. [Google Scholar] [CrossRef]
Naito, H.; Shimomoto, K.; Fukatsu, T.; Hosoi, F.; Ota, T. Interoperability Analysis of Tomato Fruit Detection Models for Images Taken at Different Facilities, Cultivation Methods, and Times of the Day. AgriEngineering 2024, 6, 1827–1846. [Google Scholar] [CrossRef]
Shimomoto, K.; Naito, H.; Fukatsu, T.; Ota, T. Development of an AI-Based Fruit Monitoring System with a Focus Illumination Unit for Tomatoes Grown in Greenhouse Horticulture. Eng. Agric. Environ. Food 2025, 18, 23–31. [Google Scholar] [CrossRef]
Naito, H.; Ota, T.; Shimomoto, K.; Hosoi, F.; Fukatsu, T. Accuracy Assessment of Tomato Harvest Working Time Predictions from Harvestable Fruit Counts in Panoramic Images. Agriculture 2024, 14, 2257. [Google Scholar] [CrossRef]
Nuske, S.; Wilshusen, K.; Achar, S.; Yoder, L.; Narasimhan, S.; Singh, S. Automated Visual Yield Estimation in Vineyards. J. Field Robot. 2014, 31, 837–860. [Google Scholar] [CrossRef]
Stein, M.; Bargoti, S.; Underwood, J. Image Based Mango Fruit Detection, Localisation and Yield Estimation Using Multiple View Geometry. Sensors 2016, 16, 1915. [Google Scholar] [CrossRef]
Sabir, N.; Singh, B. Protected Cultivation of Vegetables in Global Arena: A Review. Indian J. Agric. 2013, 83, 123–135. [Google Scholar]
Moon, T.; Park, J.; Son, J.E. Prediction of the Fruit Development Stage of Sweet Pepper (Capsicum annuum var. annuum) by an Ensemble Model of Convolutional and Multilayer Perceptron. Biosyst. Eng. 2021, 210, 171–180. [Google Scholar] [CrossRef]
Viveros Escamilla, L.D.; Gómez-Espinosa, A.; Escobedo Cabello, J.A.; Cantoral-Ceballos, J.A. Maturity Recognition and Fruit Counting for Sweet Peppers in Greenhouses Using Deep Learning Neural Networks. Agriculture 2024, 14, 331. [Google Scholar] [CrossRef]
Hemming, J.; Bac, C.W.; van Tuijl, B.A.J.; Barth, R.; Bontsema, J.; Pekkeriet, E.J.; van Henten, E.J. A Robot for Harvesting Sweet-Pepper in Greenhouses. In Proceedings of the International Conference of Agricultural Engineering, Zurich, Switzerland, 6–10 July 2014. [Google Scholar]
Bac, C.W.; Hemming, J.; van Tuijl, B.A.J.; Barth, R.; Wais, E.; van Henten, E.J. Performance Evaluation of a Harvesting Robot for Sweet Pepper. J. Field Robot. 2017, 34, 1123–1139. [Google Scholar] [CrossRef]
Lehnert, C.; McCool, C.; Sa, I.; Perez, T. A Sweet Pepper Harvesting Robot for Protected Cropping Environments. arXiv 2018, arXiv:1810.11920. [Google Scholar]
Arad, B.; Balendonck, J.; Barth, R.; Ben-Shahar, O.; Edan, Y.; Hellström, T.; Hemming, J.; Kurtser, P.; Ringdahl, O.; Tielen, T.; et al. Development of a Sweet Pepper Harvesting Robot. J. Field Robot. 2020, 37, 1027–1039. [Google Scholar] [CrossRef]
Heuvelink, E.; Marcelis, L.F.M.; Körner, O. How to Reduce Yield Fluctuations in Sweet Pepper? Acta Hortic. 2004, 633, 349–355. [Google Scholar] [CrossRef]
Heuvelink, E.; Marcelis, L.; Kierkels, T. Young Fruits Pull so Hard Flowers Above Abort. Wagening. Univ. Res. 2016, 5, 16–17. [Google Scholar]
Homma, M.; Watabe, T.; Ahn, D.-H.; Higashide, T. Dry Matter Production and Fruit Sink Strength Affect Fruit Set Ratio of Greenhouse Sweet Pepper. J. Am. Soc. Hortic. Sci. 2022, 147, 270–280. [Google Scholar] [CrossRef]
Shinozaki, Y.; Nicolas, P.; Fernandez-Pozo, N.; Ma, Q.; Evanich, D.J.; Shi, Y.; Xu, Y.; Zheng, Y.; Snyder, S.I.; Martin, L.B.B.; et al. High-Resolution Spatiotemporal Transcriptome Mapping of Tomato Fruit Development and Ripening. Nat. Commun. 2018, 9, 364. [Google Scholar] [CrossRef]
Harel, B.; van Essen, R.; Parmet, Y.; Edan, Y. Viewpoint Analysis for Maturity Classification of Sweet Peppers. Sensors 2020, 20, 3783. [Google Scholar] [CrossRef]
Kasampalis, D.S.; Tsouvaltzis, P.; Ntouros, K.; Gertsis, A.; Gitas, I.; Siomos, A.S. The Use of Digital Imaging, Chlorophyll Fluorescence and Vis/NIR Spectroscopy in Assessing the Ripening Stage and Freshness Status of Bell Pepper Fruit. Comput. Electron. Agric. 2021, 187, 106265. [Google Scholar] [CrossRef]
Wang, L.; Zhong, Y.; Liu, J.; Ma, R.; Miao, Y.; Chen, W.; Zheng, J.; Pang, X.; Wan, H. Pigment Biosynthesis and Molecular Genetics of Fruit Color in Pepper. Plants 2023, 12, 2156. [Google Scholar] [CrossRef]
Kurtser, P.; Edan, Y. Statistical Models for Fruit Detectability: Spatial and Temporal Analyses of Sweet Peppers. Biosyst. Eng. 2018, 171, 272–289. [Google Scholar] [CrossRef]
Kano, T.; Toda, S.; Unno, H.; Fujiuchi, N.; Nishina, H.; Takayama, K. Development of Hanging-Type Multiple Biological Information Imaging Robot for Growth Monitoring of Tomato Plants. Eco Eng. 2022, 34, 37–44. [Google Scholar] [CrossRef]
Matterport/Mask_RCNN. Available online: https://github.com/matterport/Mask_RCNN (accessed on 26 April 2025).
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 386–397. [Google Scholar] [CrossRef]
Wang, X.; Zhang, R.; Shen, C.; Kong, T.; Li, L. SOLO: A Simple Framework for Instance Segmentation. arXiv 2020, arXiv:1912.04488. [Google Scholar] [CrossRef]
Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the European Conference on Computer Vision (ECCV), Zurich, Switzerland, 6–12 September 2014. [Google Scholar]
Aleju/Imgaug. Available online: https://github.com/aleju/imgaug (accessed on 26 April 2025).
Jonauskaite, D.; Mohr, C.; Antonietti, J.-P.; Spiers, P.M.; Althaus, B.; Anil, S.; Dael, N. Most and Least Preferred Colours Differ According to Object Context: New Insights from an Unrestricted Colour Range. PLoS ONE 2016, 11, e0152194. [Google Scholar] [CrossRef] [PubMed]
Ohno, H.; Sasaki, K.; Ohara, G.; Nakazono, K. Development of Grid Square Air Temperature and Precipitation Data Compiled from Observed, Forecasted, and Climatic Normal Data. Clim. Biosph. 2016, 16, 71–79. [Google Scholar] [CrossRef]
Yue, X.; Qi, K.; Na, X.; Zhang, Y.; Liu, Y.; Liu, C. Improved YOLOv8-Seg Network for Instance Segmentation of Healthy and Diseased Tomato Plants in the Growth Stage. Agriculture 2023, 13, 1643. [Google Scholar] [CrossRef]
Hemming, J.; Ruizendaal, J.; Hofstee, J.W.; van Henten, E.J. Fruit Detectability Analysis for Different Camera Positions in Sweet-Pepper. Sensors 2014, 14, 6032–6044. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Vladislav, Z.; Viktor, O.; Wu, Z.; Zhao, M. Online Recognition and Yield Estimation of Tomato in Plant Factory Based on YOLOv3. Sci. Rep. 2022, 12, 8686. [Google Scholar] [CrossRef] [PubMed]
Shimomoto, K.; Shimazu, M.; Matsuo, T.; Kato, S.; Naito, H.; Fukatsu, T. Development of double-camera AI system for efficient monitoring of paprika fruits. In Proceedings of the International Symposium on New Technologies for Sustainable Greenhouse Systems: GreenSys2023, Cancún, Mexico, 22–27 October 2023. [Google Scholar] [CrossRef]

Figure 1. Fruit-scanning device with dual cameras: (a) overview and side view; (b) top view and cameras located at the bottom of the device.

Figure 2. (a) Sweet pepper harvesting using a pipe-rail trolley during the final stages of annual cultivation in a Venlo-type greenhouse. (b) Scanning device mounted on a pipe-rail trolley. (c) Nightly monitoring.

Figure 3. Sweet pepper fruits at different ripening stages. (a) Enlargement stage; (b) early maturation stage; (c) mid-to-late maturation stage; (d) ripe fruit.

Figure 4. Schematic top view of the 0.6 ha cultivation area in the commercial greenhouse, and the locations of Sites 1–3.

Figure 5. Square image taken by the monitoring system: (a) RGB image; (b) masked image. Examples of maturity classification at (c) 0%, (d) 1–40%, (e) 41–80%, and (f) 81–100% maturity stages (left: original image; middle: image analysis result; right: pie chart showing the coloration rate).

Figure 6. Precision–recall curves evaluating fruit detection performance across all monitoring weeks.

Figure 7. Time courses of manually (by the grower) and automatically (by the monitoring system) counted N_f by ripeness stage: (a) 1–40%; (c) 41–80%; (e) 81–100%. Relationships between manually (by the grower) and automatically (by the monitoring system) counted N_f by ripeness stage: (b) 1–40%; (d) 41–80%; (f) 81–100%.

Figure 8. (a) Time courses of weekly cumulative solar radiation and total weekly yield in the 0.6 ha cultivation area (144 plant rows). (b) Time courses of the average N_f in Sites 1–3 at each ripeness stage.

Figure 9. Relationships between the weekly total yield of the 0.6 ha cultivation area (144 plant rows) in the following week and the automatically counted average N_f by the monitoring system at each ripeness stage: (a) 1–40%; (b) 41–80%; (c) 81–100%.

Figure 10. (a) Time courses of average weekly total yield for the 0.6 ha cultivation area (144 plant rows) and Sites 1–3. (b) Relationship between the average weekly total yield for all rows and that for Sites 1–3.

Figure 11. Error distribution graphs based on residuals for each combination of explanatory variables.

Table 1. Classification of sweet pepper fruit maturity based on hue values.

Color	Hue Value
Red	0 ≤ hue < 30, 345 ≤ hue < 360
Turning	30 ≤ hue < 70
Green	70 ≤ hue < 160

Table 2. Average precision values for IoU = 0.5 across all monitoring weeks.

Week	AP^{IoU = 0.50}
1	0.71
2	0.71
3	0.73
4	0.77
5	0.72
6	0.76
7	0.76
8	0.74
9	0.73
10	0.71
11	0.74
12	0.73

Table 3. WAPE results for each combination of explanatory variables used to predict the total yield in the following week.

Variable Combination	Equation	WAPE
(‘1–40%’, ‘41–80%’)	4	21.35
(‘1–40%’, ‘41–80%’, ‘81–100%’)	2	24.06
(‘41–80%’)	8	25.70
(‘41–80%’, ‘81–100%’)	5	28.70
(‘1–40%’, ‘81–100%’)	6	29.24
(‘1–40%’)	7	38.53
(‘81–100%’)	9	63.18

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shimomoto, K.; Shimazu, M.; Matsuo, T.; Kato, S.; Naito, H.; Kashino, M.; Ohta, N.; Yoshida, S.; Fukatsu, T. Predicting Sweet Pepper Yield Based on Fruit Counts at Multiple Ripeness Stages Monitored by an AI-Based System Mounted on a Pipe-Rail Trolley. Horticulturae 2025, 11, 718. https://doi.org/10.3390/horticulturae11070718

AMA Style

Shimomoto K, Shimazu M, Matsuo T, Kato S, Naito H, Kashino M, Ohta N, Yoshida S, Fukatsu T. Predicting Sweet Pepper Yield Based on Fruit Counts at Multiple Ripeness Stages Monitored by an AI-Based System Mounted on a Pipe-Rail Trolley. Horticulturae. 2025; 11(7):718. https://doi.org/10.3390/horticulturae11070718

Chicago/Turabian Style

Shimomoto, Kota, Mitsuyoshi Shimazu, Takafumi Matsuo, Syuji Kato, Hiroki Naito, Masakazu Kashino, Nozomu Ohta, Sota Yoshida, and Tokihiro Fukatsu. 2025. "Predicting Sweet Pepper Yield Based on Fruit Counts at Multiple Ripeness Stages Monitored by an AI-Based System Mounted on a Pipe-Rail Trolley" Horticulturae 11, no. 7: 718. https://doi.org/10.3390/horticulturae11070718

APA Style

Shimomoto, K., Shimazu, M., Matsuo, T., Kato, S., Naito, H., Kashino, M., Ohta, N., Yoshida, S., & Fukatsu, T. (2025). Predicting Sweet Pepper Yield Based on Fruit Counts at Multiple Ripeness Stages Monitored by an AI-Based System Mounted on a Pipe-Rail Trolley. Horticulturae, 11(7), 718. https://doi.org/10.3390/horticulturae11070718

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predicting Sweet Pepper Yield Based on Fruit Counts at Multiple Ripeness Stages Monitored by an AI-Based System Mounted on a Pipe-Rail Trolley^†

Abstract

1. Introduction

2. Materials and Methods

2.1. Plant Materials and Greenhouse Specifications