Automated Detection of Branch Shaking Locations for Robotic Cherry Harvesting Using Machine Vision

Automation in cherry harvesting is essential to reduce the demand for seasonal labor for cherry picking and reduce the cost of production. The mechanical shaking of tree branches is one of the widely studied and used techniques for harvesting small tree fruit crops like cherries. To automate the branch shaking operation, different methods of detecting branches and cherries in full foliage canopies of the cherry tree have been developed previously. The next step in this process is the localization of shaking positions in the detected tree branches for mechanical shaking. In this study, a method of locating shaking positions for automated cherry harvesting was developed based on branch and cherry pixel locations determined using RGB images and 3D camera images. First, branch and cherry regions were located in 2D RGB images. Depth information provided by a 3D camera was then mapped on to the RGB images using a standard stereo calibration method. The overall root mean square error in estimating the distance to desired shaking points was 0.064 m. Cherry trees trained in two different canopy architectures, Y-trellis and vertical trellis systems, were used in this study. Harvesting testing was carried out by shaking tree branches at the locations selected by the algorithm. For the Y-trellis system, the maximum fruit removal efficiency of 92.9% was achieved using up to five shaking events per branch. However, maximum fruit removal efficiency for the vertical trellis system was 86.6% with up to four shakings per branch. However, it was found that only three shakings per branch would achieve a fruit removal percentage of 92.3% and 86.4% in Y and vertical trellis systems respectively.


Introduction
Cost and availability of human labor are two major concerns for sweet cherry growers. As cherries are characterized by clusters of small fruit in random spatial locations and orientations in tree canopies, hand picking is highly labor intensive. Though the labor-saving technology for sweet cherry harvesting is a critical need for the industry [1], no commercially adoptable sweet cherry harvesters have been available to the growers so far. The advancement in mechanical sweet cherry harvesting would help the cherry industry to be more competitive and sustainable in the long term [2].
Over the past several decades, many studies have been conducted to develop mechanical and automated solutions for harvesting tree fruit crops [3][4][5][6]. One of the widely used harvesting technique is using vibrational energy, which has been shown to be an effective way of harvesting various types of fruit crops, especially berries and cherries with smaller fruit growing in clusters [7][8][9]. In this harvesting method, fruit removal is accomplished by delivering vibrational energy at appropriate Robotics 2017, 6, 31 2 of 16 frequency using trunk shakers, limb shakers, or canopy shakers [10][11][12][13], which generally detaches fruit at its abscission zone [7]. Peterson et al. (1999) [14] proposed a fully automated bulk apple harvester with an imaging system to guide a robotic arm. Automatic image segmentation was attempted using a color camera. It was reported that detection of apples was successful but the detection of branches was considerably difficult. Therefore, identifying shaking positions on the branches was completed manually by clicking at the desired point on the tree images. Such manual operation would provide the two-dimensional co-ordinates (x, y) of the shaking location and the depth was determined physically by allowing the actuator to move until a limit switch was pressed when it came in contact with the branch. The development of a robust image processing system that can detect branches and a method to automatically locate shaking positions in the tree branches is the desirable next step towards fully automated tree fruit harvesting with a mechanical shaker.   [15] developed a method of detecting cherry tree branches often covered with leaves and fruit using morphological properties of visible branch segments and reported a detection accuracy of 89% in vertical planar architecture.   [16] further improved the branch detection method by using location of cherry clusters as a clue to detect heavily or completely occluded branches in high foliage density orchards trained in Upright Fruiting Offshoots (UFO) architecture. The improved method reported a branch detection accuracy of 93% in a UFO orchard architecture with a Y-trellis training system. These studies showed promise for a machine vision system to automate shake-and-catch cherry harvesting systems. The next step in automated shake-and-catch harvesting is to automatically determine and locate the shaking positions in the detected tree branches for shaking off cherries.
There have been numerous studies in improving mechanical harvesting technology for sweet cherries to increase harvesting efficiency and quality of harvested fruit while keeping the harvest-induced damage to the tree at a minimum level. Some studies have indicated that higher fruit removal efficiency can be obtained by prolonged shaking at multiple shaking positions [17,18]. Zhou el al. (2013) [18] showed that fruit removal efficiency is affected by shaking frequency and duration, and reported 81% fruit removal with four intermittent shaking of 5 s each at a frequency of 18 Hz. Other studies have indicated that excitation position on the tree branches also influences the transfer of energy along the branch and consequently affects the fruit removal efficiency [19,20]. Zhou et al. (2014) [20] tested fruit removal efficiency at different excitation positions with a hand-held shaker by dividing Y-trellis cherry trees into four excitation zones. The results indicated that maximum fruit removal was achieved for lowest shaking position followed by the highest shaking position. It was also reported that up to 97% fruit removal efficiency could be achieved if shaking was performed at both top and bottom excitation zones. Peterson and Wolford (2001) [9] developed a mechanical cherry harvester with a branch impacting mechanism and catching conveyor system, which transported harvested cherries to a collection bin. They reported a potential for harvesting up to 85-92% of sweet cherries but also resulted in damage to the tree and fruit with multiple impacts during harvesting. Another major drawback was the difficulty in seeing the branches and subsequent difficulty in positioning the impacting mechanism on tree branches. Chen et al. (2012) [7] also reported that the limited visibility for the operator was challenging for accurately aiming the shaking mechanism to targeted branch locations. Larbi et al. (2015) [21] modified a cherry harvester by replacing the impacting mechanism with a continuous shaking mechanism in order to reduce the damage caused by the impact. In addition, the shaking mechanism was operated using a remote controller to improve the harvester's operability. With a remote controller in hand, the operator had more flexibility to move around to get a proper view and target the shaking mechanism accurately. With this added flexibility, the efficiency of hitting a target branch with an impactor was 93% and the average time required for such maneuvering was 19.9 s per position. The results indicated that the positioning of the shaking mechanism in the target branches was still problematic because the presence of a catching conveyor under the target branch would limit the operator from going too close to have a better view. On the other hand, the operator's skill level affects the harvest rate and positioning time. Larbi and Karkee (2014) [22] showed that there was an operator variability of 8.3% in shaker positioning time and up to 16.1% in fruit removal rate.
The difficulty in positioning the shaking mechanism on target branches is due to the limited visibility of the branches. A machine vision system can be more effective in detecting tree branches in dense canopies and position the shakers in target locations [23]. This automated method will not only eliminate the visibility issues but also reduce positioning time taken by operators. The detection of tree branches has been carried out in previous studies with satisfactory accuracy. However for positioning of shaking mechanisms on proper shaking locations, the shaking positions need to be identified in detected branches. Therefore, this research has focused on developing a method of automatically selecting shaking positions in tree branches for effective cherry harvesting. The shaking positions are selected considering the amount of cherries available for harvest and their relative location in the canopy.

Test Orchard
The study was conducted in the Washington State University experimental orchard in Prosser, Washington. Cherry trees trained in two different architectures were used in this research. The first cherry block was the Skeena variety, trained in the Upright Fruiting Offshoots (UFO) system with vertical limbs (Figure 1a). The vertical offshoots were tied to four trellis wires at 0.6 m, 1.08 m, 1.65 m and 2.2 m above the ground. The orchard had row spacing of 3 m with tree spacing of 1.8 m and approximate canopy height of 3.5 m. The second cherry block was of the Selah variety, trained in the UFO Y-trellis system with fruiting limbs oriented at about 55 • to the horizontal surface ( Figure 1b). The Y-trellis orchard had a tree spacing of 4.3 m × 1.7 m and tree height of approximately 3.5 m.
Robotics 2017, 6, 31 3 of 16 (2014) [22] showed that there was an operator variability of 8.3% in shaker positioning time and up to 16.1% in fruit removal rate. The difficulty in positioning the shaking mechanism on target branches is due to the limited visibility of the branches. A machine vision system can be more effective in detecting tree branches in dense canopies and position the shakers in target locations [23]. This automated method will not only eliminate the visibility issues but also reduce positioning time taken by operators. The detection of tree branches has been carried out in previous studies with satisfactory accuracy. However for positioning of shaking mechanisms on proper shaking locations, the shaking positions need to be identified in detected branches. Therefore, this research has focused on developing a method of automatically selecting shaking positions in tree branches for effective cherry harvesting. The shaking positions are selected considering the amount of cherries available for harvest and their relative location in the canopy.

Test Orchard
The study was conducted in the Washington State University experimental orchard in Prosser, Washington. Cherry trees trained in two different architectures were used in this research. The first cherry block was the Skeena variety, trained in the Upright Fruiting Offshoots (UFO) system with vertical limbs (Figure 1a). The vertical offshoots were tied to four trellis wires at 0.6 m, 1.08 m, 1.65 m and 2.2 m above the ground. The orchard had row spacing of 3 m with tree spacing of 1.8 m and approximate canopy height of 3.5 m. The second cherry block was of the Selah variety, trained in the UFO Y-trellis system with fruiting limbs oriented at about 55° to the horizontal surface ( Figure 1b). The Y-trellis orchard had a tree spacing of 4.3 m × 1.7 m and tree height of approximately 3.5 m.

Image Acquisition
Image acquisition was completed at night with artificial illumination to avoid variation in natural lighting conditions. LED lights (Trilliant ® 36 Light Emitting Doide Grote, Madison, IN, USA) were used for illuminating the imaging region. Color images of cherry trees were acquired using a Bumblebee ® XB3 (Point Grey Research Inc., Richmond, BC, Canada) camera, which is a stereo-vision device with three lenses. The RGB images captured by the central camera with a focal length of 6 mm, Horizontal Field of View (HFOV) of 43° and resolution of 1280 × 960 pixels were used for detecting branches and cherries. A time-of-flight (ToF) based 3D camera (PMD CamCube 3.0, PMD Technologies, Siegen, Germany) was used to capture the depth information. The camera system was

Image Acquisition
Image acquisition was completed at night with artificial illumination to avoid variation in natural lighting conditions. LED lights (Trilliant ® 36 Light Emitting Doide Grote, Madison, IN, USA) were used for illuminating the imaging region. Color images of cherry trees were acquired using a Bumblebee ® XB3 (Point Grey Research Inc., Richmond, BC, Canada) camera, which is a stereo-vision device with three lenses. The RGB images captured by the central camera with a focal length of 6 mm, Horizontal Field of View (HFOV) of 43 • and resolution of 1280 × 960 pixels were used for detecting branches and cherries. A time-of-flight (ToF) based 3D camera (PMD CamCube 3.0, PMD Technologies, Siegen, Germany) was used to capture the depth information. The camera system was mounted in an electric vehicle. The setup used for imaging in the Y-trellis system is shown in Figure 2. Detailed description about imaging setup can be found in [15,16].
Robotics 2017, 6, 31 4 of 16 mounted in an electric vehicle. The setup used for imaging in the Y-trellis system is shown in Figure 2. Detailed description about imaging setup can be found in [15,16].

Co-Registration of Depth and RGB Images
RGB images were analyzed for detecting and reconstructing tree branches using various color and geometric features of tree branches and cherry clusters. A time-of-flight-based 3D camera was used along with a RGB camera to obtain depth information for the detected branches. The mapping of depth information from 3D camera images onto RGB images was carried out using a standard stereo camera calibration algorithm from MATLAB camera calibration toolbox [24]. The resolution of the 3D camera was 200 × 200 pixels whereas the resolution of RGB camera was 1280 × 960 pixels. To match the resolutions, 3D images were up-sampled by linear interpolation to 1280 × 960 pixels. The calibration algorithm provided the intrinsic camera parameters including focal length (F), principal point (C) and distortion coefficient (k) (skew coefficient for radial and tangential distortions). Using these intrinsic parameters, a stereo calibration was performed to determine extrinsic camera parameters, which included rotation (R) and translation (T) vectors of the 3D camera with respect to the RGB camera [24,25]. For more details on co-registration process used in this work, please refer to [25].

Branch Detection and Reconstruction
Detection of cherry tree branches was carried out in color images acquired in the test orchards using methods developed by [15,16] (Figure 3). The cherry tree branches were covered by dense foliage, which limited the visibility of branches in tree canopies. A branch detection method was implemented to estimate the location of branches in canopies using partially visible branch sections. The intermittently visible branch sections were segmented from the images to evaluate their morphological features including orientation, length and thickness. Using these features as clues, the branch sections belonging to a branch were identified, and a model equation was fitted to reconstruct the branch. Detailed explanation of this method can be found in [15]. This method detected most of the branches when a few sections were exposed to the camera. However, some branches were entirely or almost entirely occluded by leaves and/or cherry clusters. To detect such branches, a cherry location-based branch detection method [16] was implemented. In this method, cherry regions were segmented from images and a series of cherry clusters growing close to each other in a particular direction were identified as potential fruit grown in a specific branch. Model equations were then fitted to represent the branches occluded by those cherry clusters. Detailed description on this method can be found in [16]. The final output of branch detection methods was the mathematical equations representing all branches in the images.

Co-Registration of Depth and RGB Images
RGB images were analyzed for detecting and reconstructing tree branches using various color and geometric features of tree branches and cherry clusters. A time-of-flight-based 3D camera was used along with a RGB camera to obtain depth information for the detected branches. The mapping of depth information from 3D camera images onto RGB images was carried out using a standard stereo camera calibration algorithm from MATLAB camera calibration toolbox [24]. The resolution of the 3D camera was 200 × 200 pixels whereas the resolution of RGB camera was 1280 × 960 pixels. To match the resolutions, 3D images were up-sampled by linear interpolation to 1280 × 960 pixels. The calibration algorithm provided the intrinsic camera parameters including focal length (F), principal point (C) and distortion coefficient (k) (skew coefficient for radial and tangential distortions). Using these intrinsic parameters, a stereo calibration was performed to determine extrinsic camera parameters, which included rotation (R) and translation (T) vectors of the 3D camera with respect to the RGB camera [24,25]. For more details on co-registration process used in this work, please refer to [25].

Branch Detection and Reconstruction
Detection of cherry tree branches was carried out in color images acquired in the test orchards using methods developed by [15,16] (Figure 3). The cherry tree branches were covered by dense foliage, which limited the visibility of branches in tree canopies. A branch detection method was implemented to estimate the location of branches in canopies using partially visible branch sections. The intermittently visible branch sections were segmented from the images to evaluate their morphological features including orientation, length and thickness. Using these features as clues, the branch sections belonging to a branch were identified, and a model equation was fitted to reconstruct the branch. Detailed explanation of this method can be found in [15]. This method detected most of the branches when a few sections were exposed to the camera. However, some branches were entirely or almost entirely occluded by leaves and/or cherry clusters. To detect such branches, a cherry location-based branch detection method [16] was implemented. In this method, cherry regions were segmented from images and a series of cherry clusters growing close to each other in a particular direction were identified as potential fruit grown in a specific branch. Model equations were then fitted to represent the branches occluded by those cherry clusters. Detailed description on this method can be found in [16]. The final output of branch detection methods was the mathematical equations representing all branches in the images.

Determining Shaking Locations in Tree Branches
The shaking position(s) in tree branches will affect the fruit removal efficiency during shake-and-catch harvesting. Previous studies on optimizing shaking frequency, duration and position [18,20] provided guidance for estimating initial shaking locations. Studies have shown that multiple shaking locations would be essential for maximum fruit removal [18,20]. Based on the fruit distribution and dynamics of tree branches, shaking in both the upper canopy region and the lower canopy region have the potential to yield maximum fruit removal efficiency. Therefore, the initial localization of the shaking position in this study was carried out in three specific canopy regions (referred to as primary shaking positions). If any cherries were left on the tree after shaking at those initial locations, new shaking locations were determined based on the location of remaining cherries (referred to as secondary shaking positions). Details of the localization of the shaking position are provided in the following sub-sections.

Determining Primary Shaking Positions
The inputs for locating shaking positions were branch and cherry regions obtained after image segmentation, and branch equations provided by the branch detection methods. First, the canopy region captured by an image was divided vertically into three zones: zone 1 (top), zone 2 (middle), and zone 3 (bottom) (Figure 4). For a given branch, it was determined whether there were any cherry clusters assigned to the branch section in each of the three canopy zones. The given zone was considered for shaking only if there were cherries present in that zone. Figure 4 shows an example image, in which Branch 1 has cherries in Zone 1 only. In such a case, there was no need to find shaking position in Zone 2 or Zone 3. Similarly, Branch 2 had cherries in Zone 3, requiring shaking at Zone 3 only. Branch 3 has no or only an inconsiderable number of cherries on it and therefore shaking would not be required in any zone.

Determining Shaking Locations in Tree Branches
The shaking position(s) in tree branches will affect the fruit removal efficiency during shake-and-catch harvesting. Previous studies on optimizing shaking frequency, duration and position [18,20] provided guidance for estimating initial shaking locations. Studies have shown that multiple shaking locations would be essential for maximum fruit removal [18,20]. Based on the fruit distribution and dynamics of tree branches, shaking in both the upper canopy region and the lower canopy region have the potential to yield maximum fruit removal efficiency. Therefore, the initial localization of the shaking position in this study was carried out in three specific canopy regions (referred to as primary shaking positions). If any cherries were left on the tree after shaking at those initial locations, new shaking locations were determined based on the location of remaining cherries (referred to as secondary shaking positions). Details of the localization of the shaking position are provided in the following sub-sections.

Determining Primary Shaking Positions
The inputs for locating shaking positions were branch and cherry regions obtained after image segmentation, and branch equations provided by the branch detection methods. First, the canopy region captured by an image was divided vertically into three zones: zone 1 (top), zone 2 (middle), and zone 3 (bottom) (Figure 4). For a given branch, it was determined whether there were any cherry clusters assigned to the branch section in each of the three canopy zones. The given zone was considered for shaking only if there were cherries present in that zone. Figure 4 shows an example image, in which Branch 1 has cherries in Zone 1 only. In such a case, there was no need to find shaking position in Zone 2 or Zone 3. Similarly, Branch 2 had cherries in Zone 3, requiring shaking at Zone 3 only. Branch 3 has no or only an inconsiderable number of cherries on it and therefore shaking would not be required in any zone. Once cherries were located in all three shaking zones, a shaking position was determined to harvest those cherries. To reduce the possible damage to cherries, it was desirable to avoid direct contact of the shaker with cherries. Therefore, first priority was given to visible sections of the branch within the shaking zone on which desired cherry clusters are located. But the branch section may or may not be visible in every situation due to the possibility of occlusion by fruit or foliage. The decision making process for different scenarios that may occur in this process ( Figure 5) is described in the following paragraphs.  Once cherries were located in all three shaking zones, a shaking position was determined to harvest those cherries. To reduce the possible damage to cherries, it was desirable to avoid direct contact of the shaker with cherries. Therefore, first priority was given to visible sections of the branch within the shaking zone on which desired cherry clusters are located. But the branch section may or may not be visible in every situation due to the possibility of occlusion by fruit or foliage. The decision making process for different scenarios that may occur in this process ( Figure 5) is described in the following paragraphs. Once cherries were located in all three shaking zones, a shaking position was determined to harvest those cherries. To reduce the possible damage to cherries, it was desirable to avoid direct contact of the shaker with cherries. Therefore, first priority was given to visible sections of the branch within the shaking zone on which desired cherry clusters are located. But the branch section may or may not be visible in every situation due to the possibility of occlusion by fruit or foliage. The decision making process for different scenarios that may occur in this process ( Figure 5) is described in the following paragraphs.

Scenario 1: Branch Section is Visible and Satisfies Branch Equation
In this case, cherry clusters were detected in the given shaking zone and a part of the branch segment to which the cherries belong was visible (white region in Figure 6b). The dotted line (Figure 6b) is the trajectory of the branch represented by the branch equation, which was determined through branch segment detection and the branch reconstruction method described above. As the branch equation passes through some part of the segmented branch region, some pixel coordinates of overlapping regions will satisfy the branch equation. After the coordinates of branch region satisfying the branch equation were identified, the median location (which divided all coordinates into two halves, each half being in one side of the median location) was selected as the shaking position for the corresponding branch in the given canopy zone.

Scenario 1: Branch Section is Visible and Satisfies Branch Equation
In this case, cherry clusters were detected in the given shaking zone and a part of the branch segment to which the cherries belong was visible (white region in Figure 6b). The dotted line (Figure 6b) is the trajectory of the branch represented by the branch equation, which was determined through branch segment detection and the branch reconstruction method described above. As the branch equation passes through some part of the segmented branch region, some pixel coordinates of overlapping regions will satisfy the branch equation. After the coordinates of branch region satisfying the branch equation were identified, the median location (which divided all coordinates into two halves, each half being in one side of the median location) was selected as the shaking position for the corresponding branch in the given canopy zone. In this scenario, branch segments were visible in the desired zone along with cherries, however the branch equation did not pass through the visible branch section. In such cases, the branch section nearest to the branch equation was identified. Then the centroid of that branch region was identified as the desired shaking position for the given zone.

Scenario 3: Branch Sections Not Visible
The visibility of branch sections is not always guaranteed because of the presence of dense foliage and clusters of cherries. Figure 7 depicts one example with a completely occluded branch section. The branch equation, in this case, passed through the cluster of cherries, which occluded the branch section. In such cases, the shaking position was selected allowing a 2-3 cm (15 pixel) buffer zone for engaging the shaking mechanism below the largest cherry cluster in the zone. In this scenario, branch segments were visible in the desired zone along with cherries, however the branch equation did not pass through the visible branch section. In such cases, the branch section nearest to the branch equation was identified. Then the centroid of that branch region was identified as the desired shaking position for the given zone.

Scenario 3: Branch Sections Not Visible
The visibility of branch sections is not always guaranteed because of the presence of dense foliage and clusters of cherries. Figure 7 depicts one example with a completely occluded branch section. The branch equation, in this case, passed through the cluster of cherries, which occluded the branch section. In such cases, the shaking position was selected allowing a 2-3 cm (15 pixel) buffer zone for engaging the shaking mechanism below the largest cherry cluster in the zone.

Fruit Harvesting using Primary Shaking Positions
As discussed before, shaking positions for cherry harvesting were located on tree branches at different canopy heights. Tree canopies were divided into three zones and shaking locations were estimated for each zone containing cherry clusters. The next step was to perform a harvesting test by shaking branches at shaking locations determined by the algorithm described before. Harvesting was completed by shaking tree branches using a hand-held shaker (Figure 8) at the locations identified by the algorithm. The handheld shaker was operated at a frequency range of 14 Hz-18 Hz. Intermittent shaking of branches was continued up to four times with approximately 5 s of excitation every time. This shaking frequency and harvesting pattern was implemented based on previous studies on mechanical cherry harvesting [18], which found out that such a method would lead to the most efficient fruit removal. At most, three shaking locations were predicted per branch. However, previous studies [20] had indicated that shaking at the top and bottom zones would result in maximum fruit removal. Therefore, the top and bottom shaking locations were used first. The mid zone was shaken only if there were any cherries remaining in that zone after harvesting at the top and bottom zones. The harvest test was performed by shaking tree branches in the following sequence: (i). Harvesting at shaking location in zone 1 (top zone) (ii). Harvesting at shaking location in zone 3 (bottom zone) (iii). Harvesting at shaking location in zone 2 (mid zone), only if there were cherries remaining after previous shakings Such a harvesting sequence was used to evaluate the fruit removal efficiency (percentage) achieved by the additional shaking at the mid zone, which will help to determine the optimum number of shaking positions. When tree branches are shaken at a location in the bottom area of the canopy, there is a high possibility that the cherries in the higher areas in the canopy will also fall. If the cherry catching surface is at the bottom of the canopy, the fruit will have a higher drop height increasing the chances of fruit damage (e.g., bruise). It is, therefore, essential to harvest the top area

Fruit Harvesting using Primary Shaking Positions
As discussed before, shaking positions for cherry harvesting were located on tree branches at different canopy heights. Tree canopies were divided into three zones and shaking locations were estimated for each zone containing cherry clusters. The next step was to perform a harvesting test by shaking branches at shaking locations determined by the algorithm described before. Harvesting was completed by shaking tree branches using a hand-held shaker (Figure 8) at the locations identified by the algorithm. The handheld shaker was operated at a frequency range of 14 Hz-18 Hz. Intermittent shaking of branches was continued up to four times with approximately 5 s of excitation every time. This shaking frequency and harvesting pattern was implemented based on previous studies on mechanical cherry harvesting [18], which found out that such a method would lead to the most efficient fruit removal. At most, three shaking locations were predicted per branch. However, previous studies [20] had indicated that shaking at the top and bottom zones would result in maximum fruit removal. Therefore, the top and bottom shaking locations were used first. The mid zone was shaken only if there were any cherries remaining in that zone after harvesting at the top and bottom zones. The harvest test was performed by shaking tree branches in the following sequence: (i). Harvesting at shaking location in zone 1 (top zone) (ii). Harvesting at shaking location in zone 3 (bottom zone) (iii). Harvesting at shaking location in zone 2 (mid zone), only if there were cherries remaining after previous shakings Such a harvesting sequence was used to evaluate the fruit removal efficiency (percentage) achieved by the additional shaking at the mid zone, which will help to determine the optimum number of shaking positions. When tree branches are shaken at a location in the bottom area of the canopy, there is a high possibility that the cherries in the higher areas in the canopy will also fall. If the cherry catching surface is at the bottom of the canopy, the fruit will have a higher drop height increasing the chances of fruit damage (e.g., bruise). It is, therefore, essential to harvest the top area of the canopy first with the catching surface close to the shaking location to minimize drop height, and thus, minimizing potential damages.
of the canopy first with the catching surface close to the shaking location to minimize drop height, and thus, minimizing potential damages. In ideal cases, up to three primary shaking positions would be sufficient to harvest all cherries in a tree branch. But there were cases when there could be inefficient energy transfer from shaker to branches due to undesirable branch properties like long and/or thin branches. Such limitations may result in cherries not being harvested completely by shaking at primary shaking positions. On the other hand, some branches could be missed during automated branch detection in the images collected before the harvesting started due to occlusions. In either case, secondary shaking locations have to be determined to harvest remaining cherries. Hence, a second image of the tree canopy was taken and branch and cherry detection was carried out after completing shaking at the primary shaking positions. If more cherries were detected in the canopy, new shaking positions (secondary positions) were determined based on the location of detected cherries. The method is described in the following section.

Harvesting with Secondary Shaking Positions
It was assumed that the remaining cherries in tree branches did not detach during the primary round of shaking because of ineffective energy transfer. Therefore, to maximize the energy transfer to remaining cherries, secondary shaking positions were determined such that shaking would occur close to the remaining cherry clusters. The input to this method was also the segmented images containing branch and cherry regions as well as equations of the detected branches. However, in this case instead of dividing the image into different zones, the whole field of view was used. First, the intersection between the trajectory defined by the branch equation and the cherry regions was determined ( Figure 9). The distance from each pixel coordinate of the branch path to the nearest cherry region was estimated. The distance to the nearest cherry region is shown as bars in the cherry distance profile (Figure 9c) along the branch path. The distance to the nearest cherry region for all branch coordinates that overlap with the cherry region is considered to be zero. In Figure 9b, point A lies within the cherry region and the distance to the nearest cherry cluster is zero as shown in the cherry distance profile (Figure 9c). To avoid direct impact on the cherries while shaking, the shaking position should be picked outside the cherry region. On the other hand, it should be picked close to the cherry cluster to transfer maximum energy to the cherries being harvested. Therefore, allowing a buffer zone of 2-3 cm (15 pixels), all coordinates that are 15 pixels away from the nearest cherry region were selected as potential shaking positions. In ideal cases, up to three primary shaking positions would be sufficient to harvest all cherries in a tree branch. But there were cases when there could be inefficient energy transfer from shaker to branches due to undesirable branch properties like long and/or thin branches. Such limitations may result in cherries not being harvested completely by shaking at primary shaking positions. On the other hand, some branches could be missed during automated branch detection in the images collected before the harvesting started due to occlusions. In either case, secondary shaking locations have to be determined to harvest remaining cherries. Hence, a second image of the tree canopy was taken and branch and cherry detection was carried out after completing shaking at the primary shaking positions. If more cherries were detected in the canopy, new shaking positions (secondary positions) were determined based on the location of detected cherries. The method is described in the following section.

Harvesting with Secondary Shaking Positions
It was assumed that the remaining cherries in tree branches did not detach during the primary round of shaking because of ineffective energy transfer. Therefore, to maximize the energy transfer to remaining cherries, secondary shaking positions were determined such that shaking would occur close to the remaining cherry clusters. The input to this method was also the segmented images containing branch and cherry regions as well as equations of the detected branches. However, in this case instead of dividing the image into different zones, the whole field of view was used. First, the intersection between the trajectory defined by the branch equation and the cherry regions was determined (Figure 9). The distance from each pixel coordinate of the branch path to the nearest cherry region was estimated. The distance to the nearest cherry region is shown as bars in the cherry distance profile (Figure 9c) along the branch path. The distance to the nearest cherry region for all branch coordinates that overlap with the cherry region is considered to be zero. In Figure 9b, point A lies within the cherry region and the distance to the nearest cherry cluster is zero as shown in the cherry distance profile (Figure 9c). To avoid direct impact on the cherries while shaking, the shaking position should be picked outside the cherry region. On the other hand, it should be picked close to the cherry cluster to transfer maximum energy to the cherries being harvested. Therefore, allowing a buffer zone of 2-3 cm (15 pixels), all coordinates that are 15 pixels away from the nearest cherry region were selected as potential shaking positions. In Figure 10, potential shaking positions are shown as P1, P2, P3 and P4, which were at a distance of 15 pixels (~3 cm) from the nearest cherry region. For every one of these positions, the impact zone was defined as a square area of approximately 1 m 2 (500 × 500 pixels) with the shaking position as its center. The area of the cherry region inside each impact zone was evaluated. The shaking position with the maximum cherry area inside the impact zone was picked as the first shaking position. If more cherries were left on the tree, another shaking position with the maximum impact area for remaining cherries was picked for shaking. This process was continued until all cherry regions were under an impact zone. After determining secondary shaking locations, further harvesting tests were conducted. The weight of cherries removed in each shaking was recorded.  In Figure 10, potential shaking positions are shown as P1, P2, P3 and P4, which were at a distance of 15 pixels (~3 cm) from the nearest cherry region. For every one of these positions, the impact zone was defined as a square area of approximately 1 m 2 (500 × 500 pixels) with the shaking position as its center. The area of the cherry region inside each impact zone was evaluated. The shaking position with the maximum cherry area inside the impact zone was picked as the first shaking position. If more cherries were left on the tree, another shaking position with the maximum impact area for remaining cherries was picked for shaking. This process was continued until all cherry regions were under an impact zone. After determining secondary shaking locations, further harvesting tests were conducted. The weight of cherries removed in each shaking was recorded. In Figure 10, potential shaking positions are shown as P1, P2, P3 and P4, which were at a distance of 15 pixels (~3 cm) from the nearest cherry region. For every one of these positions, the impact zone was defined as a square area of approximately 1 m 2 (500 × 500 pixels) with the shaking position as its center. The area of the cherry region inside each impact zone was evaluated. The shaking position with the maximum cherry area inside the impact zone was picked as the first shaking position. If more cherries were left on the tree, another shaking position with the maximum impact area for remaining cherries was picked for shaking. This process was continued until all cherry regions were under an impact zone. After determining secondary shaking locations, further harvesting tests were conducted. The weight of cherries removed in each shaking was recorded.

Mapping 3D Depth Information onto RGB Images
The distance to shaking positions estimated using depth information mapped onto RGB images was compared with the reference distance measured using a laser unit (DLR 130K, Bosch, Stuttgart, Germany). A total of 138 shaking positions were used for evaluating the performance of the distance estimation method. As discussed in the methods section, the shaking positions were located on the visible branch sections when possible. However, branch sections were not always visible and the shaking position could be located on occluded regions as well. In such cases, distances to the shaking positions were estimated by linearly interpolating the depth information available for visible branch or cherry regions around the occluded shaking position. Overall, 32% of shaking positions were located on the visible branch regions. The mean absolute error for distance estimation over the shaking positions within visible branch regions was 3.4 cm with a standard deviation of 3.4 cm ( Table 1). Root mean square error for such cases was 4.8 cm. The distance to occluded branch regions was estimated with a mean absolute error of 5.2 cm with a standard deviation of 4.8 cm and the root mean square error of 7.1 cm. For occluded branches, the reference measurement was taken after removing the occluding leaves by hand. The estimated error was higher for occluded branch regions, as expected.
That is because the depth information was not directly available for those regions. Overall, the root mean square error for distance estimation was 6.4 cm. Figure 11 depicts errors in estimating the distance to all branch sections with a solid line representing mean error, and dashed lines representing upper and lower deviation lines.

Mapping 3D Depth Information onto RGB Images
The distance to shaking positions estimated using depth information mapped onto RGB images was compared with the reference distance measured using a laser unit (DLR 130K, Bosch, Stuttgart, Germany). A total of 138 shaking positions were used for evaluating the performance of the distance estimation method. As discussed in the methods section, the shaking positions were located on the visible branch sections when possible. However, branch sections were not always visible and the shaking position could be located on occluded regions as well. In such cases, distances to the shaking positions were estimated by linearly interpolating the depth information available for visible branch or cherry regions around the occluded shaking position. Overall, 32% of shaking positions were located on the visible branch regions. The mean absolute error for distance estimation over the shaking positions within visible branch regions was 3.4 cm with a standard deviation of 3.4 cm ( Table 1). Root mean square error for such cases was 4.8 cm. The distance to occluded branch regions was estimated with a mean absolute error of 5.2 cm with a standard deviation of 4.8 cm and the root mean square error of 7.1 cm. For occluded branches, the reference measurement was taken after removing the occluding leaves by hand. The estimated error was higher for occluded branch regions, as expected. That is because the depth information was not directly available for those regions. Overall, the root mean square error for distance estimation was 6.4 cm. Figure 11 depicts errors in estimating the distance to all branch sections with a solid line representing mean error, and dashed lines representing upper and lower deviation lines. Figure 11. Error in estimating distance to branch sections through mapping of 3D information onto RGB images. The solid line represents the mean whereas the shaded band represents the standard deviation region. Reference measurements were taken with a laser distance measure. Table 1. Error in estimating distance to shaking positions from the 3D camera. Distance was estimated by mapping 3D depth information onto color images. Observations Figure 11. Error in estimating distance to branch sections through mapping of 3D information onto RGB images. The solid line represents the mean whereas the shaded band represents the standard deviation region. Reference measurements were taken with a laser distance measure. It is desirable to minimize the distance error so that the shaking mechanism could be guided precisely to the desired shaking position. Further improvement in distance estimation accuracy may be possible by improving the stereo camera calibration process. The low resolution of the 3D camera is another factor that could affect the calibration accuracy.

Mean Error (m) Mean Absolute Error (m) RMSE (m)
Along with accuracy of distance estimation, the efficient harvesting also depends on the mechanical design of the shaker. Some level of error in distance measurement can be compensated by designing shakers with a degree of tolerance. For example, the shaker could be built with wider contact surface or in a V-shaped hook design with wider opening. The shakers used in previous studies have a wide shaking head of around 30 cm (1 feet) width to allow proper contact with tree branches even if there is some offset in the positioning of the shaker. Considering such designs of the shaking head, the error of 6.4 cm (RMSE) could still be acceptable for successful cherry harvesting. Force sensors and limit switches can also be installed in the shakers to detect when it comes in contact with the branch.

Fruit Removal Efficiency
The decision-making algorithm for determining shaking positions on cherry tree branches generated multiple shaking positions on each branch based on the distribution of cherries on individual branches. The number of shaking positions required per branch varied according to the tree training system and the fruit load on the target branches. The progression of fruit removal efficiency over increasing shaking positions has been analyzed for two training systems investigated. It was observed that the shaking at the first position always yielded the maximum fruit removal efficiency (percentage), which was 65.1% in the Y-trellis canopy system (Figure 12a) and 67.7% in the vertical trellis system (Figure 13a). Shaking at the second position on a branch also helped to remove significant amounts of fruit from the branches bringing the cumulative fruit removal efficiency to 85.5% and 83.3% for the Y-trellis system ( Figure 12b) and vertical trellis system (Figure 13b) respectively. Shaking at the third shaking position removed only 6.5% fruit in the Y-trellis and 3.0% in the vertical trellis system to bring the overall fruit removal to 92.3% and 86.4% respectively.
It is desirable to minimize the distance error so that the shaking mechanism could be guided precisely to the desired shaking position. Further improvement in distance estimation accuracy may be possible by improving the stereo camera calibration process. The low resolution of the 3D camera is another factor that could affect the calibration accuracy.
Along with accuracy of distance estimation, the efficient harvesting also depends on the mechanical design of the shaker. Some level of error in distance measurement can be compensated by designing shakers with a degree of tolerance. For example, the shaker could be built with wider contact surface or in a V-shaped hook design with wider opening. The shakers used in previous studies have a wide shaking head of around 30 cm (1 feet) width to allow proper contact with tree branches even if there is some offset in the positioning of the shaker. Considering such designs of the shaking head, the error of 6.4 cm (RMSE) could still be acceptable for successful cherry harvesting. Force sensors and limit switches can also be installed in the shakers to detect when it comes in contact with the branch.

Fruit Removal Efficiency
The decision-making algorithm for determining shaking positions on cherry tree branches generated multiple shaking positions on each branch based on the distribution of cherries on individual branches. The number of shaking positions required per branch varied according to the tree training system and the fruit load on the target branches. The progression of fruit removal efficiency over increasing shaking positions has been analyzed for two training systems investigated. It was observed that the shaking at the first position always yielded the maximum fruit removal efficiency (percentage), which was 65.1% in the Y-trellis canopy system ( Figure 12a) and 67.7% in the vertical trellis system (Figure 13a). Shaking at the second position on a branch also helped to remove significant amounts of fruit from the branches bringing the cumulative fruit removal efficiency to 85.5% and 83.3% for the Y-trellis system ( Figure 12b) and vertical trellis system (Figure 13b) respectively. Shaking at the third shaking position removed only 6.5% fruit in the Y-trellis and 3.0% in the vertical trellis system to bring the overall fruit removal to 92.3% and 86.4% respectively.   As discussed in the methods section, the cherry location-based shaking decision was made if there were cherries left on the tree after the third shaking. In the Y-trellis system, a fourth and fifth shaking yielded 0.54% and 0.02% fruit bringing the maximum fruit removal efficiency to 92.9% ( Figure 12). Whereas, in the vertical-trellis system the fourth shaking yielded 0.2% fruit removal achieving the maximum fruit removal of 86.7% ( Figure 13). Shaking was not continued beyond the fourth shaking for the vertical trellis system either because no more cherries were detected or no fruit were removed by the fourth shaking. It was observed that there was no significant improvement in overall fruit removal beyond three shaking events per branch in most of the branches. It is also evident that the Y-trellis system has higher fruit removal efficiency of 92.9% compared to 86.6% with the vertical trellis system, which was primarily caused by undetected branches and cherries. Tree canopies in this cherry block were not as well trained and pruned to maintain the two-dimensional structure as in the Y-trellis block, which might be the primary reason for lower harvesting efficiency. The results and field observations indicated that maintaining a good two dimensional fruiting wall structure and proper spacing between branches may help improve overall harvesting efficiency.
Results also showed that 29.1% of branches in the Y-trellis architecture required only one shaking to remove all removable cherries from the branch. For the vertical trellis system, 47.4% branches were harvested completely with a single shaking per branch ( Figure 14). It was observed that the efficiency of a shaking event in getting fruit detached was affected by the branch diameter. Branches in the vertical trellis system had a larger diameter compared to those in the Y-trellis system, which potentially facilitated the effective transfer of energy along the branch causing a larger fruit removal percentage per shaking. Branches in the Y-trellis architecture required two shaking events for 39.8% of branches, whereas 31.5% of branches required three or more shaking events for maximum fruit removal. For the vertical trellis system, 36.8% of branches required two shaking events and 15.8% branches required three or more shakings (Figure 14). These results show that the number of shaking events per branch should be decided dynamically depending on the load and distribution of cherries that is present in the canopy at the beginning and after each shaking event.
Previous research studies in mechanical cherry harvesting have generated a lot of knowledge on effective shaking methods for harvesting cherries. This research focused on developing a method for automatically determining shaking positions on tree branches based on canopy and cherry locations. To develop an automated harvester, this information on the location of shaking positions will be essential for accurately guiding the shaking mechanism on target branches and efficiently. The results of harvesting tests have also indicated that multiple shakings would be required on each branch for maximum fruit removal. With machine vision guided shakers, the automated harvester could be equipped with multiple independent shakers at different canopy positions with a separate As discussed in the methods section, the cherry location-based shaking decision was made if there were cherries left on the tree after the third shaking. In the Y-trellis system, a fourth and fifth shaking yielded 0.54% and 0.02% fruit bringing the maximum fruit removal efficiency to 92.9% ( Figure 12). Whereas, in the vertical-trellis system the fourth shaking yielded 0.2% fruit removal achieving the maximum fruit removal of 86.7% ( Figure 13). Shaking was not continued beyond the fourth shaking for the vertical trellis system either because no more cherries were detected or no fruit were removed by the fourth shaking. It was observed that there was no significant improvement in overall fruit removal beyond three shaking events per branch in most of the branches. It is also evident that the Y-trellis system has higher fruit removal efficiency of 92.9% compared to 86.6% with the vertical trellis system, which was primarily caused by undetected branches and cherries. Tree canopies in this cherry block were not as well trained and pruned to maintain the two-dimensional structure as in the Y-trellis block, which might be the primary reason for lower harvesting efficiency. The results and field observations indicated that maintaining a good two dimensional fruiting wall structure and proper spacing between branches may help improve overall harvesting efficiency.
Results also showed that 29.1% of branches in the Y-trellis architecture required only one shaking to remove all removable cherries from the branch. For the vertical trellis system, 47.4% branches were harvested completely with a single shaking per branch ( Figure 14). It was observed that the efficiency of a shaking event in getting fruit detached was affected by the branch diameter. Branches in the vertical trellis system had a larger diameter compared to those in the Y-trellis system, which potentially facilitated the effective transfer of energy along the branch causing a larger fruit removal percentage per shaking. Branches in the Y-trellis architecture required two shaking events for 39.8% of branches, whereas 31.5% of branches required three or more shaking events for maximum fruit removal. For the vertical trellis system, 36.8% of branches required two shaking events and 15.8% branches required three or more shakings (Figure 14). These results show that the number of shaking events per branch should be decided dynamically depending on the load and distribution of cherries that is present in the canopy at the beginning and after each shaking event.
Previous research studies in mechanical cherry harvesting have generated a lot of knowledge on effective shaking methods for harvesting cherries. This research focused on developing a method for automatically determining shaking positions on tree branches based on canopy and cherry locations. To develop an automated harvester, this information on the location of shaking positions will be essential for accurately guiding the shaking mechanism on target branches and efficiently. The results of harvesting tests have also indicated that multiple shakings would be required on each branch for maximum fruit removal. With machine vision guided shakers, the automated harvester could be equipped with multiple independent shakers at different canopy positions with a separate catching surface for each shaker. Such systems have a potential to improve the efficiency and speed of the harvester as well as improve the quality of harvested fruit by reducing the drop height. catching surface for each shaker. Such systems have a potential to improve the efficiency and speed of the harvester as well as improve the quality of harvested fruit by reducing the drop height.

Conclusions
For automated sweet cherry harvesting with the shake-and-catch system, the machine is required to make decisions on the number and location of shaking positions and estimate their 3D location in the canopy. This research focused on developing a method for determining shaking positions in cherry tree branches detected by a machine vision system. The localization of shaking positions on tree branches included; (i) determination of shaking position in each branch to harvest cherries; and (ii) estimation of distance to shaking position from the camera location by mapping depth information onto the color images. The root mean square error (RMSE) on estimating distance to shaking positions was found to be 6.4 cm. It was also observed that the distance estimation was more accurate (RMSE of 4.8 cm compared to reference measurements from laser distance measure) when the shaking position was selected over the visible branch regions compared to the positions in occluded regions of the canopy where distance was estimated using a linear interpolation method.
The mechanical shaking of tree branches at the shaking positions determined by the algorithm was carried out in Y-trellis and vertical trellis canopies. The first shaking event removed the largest amount of fruit from tree branches regardless of the tree architecture. The maximum fruit removal achieved with shaking at multiple positions was 92.9% for the Y-trellis system, which required as many as five shaking positions per branch in some cases. For the vertical trellis system, the maximum fruit removal efficiency was 86.6%, which took up to four shaking positions per branch. However, it was found that three shaking positions per branch would be enough for harvesting most of the cherries that could be removed by branch shaking in most of the cases. The first shaking event in the vertical trellis removed more fruit (47.4%) compared to that in Y-trellis system (29.1%). The results indicated that relatively larger diameter (of the vertical system) might play a role in increasing the effectiveness of energy transfer along the branch and therefore more efficient fruit removal. Overall, fruit removal in the vertical trellis system was lower than in the Y-trellis system because of undetected branches and cherries. Maintaining a good two-dimensional fruiting wall structure and spacing between branches may help to improve branch and cherry detection accuracy and the overall harvest efficiency.

Conclusions
For automated sweet cherry harvesting with the shake-and-catch system, the machine is required to make decisions on the number and location of shaking positions and estimate their 3D location in the canopy. This research focused on developing a method for determining shaking positions in cherry tree branches detected by a machine vision system. The localization of shaking positions on tree branches included; (i) determination of shaking position in each branch to harvest cherries; and (ii) estimation of distance to shaking position from the camera location by mapping depth information onto the color images. The root mean square error (RMSE) on estimating distance to shaking positions was found to be 6.4 cm. It was also observed that the distance estimation was more accurate (RMSE of 4.8 cm compared to reference measurements from laser distance measure) when the shaking position was selected over the visible branch regions compared to the positions in occluded regions of the canopy where distance was estimated using a linear interpolation method.
The mechanical shaking of tree branches at the shaking positions determined by the algorithm was carried out in Y-trellis and vertical trellis canopies. The first shaking event removed the largest amount of fruit from tree branches regardless of the tree architecture. The maximum fruit removal achieved with shaking at multiple positions was 92.9% for the Y-trellis system, which required as many as five shaking positions per branch in some cases. For the vertical trellis system, the maximum fruit removal efficiency was 86.6%, which took up to four shaking positions per branch. However, it was found that three shaking positions per branch would be enough for harvesting most of the cherries that could be removed by branch shaking in most of the cases. The first shaking event in the vertical trellis removed more fruit (47.4%) compared to that in Y-trellis system (29.1%). The results indicated that relatively larger diameter (of the vertical system) might play a role in increasing the effectiveness of energy transfer along the branch and therefore more efficient fruit removal. Overall, fruit removal in the vertical trellis system was lower than in the Y-trellis system because of undetected branches and cherries. Maintaining a good two-dimensional fruiting wall structure and spacing between branches may help to improve branch and cherry detection accuracy and the overall harvest efficiency.