Apple Shape Detection Based on Geometric and Radiometric Features Using a LiDAR Laser Scanner

: Yield monitoring systems in fruit production mostly rely on color features, making the discrimination of fruits challenging due to varying light conditions. The implementation of geometric and radiometric features in three-dimensional space (3D) analysis can alleviate such di ﬃ culties improving the fruit detection. In this study, a light detection and range (LiDAR) system was used to scan apple trees before (T L ) and after defoliation (T D ) four times during seasonal tree growth. An apple detection method based on calibrated apparent backscattered reﬂectance intensity (R ToF ) and geometric features, capturing linearity (L) and curvature (C) derived from the LiDAR 3D point cloud, is proposed. The iterative discretion of apple class from leaves and woody parts was obtained at R ToF > 76.1%, L < 15.5%, and C > 73.2%. The position of fruit centers in T L and in T D was compared, showing a root mean square error (RMSE) of 5.7%. The diameter of apples estimated from the foliated trees was related to the reference values based on the perimeter of the fruits, revealing an adjusted coe ﬃ cient of determination (R 2adj ) of 0.95 and RMSE of 9.5% at DAFB 120 . When comparing the results obtained on foliated and defoliated tree’s data, the estimated number of fruit’s on foliated trees at DAFB 42 , DAFB 70 , DAFB 104 , and DAFB 120 88.6%, 85.4%, 88.5%, and 94.8% of the ground truth values, respectively. The algorithm resulted in maximum values of 88.2% precision, 91.0% recall, and 89.5 F1 score at DAFB 120 . The results point to the high capacity of LiDAR variables [R ToF , C, L] to localize fruit and estimate its size by means of remote sensing.


Introduction
In the quest of decreasing farming costs and increasing sustainability, leaf area and yield monitoring are considered as one of the most important steps when implementing precision agriculture technologies in orchards [1,2]. Monitoring of leaf area index and yield has been widely studied and applied in cereal crops such as rice [3], sorghum [4], wheat [5], maize [6]. Similarly, in orchards the monitoring of tree morphology and growth is requested, since the variation in the leaf area and resulting light interception influences the fruit-bearing capacity of the tree. Furthermore, the shading of fruit affects the fruit color, which is an important parameter of fruit quality [7]. However, the coinciding structures and the varying reflectance properties of the tree organs work as suspending factor for yield mapping systems, particularly making the automated fruit detection challenging.
The aim of this study is to propose a methodology for segmenting, localizing, and analyzing fruit on the tree, based on the assumption that apples show enhanced R ToF at 905 nm compared to foliage and woody parts. The main objectives are: (1) the classification of 3D points of apples based on reflectance and geometric features derived from LiDAR laser scanner; (2) the development of a fruit segmentation algorithm that allows measuring the fruit size in defoliated trees; and (3) the evaluation of the proposed technique on foliated apple trees at different growth stages.

Site Description
The study was carried out in the experimental station Field Lab for Digital Agriculture by Leibniz Institute of Agricultural Engineering and Bioeconomy (ATB) located in Marquardt, Germany (Latitude: 52.466 • N, Longitude: 12.958 • E) during the growing season 2019. The trees of Malus × domestica Borkh. 'Gala' on M9 rootstock were planted in a 0.37 ha orchard, 0.95 m within-row distance between trees, which have been trained as tall slender spindle (central leader) formation with 3.3 m maximum tree height. The trees were supported by utilizing four horizontally parallel wires (W 1 , W 2 , W 3, W 4 ) with 1 m equal distance from each other. Plants were drip irrigated, fertilized, and treated according to management in commercial orchards and the regional plant protection plan. Hand thinning resulted in 1-3 fruits per inflorescence.

Data Acquisition and Pre-Processing
A metal frame installed on a tractor was used to carry the sensors along the tree rows [37]. A mobile terrestrial LiDAR laser scanner (LMS-511, Sick AG, Waldkirch, Germany) was mounted vertically on the metal frame at 1.6 m above the ground level.
The LiDAR sensor was configured with a 0.1667 • angular resolution, 25 Hz scanning frequency and a scanning angle of 190 • . The sensor frame was driven along the rows on both sides of the trees with an average speed of 0.13 m s −1 . The measured objects, hit by the laser beam, were assumed as perfectly diffuse reflectors (Lambertian), excluding the influence of incidence angle [38,39]. Thus, the apparent signal was considered as an approximation of hemispherical reflectance. Board targets, coated with white barium sulphate (BaSO 4 , CAS Number: 7727-43-7, Merck, Germany) for maximum (R max ) and blackened urethane (S black, Avian Technologies, New London, NH, USA) for minimum (R min ) referencing were applied to calibrate the backscattered intensity (R ToF ) of the LiDAR, obtaining the R ToF [%] at 905 nm for each point in the 3D point cloud (Equation (1)).
The tractor movement was recorded by an inertial measurement unit (IMU) (MTi-G-710, XSENS, Enschede, the Netherlands), allowing the correction of errors due to the uneven ground surface. The root mean square error (RMSE) of orientation noted at 0.25 • for roll, pitch and yaw [40]. Furthermore, a RTK-GNSS (AgGPS 542, Trimble, Sunnyvale, CA, USA) was used for georeferencing each individual point of the 3D point cloud. The horizontal and vertical accuracy of the RTK-GNSS was ± 25 mm + 2 ppm and ± 37 mm + 2 ppm, respectively. The IMU was place 0.24 cm aside from the LiDAR sensor, while the receiver antenna of RTK-GNSS was mounted 0.30 cm above the laser scanner ( Figure 1). A multi-thread software was developed in Visual Studio (version 16.1, Microsoft, Redmond, WA, USA) to acquire the data [37].
The 3D point cloud data was processed in the Computer Vision Toolbox™ of Matlab (2017b, Mathworks, Natick, MA, USA). A sparse outlier removal was applied to each point cloud pair to reject the points that were above the average distance using CloudCompare (2.10, GPL software, Paris, France). Moreover, the random consensus was performed to filter the points that belonged to the ground [41]. According to Tsoulias et al. [37], the rigid translations and rotations were applied in each point of the point cloud, while the alignment of the pair tree sides was carried out with the iterative closest point algorithm, using a k-dimensional space to speed up the process.
Data were recorded during fruit growth starting when the end of the cell division stage of fruit was reached and the cultivar-typical red blush color [42] appeared and on subsequent three dates, i.e., 42, 72, 104, and 120 days after full bloom (DAFB 42 , DAFB 72 , DAFB 104 , DAFB 120 , respectively). On each measuring date two apple trees (n = 8) were initially scanned with leaves (T L ), while the measurement was repeated with the plants defoliated (T D ). After scanning, the total number of apples was counted on the tree. After harvest, fruit's diameter (D Manual ) was manually measured in the lab. All apples of each tree were categorized according to their growing position on the slender spindle. The wires (W) were used as borders to discriminate the apples between the ground and W 1 (∆W 1-G ); W 1 and W 2 (∆W 2-1 ); W 2 and W 3 (∆W 3-2 ); W 3 and W 4 (∆W 4-3 ).
Remote Sens. 2020, 12, x FOR PEER REVIEW 4 of 19 measurement was repeated with the plants defoliated (TD). After scanning, the total number of apples was counted on the tree. After harvest, fruit's diameter (DManual) was manually measured in the lab. All apples of each tree were categorized according to their growing position on the slender spindle. The wires (W) were used as borders to discriminate the apples between the ground and W1 (ΔW1-G); W1 and W2 (ΔW2-1); W2 and W3 (ΔW3-2); W3 and W4 (ΔW4-3).

Figure 1.
Flowchart showing the protocol for apple detection and sizing using defoliated trees (TD), starting with the threshold application in the registered point cloud pair, through the filtering and partitioning, until the counting of clusters, which were compared to reference data from manual measurements and applied as ground truth when analysing the foliated trees. The cells with shading and dashed frame point out the differences from apple detection in foliated trees.

Extraction of Radiometric and Geometric Features
The information of the 3D local neighbourhood of points P i = [x i , y i , z i ] was determined using the k-nearest neighbours within a radius (r) equal to the mean DManual of apples [43]. The total number of the P i data set (K) was used to estimate the average P ̃= 1 of the nearest neighbours. Thus, the covariance matrix (Cov) was built after mean centering by means of subtraction of P value from each P i of the nearest neighbours set (Equation (2)): The Cov was decomposed based on the singular value decomposition, producing the eigenvalues ( λ 1 , λ 2 , λ 3 ), which were sorted in ascending order λ 1 ≥ λ 2 ≥ λ 3 according to the variance captured, and the corresponding eigenvectors. The eigenvalues, which represent the highest orthogonal variance of the matrix, provided the points dispersion within k-nearest neighbours describing the local spatial structure in 3D. The eigenvalues were scaled between 0 and 100, allowing the comparison of different clusters. Thus, the local geometry of each Pi in the point cloud was analyzed by means of eigenvalues to illustrate the spatial structure considering its linearity (L, Figure 1. Flowchart showing the protocol for apple detection and sizing using defoliated trees (T D ), starting with the threshold application in the registered point cloud pair, through the filtering and partitioning, until the counting of clusters, which were compared to reference data from manual measurements and applied as ground truth when analysing the foliated trees. The cells with shading and dashed frame point out the differences from apple detection in foliated trees.

Extraction of Radiometric and Geometric Features
The information of the 3D local neighbourhood of points P i = [x i , y i , z i ] was determined using the k-nearest neighbours within a radius (r) equal to the mean D Manual of apples [43]. The total number of the P i data set (K) was used to estimate the average P = 1 K K i = 1 (P i ) of the nearest neighbours. Thus, the covariance matrix (Cov) was built after mean centering by means of subtraction of P value from each P i of the nearest neighbours set (Equation (2)): The Cov was decomposed based on the singular value decomposition, producing the eigenvalues (λ 1 , λ 2 , λ 3 ), which were sorted in ascending order λ 1 ≥ λ 2 ≥ λ 3 according to the variance captured, and the corresponding eigenvectors. The eigenvalues, which represent the highest orthogonal variance of the matrix, provided the points dispersion within k-nearest neighbours describing the local spatial structure in 3D. The eigenvalues were scaled between 0 and 100, allowing the comparison of different clusters. Thus, the local geometry of each P i in the point cloud was analyzed by means of eigenvalues to illustrate the spatial structure considering its linearity (L, Equation (3)) and curvature (C, Equation (4)): where L and C describe the variation of linearity and curvatures for all points along the direction of the corresponding eigenvalues, respectively [43,44]. More specifically, the closer the values of L or C to 100, the higher the likelihood for the shape of points to be linear or curved, respectively. Therefore, L was used to segment the foliage and woody parts of the tree (leaves, branches, stem), while C was applied to distinguish the apple points.

Apple Segmentation
The data set of T D at DAFB 120 (n = 2) was used to define the range of R ToF , C and L for the class of apple (R A , C A, L A ) and of woody parts (R W , C W , L W ), while the range of leaf class for the same features (R L , C L, L L ) was defined in the T L . The thresholding was performed for each class by employing the exploratory analysis of normal distribution using the probability density function to define the threshold that will distinguish the 3D points of apples from leaves and woody parts [33,45]. The value with the highest likelihood (mode) within the R A , C A , and L A classes was used as a threshold (R th , C th , and L th ). Points which fulfilled the criteria of L A ≤ L th , C th ≤ C A , and R th ≤ R A were segmented and categorized as apples. Figure 1 represents a flowchart of the protocol for apple detection and sizing using the defoliated trees (T D ).
Subsequently, a density-based scan algorithm (DBSCAN) [46] was applied to find the point sets, using the mean D Manual of apples that was found in each ∆W as a neighborhood search radius (ε) and the value 10 as a minimum number of neighbors. The value of 10 was applied resulting from manually run tests showing that less neighboring points result in random appearance of sets. The mean value of apparent reflectance intensity and of curvature in defoliated trees ( R and C) were calculated to classify the clusters. The maximum distance in x and y axes was considered as the diameter (D D ) of each point set recognized as an apple.
Following, the k-means clustering was applied to count and find the fruit center (M D ) in each cluster in T D . However, the shape variation of each apple and the clustering of more than one apple produced different C and R ToF . Consequently, the utilization of C th and R th values possibly exclude points at the edge of the fruit. Therefore, a sphere was placed in each individual apple cluster, with the coordinates of M D as the center and a radius equal with the estimated D D to include the filtered apple points. The k-means algorithm was reapplied to estimate the M D of each cluster based on the added 3D points, while the R and C were recalculated.
The fact that more than one apple developed per inflorescence led to close appearance of 2-3 fruits in the cloud, resulting in the creation of augmented clusters with more than one apple. Thus, the clusters of 3D points within the sphere were evaluated and partitioned based on their extracted features, creating sub-clusters and defining the number of M D . Therefore, the partitioning was performed for each cluster only on the condition that the three variables, D D , R, and C, were close to mean values of D Manual measured at DAFB 120 for each ∆W and R th , C th . The features of each sub-cluster were iteratively extracted and re-evaluated, while the partitioning was continued until the sub-cluster features stopped, while fulfilling the condition. Subsequently, the spheres were placed based on the M D and the D D of each sub-cluster. Those spheres, showing a calculated distance between the M D centers below the minimum value of D Manual in each ∆W, were merged. Therefore, their M D was redefined and their extracted features were recalculated. It should be mentioned that the M D clusters were considered as ground truth labels for the later evaluation of the method in T L .
Similarly, the L th , C th , R th were applied to the same trees before their defoliation to segment the apples and find their centers (M L ) ( Figure 2). The DBSCAN was used to group the segmented points, while the M L centers were defined by the k-means algorithm. Furthermore, the maximum distance in x and y axes was considered as the diameter (D L ) of each point set. The main partitioning condition remained the same using the generated features from each cluster. Thus, the features of each T L sub-cluster were iteratively extracted and re-evaluated until fulfilling the conditions. The number of partitions was used in the k-means algorithm to determine the centers of M L sub-clusters. The only difference between analysing T D and T L was the use of the sphere in T D to evaluate the eligibility of the cluster or the need to search for subclusters according to the estimated manually measured fruit diameter.
However, in T L , some of the apples were fully or partially covered by the leaves of the tree. In the latter case, the combination of DBSCAN and k-means algorithm overestimated the number of apple classes. Thus, apple classes producing lower or equal Euclidian distance between their M L centers compared to the minimum D Manual were combined and considered as one fruit.
Remote Sens. 2020, 12, x FOR PEER REVIEW 6 of 19 partitions was used in the k-means algorithm to determine the centers of ML sub-clusters. The only difference between analysing TD and TL was the use of the sphere in TD to evaluate the eligibility of the cluster or the need to search for subclusters according to the estimated manually measured fruit diameter. However, in TL, some of the apples were fully or partially covered by the leaves of the tree. In the latter case, the combination of DBSCAN and k-means algorithm overestimated the number of apple classes. Thus, apple classes producing lower or equal Euclidian distance between their ML centers compared to the minimum DManual were combined and considered as one fruit.

Evaluation
The calibration of the method was carried out considering data from TD and TL at DAFB120 (n = 2) to define the Rth, Cth and Lth that describe the apple class. Cross-validation, applying thresholds extracted at DAFB120, was performed to the TD and TL data sets of the three earlier measuring dates (n = 6). The fruit detection methodology was evaluated by calculating the accuracy, the precision, the recall, and the F1 score:

Evaluation
The calibration of the method was carried out considering data from T D and T L at DAFB 120 (n = 2) to define the R th , C th and L th that describe the apple class. Cross-validation, applying thresholds extracted at DAFB 120 , was performed to the T D and T L data sets of the three earlier measuring dates (n = 6). The fruit detection methodology was evaluated by calculating the accuracy, the precision, the recall, and the F1 score: The M D were considered as ground truth labels for the evaluation of M L . The M L clusters which coincide with the M D points >50% were considered as true positive (TP). The true negative (TN) refers to the situation when a cluster was correctly categorized as no fruit. Whereas the M L clusters which matched <50% were counted as false negative (FN). The clusters which reveal a diameter smaller than the minimum value of D Manual that was found in each ∆W were categorized as false positives (FP). N denotes the sum of TP, TN, FP and FN. The values were given as percentage.
A regression analysis was performed between the D Manual and the D D , and RMSE, mean absolute error (MAE), mean bias error (MBE) were calculated for each measuring date. The position of each M L was evaluated with the M D using the Euclidean distance, pairing the centers that revealed the minimum distance between them. In parallel, the difference between the fruit diameter in T D and T L was estimated, considering the D D as ground truth for D L .

Apple Segmentation
The definition of apple class was carried out on the data set of last day's measurements at DAFB 120 . The calibrated reflectance intensity R ToF varied in the 3D point cloud of the foliated trees ( Figure 3b). The R ToF values of apples (R A ) and woody parts (R W ) appeared above 70% and below 55%, respectively. The leaves (R L ) ranged between 40 and 80%. The C values of leaves (C L ) ranged between 35.2% and 70% (Figure 3c). The C values of apples (C A ) were observed above 50%, whereas the points of woody parts (C W ) did not exceed 60%. The L of leaves (L L ) ranged between 20% and 80%, while the woody parts (L W ) were scored above 60% (Figure 3d).
Remote Sens. 2020, 12, x FOR PEER REVIEW 7 of 19 matched <50% were counted as false negative (FN). The clusters which reveal a diameter smaller than the minimum value of DManual that was found in each ΔW were categorized as false positives (FP). N denotes the sum of TP, TN, FP and FN. The values were given as percentage.
A regression analysis was performed between the DManual and the DD, and RMSE, mean absolute error (MAE), mean bias error (MBE) were calculated for each measuring date. The position of each ML was evaluated with the MD using the Euclidean distance, pairing the centers that revealed the minimum distance between them. In parallel, the difference between the fruit diameter in TD and TL was estimated, considering the DD as ground truth for DL.

Apple Segmentation
The definition of apple class was carried out on the data set of last day's measurements at DAFB120. The calibrated reflectance intensity RToF varied in the 3D point cloud of the foliated trees ( Figure 3b). The RToF values of apples (RA) and woody parts (RW) appeared above 70% and below 55%, respectively. The leaves (RL) ranged between 40 and 80%. The C values of leaves (CL) ranged between 35.2% and 70% (Figure 3c). The C values of apples (CA) were observed above 50%, whereas the points of woody parts (CW) did not exceed 60%. The L of leaves (LL) ranged between 20% and 80%, while the woody parts (LW) were scored above 60% (Figure 3d). However, the occlusions of leaves and woody parts did not allow the acquisition of the full range of RToF, C, and L values of apples. Therefore, the trees were defoliated and scanned with the LiDAR system at DAFB120 (Figure 4) and data were used to readjust the thresholds. The RW values ranged between 0 and 60%, while the RA appeared from 42 to 90% (Figure 4b). The CA values of apples ranged between 56 and 98%, and the CW appeared between 0.1 and 55.2% (Figure 4c). The 3D points of the latter category revealed the highest values varying between 48.8 and 98.8%. The L values of apple points (LA) were more discretely ranging between 0.2 and 38.3%. (Figure 4d). However, the highest However, the occlusions of leaves and woody parts did not allow the acquisition of the full range of R ToF , C, and L values of apples. Therefore, the trees were defoliated and scanned with the LiDAR system at DAFB 120 ( Figure 4) and data were used to readjust the thresholds. The R W values ranged between 0 and 60%, while the R A appeared from 42 to 90% (Figure 4b). The C A values of apples ranged between 56 and 98%, and the C W appeared between 0.1 and 55.2% (Figure 4c). The 3D points of the latter category revealed the highest values varying between 48.8 and 98.8%. The L values of apple points (L A ) were more discretely ranging between 0.2 and 38.3%. (Figure 4d). However, the highest mean value was obtained from the R A class at 71.8%, whereas the mode value (76.1%) was utilized as R th for the 'Gala' trees.  As mentioned above, the RToF values of apples could overlap with the leaves or the woody parts, while some of the surfaces may reveal similar C with apples, producing falsely clusters from leaves or woody parts. The RW showed a mean value of 38.2% with standard deviation (SD) of 18.5%. In contrast, the RToF of leaves (RL) and of apples (RA) were marginally deviated (Figure 5a), while most of their values appeared from 50 to 70% and from 65 to 71.8%, respectively. More specifically, the mean value of RL was indicated at 58.8% with 12.1% SD, and 63.2% as the value with the highest likelihood (mode) within the class.   As mentioned above, the R ToF values of apples could overlap with the leaves or the woody parts, while some of the surfaces may reveal similar C with apples, producing falsely clusters from leaves or woody parts. The R W showed a mean value of 38.2% with standard deviation (SD) of 18.5%. In contrast, the R ToF of leaves (R L ) and of apples (R A ) were marginally deviated (Figure 5a), while most of their values appeared from 50 to 70% and from 65 to 71.8%, respectively. More specifically, the mean value of R L was indicated at 58.8% with 12.1% SD, and 63.2% as the value with the highest likelihood (mode) within the class.  As mentioned above, the RToF values of apples could overlap with the leaves or the woody parts, while some of the surfaces may reveal similar C with apples, producing falsely clusters from leaves or woody parts. The RW showed a mean value of 38.2% with standard deviation (SD) of 18.5%. In contrast, the RToF of leaves (RL) and of apples (RA) were marginally deviated (Figure 5a), while most of their values appeared from 50 to 70% and from 65 to 71.8%, respectively. More specifically, the mean value of RL was indicated at 58.8% with 12.1% SD, and 63.2% as the value with the highest likelihood (mode) within the class. A similar pattern of the probability density of R ToF was obtained in C (Figure 5b). The most frequent value within the C A was illustrated at 73.2% (C th ). The values of C of leaves (C L ) were partly coinciding with the C A , revealing a mean value at 60.7%, and 9.1% SD. The C W revealed a mean value of 30.1%, with 19.4% SD. The 3D points of L W depicted the highest values (Figure 5c), reaching a mean value of 60.2%, and 94.1% as mode value ( Figure 6).
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 19 The RA, CA and LA values were distinct from the residual classes, allowing the discrimination and localization of apples in the 3D point cloud. The apples were described with RA and CA values 76.1% and 73.2% ( Figure 6). Therefore, the Rth and Cth were combined and used as thresholds to distinguish the apple points from woody parts. Whereas the remaining points of the latter class were removed using the LW mode value of 15.5% as threshold (Lth) (Figure 6). When performing the DBSCAN algorithm, the clusters of filtered points of apples appeared in the 3D point cloud (Figure 7). Subsequently, the k-means partition method was applied to the points of each cluster to acquire the MD. A sphere of variable radius based on the DD of the cluster, enclosed the points of the cluster when placed at MD. After the localization of MD, the estimated fruit diameter, DD, was compared to the DManual for each ΔW during the growth stages ( Table 1). The DD was related to the DManual. In the upper sections, The R A , C A and L A values were distinct from the residual classes, allowing the discrimination and localization of apples in the 3D point cloud. The apples were described with R A and C A values 76.1% and 73.2% ( Figure 6). Therefore, the R th and C th were combined and used as thresholds to distinguish the apple points from woody parts. Whereas the remaining points of the latter class were removed using the L W mode value of 15.5% as threshold (L th ) ( Figure 6).
When performing the DBSCAN algorithm, the clusters of filtered points of apples appeared in the 3D point cloud (Figure 7). Subsequently, the k-means partition method was applied to the points of each cluster to acquire the M D . A sphere of variable radius based on the D D of the cluster, enclosed the points of the cluster when placed at M D .
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 19 The RA, CA and LA values were distinct from the residual classes, allowing the discrimination and localization of apples in the 3D point cloud. The apples were described with RA and CA values 76.1% and 73.2% ( Figure 6). Therefore, the Rth and Cth were combined and used as thresholds to distinguish the apple points from woody parts. Whereas the remaining points of the latter class were removed using the LW mode value of 15.5% as threshold (Lth) (Figure 6). When performing the DBSCAN algorithm, the clusters of filtered points of apples appeared in the 3D point cloud (Figure 7). Subsequently, the k-means partition method was applied to the points of each cluster to acquire the MD. A sphere of variable radius based on the DD of the cluster, enclosed the points of the cluster when placed at MD. After the localization of MD, the estimated fruit diameter, DD, was compared to the DManual for each ΔW during the growth stages ( Table 1). The DD was related to the DManual. In the upper sections, After the localization of M D , the estimated fruit diameter, D D , was compared to the D Manual for each ∆W during the growth stages ( Table 1). The D D was related to the D Manual . In the upper sections, ∆W 3-2 and ∆W 4-3 of the canopies, where the fruit grow more distinctly, the analysis of apple size resulted in an adjacent coefficient of determination (R 2 adj ) of 0.69 and 0.74 with RMSE = 7.2 and 6.6% at DAFB 42 , respectively. Generally, high measuring uncertainty was noticed on the first two measuring dates, when the fruit size was smaller, particularly in the lower sections of the trees with overlapping structures. More specifically, no relation was observed on the first measuring date in ∆W 1-G and ∆W 2-1 , presenting an overestimation of 6.5 mm and 7.5 mm, respectively. Similarly, at DAFB 72 in the same areas, the MBE points to an underestimation of fruit diameter of −10.7 mm and −9.11 mm. In contrast, the highest relation was revealed in ∆W 4-3 (R 2 adj = 0.95) and ∆W 3-2 (R 2 adj = 0.90), at DAFB 104 . Whereas, in the same stage, a less pronounced R 2 adj was observed in ∆W 1-G (R 2 adj = 0.55) and ∆W 2-1 (R 2 adj = 0.66) compared to the aforementioned upper zones of the trees. The measuring uncertainty was slightly enhanced again at DAFB 120 . The crop load of trees usually varies between the trees, affecting the fruit size. The high crop load of the trees analysed at DAFB 120 resulted in slightly smaller fruit size compared to trees with lower crop load sampled at DAFB 104 . Consequently, the method showed lowest measuring uncertainty in the biggest fruit at DAFB 104 considering the defoliated trees.

Evaluation
In foliated trees, T L , the same procedure was followed applying the same threshold values gained in the calibration on the data set obtained at DAFB 120 . For evaluating the performance of the segmentation process in foliated trees, T L , the data set of defoliated trees T D was employed as ground truth. For this purpose, the M L clusters found in T L were compared to the clusters, M D , found in T D .
The total number of the apple clusters in defoliated trees (n D ) was 42 in ∆W 1-G , 84 in ∆W 2-1 , 91 in ∆W 3-2 , and 51 in ∆W 4-3 considering all measuring dates. It can be assumed that the high measuring uncertainties in the lowest part of the canopy may have resulted from the lower number of fruits additionally to the inclusions. When utilizing the proposed methodology, a high percentage of apples were found considering the TP in the foliated trees with 88.6%, 85.4%, 88.5%, and 94.8% of the ground truth at DAFB 42 , DAFB 72 , DAFB 104 , and DAFB 120 , respectively (Table 2).
A less pronounced accuracy was noted when analyzing the foliated trees at DAFB 42 and DAFB 72 . In parallel with the fruit growth, the number of apple points in the 3D point cloud increased, enhancing the performance of the detection algorithm. More specifically, the precision ranged from 81.1 to 86.6% and the F1 between 83.3 and 87.5% at DAFB 104 , when the highest fruit size was captured. However, the highest performance was observed in ∆W 4-3 of DAFB 120 , revealing an 89.5% F1, 88.2% precision, and 91.0% recall, when many large fruits were found and least overlapping between tree's parts occurred.
The processing time of the algorithm was 13 sec per tree. The whole process carried out in 64-bit operating system with 16 GB RAM, an Intel ® Core i7-8750H processor (2.2 GHz). Table 2. Performance assessment of the localization algorithm on the segmented apple points in foliated trees (T L ) considering the apple in defoliated trees (T D ) as ground truth over four growth stages given in days after full bloom (DAFB) and in four tree height (W) starting from ground (G) till the maximum tree point (4). Manually counted fruit number (n Manual ), total number of detected fruits in T D (n D ) and true positives (TP) in T L with descriptive statistics are given. The differences of distance between the M L and M D (Figure 8) and the fruit diameter considering D D and D L were calculated for each measuring date ( Table 3). The worst-case scenario was illustrated at DAFB 42 with 19.9% RMSE (Figure 9a). Moreover, in the same growth stage, the highest difference between the D D and D L was revealed, presenting a mean difference of 22.3 mm with 0.3 mm SD. The measuring uncertainty was decreased at DAFB 72 . An even less pronounced RMSE was depicted at DAFB 104 , while the lowest difference between ground truth and estimated position of fruit center in foliated trees was found at DAFB 120 (Table 3). (a) (b) Figure 9. Representation of the (a) worst-and the (b) best-case scenario of the distance of the center of clusters in defoliated (MD) and foliated (ML) trees at DAFB42 and DAFB120. The red color depicts the apple points of the foliated trees (AL) while the blue color the points of the defoliated tree (AD). Table 3. Differences between the position of the centers of defoliated (MD) and foliated (ML) clusters as well as the fruit diameter estimated (DL) and ground truth (DD) in TL and in TD, respectively, regarding the standard deviation (SD) and root mean square error (RMSE) [%] during the fruit growth given in days after full bloom (DAFB).  (a) (b) Figure 9. Representation of the (a) worst-and the (b) best-case scenario of the distance of the center of clusters in defoliated (MD) and foliated (ML) trees at DAFB42 and DAFB120. The red color depicts the apple points of the foliated trees (AL) while the blue color the points of the defoliated tree (AD). Table 3. Differences between the position of the centers of defoliated (MD) and foliated (ML) clusters as well as the fruit diameter estimated (DL) and ground truth (DD) in TL and in TD, respectively, regarding the standard deviation (SD) and root mean square error (RMSE) [%] during the fruit growth given in days after full bloom (DAFB).

MD-ML [mm] SD [mm] RMSE [%] DD-DL [mm] SD [mm] RMSE [%]
DAFB42 0.    More specifically for DAFB 120 , the best-case scenario was depicted considering the position of apple center with a mean value of 0.1 mm difference and 5.7% RMSE (Figure 9b). Furthermore, the minimum difference between the D D and D L was analysed, reaching 8.7 mm with 0.1 mm SD and 9.5% RMSE.

Segmentation
The application of a LiDAR sensor system enabled the acquisition of 3D point cloud of apple trees, allowing the utilization of radiometric (R ToF ) and geometric features (C, L) as classifiers for fruit detection in the proposed algorithm. The apple class (R A , C A , L A ) was separated from leaves as well as delimited from the class capturing woody parts (R W , C W , L W ). The latter thresholds were defined after tree defoliation, whereas, the leaf class for the same features (R L , C L, L L ) was determined in T L . The discrete ranges ( Figure 5) of R A (42-90%), R L (50 to 70%) and R W (0-60%) is assumed to be caused by their different type of surface [47] and water content [48]. Furthermore, the spherical shape of apples resulted in higher C A values (56-98%) than C W (0.1-55.2%) due to the linear shape of woody parts, while a reciprocal opposite range was observed in the correspondent classes of linearity, L A (0.2-38.3%) and L W (60.2-94.1%). Moreover, the latter classes were partly coinciding with C L (40-78%) and L L (20-80%). The apples were segmented from the woody parts and later clustered using the DBSCAN. The application of sphere in each M D enabled high resolution in the determination of R th (76.1%), C th (73.2%) and L th (15.5%) at DAFB 120 . Similar value of R th (60%) was suggested by Gene-Mola et al. [25] for the segmentation of apples in 'Fuji' trees.
This served as calibration for the segmentation modes, whereas also the sphere radius improved the detection of the number of fruit. As was previously mentioned, the ε of DBSCAN was based on the D Manual of each ∆W at DAFB 120 . Thus, earlier growth stages would be influenced by the generation of augmented clusters. This was confirmed by the finding that the measuring uncertainty was elevated mainly in the DAFB 42 stage, when a lower D of the fruits was measured manually.
This overestimation of fruit size could result in under-and over-counting of the DBSCAN clustering at DAFB 42 and subsequent measuring dates, respectively, since the ε remained constant, and the fruit may be bigger or more small fruits are collected in one cluster. In Sun et al. [34], the wide range of size and shape of cotton boll similarily influenced the splitting operation of clusters. Furthermore, the shape of apples is not perfectly spherical producing an error in the estimation of D D . Wang et al. [26] combined a cascade detection and an ellipse fitting to recognize and estimate the size of mangoes, using RGB-D depth imaging. However, the not perfectly elliptical shape of fruits produced 4.9 mm and 4.3 mm RMSE in the detected height and width of fruit, respectively. Gongal et al. [49] measured the size of apples based on the 3D coordinates from a 3D camera and they obtained 30.9% MAE, while the MAE decreased to 15.2% when the pixel information from an RGB camera was considered. As was mentioned before, the crop load (number of fruit per tree) varied, consequently trees with higher crop load will produce fuits of smaller size as found at DAFB 120 compared to DAFB 104 . This affected the ratio of the total number of TP over the n D , decreasing it from 94.8% at DAFB 120 (calibration date) to 88.4% at DAFB 104 .
Nevertheless, the present results indicate that the data set of L, C and R ToF can enable feasible counting and sizing of apples from the 3D point cloud, allowing their discrimination even in areas with high apple density. Furthermore, the values of R ToF, C and L were described through the surface of apples with sphere segmentation, increasing the resolution in the classes, and subsequently the determination of the mode value.

Evaluation
The apple clusters in T D were used as ground truth labels. This derived from the assumption that the shape and the size of apples could be fully described due to the reduced occlusions. Mack et al. [50] employed random sample consensus algorithm to fit a 3D sphere model in grapevine berries, reconstructing the shape of berries with 98% precision and 94% recall. However, the sphere size (D D ) was also applied in T D as an additional step to evaluate the segmented region. In Rakun et al. [51], a reconstructed sphere to evaluate the 3D shape of segmented voxels was employed. Gene-Mola et al. [33] labelled manually the estimated locations of apples in the 3D point cloud, placing 3D rectangular bounding boxes on each apple. The reflectance and the sphericity derived from a LiDAR system were utilized as principal features in fruit detection, achieving 87.5% localization success and 0.85 F1. However, similar results have been reported from the implementation of 3D shape information with RGB vision systems, reaching a precision of 86.4% and recall 88.9% in fruit localization [35,36]. As a major perturbation factor, the woody parts in ∆ W1-G revealed similar R ToF and C values with the apples, enhancing the FP cases and subsequently the decline of accuracy in this region. The occlusions produced by branches and leaves hide partly or totally the fruit, resulting in undercounting of apples. Mendez et al. [52] used a LiDAR laser scanner to count the number of fruit in pruned orange trees (R 2 = 0.63), however, similar to the results of the present study, the correlation was significantly reduced (R 2 = 0.18) when the method was applied in unpruned trees due to occlusion by leaves.
The fruit size of the calibration data set reduced the accuracy of the method, when smaller fruit size occurred. The accuracy of the proposed methodology was reduced at the first two growth stages, resulting in the lowest percentage (65.8%) in ∆W 4-3 at DAFB 72 . Stajnko et al. [53] monitored the diameter of green apples several times during the vegetation period, combining thermal and RGB camera. The highest relation between the manually measured and detected fruit diameter (R 2 = 0.70) was observed at harvest stage, when the fruit reached their maximum size. Confirming, a less pronounced relation (R 2 = 0.68) was observed in apples of smaller size after fruit drop. Furthermore, the varying fruit size, appearing even at one stage, can enhance the visual complexity of the cluster and consequently influence the identified number of fruit [54].
However, evaluating the distance of M L and M D centers allowed to quantify the possible deviations ( Table 3). Some of the fruits were partly occluded by branches, creating smaller clusters in T L . This deviation was also visible in the high RMSE of fruit diameter considering D D and D L Summarizing, pomising results for in situ yield monitoring using a 2D LiDAR sensor system. Further research should be carried out to test the method in different apple cultivars having less spherical shape and varying surfaces properties in order to identify and tackle possible deviations of geometric and reflectance values.

Conclusions
The radiometric and geometric features of apples (R A , C A , L A ) showed higher values compared to wood and leaves compared to fruit elements. Concluding that they can be utilized to segment the apple points in the 3D point cloud. The likelihood of thresholds was enhanced by integrating a sphere to segment the apple clusters and determine the center position. This calibration was carried out on two defoliated trees at DAFB 120 , resulting in R th = 76.1%, C th = 73.2%, and L th = 15.5% that define the points of apple class.
The developed sphere methodology in defoliated trees was able to estimate the fruit diameter, D D , with low measuring uncertainty at different growth stages. The D D , showed the highest adjacent coefficient of determination (R 2 adj = 0.95) with D Manual in the upper parts of the canopy (∆W 4-3 ) at DAFB 104 , when the fruit showed hardly any occlusion with other tree organs and fruit size was at maximum considering all trees measured.
The evaluation of apple clusters in foliated trees over the clusters of defoliated trees depicted that the robustness is affected by the fruit size, but resulted in highest performance of 89.5% F1 with 91.0% recall and 88.2% precision. At DAFB 120 , the minimum difference between ground truth, D D , and estimated, D L , fruit diameter was 8.7 mm representing 9.5% RMSE, which encourages the further development of LiDAR-based fruit size analysis.
Author Contributions: N.T. developed the experimental plan, carried out the data analysis, programmed the codes applied, and wrote the text. D.S.P. contributed to the programming and added directions on the data analysis. G.X. supported the experimental planning, data analysis, and revision of the text. M.Z.-S. developed the experimental plan, added directions to the data analysis and interpretation, and revised the text. All authors have read and agreed to the published version of the manuscript.

Acknowledgments:
The authors are grateful that the publication of this article was funded by the Open Access Fund of the Leibniz Association. Search radius used in density-based scan algorithm λ 1 , λ 2 , λ 3 Eigenvalues calculated from the covariance matrix