An Autonomous Grape-Harvester Robot: Integrated System Architecture

: This work pursues the potential of extending “Industry 4.0” practices to farming toward achieving “Agriculture 4.0”. Our interest is in fruit harvesting, motivated by the problem of addressing the shortage of seasonal labor. In particular, here we present an integrated system architecture of an Autonomous Robot for Grape harvesting (ARG). The overall system consists of three interdependent units: (1) an aerial unit, (2) a remote-control unit and (3) the ARG ground unit. Special attention is paid to the ARG; the latter is designed and built to carry out three viticultural operations, namely harvest, green harvest and defoliation. We present an overview of the multi-purpose overall system, the speciﬁc design of each unit of the system and the integration of all subsystems. In addition, the fully sensory-based sensing system architecture and the underlying vision system are analyzed. Due to its modular design, the proposed system can be extended to a variety of different crops and/or orchards.


Introduction
Agriculture technologies keep evolving in the recently introduced paradigm of Agriculture 4.0 [1]. The latter can be regarded as an extension of the Industry 4.0 paradigm to farming. Within Agriculture 4.0, emerging technologies such as robotics, Internet of Things (IoT), artificial intelligence and machine vision are combined with a common focus on sustainable crop management [2]. Furthermore, the advent of autonomous intelligent systems has led to the development of robust agricultural robots, namely agrobots [3][4][5]. Agrobots can better handle the variability of crops and hence reduce environmental impacts while increasing food supply and improving economic sustainability [2,3].
Research often focuses on interactive agrobots that can operate on a crop scale especially with dexterous tactile skills. Toward this end, agrobots have been introduced for precision agriculture tasks such as weeding, harvesting, spraying, pruning, watering, etc. The design of a harvesting cherry tomato robot was presented in [6]; the robotic system consists of a stereo-vision unit, an end-effector, a manipulator, a fruit collector and a railed vehicle. The design concept of an autonomous kiwifruit-picking robot was reported in [7]; the robot follows instructions transmitted via a radio link and navigates autonomously combining Global-Positioning System (GPS) and machine vision; the system was equipped with four picking arms controlled by one processing core. In [8] a microdosing system for the precise application of herbicides was developed; the overall system includes an autonomous ground robot, a camera and a micro-dosing system. Another work [9] described an autonomous apple-picking robot; the main parts of the system are a traveling device, a vision system and a robotic arm with a gripper. The development of a strawberry-harvesting robot was presented in [10]; the system consists of a red-greenblue depth (RGB-D) camera, a gripper and a robotic arm, mounted on an autonomous wheeled robot. An autonomous weeding mobile platform, namely AgBot II, offers three different mechanical implements depending on the detected weed namely an arrow hoe, a tine and a cutting tool [11]. Most of the reported mobile robots in the literature are task-specific. Task-specific agricultural robots are also commercially available, including Harvest CROO for strawberries harvest [12], GUSS for orchard spraying [13], Oz, Ted and Dino for weeding [14]. Commercial multi-purpose robotic platforms, designed for more than one particular agricultural operation, have also been reported, including Digital Farmhand [15], Farmdroid [16] and Husky [17].
Our special interest here is in grapes. It is well-known that the quality of produced grapes is highly affected by viticultural practices such as defoliation, lateral shoot removal and green harvest [18,19]. Regarding high-value crops like wine grapes, canopy management, pre-harvest, post-harvest and harvest operations are considered to have a considerable effect on wine quality. Agrobots are expected to support viniculture by saving human labor by performing viticultural operations with the dexterity of an experienced worker. Toward this end, this work presents a crop-scale multi-purpose autonomous grape harvester robot, or ARG for short, designed to carry out the viticultural tasks of harvest, green harvest and defoliation instead of a skillful worker. All operations are designed to allow personalization depending on the user preferences and are performed by custom-made end-effectors. Moreover, harvest is carried out homogeneously, in the sense that only grapes of a similar degree of maturity are harvested. The overall system includes three interdependent units: (1) an aerial unit, (2) a remote-control unit and (3) the ARG ground unit. The use of heterogeneous multi-modal units provides opportunities for targeted vineyard management and control of the overall system. The aerial unit provides images of the vineyards. The remote-control unit uses the images to build the vineyard maps and define all possible navigation paths. The remote-control unit allows the user to design an operation plan for the ARG ground unit to execute. The user can select between three different viticulture operations that may personalize and sends the operation plan to the ARG ground unit. The ARG navigates in the vineyard corridors, collects sensory data displayed to the remote-control unit and performs the selected viticultural operation in areas specified by the user. The objectives of this study are to describe the developed ARG harvester robot regarding (i) hardware design, referring to system architecture, interoperability and integration of hardware components, as well as (ii) software design referring to procedural flows, functionalities, integrated algorithms and personalization parameters. Due to its modular design, the proposed system can easily be adapted to similar agricultural operations regarding alternative crops.
The rest of the paper is structured as follows: Section 2 presents related work regarding viniculture agrobots. Section 3 details all system components, ARG design and workflows of the agricultural tasks. Section 4 describes ARG system integration. Discussion and future work are presented in Section 5. Finally, Section 6 concludes by summarizing the contributions of this work.

Related Work
Viniculture agrobots are rather scarce in the literature. The GRAPE project [20] is an autonomous ground robot with a robotic arm for plant health monitoring and targeted pheromone dispenser distribution; the robot operates by a user-friendly interface. In [21] a terrestrial robot is presented that can determine plant health, monitor the path in the vineyard and apply micronutrients to grapes. An earlier multi-purpose agrobot [22] was developed for harvesting, berry thinning, spraying and bagging; the system consisted of a manipulator, a visual sensor, a traveling device and alternating end-effectors. A cost-effective robot for crop monitoring tasks in mountain vineyards was presented in [23]. An updated version of the latter robot was reported in [24] able to navigate and carry out monitoring and harvesting tasks in steep slope vineyards. The design aspects of a semi-autonomous spraying robot were presented in [25]. The work in [26] describes an autonomous robot system for the automatic pruning of grapevines; a stereo-vision system extracts the three-dimensional (3D) model of the grape trees and a robotic arm carries out pruning. The VINBOT robot [27] was designed to optimize yield management and perform vineyard yield estimation. A multi-purpose robotic platform was developed for vineyard management using an autonomous vineyard scouting robot, namely VineRobot [28]. Table 1 includes the functionalities and basic features of all aforementioned related works regarding viniculture agrobots.
Simulation results x x Main hardware components The aim of Table 1 is to highlight the main gap in viniculture agrobots that this work aims to fill. The proposed agrobot can deal with three different viticultural operations, introducing personalization parameters for all tasks. Green harvest and defoliation have Electronics 2021, 10, 1056 4 of 22 not been implemented before by any other agrobota, as can be seen in Table 1, and none of the existing methods allows for the personalization of tasks. This is the main gap that this work aims to fill. In the proposed system, for harvest, personalization refers to the degree of maturity toward homogeneously harvested grapes. For the green harvest, the user can select the percentage of grapes bunches to be left in each vine tree, while for defoliation the user can define the percentage of leaves to be removed (see Section 3.2.1). At this point, it is worth mentioning that this work is the main concept of an ongoing project namely Personalized Optimal Grape Harvest by Autonomous Robot (POGHAR) [29]. Additionally, this work combines hardware design, interoperability and integration of devices along with procedural flows for all three viticultural operations, whereas none of the related work covers similar content. All machine vision algorithms that support these operations are referenced, leading to implementations that have already been done and their accuracy has been demonstrated at simulation level (see Section 3.2.1).

Materials and Methods
An overview of the proposed system is detailed in this section. More specifically, all units of the main system are analyzed. Particular importance is given to the ARG ground unit. The basic elements of ARG, which are the wheeled mobile robot and the manipulator, are described thoroughly; hardware, system design, interoperability of devices, procedural flows of all tasks, methodologies, machine vision algorithms and parametrization of agricultural tasks are presented.

System Overview
The overall system mainly consists of three units: (1) an aerial unit, (2) a remote-control unit and (3) the ARG ground unit. Figure 1 conceptually shows the system architecture, as well as the manner the three units interact with one another. Table 2 includes analytically the bill of materials (BOM) used per unit. The aim of Table 1 is to highlight the main gap in viniculture agrobots that this work aims to fill. The proposed agrobot can deal with three different viticultural operations, introducing personalization parameters for all tasks. Green harvest and defoliation have not been implemented before by any other agrobota, as can be seen in Table 1, and none of the existing methods allows for the personalization of tasks. This is the main gap that this work aims to fill. In the proposed system, for harvest, personalization refers to the degree of maturity toward homogeneously harvested grapes. For the green harvest, the user can select the percentage of grapes bunches to be left in each vine tree, while for defoliation the user can define the percentage of leaves to be removed (see Section 3.2.1). At this point, it is worth mentioning that this work is the main concept of an ongoing project namely Personalized Optimal Grape Harvest by Autonomous Robot (POGHAR) [29]. Additionally, this work combines hardware design, interoperability and integration of devices along with procedural flows for all three viticultural operations, whereas none of the related work covers similar content. All machine vision algorithms that support these operations are referenced, leading to implementations that have already been done and their accuracy has been demonstrated at simulation level (see Section 3.2.1).

Materials and Methods
An overview of the proposed system is detailed in this section. More specifically, all units of the main system are analyzed. Particular importance is given to the ARG ground unit. The basic elements of ARG, which are the wheeled mobile robot and the manipulator, are described thoroughly; hardware, system design, interoperability of devices, procedural flows of all tasks, methodologies, machine vision algorithms and parametrization of agricultural tasks are presented.

System Overview
The overall system mainly consists of three units: (1) an aerial unit, (2) a remote-control unit and (3) the ARG ground unit. Figure 1 conceptually shows the system architecture, as well as the manner the three units interact with one another. Table 2 includes analytically the bill of materials (BOM) used per unit. The aerial unit concerns an octocopter drone with an RGB camera mounted on it. Aerial surveillance provides intelligence to the ground unit regarding the field observation toward customized mission planning. More specifically, the drone flies on request over vineyards with favorable weather conditions and acquires images. The latter are used to derive the 3D micro-structure maps of the vineyards in order to calculate robot navigation paths along grapevine corridors. Details regarding this step of the implementation can be found in [30]. The evaluation of the proposed algorithm for navigation paths The aerial unit concerns an octocopter drone with an RGB camera mounted on it. Aerial surveillance provides intelligence to the ground unit regarding the field observation toward customized mission planning. More specifically, the drone flies on request over vineyards with favorable weather conditions and acquires images. The latter are used to derive the 3D micro-structure maps of the vineyards in order to calculate robot navigation paths along grapevine corridors. Details regarding this step of the implementation can be found in [30]. The evaluation of the proposed algorithm for navigation paths extraction reported average accuracy in terms of mean percentage error (MPE) for eight tested fields of 0.99%.
The remote-control unit is the information management and monitoring system presented in [31]. The remote-control unit is the communication channel between the human user and the ARG ground unit that receives and transmits data from and to the ARG ground unit. This unit enables for personalization of agricultural practices. Note that applied practices depend on the user's intention, as well as the grape variety. In conclusion, adjustable grape management is enabled by the remote-control unit. Through this unit, the user designs the operation plan and transmits it to the ARG, while at the same time ARG transmits real-time sensory data notifying the user for both the environment and its functional status. Therefore, the user provides ARG with a personalized navigation plan that consists of selections of routes between all possible vineyard navigation paths that have been extracted [30].
The ARG ground unit is the autonomous multi-purpose agrobot that works in the field. It comprises two main hardware components, namely a manipulator and a wheeled mobile robot. Each aforementioned hardware component includes several specialized devices as shown in Table 2. In particular, on the manipulator are mounted customized end-effectors, a 3D camera and artificial lighting, whereas on the wheeled mobile robot are mounted a 3D camera, several sensor devices, artificial intelligence (AI) computing devices and system power batteries. In-field sensory data, collected by the ARG, ground unit provide information regarding the robot's status, e.g., battery level, connectivity, as well as working environment information regarding humidity, temperature, live streaming, etc. Machine vision and data analysis algorithms process the aforementioned data toward a sensible decision making.
The design architecture proposed here is in-line with the technological requirements of ARG for supporting harvest, green harvest and defoliation, as identified in [32]. It should be mentioned that all selected hardware components included in Table 2 have resulted after extensive research between the most up-to-date and efficient materials that can meet the specifications set in [32]. For the ARG ground unit, a crucial limitation for the choice of the mounted hardware was the maximum allowable load, which could not exceed 65 kg. This latter feature determined the selection of the particular manipulator. In what follows, the ARG ground unit, as well as its operation are further detailed.

The ARG Ground Unit
The prototype hardware-assembled ARG ground unit is shown in Figure 2. The manipulator is mounted on the wheeled mobile robot vertically and on its left side. On the right side of the wheeled mobile robot, there is an insulated box containing two Jetson TX2 boards, a battery (LiFePO 4 50 Ah @ 24 V DC) and all necessary electronic devices. The box keeps the equipment protected from external conditions, e.g., from the dust that rises with the movement of the vehicle in the field. The above box placement allows the manipulator to work only on the left side of the vineyard. Therefore, in case ARG needs to work on the right side of the vineyard, then it has to navigate along the vineyard corridor in the opposite direction. The data acquisition system consists of the following parts: • Environmental sensors, which include two DHT22 temperature and humidity sensors as well as one LM35D temperature sensor. More specifically, one DHT22 is placed inside the box containing electronic circuits, batteries and connections, for malfunction overheating monitoring, whereas the other DHT22 is placed externally on the robot vehicle for environmental measurements. Note that a DHT22 sensor measures humidity in the range from 0% to 100% with 2-5% accuracy and temperature in the range from 40 to +80 degrees Celsius with ±0.5 degrees accuracy. The LM35D sensor enhances the accuracy of external temperature measurements. • A ZED Mini 3D camera mounted on the robotic arm. This is the main visual sensor of the system. ZED Mini provides a streaming video sequence that can be monitored from the remote-control unit. Frames are used for: (1) grape cluster and leaves detection [33], (2) grape stem detection [34], (3) harvest crate detection [35], (4) grapevine trunk detection [36], (5) ripeness estimation and yield time prediction [37] and (6) grapes defect detection.

•
Three auxiliary cameras are mounted on the left side of the wheeled mobile robot on a fixed basis. The high resolution RGB and NIR cameras are placed on one side, whereas on the other side is placed the FLIR camera. The two synchronized NIR and RGB cameras are placed at a fixed distance of 3.5 cm from one another. These cameras are used to capture images from the vineyard rows in order to calculate vegetation and temperature indices. Vegetation indices are used to characterize areas in terms of vegetation density, allowing the user to have an overview of the vineyard and, from there, locate possible working areas. More specifically, FLIR provides thermal images and NIR provides spectral images to determine the density of green. FLIR camera is a high-cost equipment, therefore underexplored. However, studies reveal the correlation between FLIR thermal images with vegetation indices [38]. All measurements are displayed on the remote-control unit on the vineyard maps [29]. The user can consult on equipotential measurements maps in order to drive the robot to areas of his/her choice, according to the values of indicators related to ripeness and/or vegetation [39]. • An ORBBEC Astra 3D camera, embedded on the wheeled mobile robot. This camera is used for navigation and it is embedded on the front of the wheeled robot. The camera recognizes harvest crates [35] as well as vine trunks [36] and uses them as markers.
The ARG pauses in front of either a harvest crate or a vine trunk and carries out a specific agricultural task. In addition, the camera provides an RGB-D map of the field used for obstacle detection. Due to the 65 cm width of the ARG ground unit, combined with the 220 cm standard width of vineyard corridors, it is dangerous for the crops and fruits and difficult for the robot to navigate performing obstacle avoidance in the corridors. Therefore, in case the robot senses an obstacle, it stops navigation and informs the user of its exact location and the status, through the remote-control unit. In general, vineyards are considered semi-structured environments. The challenge for ARG is to move dynamically along the pre-defined vine corridors on uneven, heterogeneous, or muddy soil at a fixed safe distance from the crops line. Obstacles inside the vineyard corridors are considered non-existing and rare, therefore, were not assumed in the context of this work. • An ORBBEC Astra 3D camera, embedded on the wheeled mobile robot. This camera is used for navigation and it is embedded on the front of the wheeled robot. The camera recognizes harvest crates [35] as well as vine trunks [36] and uses them as markers. The ARG pauses in front of either a harvest crate or a vine trunk and carries out a specific agricultural task. In addition, the camera provides an RGB-D map of the field used for obstacle detection. Due to the 65 cm width of the ARG ground unit, combined with the 220 cm standard width of vineyard corridors, it is dangerous for the crops and fruits and difficult for the robot to navigate performing obstacle avoidance in the corridors. Therefore, in case the robot senses an obstacle, it stops navigation and informs the user of its exact location and the status, through the remotecontrol unit. In general, vineyards are considered semi-structured environments. The challenge for ARG is to move dynamically along the pre-defined vine corridors on uneven, heterogeneous, or muddy soil at a fixed safe distance from the crops line.
Obstacles inside the vineyard corridors are considered non-existing and rare, therefore, were not assumed in the context of this work. • A GPS sensor, to locate the ARG and display in real-time its position on vineyard maps through the remote-control unit. Accurate location is not confirmed with the GPS. For this reason, GPS is not used for localization purposes, but only for the approximate visualization of ARG on the computer interface. To ensure the safe operation of ARG and minimize damage risks to both ARG and crops, additional sensory • A GPS sensor, to locate the ARG and display in real-time its position on vineyard maps through the remote-control unit. Accurate location is not confirmed with the GPS. For this reason, GPS is not used for localization purposes, but only for the approximate visualization of ARG on the computer interface. To ensure the safe operation of ARG and minimize damage risks to both ARG and crops, additional sensory information is used for ARG's localization, as explained next. The GPS sensor is mounted on top of the wheeled mobile robot on its back. • A fusion of four encoders for odometry, an internal measurement unit (IMU) and a LiDAR for ARG localization [40]. A fusion of encoder data with IMU data results in an initial state estimation for ARG. Localization is further optimized by using the LiDAR. It is well known that multi-modal systems based on a combination of sensors provide more accurate and robust state estimation. LiDAR uses two algorithms to achieve optimal localization: (1) the iterative closest points (ICP) algorithm [41] to registrate the 3D point cloud data of 16 laser beam layers; thus, it builds a map tracking the robot pose in six full degrees of freedom (DoF) simultaneous localization [42], and (2) an algorithm for wall-following, based on the information of one LiDAR laser beam; this algorithm provides the robot with a fixed distance from the working side.
In order to maximize its viewing angle and scanning area, the LiDAR is adjusted on the wheeled robot on an elevated aluminum base. Thus, the interference of the LiDAR with the robotic arm or the box is avoided. The IMU sensor is located inside the wheeled mobile robot.
The GPS, LiDAR, IMU, encoders and the ORBBEC Astra camera are embedded on the wheeled mobile robot summit XL HL made by Robotnik and therefore it is powered by its battery LiFePO 4 15 Ah @ 48 V DC. A LiFePO 4 50 Ah @ 24V DC battery supplies the manipulator and outputs various voltages through specialized voltage converters in order to power all the remaining sensors, ensuring approximately 5 h of autonomy. The system operation has been developed in Robot Operating System (ROS).
In addition to sensors, the basic elements that control the system are the main board of the wheeled robot and the two NVIDIA Jetson TX2 boards. Tasks of the main elements are listed below.

•
Main Board: The main board (motherboard) controls the wheeled robot regarding navigation and user commands. It assumes the internal communication between robot structure build in devices (IMU, GPS, LiDAR, Encoders, ORBBEC Astra) and all additional connected devices via three available USB 2.0 ports (JACO 2 and RGB camera) or wire connection (DHT22 and LM35D sensors). The main board collects all data from the linked sensors and runs the algorithms listed in Table 3. The board provides its own power supply (LiFePO 4 15 Ah @ 48 V DC battery) supporting all connected devices, apart from JACO 2 which is powered by an additional power supporting board (Battery LiFePO 4 50Ah@24V DC). • NVIDIA Jetson TX2: The main task of these two processing boards is to ensure high level autonomy of the system, communication tasks and machine vision algorithms along with feature extraction towards decision making, as shown in Table 3. Each board provides one USB 3.0 port. Board 1 is connected to ZED Mini, whereas board 2 is connected to the NIR camera; board 2 also controls the end-effector gripper. Both boards are connected via a Wi-Fi network with the remote-control unit as described in [31]. Data are communicated via the database, i.e., a MongoDB, that runs on the host computer of the remote-control unit to ARG and vice-versa. All information is transmitted inside JavaScript Object Notation (JSON) packets as JSON arrays. Both boards and all linked devices are powered by the power board (LiFePO 4 50 Ah @ 24 V DC battery).

Manipulator
Depending on the agricultural operation, two different customized end-effectors (patent pending) are available and can be mounted on the 7-DoF robotic arm: one for harvest and green harvest and another for defoliation. The Kinova JACO 2 is the robotic arm used due to its light weight and low power consumption. The modeling of JACO 2 was presented in [43]. The analytic kinematic model was calculated and implemented in ROS for cross-validation. On the manipulator are mounted the ZED Mini 3D camera, as well as, an artificial lighting system in order to eliminate natural uncontrolled illumination or shadowing that affect the accuracy of machine vision algorithms and, additionally, to enable working during the night. The placement of the ZED Mini camera on the end-effector enhances surrounding sensing and improves target object identification and, ultimately, operation precision. Note that the Kinova JACO 2 comes with a standard wired controller with a three-axis joystick mounted on a support. The latter controller is used in emergency cases such as entanglement of the arm inside the branches.
The workflow of the manipulator for the harvesting task is presented in Figure 3. During harvest, the objective is to collect grape clusters of a specific degree of maturity. Note that the degree of maturity is a conventional concept that varies for every grape variety and depends on the desired quality of the produced wine [44]. Therefore, for an accurate estimation of the harvest time, it is essential to monitor the grape ripening level. The harvesting task is accomplished by three machine vision algorithms executed in a row; grape cluster detection, ripeness estimation and grape stem detection. All the employed machine vision algorithms, as well as, the AI computing device on which they run are listed in Table 3. Table 3 also includes simulation performance results of the algorithms by mean pixel intersection over union (IU) or mean average precision (mAP). For the ripeness estimation algorithm, the error is computed as the distance between the calculated and predicted ripeness level. Methodologies referenced in Table 3 have been developed in previous works toward the implementation of the functionalities of the three agricultural tasks of harvest, green harvest and defoliation. Therefore, machine vision algorithms have already been proven to be of sufficient accuracy at simulation level. In order to connect the proposed system to the implemented algorithms, a description of the used model and the related task each algorithm is deployed to, are also provided in Table 3. More details on the parameters of the algorithms can be found in the corresponding references. During harvesting, first, the model for the grape cluster detection is loaded. The model is supplied with frames taken with 8 frames per second (fps) from ZED Mini, it detects the grape clusters and starts the process from the nearest cluster. If no cluster is detected, the manipulator moves from the home position and covers a horizontal distance of 50 cm left and right, seeking for clusters. The selected scanning area is within a safe range according to the design specification i.e., opening angles of the robotic arm. This check is done twice and if no cluster is detected the robot moves to the next vine plant. In order to harvest a detected cluster, ripeness is estimated. Only clusters of a similar degree of maturity are harvested. If the cluster is fully ripened, the algorithm defines the center of its mass (CoM cluster ) (x 0 ,y 0 ,z 0 ), converts these coordinates from image points to space points and calculates the relative distance of the CoM cluster from preset reference points. These reference points are the edges of the cutting tool on the end-effector that are always evident in every frame. The manipulator moves towards the CoM cluster only on X and Y axes, until the CoM cluster is placed on the center of the image, between the reference points. When the cluster is centered on both X and Y axes, then the manipulator approaches the cluster by moving vertically on Z axis, and stops 2 cm from the target. From that distance, the stem of the cluster can be identified. At that point, information regarding the region of interest (ROI) is collected, such as extreme points (upper, left and right). The same process described for the grape cluster is repeated for the grape stem. After cutting the cluster from the stem, the manipulator returns to its home position and releases the cluster in a harvest crate placed underneath. Harvest crates are used as visual landmarks for navigation. More specifically ARG carries out all tasks after stopping in parallel with the harvest crates placed in front of vine trunks so as the home position of the manipulator is above the center of the harvest crate in order to straightforward place a collected cluster inside the crate. The aforementioned setup saves time from additional visual identification of harvest crates as well as from navigation of the manipulator toward it. of its mass (CoMcluster) (x0,y0,z0), converts these coordinates from image points to space points and calculates the relative distance of the CoMcluster from preset reference points. These reference points are the edges of the cutting tool on the end-effector that are always evident in every frame. The manipulator moves towards the CoMcluster only on X and Y axes, until the CoMcluster is placed on the center of the image, between the reference points. When the cluster is centered on both X and Y axes, then the manipulator approaches the cluster by moving vertically on Z axis, and stops 2 cm from the target. From that distance, the stem of the cluster can be identified. At that point, information regarding the region of interest (ROI) is collected, such as extreme points (upper, left and right). The same process described for the grape cluster is repeated for the grape stem. After cutting the cluster from the stem, the manipulator returns to its home position and releases the cluster in a harvest crate placed underneath. Harvest crates are used as visual landmarks for navigation. More specifically ARG carries out all tasks after stopping in parallel with the harvest crates placed in front of vine trunks so as the home position of the manipulator is above the center of the harvest crate in order to straightforward place a collected cluster inside the crate. The aforementioned setup saves time from additional visual identification of harvest crates as well as from navigation of the manipulator toward it.  The workflow of the green harvesting task is illustrated in Figure 4. For green harvest and defoliation, vine trunks are used as visual landmarks for navigation. The ARG performs all tasks after stopping in parallel with the vine trunks. All removed leaves or clusters in those tasks are left to fall on the ground and outside any crate. The workflow of the green harvesting task is illustrated in Figure 4. For green harvest and defoliation, vine trunks are used as visual landmarks for navigation. The ARG performs all tasks after stopping in parallel with the vine trunks. All removed leaves or clusters in those tasks are left to fall on the ground and outside any crate. In green harvest, a percentage of the detected grape clusters is removed towards reducing grape load and, thus, improving the product quality [46]. The percentage is a requirement defined by the user on the personalization tab on the remote-control unit. First, the defected clusters are removed, i.e., rotten, sick, dry, damaged, and then the clusters are located on the outer edges of the vine tree. Removed clusters are thrown on the ground. The workflow for the green harvest is similar to that of harvest. The main difference is that the ripeness detection algorithm is now replaced with the defect detection algorithm. The "remove cluster" block in Figure 4 includes the grape cluster localization and stem detection as defined in the harvesting task (green blocks of Figure 3), the motion of the robotic arm and the cutting of the clusters by activating open/close of the scissor without returning the arm to its home position. In the beginning, all detected clusters are subjected to defect detection. Detected defected bunches are removed until the percentage selected by the user is achieved. If after this check the percentage of clusters to remove has In green harvest, a percentage of the detected grape clusters is removed towards reducing grape load and, thus, improving the product quality [46]. The percentage is a requirement defined by the user on the personalization tab on the remote-control unit. First, the defected clusters are removed, i.e., rotten, sick, dry, damaged, and then the clusters are located on the outer edges of the vine tree. Removed clusters are thrown on the ground. The workflow for the green harvest is similar to that of harvest. The main difference is that the ripeness detection algorithm is now replaced with the defect detection algorithm. The "remove cluster" block in Figure 4 includes the grape cluster localization and stem detection as defined in the harvesting task (green blocks of Figure 3), the motion of the robotic arm and the cutting of the clusters by activating open/close of the scissor without returning the arm to its home position. In the beginning, all detected clusters are subjected to defect detection. Detected defected bunches are removed until the percentage selected by the user is achieved. If after this check the percentage of clusters to remove has not been achieved, then more distant clusters are removed until the user's requirement is met.
On one hand, both the harvesting and the green harvesting task are implemented by a customized end-effector consisting of a robotic gripper, a specialized cutting tool (scissors) and 3D printed fingers, designed for cutting and holding the grape cluster from the stem without harming it. On the other hand, for the defoliation task, a custom-made specialized tool consisting of a robotic gripper replaces the harvesting end-effector and 3D printed specially shaped fingers with a rubber-coated internal surface, able to crumple and hold the leaf to remove it.
The workflow for the defoliation task is illustrated in Figure 5. Recall that defoliation is leaf removal from the base of the shoots towards enhancing the exposure of the grapes to the sun [47]. The defoliation task can be customized by the user from the remote-control unit. In particular, the user can define the percentage of the detected mass of leaves to remove. Note that defoliation calls for the removal of fewer leaves for white varieties to avoid loss of aromatic potential, and more leaves for red varieties to increase color and restrict plant aromas [48]. Thus, the ability of the proposed system to personalize the defoliation task is considered a novelty of great importance. not been achieved, then more distant clusters are removed until the user's requirement is met.
On one hand, both the harvesting and the green harvesting task are implemented by a customized end-effector consisting of a robotic gripper, a specialized cutting tool (scissors) and 3D printed fingers, designed for cutting and holding the grape cluster from the stem without harming it. On the other hand, for the defoliation task, a custom-made specialized tool consisting of a robotic gripper replaces the harvesting end-effector and 3D printed specially shaped fingers with a rubber-coated internal surface, able to crumple and hold the leaf to remove it.
The workflow for the defoliation task is illustrated in Figure 5. Recall that defoliation is leaf removal from the base of the shoots towards enhancing the exposure of the grapes to the sun [47]. The defoliation task can be customized by the user from the remote-control unit. In particular, the user can define the percentage of the detected mass of leaves to remove. Note that defoliation calls for the removal of fewer leaves for white varieties to avoid loss of aromatic potential, and more leaves for red varieties to increase color and restrict plant aromas [48]. Thus, the ability of the proposed system to personalize the defoliation task is considered a novelty of great importance. During defoliation, the ARG locates the first trunk next to the start point where defoliation will initiate and moves forward to the goal point by covering small parts of a During defoliation, the ARG locates the first trunk next to the start point where defoliation will initiate and moves forward to the goal point by covering small parts of a predefined length (90 cm). The latter length is computed by considering the working area the robotic arm can cover according to specifications, i.e., opening angles, and applied torques, so the ARG can move forward and apply defoliation in marginally overlapping areas along the vineyard row until the end of the selected path. The percentage of leaves to be removed is a specification defined by the user on the personalization tab on the remotecontrol unit. The leaves detection algorithm defines the area of leaves in an image frame. The resulting images are 224 × 224 pixels size. Initially, the ROI where defoliation will be performed is defined; it is the area covering the lower half part of the image (112 × 224).
Then, the ROI is divided into a grid, consisting of 4 rows and 8 columns, resulting in 32 equal square areas of 28 × 28 pixels size. The area of leaves is calculated for each square, and the 32 squares are sorted from the fullest to the emptiest. From this sorted set, we start summing up the leaves area, until α squares that include 80% of leaves in the ROI are defined. The remaining squares are excluded. This step is performed in order to exclude the boxes with only a few leaves and deal only with the full ones in order to avoid sending the robotic arm to defoliate in areas empty or almost empty. From the α selected boxes, we then try to define the ones with the most foliage where defoliation needs to be performed to remove the percentage of leaves set by the user. For this reason, we start summing up the leaves area of the α squares from the fullest to the emptiest, until we end up with the b squares that include the percentage (P%) of leaves of the total leaves area detected in the ROI. The CoM of the b squares is calculated and the robotic arm visits each CoM to remove the enclosed leaves. Every time the gripper closes, it moves back along the z-axis by 10 cm to detach the leaf from the stem. All decisions regarding the ROI size, grid size and distance values are defined by extensive trial-and-error procedures and depend on the position of the camera, which is fixed and remains stable every time it supplies the inference model with frames. The aforementioned camera is the ZED Mini camera that is mounted on the robotic arm. The camera is located constantly in the arm home position at a fixed distance from the vine trees foliage. In this way we ensure consistency in the result. Figure 6 illustrates step-by-step the results of the proposed defoliation algorithm.
All agricultural tasks described in the procedural flows allow for customization. The user can select certain parameters from the remote-control unit to customize each task depending on individual applied practices. Table 4 includes the parameters that the user can define for all three tasks. All parameters have been extracted by a multidisciplinary team of experts including engineers, viticulturists and agronomists [32] and have been tested in the laboratory. At a simulation level, all methodologies and algorithms are sufficiently accurate. However, fine-tuning of both algorithms and parameters will be done in the field. Of course, failures are expected in the field, since in real conditions the performance of the methodologies depends on the setting of the parameters (finetuning), which can only be done in the field. At this point, it should be highlighted that all agricultural tasks produce wastes. Each wine producer manages wastes differently; some throw the defected grape clusters on the ground; others collect them in harvest crates; others leave them on the vine tree. Leaves removed from the defoliation are always left on the ground and recycled as fertilizer. As can be seen from Table 4, the proposed system allows the user to choose how to manage the waste generated by each agricultural task.
One challenge regarding the motion of the manipulator is to avoid collisions with both the equipment on the wheeled mobile robot and the natural environment. The vineyard is considered a semi-structured environment, but this has to do more with ARG navigation along vineyard corridors than with moving the manipulator. More specifically, branches, support wires, grown vegetation can potentially obstruct the movement of the manipulator. Therefore, in order to avoid collisions in this complex work environment, an algorithm was applied that runs continuously during all agricultural tasks. The aforementioned algorithm is a torque-based collision avoidance strategy [49]. The forces being exerted on each joint are checked at any time and a repulsive potential repels the robot away from obstacles, returning it to a safe position i.e., the home position. The repulsion potential field is also active for self-collision, along with joint limits that define a priori the safe working space for the manipulator. When working in unstructured fields, robotic systems need robust schemes, for safely interacting with the environment. The proposed scheme does not stand out for its robustness nevertheless, it is an acceptable choice for navigating safely in complex environments. potential field is also active for self-collision, along with joint limits that define a priori the safe working space for the manipulator. When working in unstructured fields, robotic systems need robust schemes, for safely interacting with the environment. The proposed scheme does not stand out for its robustness nevertheless, it is an acceptable choice for navigating safely in complex environments.

Wheeled Mobile Robot
The four-wheel mobile robot Summit XL HL by Robotnik was the basic mobile robot used in this project. It should be noted that the Summit can carry a payload of up to 65 kg, which is enough for the manipulator together with all the embedded systems, and can navigate on uneven terrain with safety. Two operating modes are available for the wheeled mobile robot, namely manual and autonomous. On one hand, a Dualshock 4 controller by Sony provides manual control via a Bluetooth connection; this mode is used in order to navigate the ARG to the starting point of the navigation path and back to the base in case of emergency. On the other hand, autonomous navigation is based on feedback from the wheel encoders, IMU and mainly from LiDAR; the robot navigates on predetermined paths in vineyard maps that are illustrated in the software application running in the remote-control unit, selected by the user. The selected paths are then transformed from GPS coordinates to robot poses and they are sent from the remote-control unit to the ARG. ARG is placed at the starting point of the path and then navigates from one pose to the next one with the help of the LiDAR using a wall-follower algorithm to keep a safe constant distance from the vine trees. The embedded vision system of Summit, i.e., 3D ORBBEC Astra camera, is used to provide navigation cues that define the exact working area where ARG should stop and start working. More specifically, harvest crates [35], as well as, vine trunks [36] are used as visual cues. Note that harvest crates are located in front of each vine tree in order to collect the harvested clusters. However, harvest crates are not stable landmarks. Therefore, a vine trunk detection algorithm provides additional information regarding the exact stopping points. The aforementioned setup is functional and it permits ARG to navigate inside the vineyard corridors, turn when it reaches a corridor end, return along the same corridor or near it and stops in front of each vine tree on its way. The 3D camera can perceive obstacles and the system takes appropriate actions by informing the user through the remote-control unit. Navigation is effected with 0.5 m/s linear velocity and 0.2 rad/sec angular velocity. The flow regarding the movement of the wheeled mobile robot is shown in Figure 7.
All the tasks implemented in this work were developed in Python code using the Kinetic distribution of ROS installed on Ubuntu 16.04. ROS serves as a communication channel between software and hardware components of the robot via ROS messages. The robot uses a linear-quadratic regulator (LQR) controller [50]. Five main packages run on ROS, namely vision perception system, state estimation which includes localization and mapping, obstacle detection, task execution and navigation ( Figure 8). The user personalizes the tasks on the interface of the remote-control unit and stores the task data on the database through which the robot receives all necessary task data. The robot obtains the navigation operational plan, extracted from the vineyard maps, and moves relatively from its current position to the received goal. The current pose and final goal are described by three variables: (x, y, θ); x and y describe the displacement of the robot from its current position in X and Y axes, and θ denotes the offset angle.
The task executor accepts the defined tasks regarding the motion and sends a task. The task executor accepts the defined tasks regarding the motion and sends a task in the form of a string to the navigation package, which includes motion planning. The task executor also receives input from the visual perception system. Obstacle detection is performed using feedback from the ORBBEC Astra camera and LiDAR. State estimation includes the fusion of multiple sensors towards correcting the navigation trajectory.  The task executor accepts the defined tasks regarding the motion and sends a task.  The task executor accepts the defined tasks regarding the motion and sends a task.  Figure 9 illustrates the integration and connectivity of ARG main structural elements. The different color in lines represents the different ways of connecting the elements, according to the corresponding element tag. Sensors are represented by small boxes in green, while main elements are presented by big boxes in blue. The main elements communicate via a router local area Ethernet connection. The tasks of the main elements are described briefly in the figure.

System Integration
The task executor accepts the defined tasks regarding the motion and sends a task in the form of a string to the navigation package, which includes motion planning. The task executor also receives input from the visual perception system. Obstacle detection is performed using feedback from the ORBBEC Astra camera and LiDAR. State estimation includes the fusion of multiple sensors towards correcting the navigation trajectory. Figure 9 illustrates the integration and connectivity of ARG main structural elements. Τhe different color in lines represents the different ways of connecting the elements, according to the corresponding element tag. Sensors are represented by small boxes in green, while main elements are presented by big boxes in blue. The main elements communicate via a router local area Ethernet connection. The tasks of the main elements are described briefly in the figure. The autonomous operation of ARG is defined by all previous workflows that describe how machine vision algorithms, the motion control of the wheeled robot and the manipulator are combined in order to implement three agricultural tasks.

System Integration
System integration enables the ARG to employ the selected agricultural task all the time during navigating along the vineyard rows. The hardware and software architecture of ARG is presented in Figure 10. Surrounding boxes in green represent hardware modules, whereas inner boxes in deep blue represent software functions. The Ground Robot Node is the basic node that coordinates the system by synchronizing the information flow between all modules; it is responsible for the navigation tasks and performs image analy- The autonomous operation of ARG is defined by all previous workflows that describe how machine vision algorithms, the motion control of the wheeled robot and the manipulator are combined in order to implement three agricultural tasks.
System integration enables the ARG to employ the selected agricultural task all the time during navigating along the vineyard rows. The hardware and software architecture of ARG is presented in Figure 10. Surrounding boxes in green represent hardware modules, whereas inner boxes in deep blue represent software functions. The Ground Robot Node is the basic node that coordinates the system by synchronizing the information flow between all modules; it is responsible for the navigation tasks and performs image analysis. Navigation uses information from odometry as well as sensor streams and sends velocity commands to the wheeled robot. Image analysis includes machine-vision models, as well as, data processing algorithms. The Arm Motion Control Node is in charge of the motion of the robotic arm. The Subsystem Node calculates the target of the robotic arm, communicates with the wheeled mobile robot and it performs image analysis; besides, it links the robotic arm, the end-effector and the image analysis resulting from the ZED Mini camera, it handles sensory data from the camera, the robotic arm position status and the end-effector actions. Finally, the database refers to the MongoDB database where all data are stored from the remote-control unit to ARG and vice-versa. locity commands to the wheeled robot. Image analysis includes machine-vision models, as well as, data processing algorithms. The Arm Motion Control Node is in charge of the motion of the robotic arm. The Subsystem Node calculates the target of the robotic arm, communicates with the wheeled mobile robot and it performs image analysis; besides, it links the robotic arm, the end-effector and the image analysis resulting from the ZED Mini camera, it handles sensory data from the camera, the robotic arm position status and the end-effector actions. Finally, the database refers to the MongoDB database where all data are stored from the remote-control unit to ARG and vice-versa.

Discussion
The engagement of agrobots promises operation cost-savings as well as reduction of both the required material resources and the yield losses in agriculture [3,51]. Furthermore, an imperative demand in agriculture is "manual dexterity" whose automation by machines can result in substantial benefits. For instance, in certain parts of the world such as in Europe, labor shortages of seasonal workers, unable to travel between regions, have caused fresh products accumulation as well as huge food losses. Therefore, automation of the harvest, as well as, of alternative tasks in agriculture that call for manual dexterity is expected to have a massive impact.
The advent of technology is directly related to software/hardware systems, leading to the simultaneous worldwide adaptation of robotics to an extended range of applications. Therefore, technological progress concerning power autonomy, machine vision algorithms, intelligence modeling, autonomous navigation and precise manipulation are expected to improve the functionality of agrobots at all levels.
According to the above, agrobots are still in their early stages and further applications will emerge gradually as an extension of breakthrough technologies. Presentation of detailed design principles and system architecture of agrobots aims at addressing potential problems to be solved by researchers working in the field. Challenges, problems, different designs and approaches need to be highlighted to gradually achieve successful adaption of innovative techniques and transfer knowledge to numerous tasks and different crops. Toward this end, the present work investigated the robotic design aspects of a multi-purpose vineyard agrobot, aiming at providing generic design guidelines.
A limitation of the proposed system is the need for a human controller of the remotecontrol unit. At this point, the user designs an operation plan for ARG, by selecting navigation paths and specific agricultural operations. Future work includes the investigation of replacing the human user with an optimization model that will be trained to design the

Discussion
The engagement of agrobots promises operation cost-savings as well as reduction of both the required material resources and the yield losses in agriculture [3,51]. Furthermore, an imperative demand in agriculture is "manual dexterity" whose automation by machines can result in substantial benefits. For instance, in certain parts of the world such as in Europe, labor shortages of seasonal workers, unable to travel between regions, have caused fresh products accumulation as well as huge food losses. Therefore, automation of the harvest, as well as, of alternative tasks in agriculture that call for manual dexterity is expected to have a massive impact.
The advent of technology is directly related to software/hardware systems, leading to the simultaneous worldwide adaptation of robotics to an extended range of applications. Therefore, technological progress concerning power autonomy, machine vision algorithms, intelligence modeling, autonomous navigation and precise manipulation are expected to improve the functionality of agrobots at all levels.
According to the above, agrobots are still in their early stages and further applications will emerge gradually as an extension of breakthrough technologies. Presentation of detailed design principles and system architecture of agrobots aims at addressing potential problems to be solved by researchers working in the field. Challenges, problems, different designs and approaches need to be highlighted to gradually achieve successful adaption of innovative techniques and transfer knowledge to numerous tasks and different crops. Toward this end, the present work investigated the robotic design aspects of a multipurpose vineyard agrobot, aiming at providing generic design guidelines.
A limitation of the proposed system is the need for a human controller of the remotecontrol unit. At this point, the user designs an operation plan for ARG, by selecting navigation paths and specific agricultural operations. Future work includes the investigation of replacing the human user with an optimization model that will be trained to design the optimal operation plan considering sensory data, available paths, etc. combined with additional features such as weather reports. However, even if the human controller is replaced with intelligent algorithms, still the human presence is considered necessary to monitor that the system responds properly and that both yield and ARG are not compromised in any way.
Typically, robots are employed in predictable, structured industrial environments, e.g., assembly lines. However, a typical agricultural environment is neither predictable nor structured. Therefore, a novel agrobot design should be sought, especially regarding its "intelligence modeling". Note that it can be thought of as a cyber-physical system (CPSs).
The latter is defined as mechanical devices endowed with both sensing and reasoning capacities toward a certain degree of autonomy. Toward this end, future work also includes intelligence modeling. Based on their advantages the potential of lattice computer (LC) models emerges especially promising for supporting integrated software/hardware systems [52] such as the one presented in this work, therefore, will be further investigated.
Another limitation of this work is the absence of an obstacle avoidance policy. Vineyard corridors are narrow enough for ARG to perform maneuvers, considering its size and heavy loads. In case an obstacle is sensed in its way, the ARG stops and informs the user through the remote-control station. The user then performs any necessary manual corrections to the ARG trajectory. This is time-consuming since the system needs to start over. Future work includes obstacle avoidance investigation, toward a more robust system.
An additional limitation of the proposed system is the power autonomy. At this point the system is powered by battery supplies, ensuring a finite working time. Future work includes the investigation of alternative energy sources such as solar panels.
In the context of this work, overall system performance results are not reported. However, this work references results of individual methodologies that implement basic tasks of the proposed system, such as the remote-control unit function [31], the navigation route mapping [30], the kinematic analysis of the robotic arm [43] and machine vision algorithms for detection of grape clusters and leaves [33], stems [34], harvest crates [35], vine trunks [36] and grape ripeness level [37]. These results will be validated in real scenarios covering a predefined range of vineyards of three well-known Northern Greece wine producers as part of a national research program [29]. Future work will also include the application of the overall system to alternative crops to investigate possibilities of adaptability of the proposed system and definition of possible limitations.

Conclusions
This work has presented a system architecture design aspects, development and integration regarding an autonomous agrobot for viticultural operations. The main characteristics of the proposed system are: (1) the crop scale application ability; (2) the multi-modal design based on three interdependent units (aerial, remote-control and ground unit); (3) the multi-purpose operational design allowing for harvest, green harvest and defoliation automation; note that green harvest and defoliation have not been investigated previously by other existing robotic systems; (4) the innovative personalization ability for all agricultural tasks toward personalized vineyard practices.
The aim of this work is to present the hardware design of the aforementioned overall system combined with the procedural workflows for all the applied tasks. More specifically, this work includes: (i) hardware specifications such as bill of materials, interoperability of devices, system architecture and system integration, and (ii) software specifications such as procedural flows, list of integrated algorithms, simulation results and personalization parameters. The limitations of the proposed system are also discussed.
Future work includes in-lab and in-field application of the integrated system, evaluation of all subsystems and fine-tuning. The proposed system design can be adapted to similar agricultural operations regarding alternative crops. Therefore, future applications will be investigated to evaluate the adaptability of the proposed system and to broaden the scope of this research. Future work also includes automated mission configuration based on sensory data, intelligent modeling for decision making, obstacle avoidance investigation and sustainable power supply, toward complete autonomy of the proposed system. Author Contributions: Conceptualization, G.A.P., V.G.K. and T.P.P.; investigation, E.V., K.T., A.N. and T.K.; writing-original draft preparation, E.V., K.T., A.N. and T.K.; writing-review and editing, E.V., G.A.P., V.G.K., S.M., S.K. and T.P.P.; visualization, G.A.P., V.G.K. and T.P.P.; supervision, G.A.P.; project administration, G.A.P. All authors have read and agreed to the published version of the manuscript.
Funding: This research has been co-financed by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH-CREATE-INNOVATE (project code: T1EDK-00300).