Sensing Technology Survey for Obstacle Detection in Vegetation

: This study reviews obstacle detection technologies in vegetation for autonomous vehicles or robots. Autonomous vehicles used in agriculture and as lawn mowers face many environmental obstacles that are difﬁcult to recognize for the vehicle sensor. This review provides information on choosing appropriate sensors to detect obstacles through vegetation, based on experiments carried out in different agricultural ﬁelds. The experimental setup from the literature consists of sensors placed in front of obstacles, including a thermal camera; red, green, blue (RGB) camera; 360 ◦ camera; light detection and ranging (LiDAR); and radar. These sensors were used either in combination or single-handedly on agricultural vehicles to detect objects hidden inside the agricultural ﬁeld. The thermal camera successfully detected hidden objects, such as barrels, human mannequins, and humans, as did LiDAR in one experiment. The RGB camera and stereo camera were less efﬁcient at detecting hidden objects compared with protruding objects. Radar detects hidden objects easily but lacks resolution. Hyperspectral sensing systems can identify and classify objects, but they consume a lot of storage. To obtain clearer and more robust data of hidden objects in vegetation and extreme weather conditions, further experiments should be performed for various climatic conditions combining active and passive sensors.


Introduction
Autonomous guided vehicles (AGVs) are widely used for different applications, such as military tasks, disaster recovery during natural calamities for safety operations, astronomy, lawn mowers, and agricultural use. To improve workflow, optimize functionality and reduce manual labor, AGVs are needed with precision route plans guided by advanced sensing technology. However, the biggest challenges of these systems today are sensing the surrounding environment and applying the detected information to control vehicle motions. The surrounding environmental problems for AGVs include navigating through off-road terrain while sensing a variety of ferrous and non-ferrous materials, facing illumination effects such as underexposure and overexposure, shadow, and working in bad weather conditions [1]. Detecting organic obstacles such as tall grass, tuft, and small bushes, for a traversable path in navigation is difficult for AGVs. Adding to organic obstacles, camouflaged animals in the field are threatened when AGVs are working as demonstrated in previous research work [2]. In addition, on bumpy (positive obstacle) and dirt roads, AGVs should constantly scan their surroundings to determine the travelability path based on bumps and the size of holes (negative obstacles) [1]. In general, AGVs should classify different obstacles (organic obstacles, positive and negative obstacles, hidden obstacles, etc.) efficiently to navigate through a field which is currently a challenge for AGVs. To help AGVs classify different obstacles by various sensors and to avoid potential risks, this study focuses on the objective of finding the right sensors for AGVs by reviewing the latest technologies.
The advanced spatial sensing systems (including, active and passive sensors) equipped on AGVs have the potential to detect all kinds of obstacles on the route. In order to clearly navigate through vegetation, it is necessary to find suitable sensors that can work in all weather and light conditions and identify obstacles of various material composition with acceptable processing time that are more efficient, safe, robust, and cost competitive. Sensor selection depends on the specific application and environment conditions. The following paragraph illustrates the importance of electromagnetic spectrum for sensor selection.

Sensors and the Electromagnetic Spectrum
In order to detect obstacles through vegetation, it is important to know the vegetation range in the electromagnetic spectrum. Because plants use light for photosynthesis, their remote sensing application range varies between the ultraviolet (UV) (range 10-380 nm), visible (450-750 nm), and infrared spectra (850-1700 nm) [3].
Based on the electromagnetic spectrum range for detecting plants, active and passive sensors are used for detection. Passive sensors detect electromagnetic radiation or light reflection from the objects. They work well with visible, infrared, thermal infrared, and even the microwave segment. Unlike passive sensors, active sensors have their own energy source, and emit pulse energy and receive the reflected energy to detect objects. Active sensors work well with the radio wave segment, and the typical sensors include LiDAR and radar. Along with these sensors, we need position estimation sensors such as accelerometers, gyroscopes, and global positioning system (GPS) [4,5] to track the vehicle's location, orientation, and velocity. The information from post estimation sensors is further needed for synchronizing and registering subsequent frames from the imaging sensors. Tight integration of position solution with object detection is needed if object persistent principles are use in the object detection algorithms. As shown in Figure 1, one of the passive sensors detects the hidden objects in the field such as rocks and processes the data to the estimator sensor, which guides the vehicle in the right direction without damaging the vehicle. The sensors allow self-driving vehicles such as AGVs to detect obstacles, while generating large amounts of sensing data, including point or pixel-based data. Processing those data is a challenge. The popular data-processing methods focus on point and pixelwise image classification, commonly referred to as semantic segmentation, which serves as a generic representation that allows for subsequent clustering, tracking, or further fu- The sensors allow self-driving vehicles such as AGVs to detect obstacles, while generating large amounts of sensing data, including point or pixel-based data. Processing those data is a challenge. The popular data-processing methods focus on point and pixelwise image classification, commonly referred to as semantic segmentation, which serves as a generic representation that allows for subsequent clustering, tracking, or further fusion with other modalities. Semantic segmentation can be first used to detect objects from the background [6,7]. Obstacle avoidance can be performed using a 2D or 3D bounding box to describe object location, trajectory, and size. Vehicle motion behavior can further distinguish passable obstacles (that must be traversable) and non-passable obstacles (that may be non-traversable).

Passive Sensors
Passive sensors measure the reflected solar electromagnetic energy from a surface in the presence of light [8,9]. These sensors do not have their own source of light and hence their performance is affected at night or by poor illumination conditions, except for thermal sensors. These sensors include cameras and computer vision technology that measures the distance of an object by receiving information about the position of the object [10]. Stereo cameras and RGB cameras detect protruding objects during the day. Thermal cameras work well in all light conditions. The following sections provide details on the passive sensors that may be used for obstacle detection under vegetation.

Stereo Cameras
There are two types of systems for passive sensing: monovision systems and stereovision systems. In the monovision system, one camera is used to estimate the distance which is based on reference points in the given camera field of view. The inconvenience of the monovision method is in estimating distances, recognizing the detected objects, and the complexity of the algorithm to classify the object categories while matching the real dimensions of the objects in different positions [11]. The stereovision system is a system that uses stereoscopic ranging techniques to calculate distance. This system is effective for depth sensing which uses two cameras as one and computes the distance with high accuracy [10]. The depth estimate is constrained by the distance of the baseline of two cameras. Short distances between two cameras results in a limited depth accuracy, whereas wide-baseline cameras provide better depth accuracy, but result into partial loss of spatial data and frequent occlusions [12].
A stereo camera generates information on depth and 2D color image. Zaarane et al.'s [8] method starts with capturing the scenes using both cameras. A vehicle detection algorithm (obstacle detection) is used in one image and a stereo matching algorithm is used on the other to match the detected vehicle (obstacle) [8]. The horizontal centroids of both objects are used to calculate and measure the distance and detect the depth of the obstacle.
Protruding objects and visually camouflaged animals can be identified with both the depth information and the 2D imaging data. In this way, depth-aware algorithms can be created based on the different perceptible characteristics (e.g., color, texture, shape, etc.) of object and depth data. The drawback of using a stereo camera is that the image quality and detection of obstacles is badly affected by illumination of light and weather conditions.

RGB Cameras
An RGB camera, captures the visible segment of an electromagnetic spectrum to provide information on the identified object's color, texture, and shape in high resolution at a low cost. For RGB cameras, non-protruding (non-exposed) objects in tall grasses are noticeable to some extent. The performance of camera, however, is affected by bad weather conditions (e.g., fog, snow, and rain) and illumination conditions such as low light, night conditions and shadows. An RGB camera only provides 2D image data, and is unable to provide depth information of objects in 3D space. To compensate for the loss of depth information and obtain the positions of surrounding obstacles, the technologies of visual simultaneous localization and mapping (SLAM) [13,14] and structure from motion (SFM) [15,16] have been developed [17]. SLAM and SFM use multi-view geometry to estimate the motions (rotation and translation) and construct the unknown surrounding environment. However, they are limited to large scenes or quick movements.

Thermal Infrared Cameras
All objects emit infrared (or thermal) radiation for a temperature above absolute zero. Infrared radiation lies within the wavelength spectrum of 0.7-1000 µm. The midwavelength and long-wavelength infrared in the infrared spectral regions are often referred to as thermal infrared (TIR). TIR cameras with a detector are sensitive to either mid-wave infrared (MWIR) (3-5 µm) or long-wave infrared (LWIR) radiation (7-14 µm). The objects in the temperature range from approximately 190-1000 Kelvin emit radiation in this spectral range [18]. Thermal cameras detect electromagnetic spectrum radiation emitting from the target objects to form thermal images, which illustrate the heat and not the visible light of objects. They continuously provide daytime and nighttime thermal image data of passive terrain perception under any light, as well as foggy conditions. Rankin et al. [19] introduced a TIR camera to provide imagery over the entire 24-h cycle.
An emissivity signature is a potential way to distinguish vegetation from objects or materials such as soil or rock. Each object has a specific emissivity for each spectral band and a certain temperature, and thus objects are clustered according to different regions of infrared color space of its emissivity, making it possible to distinguish into several classes. Vegetation and soil/rock materials have sufficiently different emissivity in both the broad MWIR and LWIR bands, which is illustrated in a study conducted by researchers from the California Institute of Technology, Jet Propulsion Laboratory (JPL) [19].

Hyperspectral Sensing
Hyperspectral sensing is the technology of getting information about the chemical composition from the object's emitted energy. The process involves dividing the electromagnetic spectrum into several narrow bands to read the spectral signatures of the materials in the generated image. This makes it easy to identify objects from a scene [20]. To detect the targeted object, the spectral signature of the object is obtained and matched with the spectral signature matching algorithms for a hyperspectral sensing image analysis.
Kwon et al. [21] provided the overall idea of using hyperspectral sensing technology for obstacle detection in military application. Three hyperspectral sensors with different operating spectral ranges, dual band hyperspectral imager (DBHSI), an acousto-optical tunable filter imager, called SECOTS, and a visible-to near-infrared spectral imager SOC-700, were used to provide spectral data to the hyperspectral detection algorithms. DBHSI operates at the mid-and long-wave infrared bands and collects 128 bands of images simultaneously with a dual-color focal plane array to obtain hyperspectral images in two separate infrared spectral regions. SECOTS and SOC-700 span small, portable hyperspectral imagers operating at the visible to near-infrared bands. Newly developed hyperspectral anomaly and target detection algorithms were applied to the hyperspectral images generated and detect objects such as military vehicles, a barbed wire, and a chain-link fence [21]. This developed hyperspectral sensing system is expected to help unmanned guided vehicles to navigate safely in an unknown area.
It is possible to detect camouflaged animals using this system. However, it requires more data storage and consumes computation complexity to identify objects because of hyperspectral band selection. Gomez [20] found that taking advantage of sensors such as radar and LiDAR, in addition to hyperspectral imaging, is advisable for developing remote sensing program strategies to produce specific application products.

Active Sensors
Active sensing methods measure the distance of objects by sending pulse signals to a target and receiving the signal bounced back, which are generally based on computing the time of flight (ToF) of laser, ultrasound, or radio signals of the electromagnetic spectrum to measure and search for objects [10]. LiDAR and radar are both active range sensors that provide distance measurements useful for detecting obstacles based on geometry, whereas passive camera sensors (e.g., color, thermal) provide visual clues useful for discriminating object classes.

LiDAR Sensors
LiDAR sensors use the ToF of reflected laser pulses to measure distances and detect objects [22]. A LiDAR camera emits billions of pulses per second up to 360 • in all directions, thus generating a 3D matrix for the surrounding environment. Depending on the specs of different products, a LiDAR sensor generates up to millions of distance measurement points, as well as information on the position, shape, and movement of objects in seconds.
Although the benefits of LiDAR sensors are obvious, the main problems for obstacle detection and recognition in agricultural environments are (1) point cloud classification and (2) multimodal fusion. Point cloud classification deals with the issue of discriminating 3D point structures based on their shapes and neighborhoods for various applications, such as vehicle identification and tracking [23], pedestrian-vehicle near-miss detection [24], and background filtering [25]. Kragh [26] proposed two methods for point classification of LiDAR-acquired 3D point clouds, which address sparsity and local point neighborhoods and were used for consistent feature extraction across entire point clouds. One method, based on a traditional processing pipeline, outperformed a generic 3D feature descriptor designed for dense point clouds. The other method used a 2D range image representation, semantic segmentation [27] in 2D with deep learning. Together, the two methods showed that sparsity in LiDAR-acquired point clouds can be addressed intelligently by utilizing the known sample patterns. A combination of multiple representations may therefore accumulate the benefits and potentially provide increased accuracy and robustness [26]. To effectively use LiDAR for sensing obstacles in vegetation, LiDAR can be used in combination with stereo cameras to analyze the cloud data points with 2D images.
Multimodal fusion can increase classification robustness and confidence. It addresses the question of how LiDAR technology can work with other sensing modalities in agricultural environments. Kragh [26] proposed and evaluated methods for sensor fusion between LiDAR and other sensors, incorporating spatial, temporal, and multimodal relationships to increase detection accuracy and thus enhance the safety of self-driving vehicles. The method consists of a self-supervised classification system using LiDAR to continuously supervise a visual classifier of traversability. LiDAR and camera data are then fused at the decision level with deep learning on range images. In the study [28], Larson and Trivedi put forward a LiDAR-based method that uses geometric features of the outline of concave obstacles, which are sent to a support vector machine (SVM) classifier to detect obstacles. However, the laser can be reflected repeatedly in the pit resulting into the loss of observational information, and thus the suggestion is to combine LiDAR with other sensors, such as thermal cameras, for better concave obstacle detection.

Radar
Radar fires radio waves at a target area and monitors reflection from the objects within the area, generating position and distance data for the objects [29]. For remote sensing of trees and crops, object detection in vegetation was studied by Radar [30]. Detecting performance of radar involves two critical factors such as penetration depth of radar waves through the vegetation and angular (or spatial) resolution of the radar system. Penetration depth is the depth at which the signal strength of the radar is dropped (weaken) to 1/e (37%) [30] of its original value. The range of wavelengths for remote sensing in vegetation varies from 70 cm (~0.5 GHz) to 1 cm (~30 GHz) [30]. Microwaves attenuation through vegetation increases generally with frequency [18]. Obstacle classification by radar can be summarized as attenuation, backscatter, phase variation, and depolarization [31]. In the presence of dust and fog, radar was successfully used for perception on autonomous vehicles [32].
Convex obstacles such as trees, slopes, and hills can be detected easily by radars as they have a strong penetrating ability and can work well in bad weather conditions. Jing et al. [33] used a Doppler-feature-based method to find the height of obstacles and classify convex obstacles. Since the accuracy is low, fusing the millimeter wave radar with other sensors is expected.
One experiment by Gusland et al. [34] used radar with a pulse length of 250 µs, 0.4-GHz bandwidth, and four horizontally polarized transmit antennas. The obstacles included in this experiment were a small rock, large rock, concrete support, tree stub, and paint can. The experiment confirmed system functioning and cross-range resolution improvement due to the multiple-input, multiple-output configuration, and the initial results indicate that the system is capable of detecting obstacles hidden in vegetation. The contrast between relevant obstacles and vegetation clutter is of critical importance, defined as the maximum reflected power of the resolution cell containing the obstacle compared with the maximum of the vegetation surrounding it [34]. Table 1 summarizes the advantages and disadvantages of each sensor and the potential use of each sensor to detect obstacles. RGB cameras are low-cost compared with other sensors. The disadvantage of this sensor is that it cannot sense the depth of the object or work well in bad weather. Comparatively, stereo cameras can detect and sense obstacle depth, as well as identify camouflaged animals and protruding objects. A key ability of thermal cameras is that the sensing data are not affected by camouflaged animals and the illumination effect while the performance capabilities are affected by the ambient temperature [26]. Another drawback of thermal cameras is the low resolution and loss of range data when the camera is in motion, as well as when there is texture difference.

Comparison of Sensors
Hyperspectral imaging sensors provide better resolution and acquire images across several narrow spectral bands ranging from the visible region to mid-infrared region of the electromagnetic spectrum [35][36][37][38][39][40]. Thus, object identification and classification become easy. Although equipped with a wide spectral band, sensor performance is affected by variation in illumination, and the system is not robust with respect to the environment [41].
LiDAR provides more accurate depth information for a longer range while capturing the data horizontally up to 360 • . The drawback of LiDAR is in recognizing the objects due to the lack of visual and thermal information [42]. Radar also lacks in giving a good resolution due to the sparsity of data and thereby making object recognition a challenging task.
Thus, both active and passive sensors have certain limitations. The main inconvenience of using active sensors is the potential confusion of echoes from subsequent emitted pulses resulting in sparse data, and the accuracy range of distance for these systems is usually bounded between 1 and 4 m [10]. Although active sensors have certain disadvantages, their vision characteristics are not affected by climatic conditions. Comparatively, passive sensors have dense data but are affected by climatic conditions. Hence, fusing active and passive sensors can be an effective way to compensate for drawbacks of each type.

Fusion of Sensors Applications
The fusion of sensors is performed at two levels: low-level and high-level. Low-level fusion combines raw data of different sensors, and high-level fusion involves combining data from the sensors to classify objects into different categories [42]. At low-level fusion, the outputs of different sensors are fused together, and at high-level fusion, the outputs of sensors are categorized as obstacles detected by sensors. Cameras provide dense texture and semantic information about the scene, but have difficulty directly measuring the shape and location of a detected object [43]. LiDAR provides an accurate distance measurement of an object; however, precise point cloud segmentation for object detection involves computational complexity due to the sparsity in horizontal and vertical resolution of the scanning points. Radar provides object-level speed and location via range and range-rate but does not provide an accurate shape of the objects. Thus, passive sensors are used for sensing the appearance of the environment and the objects, whereas active sensors are used for geometric sensing. There have been several approaches made to combine different sensors to detect obstacles hidden under vegetation. The following sections identify popular fusion methods from recent research.
No single sensor can detect objects reliably and single-handedly in all weather conditions. Active sensors such as radar and LiDAR, and passive sensors such as RGB camera, stereo camera, and thermal camera and hyperspectral sensing have different pros and cons concerning illumination, weather, working range, and resolution, and thus a combination of these sensors are needed to accommodate various working environments [30].

RGB and Infrared Camera
Microsoft's Kinect sensor uses structured-light and ToF-based RGB-D cameras to detect obstacles [44]. It is a typical RGB and infrared combination camera. The cameras emit their own light and are more robust in low-light environments. However, they are not suitable for outdoor environments because of the submerged laser speckle under strong light. The ToF depth cameras emit infrared light and measure the observed object for depth measurements. They provide accurate depth measurement but have low resolution and significant acquisition noises [45].
Nissimov et al. [46] used the sensor to detect obstacles in a greenhouse robotic application. The RGB camera in the sensor takes images and the infrared laser emitter senses the depth of the image by emitting the infrared dots captured by infrared camera. Thus, the sensor uses color and depth information to detect obstacles. The gradient of the image is calculated, and based on a threshold value of the gradient, the object is classified as either traversable or non-traversable. Further, to increase the efficiency of the detected objects, a local binary pattern (LBP) texture analysis was conducted for the pixels near the detected objects. The paper confirmed that with the texture analysis, the system can detect unidentified objects near the detected objects. However, the drawback of this system is that it cannot be operated in all lighting conditions. The sensor requires the returning beams of the infrared emitter to be easily detectable within the image of the infrared camera, and thus the beams must be significantly stronger than the ambient light [46].

Radar and Stereo Camera
Reina et al. [47] conducted a fusion of a radar and stereo camera. Radar is a good sensor for range measurement, and stereo cameras provide clear resolution for detected objects. Thus, combining these sensors results in improved 3D localization of obstacles up to 30 m. Radar is used to obtain 2D points of the detected obstacles, which is then augmented with stereo camera data to obtain the obstacle information. The intensity range of the radar beam determines the obstacle detection performance. The detected obstacle range and contour is used with sub-cloud in a stereo-generated 3D point cloud. Sub-cloud is the volume of interested areas of given radar-labeled obstacles in the stereo-generated 3D point cloud. This helps to obtain stereoscopic 3D geometric and color information of the detected object.
Jha, Lohdi, and Chakravarty [48] discussed the fusion of radar and stereo cameras, giving real-time information on the environment for navigation. The research used the 76.5-GHz millimeter wave radar to provide information on the range and azimuth of the detected objects. The camera output data were collected and processed framewise in real time by the YOLOv3 algorithm [49]. Thus, the weights of the trained YOLOv3 model were used to detect and identify objects, which were mapped with radar data to find the distance and angle of the objects for vehicle navigation. However, the study does not clearly explain the functionality of the fusion for detecting hidden objects; rather, it provides good results for object detection and identification.

LiDAR and Camera
Some researchers used self-supervised systems in which one device is used to monitor the other device (stereo radar, RGB-radar, and RGB-LiDAR [42]) and improve the detection performance. The difference between this system and sensor fusion is that fusing sensors provides accurate data rather than supervising the sensors. According to Kragh and Underwood [42], assuming perfect calibration between the camera sensor and LiDAR sensor, involves fusing LiDAR points with camera images. The calibration of sensors is challenging and involves much computation effort.
Semantic segmentation for object detection using LiDAR and a camera results in capturing the objects not easily detected by bounding boxes. By using the LiDAR and camera fusion as shown in Figure 2, visual information from color cameras is good for environmental sensing, and 3D LiDAR information serves to distinguish flat, traversable ground areas from non-traversable elements (includes trees and other obstacles in the path) [42]. The combination of appearance-and geometric-based detection processes is performed using the conditional random field [50], which predicts objects based on the current data provided by the LiDAR and camera. The results from Kragh and Underwood [42] show that, for a two-class classification problem (ground and nonground) where LiDAR distinguishes ground and nonground structures without the aid of the camera. To classify more objects into various classes (ground, sky, vegetation, and object), both the devices are needed to complement each other and improve the performance using the conditional random field (CRF) [42].
using the conditional random field [50], which predicts objects based on the current data provided by the LiDAR and camera. The results from Kragh and Underwood [42] show that, for a two-class classification problem (ground and nonground) where LiDAR distinguishes ground and nonground structures without the aid of the camera. To classify more objects into various classes (ground, sky, vegetation, and object), both the devices are needed to complement each other and improve the performance using the conditional random field (CRF) [42]. Vehicle and pedestrian detection use a common fusion process of camera and LiDAR, where LiDAR data generate the regions of interest (ROIs) on the image by extrinsic calibration and then an image-based segmentation and detection method detects vehicles and pedestrians [51][52][53][54].
Fu et al. [43] used a deep fusion architecture through a convolution neural network for fusing LiDAR data with RGB images to complete the depth map of the environment. This method can deal with sensor failure for AGV's, improving the robustness of the perception system. Starr and Lattimer [55] combined a thermal camera with a 3D LiDAR to obtain good results for low-visibility detection. In their work, two sensors detect objects separately and then adopt the evidential fusion method (Dempster-Shafer theory [56]). Similarly, Zhang [57] first proposed a two-step method of calibration between a 3D LiDAR and a thermal camera. The fusion algorithms between these two sensors are not limited to low-level fusion but can be extended for high-level fusion.

LiDAR and Radar
Another research task [58] used low-power, ultra-wideband radar sensors in combination with higher-resolution range imaging devices (such as LiDAR and stereovision) for AGV. This combination of LiDAR and radar served effective at sensing sparse vegetation; however, it was less effective at sensing dense vegetation. This indicated treating Vehicle and pedestrian detection use a common fusion process of camera and LiDAR, where LiDAR data generate the regions of interest (ROIs) on the image by extrinsic calibration and then an image-based segmentation and detection method detects vehicles and pedestrians [51][52][53][54].
Fu et al. [43] used a deep fusion architecture through a convolution neural network for fusing LiDAR data with RGB images to complete the depth map of the environment. This method can deal with sensor failure for AGV's, improving the robustness of the perception system. Starr and Lattimer [55] combined a thermal camera with a 3D LiDAR to obtain good results for low-visibility detection. In their work, two sensors detect objects separately and then adopt the evidential fusion method (Dempster-Shafer theory [56]). Similarly, Zhang [57] first proposed a two-step method of calibration between a 3D LiDAR and a thermal camera. The fusion algorithms between these two sensors are not limited to low-level fusion but can be extended for high-level fusion.

LiDAR and Radar
Another research task [58] used low-power, ultra-wideband radar sensors in combination with higher-resolution range imaging devices (such as LiDAR and stereovision) for AGV. This combination of LiDAR and radar served effective at sensing sparse vegetation; however, it was less effective at sensing dense vegetation. This indicated treating dense vegetation area as a non-traversable path and sparse area as a traversable path, serving the advantage for detecting obstacles through vegetation [59].
Kwon [59] provides interesting research based on partly blocked pedestrian detection by using LiDAR and radar. The method considers the blocked depth projected part of the object to determine the existence of the blocked object. According to this study, radar detects the partially blocked object easily compared with LiDAR due to the Doppler (change in frequency) pattern. Thus, a partially blocked pedestrian is detected using the combination of occluded depth LiDAR data of human characteristics curve and the radar Doppler distribution of a pedestrian. Thus, this combination of sensor fusion scheme is useful in an AGV to detect hidden objects and prevent collision [32]. Similarly, object characteristic point-cloud classification of LiDAR can be experimented with an occluded hidden object to obtain the desired results. Table 2 highlights the summary of the sensor fusion used for different applications and shows the area for potential research. For example, for spare vegetation or partially hidden objects detection, the LiDAR and radar combination works better. Radar and stereo camera are suitable for range measurement, object detection, and location tracking.

Multimodal Sensors
Kragh [26] established a flexible vehicle-mounted sensor platform at Denmark (Figure 3), which recorded imaging and position data for a moving vehicle using an RGB camera (Logitech HD Pro C920), thermal camera (FLIR A65, 13 mm), stereo camera (Multisense S21 CMV2000), LiDAR (Velodyne HDL-32E), radar (Delphi ESR), and two position estimation sensors, GPS (Trimble BD982 GNSS) and IMU (Vectornav VN-100). The platform illustrated in Figure 3 includes seven sensors, and thermal and stereo cameras are linked with a frame grabber and provide image data to the controller via ethernet. The platform collected real time data from all the sensors where the data were used for offline processing. The following paragraph indicates detection of a human, barrel, and mannequin hidden inside the field using a multimodal sensing platform. The experiment was conducted in an open-space lawn with high grass, and the objects were a partly hidden barrel, lying child mannequin, and sitting human (Figure 4). The RGB camera in the extreme left column of Figure 4, was able to detect the sitting human but was not able to detect hidden human. Thus, images from the stereo camera (next to RGB image column) can only detect the sitting human and not the barrel. The LiDAR sensor (extreme right image column) was more reliable and was able to reflect both the sitting human and protruded barrel but did not reflect the lying child mannequin. Among all, the thermal camera (next to stereo camera image column) achieved robust detection performance for all three objects. However, the thermal camera is affected by the warm climate where the temperature of the objects is similar, creating sensing issues [26]. Thus, from this study [26], we found that the combination of sensors is required for effective detection of all hidden or partly hidden objects in the field.
The RGB camera in the extreme left column of Figure 4, was able to detect the sitting human but was not able to detect hidden human. Thus, images from the stereo camera (next to RGB image column) can only detect the sitting human and not the barrel. The LiDAR sensor (extreme right image column) was more reliable and was able to reflect both the sitting human and protruded barrel but did not reflect the lying child mannequin. Among all, the thermal camera (next to stereo camera image column) achieved robust detection performance for all three objects. However, the thermal camera is affected by the warm climate where the temperature of the objects is similar, creating sensing issues [26]. Thus, from this study [26], we found that the combination of sensors is required for effective detection of all hidden or partly hidden objects in the field. Figure 4. Barrel, child mannequin, and human detection experiment result [26]. Figure 4. Barrel, child mannequin, and human detection experiment result [26].
From the experiments and research studies, it was found that the use of multiple sensors is a difficult process that requires the fusion of various sensor data which adds to computational complexity. The data set must be expanded in inclement weather conditions to provide a thorough evaluation for complicated agricultural environments. For future work, experiments should be carried out on moving obstacles or organic objects (such as animal bodies).

Conclusions and Recommendations
This study preliminarily explores state-of-the-art obstacle detection in vegetation, introducing a series of sensors and technologies and discussing their attributes in various environments. Using multiple sensors would add to computational complexity; thus, we found that fusing two sensors would yield effective results while keeping the computational burden reasonable. As discussed, active sensors are effectively used for geometric sensing while detecting an accurate range of obstacles, and passive sensors provide a high-resolution appearance of environment and objects. Therefore, it was found that fusing data of an active and passive sensor results in effectively detecting hidden obstacles in vegetation. Sensor selection should be performed according to the application and environment. Radar is the affordable solution for an active sensor that penetrates vegetation, and can be fused with a stereo camera. The second-most effective sensor is the use of LiDAR in combination with a radar/stereo camera, which is effective in detecting objects. Thermal cameras effectively detect hidden objects; however, a combination of those sensors sharing their advantages is worth exploring for detecting obstacles in complicated vegetation environments. In addition, the vehicles mounted with sensors are moving, so a stable and reliable detection capability through the sensors for a self-driving vehicle should be studied as well. The needs of improving working conditions in the vegetation objects detection sector, including policy support and technological advancements, as well as the integration of other services, are critical and necessary for further deploying the sensor technologies in the real world.