Research Status and Prospects on Plant Canopy Structure Measurement Using Visual Sensors Based on Three-Dimensional Reconstruction

: Three-dimensional (3D) plant canopy structure analysis is an important part of plant phenotype studies. To promote the development of plant canopy structure measurement based on 3D reconstruction, we reviewed the latest research progress achieved using visual sensors to measure the 3D plant canopy structure from four aspects, including the principles of 3D plant measurement technologies, the corresponding instruments and specifications of different visual sensors, the methods of plant canopy structure extraction based on 3D reconstruction, and the conclusion and promise of plant canopy measurement technology. In the current research phase on 3D structural plant canopy measurement techniques, the leading algorithms of every step for plant canopy structure measurement based on 3D reconstruction are introduced. Finally, future prospects for a standard phenotypical analytical method, rapid reconstruction, and precision optimization are described.


Introduction
With the rapid development of plant phenotypical technology, its identification has become a key process used to improve plant yield, and analyzing plant phenotypes with intelligent equipment is one of the main methods to achieve smart agriculture [1].Digital and visual research of threedimensional (3D) plant canopy structures is an important part of plant phenotypical studies.With the improvement in computer processing capabilities and reductions in the size of 3D data measurement devices, 3D plant canopy structure measurement and reconstruction studies have begun to increase exponentially [2].This paper introduces five common visual techniques for 3D plant canopy data measurement, their corresponding instrument models and parameters, and their advantages and disadvantages.These technologies are binocular stereo vision, multi-view vision, time of flight (ToF), light detection and ranging (LiDAR), and structured light.Following this, the general process of 3D reconstruction and structure index extraction of plant canopies are summarized.The accuracy and correlation of the structure index of the reconstructed plant canopy with different visual devices are evaluated, and the common algorithms of plant 3D point cloud processing are reviewed.Then, the technical defects, including the lack of matching between reconstructed 3D plant structure data and physiological data, the low reconstruction accuracy, and the high device costs, are outlined.Finally, the development trends in 3D plant canopy reconstruction technology and structure measurement are described.

Binocular Stereo Vision Technology and Equipment
Binocular vision uses two cameras to image the same object at different positions, which will produce a difference in the coordinates of similar features within two stereo images, the difference calls binocular disparity, and the distance (object to camera) can be calculated according to binocular disparity.Disparity distance measurement is applied to calculate depth information [2].The principle of the method is shown in Figure 1.x and y can be calculated by z and image plane coordinate.
The main process of binocular vision reconstruction includes image collection, camera calibration, feature extraction, stereo matching, and 3D reconstruction.Camera calibration is a key step for obtaining stereo vision data with binocular cameras, and its main purpose is to estimate the parameters of a lens and image sensor of a camera, and use these parameters to measure the size of an object in world units or determine the relative location between camera and object.The main camera calibration methods include the Tsai method [3], Faugeras-Toscani method [4], Martins' twoplane method [5], Pollastri method [6], Caprile-Torre method [7], and Zhang Zhengyou's method [8].These methods are based on traditional calibration methods that obtain the camera parameters by using a highly accurate calibration piece to establish the correspondence between the space points and the image points.In addition, there are self-calibration technologies and calibration techniques based on active vision [9].Andersen et al. [10] used the camera calibration method of Zhang Zhengyou to calibrate the internal parameters of the binocular camera, and then obtained the depth data of wheat using the stereo matching method with simulated annealing.
Stereo matching or disparity estimation is the process of finding the pixels in the multi-view that correspond to the same 3D point in the scene.The disparity map refers to the apparent pixel difference or motion between a pair of stereo images.The calculation of disparity maps in stereo matching is both challenging and the most important part of binocular stereo vision technology.Various algorithms can be used to calculate pixel disparity, which can be divided into global, local, and iterative methods according to different optimization theories; they can also be divided into region matching, feature matching, and phase matching by what elements are represented by the images.Malekabadi [11] used an algorithm based on local methods (ABLM) and an algorithm based on global methods (ABGM) to obtain the disparity image, which can provide plant shape data.Two stereo matchings, 3D minimum spanning tree (3DMST) [12] and semi-global block matching (SGBM), are state-of-the-art and widely used.Bao [13] designed an analysis system to measure plant height in field using a high-throughput field combined with the 3DMST stereo matching technique.Baweja [14] coupled deep convolutional neural networks and SGBM stereo matching to count stalks and measure stalk width.Dandrifosse [15] used SGBM stereo matching to extract wheat structure features with two nadir cameras in field conditions, including height, leaf area, and leaf angles; the result showed that 3D point cloud produced by the stereo camera can be used to measure the plant height and other morphological characteristics, although some errors were noted.
The parameters of typical binocular cameras are shown in Table 1.Binocular stereo version is simple and inexpensive, and no further auxiliary equipment (such as a specific light source) and no special projection are required [16].Stereo vision technology also has limitations.It is affected by changes in scene lighting and requires a highly-configured computing system to implement the stereo matching algorithm; The measurement accuracy by binocular stereo depends on the baseline length, as the longer the baseline length compared with distance to a measurement object is, the higher the accuracy is; Stereo vision cannot acquire high-quality data, but it uses the data to have an interpretation in robotics and computer vision [2]; A robust disparity estimation is difficult in areas of homogeneous color or occlusion [16]; and a stereo camera may not reflect the actual boundary of the surface when projecting on a smooth and curved surface, which is called false boundary problem and will affect the correctness of feature matching in active stereo vision.To solve the false boundary problem, one effective approach is to use dynamic and exploratory sensing, another is to move the cameras farther away from the surface [17].

Multi-View Vision Technology
Multi-view vision technology is an imaging method used to capture pictures of objects from different perspectives with calibrated cameras.The feature points obtained by overlapped images are used to calculate shooting position.Its main applications include structure-from-motion technology (SfM) and multi-view stereo technology (MVS).There are two main multi-view vision technologies: using multiple cameras to obtain 3D data and rotating cameras or objects to obtain 3D data (including deep information).The 3D reconstruction processes for multi-view stereo vision and binocular vision are similar, the biggest difference is that SfM uses redundancy overlapping images to get camera position parameters, and binocular vision uses a traditional calibration method, calibration, matching, and 3D reconstruction.Although the image produced by multi-view vision is more accurate, its calibration and synchronization, including camera location mainly, are more complicated than those of a binocular camera.
SfM and MVS have a sequential order: SfM is used to determine camera poses, intrinsic parameters calibration and start feature matching, then MVS is used to reconstruct the dense 3D scene.Structure-from-motion technology (SfM) is a distance imaging technology that estimates a 3D structure by capturing a series of 2D images at different locations in a scene, whose model includes incremental, global, and hybrid structures, then it applies a highly redundant image feature and matches the 3D positions of features based on the scale-invariant feature transform (SIFT) algorithm (or using SURF, ORB algorithm).After estimating camera pose and extracting the points cloud (using Bundler), MVS technology is used to reconstruct a complete 3D object model from a suite of images taken from known camera locations after calibrating cameras [18], which uses the method of polar geometric constraint that sees whether they are consistent with a common popular geometry to match each pixel (clustering views for multi-view stereo (CMVS), patches-based multi-view-stereo (PMVS2) algorithms and et al.).Some open source software for MVS are shown in Table 2. SfM generally produces sparse point clouds and MVS photogrammetry algorithms are used to increase the point density by several orders of magnitude.As a result, the combined workflow is more correctly referred to as 'SfM-MVS' [19].The steps of point cloud formation based on SfM-MVS generally include feature detection, keypoint correspondence, identifying geometrically consistent matches, structure from motion, scale and georeferencing, refinement of parameter values, and multiview stereo image matching algorithms.Some typical commercial integrated software for implementing SfM-MVS are shown in Table 3. Scale and georeferencing are special steps for aerial maps.Output of the SfM stage is a sparse unscaled 3D point cloud in arbitrary units along with camera models and poses, so correct scale, orientation, or absolute position information need to be built according to known coordinates.Three methods can be used to enable accurate scale and georeferencing of the imagery.One is using a minimum of three ground control points (GCPs) with XYZ coordinates to scale and georeference the SfM-derived point cloud [20].Orientation can be measured from an Inertial Measurement Unit (IMU) [21] and it can be performed from known camera positions derived from RTK-GPS measurements [22].On the other hand, the metric scaling factor was derived through the known value of a geometrical feature in the point cloud for small-scale plant measurement without unmanned aerial systems (UAS), and raw point cloud are multiplied by a scale factor that is the ratio of the feature in millimeters and in the pixel system of the raw point cloud, which will determine an individual scale factor for every point cloud [23].
SfM can be applied to large-scale plant measurement.Unmanned aerial systems are necessary pieces of auxiliary equipment for large-scale experimental field measurement based on SfM-MVS.Images are acquired autonomously based on presetting UAS parameters and camera settings, then point cloud data are generated by some commercial software for 3D scene modeling.Then plant height, density, and etc. was calculated after point cloud processing.For example, Malambo [20] used a DJI ® Phantom 3 to acquire images and 6 or more portable GCPs were placed uniformly in the field and measured using a Trimble GeoXH GPS system for scale and georeferencing, 100 readings were taken per point and differentially post-processed using Trimble's Pathfinder Office software to achieve centimeter accuracy (<10 cm), and Pix4Dcapture software based on SfM was used to generate a point cloud, then point cloud was processed to obtain maize height.SfM can also be applied to small-scale plant measurement.Rose [23] used Pix4DMappe based on SfM-MVS to reconstruct single tomato plants, and extracted main stem height and convex hull from the 3D point clouds.

Time of Flight Technology
Time of Flight (ToF) is a high-precision ranging method.ToF cameras and LiDAR (light detection and ranging) scanning are based on Time of Flight technology.The imaging principles of ToF can be divided into pulsed-wave (PW-iToF) or continuous-wave (CW-iToF) modulation [24].The ToF imaging principle is shown in Figure 2. CW-iToF emits near-infrared (NIR) light through a light-emitting diode (LED), which reflects back to the sensor.Each pixel on the sensor samples the amount of light reflected by the scene four times in equal intervals per cycle (such as m0, m1, m2, and m3).The phase difference, offset value, and amplitude are sampled by comparing the modulation phase with the transmitted signal phase, and the target depth is calculated based on these three quantities.PW-iToF uses a transmitting module to transmit a laser pulse (Tpulse), while at the same time, a shutter pulse, which has the same time length with Tpulse, is activated by the transfer gate (TX1).When the reflected laser hits the detector, the charges are collected.After the first shutter pulse ends, the second shutter pulse is activated by the transfer gate (TX2).The charge is integrated in the according storage node of two shutters and the target depth is calculated based on accumulation of charge [24].

Time of Flight Cameras
Time of Flight cameras are part of a broader class of scannerless LiDAR, in which the entire scene is captured with each laser pulse, as opposed to point-by-point with a laser beam, such as in scanning LIDAR systems [25].Typical cameras using ToF technology are SR-4000, CamCube, Kinect V2, etc., whose structural parameters are shown in Table 4.An important issue for ToF cameras is the wrapping effect, which is the distances to objects that differ 360° in phase and are indistinguishable.Multiple modulated frequencies and lowering the modulation frequency can solve the issue by increasing the unambiguous metric range [26].Hu et al. [27] proposed an automatic system for leaflet non-destructive growth measurement based on a Kinect V2, which uses a turntable to obtain a multiview 3D point cloud of the plant under test.Yang Si et al. [28] used a Kinect V2 to obtain the 3D point cloud depth data of vegetables in seedling trays.Vázquez-Arellano [29] estimated the stem position of maize plant clouds, calculated the height of individual plants, and generated a plant height profile of the rows using a Kinect V2 camera in a greenhouse.Bao [30] used Kinect V2 to obtain 3D point cloud data under field conditions, and a point cloud processing pipeline was developed to estimate plant height, leaf angle, plant orientation, and stem diameter across multiple growth stages.A branch 3D skeleton extraction method based on an SR4000 was proposed by Liu [31] to reconstruct a 3D skeleton model of the branches of apple trees, and an experiment was carried out in Fruit Tree Experimental Park; Skeletonization is the process of calculating a thin version of a shape to simplify and emphasize the geometrical and topological properties of that shape, such as length, direction, or branching, which are useful for the estimation of phenotypic traits.Hu [32] used the SR4000 camera to acquire a plant's 3D spatial data and construct a 3D model of poplar seedling leaves, then calculated leaf width, leaf length, leaf area, and leaf angle based on the 3D models.A key advantage of time-of-flight cameras is that only a single viewpoint is used to compute depth.This allows robustness to occlusions and shadows and preservation of sharp depth edges [33].
The main disadvantages of Time of Flight are low resolution, and not being to able to be operated under strong sunlight, being disturbed by other's ToF cameras, and short distance measurement.

LiDAR Scanning Equipment Based on ToF
Light detection and ranging (LiDAR) was developed in the early 1970s to monitor the earth [34].LiDAR can be divided into aerial and terrestrial LiDAR.As aerial LiDAR laser scanning is mainly used for 3D data measurement of glaciers, forests, and land, the effect resolution is low in plant phenotypical analysis, so terrestrial LiDAR scanning is mainly used in 3D plant scanning.Terrestrial LiDAR (T-LiDAR) scanners can be divided into phase-shift T-LiDAR and pulse-wave T-LiDAR.T-LiDAR estimates time by the phase shift between the continuous emission and the receipt of the laser beam, making it ideal for measuring high-precision and relatively close scenarios.Time-of-flight T-LiDAR is based on calculating the time between emitting and receiving laser pulses to estimate the distance, which is suitable for scenarios with large distances.The specification parameters of partial low-cost devices T-LiDAR for measurements of a plant canopy are shown in Table 5. LiDAR can be used for canopy measurement.Garrido [35] used portable LiDAR LMS 111 to reconstruct a maize 3D structure under greenhouse conditions, which can help the aim of developing a georeferenced 3D plant reconstruction.Yuan [38] developed a detection system to measure the tree canopy structure by LiDAR UTM30LX and the height and weight of artificial tree could be obtained by the system.Qiu [39] used LiDAR Velodyne HDL64E-S3 to get depth-band histograms and horizontal point density, using the data to recognize and compute the morphological phenotype parameters (row spacing and plant height) of maize plants in the experimental field.Jin [40] used LiDAR FARO Focus 3D X 330 HDR to get maize point cloud data, and realized stem-leaf segmentation and phenotypic trait extraction in an experiment carried out in the Botany Garden.

Structured Light Technology and Equipment
Structured light is an active imaging technology.The projector projects a series of light sequences, or patterns consisting of many stripes at once or of arbitrary fringes onto the object, and the light sequence is deformed on the object.Then, the camera shoots the object in another direction and extracts the deformation of its stripe shape and stripe width to obtain depth data.The method is shown in Figure 3.A structured light 3D scanner has some advantages.A structured light scanner can produce highly accurate results, resolution is typically high, the images captured can reliably determine the dimensions of the object, and it is often fast.3D imaging can occur practically as fast as an image can be taken.Structured light scanner imaging systems have a better measurement coverage area than other 3D imaging techniques, as long as the distance is fixed.This is particularly useful for larger parts that need multiple scans, further saving time and creating efficiencies in production [33].Major drawbacks of the sequential projection techniques include its inability to acquire the 3D object in dynamic motion or in a live subject such as human body parts.Another limitation is that the reflected pattern is sensitive to optical interference from the environment, so it is suitable for indoors.The general process for 3D reconstruction based on structured light is as follows: camera and projector calibration, projector calibration includes intensity calibration to build the relationship between the actual intensity of the projected pattern and image pixel value, geometric calibration to build the relationship between point of 3D space and projector [41], projecting patterns and finding correspondences to estimate parameter matrix between pixel and point of 3D space, obtaining a 3D point cloud based on the parameter matrix of the structured light camera, and to carry out 3D reconstruction.
Chené et al. [42] used Kinect V1 to measure leaf curvature, morphology, and orientation.Azzari et al. [43] used Kinect V1 to obtain the point cloud data of the plant, and then constructed the canopy structure of the plant to obtain the plant diameter and height.Nguyen et al. [44] used a combination of structured light and a multi-camera to extract plant (cabbage, cucumber, tomato) height, leaf area, and total shaded area.Syed et al. [45] used Realsense SR300 to obtain the color and depth data of the plants (pepper, tomato, cucumber, and lettuce), with the key characteristics of the seedlings obtained through a series of algorithms; the processing speed was also fast.Vit [46] compared the following sensors: Kinect II, Orbbec Astra, Intel RealSense SR300, and Intel D435; and experiments showed that the Intel D435 sensor provided the best accuracy for measuring the average diameter of maize stems.Liu [47] proposed a recognition algorithm for citrus fruit based on RealSense.The method effectively used depth-point cloud data got from RealSense F200 in a close-shot range of 160 mm and different geometric features of the citrus fruit and leaves to recognize fruits with an intersection curve cut by the depth-sphere.Milella [48] used the RealSense R200 depth camera to construct an in-field high throughput grapevine phenotyping platform that can estimate canopy volume and detect grape bunches under field condition.And some structured light depth cameras specifications are shown in Table 6.

Comparison of Main Measurement Technologies
Table 7 summarizes the devices' technology differences of stereo vision, SfM, Time of Light, LiDAR scanning, and structured light in 6 aspects.The numbers of plus and minus are intensity of advantage.

3D Plant Data Acquisition
Plant 3D data are mainly displayed using depth maps [52,53], polygon meshes [54], voxels [55][56][57][58], and 3D point clouds [44].The presentation of data types is shown in Figure 5.Among them, the depth map is a 2D picture, and each pixel value records the distance from the camera viewpoint to the surface of the obstruction.A polygon mesh, also called an unstructured mesh, is a collection of vertices and polygons representing polyhedron shapes in 3D computer graphics, consisting of a series of convex polygon vertices and convex polygon surfaces [59].Polygon meshes are intended to represent 3D object models in a way that is easy-to-render.A voxel [60], which is an abbreviation for volume cell and is similar to a pixel of 2D space, is the smallest unit of digital data in the 3D space partition.Voxelization is a standardized representation method that is used in the field of 3D imaging.A point cloud is a data set of points in a certain coordinate system that includes 3D coordinates, color, size value, segmentation results, etc.

3D Plant Canopy Point Clouds Preprocessing
Modeling using point cloud data is fast and has finer details than polygon meshes and voxels, which is valuable for agricultural crop monitoring.However, point clouds cannot be used directly for 3D applications, they need to be processed first because of wrongly assigned points and nointerest points, which are not matching between pixel point and actual corresponding object, or it is background and no target object.3D point cloud preprocessing in general includes background subtraction, outlier removal, and denoising [61].At present, there are many open source resources available for point cloud processing.

Background Subtraction
To obtain only the plant canopy, it is necessary to separate the plant point cloud area from the ground, weeds, or other backgrounds after obtaining plant 3D point clouds data.When using active image technology (ToF technology, structured light technology, and so on) without color data to get 3D point clouds, detection of geometric shapes can be applied to remove the background.When using passive image technology (binocular stereo vision technology, multi-view vision technology, SfM technology, and so on), color thresholding or clustering with different color data can be applied to remove background.
Bao [13] uses the Random Sample Consensus (RANSAC) algorithm to fit a plane, and subtracts the background whether un-requiring the distance threshold value between data point and defined plane.Klodt [62] used dense stereo reconstruction to analyze grapevine phenotyping, and backgrounds were segmented with respect to the color and depth information.However, the lowlevel geometric shapes features cannot handle all types of meshes.Deep Convolutional Neural Networks (CNNs) can solve the problem and provide a highly accurate way to label the background, using many geometric features to train a label model [63].
Background subtraction has an important application in robotic weeding.Plant recognition for automated weeding based on 3D sensors included preprocessing, ground detection, plant extraction refinement, and plant detection and localization.Gai [64] used Kinect V2 to obtain broccoli point clouds and RANSAC was used to remove ground Afterwards, 2D color information was utilized to compensate rough ground error and clustering was applied to remove weeding point cloud, and the result after ground removal with RANSAC is shown in Figure 6.Andújar [65] used Kinect V2 for volumetric reconstruction of corn, and canonical discriminant analysis (CDA) was used to predict weed classification of the system using weed height.

Outlier Removal and Plant Point Clouds Noise Reduction
An outlier is a data point that differs significantly from other observations.Noisy data are with a large amount of additional meaningless information data, which arise out of various physical measurement processes and limitations of the acquisition technology [66], including being corrupted or distorted, or having a low signal-to-noise ratio data.Also, matching ambiguities and image imperfection produced by lens distortion or sensor noise will lead to outliers and noise of point cloud data.Outlier detection approaches are classified into distribution-based [67], depth-based [68], clustering [69], distance-based [70], and density-based approaches [71].The moving least-squares (MLS) generally deals with noise, which iteratively projects points on weighted least squares fits of their neighborhoods, thus causing the newly sampled points to lie closer to an underlying surface [72].
Wu et al. [73] used a statistical outlier removal filtering algorithm to denoise the point cloud, which calculates the mean distance to the K neighboring points by K-neighbor searching method for each point, and removing oversize value.Yuan et al. [38] used statistical outliers to remove outlier point clouds around peanut point clouds.Wolff [74] designed a new algorithm to remove noisy points and outliers from each per-view point cloud by checking if points are consistent with the surface implied by the other input views.Xia [75] combined the two characteristic parameters of the average distance of neighboring points and the number of points in the neighborhood to remove outlier noise, and used a bilateral filtering algorithm to remove small noise in the point cloud of tomato plants.After performing point-wise Gaussian noise reduction, Zhou et al. [76] used the grid optimization method to optimize the point cloud data, and used the average distance method to remove redundant boundary points, thus obtaining a more realistic blade structure.Hu et al. [27] first used the multi-view interference elimination (MIE) algorithm to reduce layers and then used moving least squares (MLS) algorithm to reduce the remaining local noise.

Plant Point Clouds Registration
To measure the complete data model of a plant, the points obtained from various perspectives are combined into a unified coordinate system to form a complete point cloud, so the point clouds needs to be registered.The purpose of registration is to transform the coordinates of the source point cloud (initialized the point cloud) and target point cloud (point cloud formed by the motion of targeted object), and obtain a rotation translation matrix (RTMatrix, RTRT) that represents the position transformation relationship between source point cloud and target point cloud.Point cloud registration can be divided into rough registration and precise registration.Rough registration uses rotation axis center coordinate and rotation matrix to make the rigid transformation of point clouds.Precise registration aligns two sets of 3D measurements from geometric optimization.Iterative closing point (ICP) algorithm [77], Gaussian mixture models (GMM) algorithm [78] and thin plate spline robust point matching (TPS-RPM) algorithm [79] are generally used to make precise registration .ICP is the most classic and easy, which iteratively calculates the distance between the corresponding source point cloud and the target point cloud, constructing a rotation translation matrix to transform the source point cloud, and calculating the mean squared error after the transformation to determine if met defined threshold.Jia [80] performed the rough registration of plant point clouds from six perspectives based on the sample consistent initial alignment (SAC_IA).Precise registration uses a known initial transformation matrix, and it obtains a more accurate solution through ICP algorithm.The principle of ICP algorithm is shown in Figure 7.

Plant Point Clouds Surface Reconstruction
According to the different principles of reconstruction surfaces, 3D point clouds surface reconstruction can be divided into surface reconstruction based on Delaunay triangulation [81], region-based growth surface reconstruction, and implicit surface reconstruction [82].Among them, the Delaunay triangulation and its improved method [83][84][85] can satisfy the consistency requirements of the point cloud data topology, but the accuracy of surface reconstruction depends entirely on the density and quality of the point cloud.Region-based growth surface reconstruction can quickly triangulate the original point cloud to reconstruct the surface by projecting a 3D point to a certain normal plane, and then triangulating the point cloud obtained by the projection in the plane to obtain the connection relationship of the points.After triangulating the plane area, a triangular mesh surface is formed, and then a surface model is obtained according to the connection relationship [83].The implicit surface reconstruction segments the data into regions for local fitting and further combine these local approximations using blending functions [86], and it has better noise immunity and smoothness, but retaining the sharp features of the surface is difficult.Implicit surface reconstruction includes the radial basis function (RBF) algorithm [87], point set surface (PSS) algorithm [88], unified implicit multi-level partition of unity (MPU) algorithm [89], Poisson algorithm [90], algebraic point set surface (APSS) algorithm [91], etc.
Jay [92] used Delaunay triangulation to reconstruct the surface of cabbage to calculate the leaf area.Poisson surface reconstruction is often used in plant point cloud surface reconstruction, where the approximate surface is obtained by performing optimal interpolation processing on point cloud data.Martinez [93,94] used the Poisson algorithm in Meshlab to perform foliar reconstruction of cauliflower leaves.Hu [95] searched for the points closest to the dense point cloud in the vertices of the Poisson surface based on the Poisson reconstruction surface.The obtained distance was compared with the distance threshold to determine the removal of the vertices of the Poisson surface and to smooth the reconstructed cucumber, eggplant, and green pepper surfaces.Poisson surface reconstruction cannot be used for complex plants or plant canopies, so Michael [96] proposed that the boundary of each leaf patch can be refined using the level-set method, and demonstrated the effectiveness of the approach on the surface smoothing of the leaves of wheat and rice after reconstructing 3D point clouds of plants and scenes from multiple color input images.The reconstruction results based on Delaunay triangulation, implicit surface reconstruction algorithm, Poisson algorithm are shown in Figure 8.

Plant Canopy Segmentation
Plant canopy study is focused on canopy architecture, leaf angle distribution, leaf morphology, leaf number, leaf size, and so on, so plant leaf point cloud segmentation is necessary before morphological analysis.Plant segmentation is most difficult and important in plant phenotypic analysis, because kinds of plant organ in different vegetation is not similar, which leads to the use of specific methods for different plant segmentation.Three main varieties of range segmentation algorithms are edge-based segmentation, surface-based segmentation, and scanline-based segmentation [98].The surface-based segmentation methods use local surface properties as a similarity measure and merge together the points that are spatially close and have similar surface properties.Surface-based segmentation is common for plant canopy segmentation and its key is obtaining features for clustering or classification.Spectral clustering algorithm [99] can solve the segmentation problem of plant stem and leaf where the centers and spreads are not an adequate description of the cluster, but the number of clusters must be given as input; Point Feature Histograms (PFH) [100] can better show descriptions of a point's neighborhood for calculating features.Seed region-growing algorithm [101] is also common for segmentation, it examines neighboring features of initial seed points and determines whether the point should be added to the region, so the selecting of initial seed point is important for the segmentation result.
Paulus [102] proposed a new approach to the segmentation of plant stem and leaf, which applies PFH descriptor into Surface Feature histograms (SFH) in order to make a better distinction, and new descriptors were used as features for labels of machine learning to realize automatic classification.Hu [27] used pot point data to construct a pot shape feature to define plane Sm and segmentation of the plant leaf by whether the point's projection is or not on plane Sm.Li [103] selected a suitable seed point feature in the K-nearest neighborhood to cluster for coarse planar facer generation, then carried out facet region growing by multiple coarse facers according to facet adjacency and the coplanarity to accomplish leaf segmentation.Dey [104] used saliency features [105] and color data to obtain a 12dimensional feature vector for each point, then used SVM to classify the point clouds of grape, branches, and leaves according to obtained features.Gélard [106] decomposed 3D point clouds into super-voxel and used the improved region growing approach to segment merged leaves.
Surface fitting benefits plant canopy segmentation, which is used to fit planes or flexible surfaces.Non-uniform rational B-splines (NURBS) [107] algorithm is the general fitting plant leaf surface.Hu et al. [32] proposed an angle of the two adjacent normal vectors method to remove redundant points, and NURBS method was used to fit the plant leaf.Santos [108] used single handheld to get dense 3D point clouds by MVS technology, sunflower stem and leaf were segmented by spectral clustering algorithm, and leaf surface was estimated using non-uniform rational B-splines (NURBS).

Plant Canopy Structure Parameters Extraction
Plant structure index is used to characterize growth quality, structural parameters, covering area, and so on.It can be divided into the plant group canopy level [109], individual plant level [110], and plant organ level [111].The plant canopy plays important functional roles in cycling materials and energy through photosynthesis and transpiration, maintaining plant microclimates, and providing habitats for various taxa [112].This paper only focuses on the plant group canopy level, which includes leaf inclination angles, leaf area density, plant area density, etc.

Leaf Inclination Angles
The skeleton, also called the symmetry axis, is a useful structure-based object descriptor.Extracting object skeletons directly from natural images can deliver important information about the presence and size of objects.The skeleton segment [113] is often applied to leaf angle measurement.Skeletonization is used to show the geometrical and topological properties of that shape.Bao [30] made a skeleton segmentation for maize and filtered the skeleton nodes that satisfy suitable point-tostem distance, and leaf angle was computed using PCA and approximated by the first eigenvector of the filtered nodes, the skeleton segmentation result is shown in (a) of Figure 9.As a result of leaf angle stability, not change with zooming in or out, the leaf projected can be used to calculate the leaf angle.Biskup [114] used leaf projected ROI (region of interest) for plane fitting to build a planar surface model, which is obtained by the RANSAC algorithm and by analyzing the covariance matrix of the outlier-free point cloud; the leaf angle was obtained corresponding to the dihedral angle between two planes, the detailed is shown in (b) of Figure 9.

Leaf Area Density (LAD)
Leaf area density (LAD) is defined as the one-sided leaf area per unit of horizontal layer volume [115].The leaf area index (LAI), which is defined as the leaf area per unit ground area, is calculated by integrating the LAD over the canopy height.For LAD, leaf area and plant volume need to be calculated by each layer voxel area, which is obtained by transferring point clouds into voxel-based three-dimensional model.
For the direct calculating of LAD, Hosoi [116] proposed the voxel-based canopy profiling (VCP) method to estimate tree LAD; data for each horizontal layer of the canopy were collected from optimally inclined laser beams and were converted into a voxel-based three-dimensional model; then LAD and LAI were computed by counting the beam-contact frequency in each layer using a pointquadrat method.
For the measurement of plant volume, an alpha shape volume estimation was used to calculate plant volume [117].This algorithm estimates the concave hull around point clouds and computes the volume from there.Paulus [102] used an alpha shape volume estimation method for volume estimation and an accurate description of the concave wheat ears with segmental point clouds, the detailed presentation is shown in (a) of Figure 10.Hu [27] proposed a method based on tetrahedrons to calculate plant volume; tetrahedrons were constructed by down-sampled point cloud, distance of any two points should be smaller than maximum edge length of tetrahedrons, and plant volume can be calculated by tetrahedrons point space.When the plant is reconstructed by voxel grid or octree, the volume can be estimated by adding up the volumes of all the voxels covering the plant, the detailed presentation is shown in (b) of Figure 10.Chalidabhongse [118] made 3D mango reconstruction based on the space carving method, and each projected voxel in the voxel space onto the all view of images was the approximation of the object volume.
For leaf fitting using NURBS, leaf area is calculated by the sum of each partial area according to fitting surface mesh.Santos [119] and Hu [32] used NURBS to calculate mint and poplars area, and the results were very accurate.It is relatively simple to get the whole plant area with needless segmentation, Bao [13] converted point clouds into triangle mesh, reconstructed surface with PCL, and the plant surface area was approximated by the sum of areas of all triangles in the mesh.When a voxel grid or octree reconstructs the plant, a sequential cluster connecting algorithm and subsequent refinement steps need to be carried out to segment the leaf, then voxel grid or octree is converted into point cloud for piece-wise fitting of leaf planes [120].Scharr [55] used volume carving to make 3D maize reconstruction and leaf area was calculated by a sequence of segmentation algorithms.In addition, the marching cubes algorithm [121] can also calculate the area of a voxel or octree by fitting a mesh surface.

Plant Area Density (PAD)
The notion of plant area density (PAD) is easy to understand, which is defined as canopy area per unit of ground area.So the device for generating points of data needs to have a broad-scale survey range, and as such, handheld laser scanner and airborne laser scanner (ALS) remote sensing are often used.As a result of large quantities of data for broad-scale plant area measurement, point cloud segmentation and reconstruction are complex and difficult, so PAD is estimated based on the VCP [116] method by converting point clouds into a voxel-based three-dimensional model.Song [122] used an airborne laser scanner estimate tree PAD, and PAD was computed with the VCP method.Table 9 and Table 10 shows the 3D reconstruction of plants and the analysis of the structure index using single and multiple measurement methods.[116] R2: 0.818 [122] Note: R 2 , Coefficient of determination, the ratio of the sum of the squared regression to the sum of the squared total errors is an index of the degree of fit of the trend line; MAPE, Mean absolute percentage error.

Poor Standardization of Algorithms
There is a lot of variability of the appearance of different kinds of plants, and the analysis method of reconstruction and segmentation aims to only specific plants, moreover it may apply different algorithms for the same plant in different environments.In the flow of 3D plant data acquisition, point cloud processing, 3D plant reconstruction, plant segmentation, and plant canopy structure parameters extraction have multiple processing algorithms and do not have an optimal criteria to build standards and specifications (such as labeling, naming, formatting, and integrity constraints).The problems include large differences in format and accuracy, incomplete supporting data, data redundancy, and low data use.The data from the plant organ layer to the individual plant layer to the group canopy level are independent from each other in the study of 3D canopy structure, and the matching characterization with plant physiological data (such as canopy photosynthesis data) needs to be standardized [125].For example, Delaunay triangulation, region-based growth surface reconstruction, and implicit surface reconstruction can be used for plant reconstruction and have different results.

3D Reconstruction Operation Is Slow
The data processing speed can be influenced by the number of input points, which could be a time-consuming problem for large-sized plants.When analyzing plant phenotypes on a large scale, 3D reconstruction takes longer and is less efficient due to the large number of objects to be analyzed.The analysis shows that the 3D reconstruction effect of multi-view images is related to the number of images.The higher the number of images, the better the reconstruction effect, but the corresponding calculation amount also increases considerably [126], resulting in a time-consuming reconstruction process.In addition to the speed improvement required by hardware, software algorithms are required to speed up the calculation.
3D reconstruction speed has a direct relationship with point cloud data size, and rough and fine reconstruction also take different times.Marton [127] used the triangulation method to make an urban scene fast surface reconstruction, which needed 8.983 seconds with 65,646 points and reconstruction of radiohead took 17.857 seconds with 329,741 points.Although 3D reconstruction takes little time, generating dense and complete a 3D point cloud with multi-images will take a lot of time.The CMPMVS software ran for around 182 minutes from 66 input images, and Lou [128] used an improved SfM method, which ran for 15 minutes to produce the final 3D point cloud for the same images.

Plant 3D Reconstruction Is Inaccurate
Currently, plant analysis and reconstruction technology uses moment phenotype extraction and lacks a monitoring of growth dynamics; however, monitoring of growth dynamics requires a noninvasive time-lapse imaging system that supports accurate reconstruction of plant architecture and most depth cameras or other devices provide only rough approximations of size, often lacking high spatial or high temporal resolution [129].In addition, the occlusion of the plant canopy structure causes problems such as voids or holes, untextured areas, and blurred images in the final 3D models of some plants.Therefore, occlusion problems should be avoided as much as possible during the image collection process.Multi-view stereo reconstruction with multiple devices working together like laser scanner and ToF camera has high accuracy for sheltered leaves and fruit plant reconstruction, but rapid multi-view registration is difficult for achieving the high-throughput 3D phenotypic analysis.
Models that have been proposed thus far are still limited in their application because of sensitivity to outdoor illumination conditions and the inherent difficulty in modeling complex plant shapes using only radiometric information.Different plant or imaged environments also have a great reconstruction performance difference with the same material and methods.In the 3D stereo model, the reconstruction errors of corn, sunflower, black nightshade, and tomato are 5.7, 4.6, 5.2, and 4.7% in LCA (leaf cover area) [123].The data accuracy meets the demand for precision agriculture practices, but still needs to improve the reconstruction accuracy in fine phenotypic analysis and texture research.
The process of plant 3D data capture is easily affected by light intensity, blurred edges, wind factors, etc., which lead to data loss or low quality, affecting the segmentation of plants and background.When the plant structure cannot be completely reconstructed, the reconstruction accuracy is reduced.Although when structured light and ToF camera avoid the condition being indoors, having high measuring speed, and strong robustness with a no-movement plant, the major weakness is the existing high noise among 3D data, which is a challenge for plant segmentation.For individual plant organ segmentation, there is no unified and standard methods, which largely vary according to diffident plant morphology.Existing methods based on machine learning can achieve good results, but require manual participation and cannot provide automatic segmentation.

High Equipment Collection Cost
The current limitation of the broad-scale plant detection is that it relies on a relatively expensive robotic platform and positioning system.The commercial possibilities of a scout robot are better since the robot's task can be executed while navigating when the automatic data processing can be carried out.As LiDAR [130], light field camera [131], high-precision TOF cameras, and other instruments are expensive, they are suitable only for laboratory research and large-scale facilities and agricultural sites.They are currently in the pilot stage, but manual operation is often needed and the promotion is limited due to funding problems [132].Although the cost of applying SfM photogrammetry is lower, generating more detailed models will increase time required and costs.For broad-scale plant detection of large farms or forests, airspace carrying devices including unmanned aerial vehicles (UAV) or farm helicopter transport is necessary, which adds the extra cost.

Establishing a Standard System of 3D Plant Canopy Structure Data
A future research direction should go into automating the manual estimations by automatically setting the point density parameter in order to avoid manual trimming.Additionally, more research needs to be done with the leaf area index (LAI) parameter estimation.High-throughput phenotyping for large greenhouses and open fields (if the measurements are performed on cloudy or low sunlight intensity days) is a future application for the analysis system.Phenotypical analysts have introduced the canopy structure index into various agricultural professional models to match plant physiological data and improve the international universality of agricultural professional models.
Due to the significant differences in the different plant characteristics on different scales, it is possible to refine the plant species as a unit on multiple scales such as organ, individual, or population, and consider the top-level design principles of 3D structure analysis of plant canopies.The top-level design principles include related terminology categories, detection schemes, technical standards, technical methods, models for obtaining and using relevant data, and the representation and verification procedures of the relationship between various data.

Speeding Up the 3D Plant Canopy Structure Reconstruction
In the different methods used to study plant phenotype, the effects of image preprocessing and scaling on image registration accuracy can be studied [133] to reduce lighting interference, background interference, image distortion, and other problems, and then improve the matching degree of plant reconstruction and enhance the algorithm robustness.If distributed computing can be combined with computer cluster computing [134], the reconstruction algorithm could be sped up, and performing distributed optimization on the algorithm could also improve the calculation accuracy and reduce the calculation time.Clustering algorithm mainly applies in point clouds processing of background subtraction and outlier removal, along with surface feature-based segmentation.
In the construction of the collection device platform, the UAV is a type of remote sensing platform that is unmanned and reusable.After being equipped with a 3D canopy shape collection device, the UAV could provide rapid collection, flexible movement, and convenient control.Especially with the miniaturization of the 3D shape collection device, UAVs can acquire visible or near-infrared images, 3D point cloud images, multispectral images, and remote sensing images with high spatial resolution at any time.It is possible to construct a 4D space-time scene of farmland based on UAV remote sensing images through real-time data collection to achieve cross-fusion of time series and spatial images [135].

Improving the Accuracy of the 3D Structure Index of Canopy Reconstruction
3D plant canopy structure measurement technology can be embedded in phenotypical analysis tools.Sensor fusion technology can be used to quantify 3D canopy structure and single leaf shape features by integrating multiple features to improve the accuracy of the structure index.The color, depth, and infrared data included in the image can be combined to improve the integrity of the plant phenotypical data and improve the 3D reconstruction effect.Using multiple devises working together to obtain point clouds from multi-view can reduce noise and improve reconstruction accuracy.
Optimizing the segmentation algorithm parameters to support a wider range of plant species with less parameter tuning is important to improve plant structure index extraction accuracy.Neural networks can be used for classification of segmentation.Deep learning on point clouds is still at the forefront of research.Multi-view convolutional neural networks (CNNs) have tried to render 3D point cloud into 2D images and then apply 2D conv nets to classify them, which can make shape classification, but it cannot achieve 3D tasks such as point classification and shape completion [129].Feature-based deep convolutional neural networks (DNNs) firstly convert the 3D data into a vector, by extracting traditional shape features and then use a fully connected net to classify the shape, but they are constrained by the representation power of the features extracted [63].Qi [136] proposed a novel deep neural network called PointNet, it can achieve point classification or semantic segmentation with a 1080X GPU.In conclusion, integrating the local and global features extracted by deep learning models with the spatial representation of the point clouds will be useful to design a model for plant canopy segmentation with top performance, but at present its segmentation quality is low as a result of point clouds being irregular and sparse.The promising solutions are improving multi-scale point clouds resolution, developing the architectures of the deep learning models like those in RGB images, and improving the processing raw point clouds based on zero-shot learning [137].

Figure 1 .
Figure 1.Binocular stereo vision principle.x1 and x2 is value of image coordinate and can be obtained from image plane directly, and camera calibration can get f (focal distance) and b (baseline).The z (deep of object) can be calculated by triangle similarity principle, which result as , and

Figure 2 .
Figure 2. Principle of time of flight image collection: (a) distance measurement based on continuouswave modulation; (b) distance measurement based on pulsed modulation.

Figure 3 .
Figure 3.The principle of structured light.

1 ) 4 ) 3 .
No external light 2) Single viewpoint to compute depth 1) Poor depth resolution 2) Not work in bright light 3) Short distance measurement LiDAR scanning technology 1) Fast image collection 2) Can work at night 3) Can work in severe weather (rain, snow, fog, etc.) for advanced laser scanning Works over long distances (more than 100 m) 1) Poor edge detection (3D point clouds of edges of plant organs like leaves, for instance, are blurry) 2) Needs warm-up time 3) Need for movement to obtain the depth data of Plant Canopy Structure Measurement Based on 3D Reconstruction Plant canopy structure measurement based on 3D reconstruction main flows include 3D plant data acquisition, point cloud processing, 3D plant reconstruction, plant segmentation, plant canopy structure parameters extraction.The processes are shown in Figure 4.

Figure 4 .
Figure 4. Flow chart of plant canopy structure measurement based on 3D reconstruction.
3D point cloud data can be obtained by a visual sensor based on binocular stereo vision technology, multi-view vision technology, SfM technology, ToF technology, and so on.The details of the technical principle and camera specifications are shown in Section 2.

Figure 7 .
Figure 7. Iterative closest point (ICP) algorithm: realize the registration of A and B point clouds.

Figure 9 .
Figure 9. (a) Skeleton segments that contain both stems and leaves [30]; (b) 3D reconstruction of a soybean leaf consisting of three leaflets.Black lines: normal vectors to fitted plane; red contour: projected region of interest (ROI) used for plane fitting [114].

Figure 10 .
Figure 10.(a) A description of the concave wheat ears with segmental point clouds [102]; (b) The triangulation results of three different sized plants, and the triangle vertexes extracted from triangular mesh were used as the points to construct tetrahedrons, which can be used to calculate volume [27].
N.A. indicates that data were not found; RGB is the abbreviation of red, blue and green.

Table 2 .
Open source software for multi-view stereo technology (MVS).

Table 3 .
Commercial software for 3D scene modeling utilizing SfM-MVS.

Table 4 .
Depth camera comparison based on time of flight (ToF).
N.A. indicates that data were not found.

Table 6 .
Depth camera comparison based on structured light.

Table 7 .
Summarize the advantages and disadvantages of each technology.

Table 8
introduces some functions of open source point cloud processing libraries and open source software.

Table 8 .
Introduction of open source libraries and software for point cloud processing.

Table 9 .
Examples of RMSE for plant canopy 3D structure parameters measurement.

Table 10 .
Examples of MAPE and R2 for plant canopy 3D structure parameters measurement.