Advanced Machine Learning in Point Spectroscopy, RGB- and Hyperspectral-Imaging for Automatic Discriminations of Crops and Weeds: A Review

: Crop productivity is readily reduced by competition from weeds. It is particularly important to control weeds early to prevent yield losses. Limited herbicide choices and increasing costs of weed management are threatening the proﬁtability of crops. Smart agriculture can use intelligent technology to accurately measure the distribution of weeds in the ﬁeld and perform weed control tasks in selected areas, which cannot only improve the e ﬀ ectiveness of pesticides, but also increase the economic beneﬁts of agricultural products. The most important thing for an automatic system to remove weeds within crop rows is to utilize reliable sensing technology to achieve accurate di ﬀ erentiation of weeds and crops at speciﬁc locations in the ﬁeld. In recent years, there have been many signiﬁcant achievements involving the di ﬀ erentiation of crops and weeds. These studies are related to the development of rapid and non-destructive sensors, as well as the analysis methods for the data obtained. This paper presents a review of the use of three sensing methods including spectroscopy, color imaging, and hyperspectral imaging in the discrimination of crops and weeds. Several algorithms of machine learning have been employed for data analysis such as convolutional neural network (CNN), artiﬁcial neural network (ANN), and support vector machine (SVM). Successful applications include the weed detection in grain crops (such as maize, wheat, and soybean), vegetable crops (such as tomato, lettuce, and radish), and ﬁber crops (such as cotton) with unsupervised or supervised learning. This review gives a brief introduction into proposed sensing and machine learning methods, then provides an overview of instructive examples of these techniques for weed / crop discrimination. The discussion describes the recent progress made in the development of automated technology for accurate plant identiﬁcation as well as the challenges and future prospects. It is believed that this review is of great signiﬁcance to those who study automatic plant care in crops using intelligent technology.


Introduction
Effective weed management is particularly important for smart agriculture. Owing to their prolific seed production and seed longevity, invasive weeds usually grow very fast and spread over the entire field at the fastest speed, competing with crops for environmental resources such as physical space, nutrients, sunlight, and water [1]. Weeds emerge earlier than crops or have a larger initial size (for example, weeds growing from roots) without natural enemies, which has a very adverse effect on the yield increase at all stages of crop growth [2]. To avoid crop yield reduction, herbicides are widely used for weed control in all agricultural areas. The market for herbicide keeps growing in the world and the global herbicide sales are expected to reach $31.5 billion by 2020 [3]. However, the current status of weed detection, the technique of using both spatial and spectral information in the discrimination step has been proposed. By combining spectroscopy and machine vision, hyperspectral imaging provides richer spatial and spectral information, and it demonstrates a stronger ability to distinguish between crops and weeds [20]. For instance, Zhang, Staab, Slaughter, Giles, and Downey [11] developed a ground-based weeding robot equipped with a hyperspectral sensor (384-810 nm) and a pulsed-jet thermal micro-dosing system that can automatically remove intra-row weeds (including Solanum nigrum L. and Amaranthus retroflexus L.) in organic tomato crops in real time with the accuracy as high as 95.8%.
Over the years, the rapid development of new technologies in smart agriculture based on non-imaging spectroscopy [21] RGB imaging [22], or hyperspectral imaging [23] has encouraged extensive research on automatic weed detection. To date there has been no review published that focused on these techniques to differentiate crops from weeds. This paper gives a brief introduction about the necessity of weed management and the limitations of existing commercial methods, followed by an overview of proposed weed control methods. The second section mainly describes three techniques, including spectroscopy, RGB imaging, and hyperspectral imaging available. The advanced machine learning algorithms for plant recognition are introduced in the third section. In the fourth part, an emphasis has been given to the studies of these non-imaging and imaging sensing options coupled with machine learning for discriminations of crops and weeds. The discussion is provided on the studies of these sensing techniques for robotic identification. Challenges and future prospects have been given on mentioned methods for real time weed control.

Point Spectroscopy, RGB-, and Hyperspectral-Imaging
Three sensing techniques are mainly presented in this study. Spectroscopy is used to obtain spectral information in a wide spectral range, in which vibrations of specific frequencies that match the transition energy of bonds or groups can be detected. Visible/infrared (VIS/IR) spectroscopy is a non-destructive sensing technology that can quickly determine the properties of objects based on the spectral information in the VIS or IR spectral range without sample pretreatment [24]. The VIS region mainly contains spectral information related to color features in the range of 380 to 780 nm. According to the distance between the IR spectrum and the VIS spectrum, the spectrum in the range of 780-2500 nm is the near infrared (NIR) spectrum [25]. The spectrum in the 2500-25,000 nm region is considered to be the mid-infrared (MIR) spectrum, and the spectrum in the 25,000-300,000 nm region is called the far infrared (FIR) spectrum [26]. NIR and MIR spectra have higher energy than FIR spectrum, which facilitates analysis and extraction of fingerprint information related to chemical compositions [27]. The NIR spectrum can be used to activate overtones or harmonic vibrations, while the MIR spectrum is mainly related to the basic vibration and rotational vibration structures [28,29]. NIR spectroscopy can be used to analyze the stretching and bending of chemical bonds including S-H, C-H, O-H, and N-H [30]. The MIR spectrum provides feature information related to chemical functional groups [31,32]. RGB imaging refers to the use of an RGB digital camera equipped with three color filters to capture the image of a scene, mimicking the way the normal human eye perceives color. In RGB imaging, the most common method of acquiring color images is to use the Bayer filter designed by Bryce Bayer in 1976 [33]. The Bayer filter consists of a mosaic of red (R), green (G), and blue (B) filters (Figure 1), which are embedded on the grid of the image sensor of a charge-coupled device (CCD) camera or a digital single-lens reflex (DSLR) camera [34]. The original RGB color image can be generated by merging the recorded broad band information containing R, G, and B lights. The information from these three broad bands are potentially less sensitive than full-wavelength spectra to specific changes in the spectral response when comparing crop to weeds. Besides the spectral features, other features including visual textures, biological morphology, and spatial contexts could also be useful for plant detection [4]. Texture features represent attributes of the spatial arrangement of the gray levels of image pixels, which provide measures such as coarseness, smoothness, and regularity. Biological morphology refers to the shape and structure of different parts of a plant. For rows of crops in the field, their spatial contexts or location information can enhance the accuracy of discrimination. Hyperspectral imaging (also called imaging spectroscopy) combines the main features of imaging and spectroscopy to collect spectral information over the full wavelength range for each pixel of the acquired image [35]. When VIS/IR spectroscopy is integrated with imaging technology, the data obtained becomes an image with a three-dimensional (3-D) structure, which contains one spectral dimension and two spatial dimensions [36]. This indicates that hyperspectral images can provide both the spectral feature and the image features of plants. There are usually three methods for acquiring full-wavelength hyperspectral images, mainly including line scanning (pushbroom), area scanning (tunable filter), and point scanning (whiskbroom) [37]. Several feature variables selected from the full-wavelength region are able to develop a simplified multispectral system to indicate specific characteristics of the object of interest [38]. The multispectral imaging, with the advantage of light hardware and faster calculation speed, is becoming the successor of hyperspectral technology [39].
Smart Cities 2020, 3 FOR PEER REVIEW 4 imaging spectroscopy) combines the main features of imaging and spectroscopy to collect spectral information over the full wavelength range for each pixel of the acquired image [35]. When VIS/IR spectroscopy is integrated with imaging technology, the data obtained becomes an image with a three-dimensional (3-D) structure, which contains one spectral dimension and two spatial dimensions [36]. This indicates that hyperspectral images can provide both the spectral feature and the image features of plants. There are usually three methods for acquiring full-wavelength hyperspectral images, mainly including line scanning (pushbroom), area scanning (tunable filter), and point scanning (whiskbroom) [37]. Several feature variables selected from the full-wavelength region are able to develop a simplified multispectral system to indicate specific characteristics of the object of interest [38]. The multispectral imaging, with the advantage of light hardware and faster calculation speed, is becoming the successor of hyperspectral technology [39].

Machine Learning Algorithms
Machine learning can help discover the rules and patterns that exist in large amounts of data to assist with decision-making. Machine learning can be divided into two types: unsupervised learning and supervised learning. Unsupervised learning such as cluster analyses (CA) is to explore undetected patterns in unlabeled data sets without human supervision [40]. The most widely used machine learning algorithms in smart agriculture are supervised learning methods such as discriminant analysis (DA) [41]. Supervised learning performs a task that learns a function by mapping an input to an output based on example input-output pairs, which means that the algorithm learns a target function from attached class labels [42]. The characteristic of supervised learning is to learn from the training data used to define the behavior of the algorithm used. The supervised learning formal notation begins with a set of instances in a vector x(j) and a class represented by y(j), forming a pair represented by (x(j), y(j)). An example can be described using a set of variables or features that have different forms, such as nominal (enumeration), number, or binary (e.g. 0 or 1). The n example data {(x(j), y(j)); i = 1,..., n} is called training data. After learning, the model would be verified using validation data. Then, the trained model can classify the test data based on the mastered experience. Since it is easier to train a system by showing examples of expected input-output behavior compared to manual programming by predicting the expected response of all possible inputs, supervised learning algorithms such as convolutional neural network (CNN), artificial neural network (ANN), support vector machine (SVM), and soft independent modelling of class analogy (SIMCA) are becoming a popular alternative method for developing practical software used in weed detection for precision agriculture robots [43,44]. A variety of sensing techniques exist within machine learning. Figure 2 describes a schematic of a general framework for crop/weed classification based on spectroscopy, color imaging, and hyperspectral imaging. Detailed applications of these techniques are given in the following section.

Machine Learning Algorithms
Machine learning can help discover the rules and patterns that exist in large amounts of data to assist with decision-making. Machine learning can be divided into two types: unsupervised learning and supervised learning. Unsupervised learning such as cluster analyses (CA) is to explore undetected patterns in unlabeled data sets without human supervision [40]. The most widely used machine learning algorithms in smart agriculture are supervised learning methods such as discriminant analysis (DA) [41]. Supervised learning performs a task that learns a function by mapping an input to an output based on example input-output pairs, which means that the algorithm learns a target function from attached class labels [42]. The characteristic of supervised learning is to learn from the training data used to define the behavior of the algorithm used. The supervised learning formal notation begins with a set of instances in a vector x(j) and a class represented by y(j), forming a pair represented by (x(j), y(j)). An example can be described using a set of variables or features that have different forms, such as nominal (enumeration), number, or binary (e.g. 0 or 1). The n example data {(x(j), y(j)); i = 1,..., n} is called training data. After learning, the model would be verified using validation data. Then, the trained model can classify the test data based on the mastered experience. Since it is easier to train a system by showing examples of expected input-output behavior compared to manual programming by predicting the expected response of all possible inputs, supervised learning algorithms such as convolutional neural network (CNN), artificial neural network (ANN), support vector machine (SVM), and soft independent modelling of class analogy (SIMCA) are becoming a popular alternative method for developing practical software used in weed detection for precision agriculture robots [43,44].
A variety of sensing techniques exist within machine learning. Figure 2 describes a schematic of a general framework for crop/weed classification based on spectroscopy, color imaging, and hyperspectral imaging. Detailed applications of these techniques are given in the following section.

Applications for weed/crop discrimination
The concept of weed control has aroused much attention, and many scientists have investigated the feasibility of non-imaging spectroscopy, color imaging, and hyperspectral imaging for rapid discrimination of weeds from crops during past few years. This section provides an overview of recent progresses of these methods, and the related applications in this area are respectively listed.

Point Spectroscopy
Point spectroscopy is to obtain the spectral information based on the interaction between electromagnetic radiation and the target in specific spectral ranges [25]. Spectroradiometers and spectrophotometers are typical representatives of high spectral resolution systems lacking spatial resolution and can be used for point measurement rather than measurement in imaging systems. Spectral resolution refers to the spectral measurement bandwidth. These high spectral resolution systems have a narrow measurement bandwidth and can resolve finer spectral characteristics. Although some systems can measure the reflected light with high spatial resolution from a small target area, they cannot be used for on-site monitoring. Since spectral reflectance is closely related to spectral absorption, it is necessary to mention the absorption of components in plants.  [45,46]. Most plant leaves contain a set of the above listed ingredients, but their concentrations vary greatly among different plants, which causes the vibration amplitude of spectral absorption to be different. This spectral difference can be used to distinguish crops and weeds. The spectral reflectance is also affected by the cell structure and the physical structure of the plant surface [47]. The difference in the structure of different plants influenced in the spectral reflectance helps to identify different plants. This feature of reflectivity is most related to the spectra in NIR region. In addition, plants excited by higher-energy light such as ultraviolet (UV) light also emit spectral fluorescence [13]. Specifically, the rate of plant photosynthesis has a great relationship with chlorophyll fluorescence, and can be used as an important indicator of green plant stress [48].

Applications for Weed/Crop Discrimination
The concept of weed control has aroused much attention, and many scientists have investigated the feasibility of non-imaging spectroscopy, color imaging, and hyperspectral imaging for rapid discrimination of weeds from crops during past few years. This section provides an overview of recent progresses of these methods, and the related applications in this area are respectively listed.

Point Spectroscopy
Point spectroscopy is to obtain the spectral information based on the interaction between electromagnetic radiation and the target in specific spectral ranges [25]. Spectroradiometers and spectrophotometers are typical representatives of high spectral resolution systems lacking spatial resolution and can be used for point measurement rather than measurement in imaging systems. Spectral resolution refers to the spectral measurement bandwidth. These high spectral resolution systems have a narrow measurement bandwidth and can resolve finer spectral characteristics. Although some systems can measure the reflected light with high spatial resolution from a small target area, they cannot be used for on-site monitoring. Since spectral reflectance is closely related to spectral absorption, it is necessary to mention the absorption of components in plants.  [45,46]. Most plant leaves contain a set of the above listed ingredients, but their concentrations vary greatly among different plants, which causes the vibration amplitude of spectral absorption to be different. This spectral difference can be used to distinguish crops and weeds. The spectral reflectance is also affected by the cell structure and the physical structure of the plant surface [47]. The difference in the structure of different plants influenced in the spectral reflectance helps to identify different plants. This feature of reflectivity is most related to the spectra in NIR region. In addition, plants excited by higher-energy light such as ultraviolet (UV) light also emit spectral fluorescence [13]. Specifically, the rate of plant photosynthesis has a great relationship with chlorophyll fluorescence, and can be used as an important indicator of green plant stress [48].
There have been many studies applying this VIS/NIR/MIR spectroscopy in identification of plant species. Based on VIS/NIR spectroscopy (325-1075 nm), back propagation artificial neural network (BPANN) and radial basis function neural network (RBFNN) showed a strong ability for weed identification in soybean fields [49][50][51]. Broadleaf weeds among wheat and chickpea crops were successfully classified by general discriminant analysis (GDA) using the VIS/NIR spectra (700-1200 nm) with the highest classification accuracy of about 95% [52]. SVM, ANN, and decision tree (DT) developed using VIS spectra (350-760 nm) obtained higher accuracies than those using VIS/NIR spectra (350-2500 nm) for crop/weed separation [53]. Based on canonical discriminant analysis (CDA) model, the NIR spectra (1580-2320 nm) were more powerful than the spectra from red-edge (RD)/NIR (700-1100 nm) for distinguishing nightshade weeds from tomato plants [54]. Moreover, Panneton, et al. [55] and Panneton, et al. [56] investigated the feasibility of UV induced fluorescence spectroscopy (400 to 755 nm) for discriminating corn from dicot and monocot weeds. The RD fluorescence spectra (670 to 755 nm) was feasible for weed discrimination resulting in an accuracy of about 85% [55]. Based on spectral bands of blue-green (BG) fluorescence (400-490 nm), linear discriminant analysis (LDA) discriminated dicot weed from the corn crop with classification error of less than 5.2% [56]. In addition to VIS/NIR spectroscopy, MIR spectroscopy coupled with unsupervised learning (such as CA) and supervised learning (such as DA) was successfully employed to distinguish the weeds, such as groundsel, barnyard grass, and blackgrass from cereal crops (such as maize, barley, and wheat) and vegetable crops (such as sugar beet and rocket salad), with a 100 % correct classification [57,58].
Since the continuous narrow-band data sets contain too much redundant spectral information, the latest goal of spectral sensing is to determine the combination of several feature spectra that is most useful for plant classification. Spectral variables associated with features with higher discriminative potential are expected to be selected to develop simplified algorithms. A dedicated sensor with the selected spectral bands can be then developed to realize the mapping of weed infections for specific application of herbicides. Based on several feature wavelengths selected from 350 to 2500 nm by principal component analysis (PCA), Bayesian discriminant analysis was effectively used to classify 5 weeds such as barnyard grass and green foxtail from 2 seedling cabbages [59]. In another study, the simplified CDA model developed using feature variables (672, 757, 897, 1117, and 1722 nm) selected by uninformative variable elimination (UVE) and successive projection algorithm (SPA) showed higher recognition accuracy (98.99%) than that (90.91%) of partial least square discriminant analysis (PLSDA) for discrimination of weeds from the winter rape [60]. One hundred percent classification accuracy was achieved using 3 feature wavelengths (385, 415, and 435 nm) for distinguishing spine-greens from the fiber crop (cotton), and 5 wavelengths (375, 465, 585, 705, and 1035 nm) for distinguishing barnyard-grass from the rice crop [61]. SIMCA models using feature wavelengths in VIS (640, 676, and 730 nm) and NIR (1078, 1435, 1490, and 1615 nm) regions classified three weed species (water-hemp, kochia, and lamb's-quarters) with over 90% accuracy [44]. Compared to random forests (RF), higher accuracy (97%) was obtained by the SIMCA using four VIS/NIR variables (500-550, 650-750, 1300-1450, and 1800-1900 nm) to differentiate sugarcane from weeds [62]. A good classification result was achieved due to the different constituent elements or concentrations of compounds (such as anthocyanin, chlorophyll, and moisture) between this crops and weeds, which caused a significant difference in spectral features (Figure 3). SVM using 3 wavelengths (635, 685, and 785 nm) in RD region classified broad-leaved weed (silver beet) from narrow-leaved corn plants with the high accuracy of 97% [63]. Researchers found that the RD region was highly important for separation of crops and broadleaf weeds [21]. Nevertheless, Shirzadifar, Bajwa, Mireei, Howatt and Nowatzki [44] demonstrated that SIMCA model using characteristic wavelengths in NIR region (1078, 1435, 1490, and 1615 nm) achieved 100% accuracy, which was more effective than that using the wavelengths (640, 676, and 730 nm) in RD region for discrimination of different weeds (including kochia, water-hemp, and lamb's-quarters).
Overall, non-imaging spectroscopy combined with machine learning models (such as SVM, LDA, and SIMCA) has shown great potential for discriminants of crops including cereals (such as maize, barley, and wheat), vegetables (such as sugar beet and rocket salad), and fibers (such as cotton) from all kinds of weeds. The models developed using reduced feature variables showed equivalent accuracy to full wavelength models. The performance of model using NIR or RD spectra appeared better than that using other the spectra in VIS region. Considering that these promising feature spectra have been determined in recent studies, further work is required to use more effective machine learning algorithms to establish multispectral systems to improve the speed and identification accuracy of plant species. The accuracies of developed algorithms were affected by many factors. Further studies should be carried out to assess the impact of a specific factor (such as plant species, spectral types and machine learning methods) on detection accuracy. The application of non-imaging spectroscopy is listed in Table 1. Overall, non-imaging spectroscopy combined with machine learning models (such as SVM, LDA, and SIMCA) has shown great potential for discriminants of crops including cereals (such as maize, barley, and wheat), vegetables (such as sugar beet and rocket salad), and fibers (such as cotton) from all kinds of weeds. The models developed using reduced feature variables showed equivalent accuracy to full wavelength models. The performance of model using NIR or RD spectra appeared better than that using other the spectra in VIS region. Considering that these promising feature spectra have been determined in recent studies, further work is required to use more effective machine learning algorithms to establish multispectral systems to improve the speed and identification accuracy of plant species. The accuracies of developed algorithms were affected by many factors. Further studies should be carried out to assess the impact of a specific factor (such as plant species, spectral types and machine learning methods) on detection accuracy. The application of non-imaging spectroscopy is listed in Table 1. SIMCA--soft independent modelling of class analogy; LDA--linear discriminant analysis; GDA--general discriminant analysis; RF--random forest; CA--cluster analyses; DA--discriminant analysis; PLSDA--partial least square discriminant analysis; SVM--support vector machines; NN--neural network.

RGB Imaging
The use of RGB imaging to acquire image data for plant species discrimination is well established. The RGB system of color cameras equipped with broadband filters is typical representatives of high spatial resolution systems with limited spectral resolution. Spatial resolution refers to the area where individual measurements can be made. Imaging systems with high spatial resolution can have pixel resolutions of the order of a few millimeters or even smaller [65]. Although only broadband reflectance images are acquired, it is sufficient to effectively distinguish plants from the background such as soil based on different spectral characteristics [66]. After segmenting the acquired image with broadband reflectance data, the features representing morphology or texture of plant canopy and plant leaf can be extracted for different plant differentiations.
RGB imaging and machine learning algorithms have been widely utilized for classification between crops and weeds. CNN in deep learning has high performance in object detection and automatic feature engineering with uncontrolled illumination. The CNN models were used to train plant RGB images and successfully identified sugar beet plants from weeds, which was suitable for online operation in the fields [67]. The classification results of CNN show that the recognition rate of weeds from soybean crops reached 91.96% [68][69][70][71]. Higher identification accuracy (92.89%) was obtained by CNN when the random initialization weights of CNN parameters was replaced by k-means unsupervised feature learning as pre-training process [72]. The best crop/weed identifier based on CNN achieved a high accuracy of 99.29% on classification of tomato and cotton from common weeds [73]. The ability of CNN and k-FLBPCM (filtered local binary patterns with contour masks and coefficient k) models was demonstrated to identify crop and weed species of similar morphologies such as canola and wild radish. In this study, based on both models, these weeds were effectively classified from barley crops at four different growth stages with accuracies up to 99% [74]. However, the k-FLBPCM model trained using images of large leaves collected in the fourth growth stage can accurately identify the smaller leaves of plants in the second and third growth stages, which cannot be done by CNN.
Besides CNN and k-FLBPCM, other algorithms such as RF, SVM, and ANN have also been investigated to distinguish between species by a number of researchers. For example, RF distinguished cotton from intra-row weeds with the highest accuracy of 85.83% [75]. In another study, the RF achieved higher performance, yielding 94.5% accuracy for classification of weeds (such as bindweeds, lamb's quarters, and crabgrass) in the early growth stage of maize field [76]. When ANN was considered, 95.1% of crop plants was correctly detected [77]. Nevertheless, SVM achieved higher performance (accuracy of 96.67%) than the ANN for weed detection in sugar beet fields [43]. SVM also differentiated maize from the mixes of different species of weeds with an accuracy of 96.74% [78]. Besides supervised learning methods, unsupervised clustering algorithm was successfully applied to detect weeds in sugarcane and rice fields, yielding an overall accuracy of more than 94% [79,80]. Without any prior knowledge on the species present in the field, a naive Bayesian classifier and a Gaussian mixture clustering algorithm discriminated 85% of the weed of multiple species [81]. Two weed species were discriminated with an overall accuracy of 98.40% based on the Bayes classifier and the execution time for each image is about 35 millisecond [82]. Then, an automated image classification system differentiated crops and weeds in sugarcane fields with 92.9% accuracy over a processing time of 20 millisecond [83]. The algorithm using the hue, saturation, value (HSV) color space demonstrated very good classification performance and recognized 98.91% of cauliflower plants. However, the misclassification rate increased when the color of the plant leaf changes due to disease or very sunny conditions [84]. Hough transform algorithm achieved over 90% accuracy for crop/inter-row weed discrimination [85]. Wavelet texture features were able to distinguish weeds among the crop with a correct detection rate of 96%, while, at most, 4% of sugar beets were incorrectly classified as weeds [86].
The abovementioned studies demonstrate that RGB imaging is an efficient tool to classify crop plants and weeds. Among the machine learning tools, the most commonly used is CNN.
A major advantage of CNN is that it can automatically extract features and classify plant images with high accuracy. CNN and k-FLBPCM showed the highest capacity for discrimination of crops (including barley, maize, wheat, sugar beet) from weeds, followed by other conventional models including SVM, ANN, and RF. CNN requires a large number of images at each stage of plant growth for effective feature learning. The k-FLBPCM method works better if the edges of crops and weed leaves are accurately extracted. Further research is needed to comprehensively apply the promising methods to practical applications and to improve classification accuracy during real-time detection. Also, more studies should be conducted to further validate the performance of CNN and k-FLBPCM on other vegetable crops such as sunflower and blueberry. The application of RGB imaging is listed in Table 2.

Hyperspectral Imaging
Hyperspectral imaging obtains the images of continuous narrow wavebands and generates the spectrum for each pixel in the image [90]. The extensively used imaging spectrometer is described as line scanning device. Each time a line from the target area is projected onto the imaging array through the diffractive optics, then a series of line images are arranged in sequence to form an entire target image. The system can capture a 3D image (including spatial dimensions of 2D and the vertical dimension with spectral data) of a moving scene along a specific travel speed and direction, as all spectral data in the same row are captured simultaneously. Many researchers have demonstrated the use of hyperspectral imaging systems to identify plant species in the field. For example, the line-imaging spectroscopy has recently been used for precision differentiations of crops (such as soybean, wheat, and cabbage) from various weed species such as black nightshade and pigweed [91][92][93]. The imaging spectrometer (660-1060 nm) and bilinear methods showed classification performances of about 90% for crop-weed discrimination [16]. The hyperspectral sensor distinguished cotton plants with an average false detection rate of only 15% [94]. The presence of early season pitted morning glory in soybean was detected with the classification accuracy of at least 87% [95]. Based on PLSDA, the total accuracy of 85 % was obtained for detection of annual grasses and broadleaf weeds in wheat fields [96]. In another study, SVM classifier yielded higher accuracy (91%) for mapping infestation of musk thistle in the wheat crops [97]. RF differentiated weeds from maize and cotton with the overall accuracy as high as 95.9% [98,99]. Then, one-class classifiers including the self-organizing map (SOM) and mixture of Gaussians (MOG) discriminated the crop and weed species with 100% accuracy [20]. Overall, machine learning coupled with line-scan hyperspectral imaging successfully achieved ground-level plant species discrimination.
A typical limitation of the studies using line-scan hyperspectral machine vision to identify plant species is that the spectral data were recorded from crops grown in a single season. However, plants in large farms are exposed to a series of uncontrolled environmental conditions. Seasonal differences such as irrigation systems and solar irradiance in farming practices could affect the plant optical properties as foliage reflectance properties are related to environmental factors [100]. To investigate the influence of seasonal effects on model performance, Zhang, et al. [101] developed a VIS/NIR hyperspectral weed mapping system (Figure 4) for multi-season tomato and weed species identification. The Bayesian classifier performed well in each season with cross-validation species classification accuracy over 92% and achieved a cross-season recognition accuracy of 95.8% eventually. After, Bayesian classifiers examined the VIS/NIR hyperspectral images (384-810) of tomato grown under various sunlight intensities [102]. The plant species exposed to higher solar irradiance (92.3% accuracy) were more easily distinguished than that in low solar irradiance condition (87.5% accuracy) [103]. The results of an outdoor test showed that this line scanning hyperspectral system of combining the thermal micro-dosing oils for tomato plant identification can be translated into practical applications for weed control [11]. Bayesian classifiers identified two weed species within early growth tomatoes yielding an overall accuracy of 95.9%, eventually.   Although the above automatic system developed could detect field weeds in real-time, the speed of the tractor was constrained by the line-imaging hyperspectral sensing platform because it takes a lot of time to scan the image using this kind of camera [11]. Unlike line-scan imaging, area-scan hyperspectral imaging can simultaneously capture spatial and spectral information within a single integration time of the detector array. This method is more practical when the required set of wavelengths is well defined before the acquiring data, the number of wavelengths is limited, and the filter is already available. A snapshot area scanning hyperspectral camera improves the transfer rate of image frames from the camera to the computer and can greatly reduce the overall processing time of the computer. The incident light is split after entering through the common aperture. Each stream is directed into a different filter and projected on a separate area of the image plane. Gao, Nuyttens, Lootens, He, and Pieters [98] investigated the feasibility of snapshot mosaic hyperspectral camera with 25 bands in classification of weed from maize crops. The RD/NIR wavelengths such as 677, 764, and 871 nm appeared frequently in the important features. The crops were recognized from weeds with a very high precision (94%) based on RF model. This result supports the prospect of further application of this area scanning camera in the field to implement site-specific weed management. These improvements are able to alleviate the travel speed limitation due to the line-scan hyperspectral weed sensing system. However, the influence of redundant image information should be further reduced to make the speed of travel reach a commercially acceptable level. Although the above automatic system developed could detect field weeds in real-time, the speed of the tractor was constrained by the line-imaging hyperspectral sensing platform because it takes a lot of time to scan the image using this kind of camera [11]. Unlike line-scan imaging, area-scan hyperspectral imaging can simultaneously capture spatial and spectral information within a single integration time of the detector array. This method is more practical when the required set of wavelengths is well defined before the acquiring data, the number of wavelengths is limited, and the filter is already available. A snapshot area scanning hyperspectral camera improves the transfer rate of image frames from the camera to the computer and can greatly reduce the overall processing time of the computer. The incident light is split after entering through the common aperture. Each stream is directed into a different filter and projected on a separate area of the image plane. Gao, Nuyttens, Lootens, He, and Pieters [98] investigated the feasibility of snapshot mosaic hyperspectral camera with 25 bands in classification of weed from maize crops. The RD/NIR wavelengths such as 677, 764, and 871 nm appeared frequently in the important features. The crops were recognized from weeds with a very high precision (94%) based on RF model. This result supports the prospect of further application of this area scanning camera in the field to implement site-specific weed management. These improvements are able to alleviate the travel speed limitation due to the line-scan hyperspectral weed sensing system. However, the influence of redundant image information should be further reduced to make the speed of travel reach a commercially acceptable level.
Rather than adopting many fixed-wavelength optical filters, a tunable filter coupled with a camera is driven to a specific wavelength for each image, thereby obtaining an image cube containing images for selected feature wavelengths. The tunable filter is a diffractive device. By driving the filter at different frequencies to adjust the passband, the material behaves as a variable wavelength transmission grating. Such tunable filter-based hyperspectral imagers have already been used for plant detection and classification by many researchers [104,105]. Based on the tunable filter, the selection of the feature spectra for automatic weed detection can be done [94]. In order to improve the speed of real-time detection, fewer characteristic bands (21 wavelengths) selected using stepwise discriminant analysis (SDA) from hundreds of VIS/NIR hyperspectral bands were then used to design a multispectral machine vision system [106]. Bayesian classification models identified the lettuce plants from weeds (such as groundsel and sowthistle) with an average accuracy of 90.3%. Then, the SDA using the feature wavelengths in the blue (420-460 nm), green (560-580 nm), red (620-650 nm), and NIR (700-740 nm) regions discriminated wheat from wild oat and canary grass with 100% accuracy [107]. To optimize the wavebands for the plant species classification, the RF model using selected images in shortwave-infrared region obtained high accuracies (93.8% to 100%) for classification of two pigweeds from 3 soybean varieties [108]. Compared with RF, SVM with 6 VIS/NIR feature variables (415, 561, 687, 705, 735, 1007 nm) selected by SPA achieved higher performance for recognition weeds from rice crops [109]. Based on 8 feature spectral bands, SVM achieved similar performance to LDA for discrimination of crop and weeds [110]. The SVM model developed using 4 spectral images in the VIS/NIR domain discriminated weeds between and within crop (maize and sugar beet) rows with the accuracy of 89% [111].
Deep learning algorithms such as ANN and CNN are becoming more popular than traditional machine learning methods (such as SDA, LDA, RF, and SVM) for plant identifications. Eddy, et al. [112] developed an ANN model with reduced wavelengths that discriminated weeds (including wild oats, redroot pigweed) from different crops (including field pea, spring wheat, canola) ( Figure 5). In this study, the ANN based on 7 feature bands (480, 550, 600, 670, 720, 840, and 930 nm) identified using SDA and PCA, yielded high classification accuracy (94%) which was equivalent to the full wavelength result (95%) [112]. Later, 100% accuracy for classification of weeds in wheat and bean crops was obtained by ANN using 12 spectral signatures (480, 485, 490, 520, 565, 585, 590, 595, 690, 720, 725, and 730 nm) [113]. In addition to ANN, CNN based on U-Net were successfully used for semantic segmentation of crops from weeds in multispectral images [114]. The CNN reported an acceptable performance with the F1-score of about 0.8 for weed detection [115]. Such results demonstrate that the multispectral imaging system developed using a small number of images at discrete spectral bands had higher potential than the full wavelength hyperspectral imaging for plant detection [39,116].
The research presented provides a trend for identifying crop and weed plants in real time. These studies demonstrate that different crops (such as rice, soybean, tomato, and cabbage) could be successfully discriminated from weeds by hyperspectral imaging. The results provide a good basis for future detections of other plants. The spectral images both in VIS region and NIR region provided enough information for classifications. For detecting vegetables such as lettuce, the classifier accuracy should be further improved. The research findings provide a trend using hyperspectral imaging based on reduced wavelengths for real-time crop and weed discrimination. More studies need to be conducted to further validate the precision of VIS and NIR spectra in simplified model development.
Given that CNN has showed higher performance, additional studies on the stability of such methods should be explored. More research is needed to monitor the changes of spectral and image features of plants during different growth stages. In addition, future research based on multispectral imaging and new algorithms should be explored to measure the characteristics of different crops and weeds. The application of hyperspectral imaging is listed in Table 3.  a, b, c, d) and reduced hyperspectral band sets (e, f, g, h) of crop (canola in yellow; wheat in cyan; field pea in green) and weed (redroot pigweed in red; wild oat in orange) combinations [112].
The research presented provides a trend for identifying crop and weed plants in real time. These studies demonstrate that different crops (such as rice, soybean, tomato, and cabbage) could be successfully discriminated from weeds by hyperspectral imaging. The results provide a good basis for future detections of other plants. The spectral images both in VIS region and NIR region provided enough information for classifications. For detecting vegetables such as lettuce, the classifier accuracy should be further improved. The research findings provide a trend using hyperspectral imaging based on reduced wavelengths for real-time crop and weed discrimination. More studies need to be conducted to further validate the precision of VIS and NIR spectra in simplified model development. Given that CNN has showed higher performance, additional studies on the stability of such methods should be explored. More research is needed to monitor the changes of spectral and image features of plants during different growth stages. In addition, future research based on multispectral imaging and new algorithms should be explored to measure the   RF--random forest; SVM--support vector machines; LDA--linear discriminant analysis; MOG--mixture of Gaussians; SOM--self-organising map; SAM--spectral angle mapper; ANN--Artificial Neural Network; PLSDA--partial least square discriminant analysis; SSC--site specific classifier; HMW--haar mother wavelet.

Discussions
The development of advanced technology for plant identification in the context of automatic weed control is a difficult but realistic task. Machine learning algorithms coupled with non-imaging spectroscopy, RGB imaging, and hyperspectral imaging have proven to be effective methods for rapid classification of crops and weeds. Such three sensing techniques are rapid, eco-friendly, and non-destructive technology with no sample preparation, which has great potential to be widely used in automatic weed control in the field. They can provide accurate reference values of the features of weeds or crops to be used as inputs in machine learning models. The non-imaging sensor does not provide the large spatial information of an entire plant, but it generates spectral information (VIS/NIR/MIR) from very small points or local parts. The correct classification is similar for VIS, NIR, and MIR regions. Some studies found the spectra in VIS region are very important for classification while others determined the NIR/MIR spectra are more important. Although several studies resulted in very contrasting results, the selection of the optimal spectral range may depend on the property of crops and weeds. Many studies have revealed that the spectral characteristics of plants are affected by environmental factors such as humidity, solar radiation, and temperature [100]. For instance, Henry, et al. [121] demonstrated that the moisture content at the leaf level affected the performance of spectral reflectance in VIS/NIR regions to discriminate weeds from soybean crops. The increase of water stress changed the spectral signature amplitudes of the plant, which improved the accuracy of the model to distinguish soybean and weed plants. Nightingale [122] observed that the accumulation of anthocyanin (absorbing light in the VIS region) in the foliage of tomato plants was inversely proportional to the growth temperature. Overall, all the spectroscopic ranges (VIS/NIR/MIR) provide average classification accuracy higher than 81.58%. In some instances, the MIR spectroscopy improves the average prediction accuracy. The advantage of MIR spectroscopy is that MIR peaks are more clearly defined and more easily assigned than other bands [123]. RGB imaging provides the broad band RGB image of a scene, but cannot capture the detailed spectral data. It appears difficult to use RGB imaging for accurate plant identification if the plants are captured in the early stages of growth due to significant changes in appearance of plants during growth. Although RGB data only provide spectral information using three broad bands, competitive results for accurate classification are achieved based on their other features such as biological morphology and spatial contexts in the section dedicated to RGB imagery. For both the spectroscopy and hyperspectral imaging, the spectra collected in full wavelength regions contain a lot of redundant information that should be effectively removed before modeling. Hyperspectral imaging by providing both spectral and image information shows great potential for plant identification, but this requires a deeper understanding of the hyperspectral images of crops and weeds to determine which combination of input variables contributes the most to the accuracy of a model.
Machine learning has been used to solve the classification problem of crops and weeds in many cases. The establishment of a robust model requires a large number of data sets, but the stability of the training data is affected by seasonal factors. The color similarity of weeds and crops make it difficult to separate leaves or plants with occlusions. Other features (such as spectral features, visual textures, biological morphology, and spatial contexts) have higher potential for plant identification. Machine learning methods (such as SVM, ANN, and CNN) integrating multiple features have achieved reasonable accuracies to measure whether a plant is a weed or a crop. Such methods are computationally intensive for the training phase, while deployment is generally a lot lighter. Thus, this does not reduce the ability to make real-time decisions during herbicide application specially for CNN-based methods. Deep learning-based techniques including CNN have received increasing attention in recent years and have become a fundamental need for research. The main reason for the popularity of CNN is the scalability to different data sets and the performance growth of such algorithms in training phase. The parallel processing based on graphics processing units (GPUs) and the availability of large-scale data sets have simplified the study of CNN. Although CNN shows great performance in image classification of different plants, they require extremely large numbers of labelled samples for model training. The size of the training data set should be large enough to prevent overfitting, and this requires a lot of manual labor to annotate the images. Deep learning has not yet been fully integrated with prior knowledge. This is high labor-cost taking into account the retraining required for new applications and the errors caused by different experts in the labeled data. Also, the current CNN algorithms are not robust enough for identifications of plants with high degrees of foliage occlusion and cannot meet the speed of the tractor in real-time applications.
Selection of several discontinuous feature spectral images in the full spectrum range can be more operationally feasible to distinguish between weeds and crops, because it allows the use of simpler multispectral imaging due to better spectral differentiation. The recognition with specific band images should help to develop the dedicated sensing equipment, which can significantly reduce costs and the dependence on computing power. It would be feasible to use these multispectral sensors in sprayers or UAV for the purpose of real-time weed management in specific locations in a large field. However, the design of robust and effective multispectral imaging sensors encounters some challenges: (a) the loading limit requires the adoption of a very limited number of spectral bands, (b) if the purchased optical filter is not sensitive enough to filter redundant information thoroughly, it will reduce the accuracy of the model in identifying crops and weeds, (c) the acquired spectral images would be affected by the surrounding uncontrolled environment (such as cloud and solar radiation), (d) the spectral reflectance of plant leaves will change abnormally due to physiological stress, (e) the accuracy of the designed detection algorithm will be affected by different interference factors, and (f) the machine learning tools developed will not be robust enough for real-time operations.
In the future, more in-depth research on multispectral imaging should be carried out to improve the discrimination of weeds and crops in a more effective and rational way, and to ensure that the technological achievements can be readily used for automatic real-time applications. Listed below are the recommendations for the future: 1.
The sensing system with selected optical filters should be continuously adjusted to improve the spectral image resolution and the detection accuracy.

2.
It is necessary to obtain accurate reference values of plant characteristics in samples collected for many years to improve the robustness of the training model.

3.
The simplified machine learning models should be further optimized to ensure its effectiveness for plant specific tasks. 4. The final developed system should be robust enough to support automatic weed removal and handle various abnormal situations in a given task.

5.
More practical and feasibility studies on farmers' fields should be carried out as crop-weed interaction is a very complex phenomenon. 6.
The potential cost of automatic weed control should be assessed and compared to conventional approaches in order to commercialize the technology.
Besides machine learning-based sensing methods, crop signaling techniques have also been used for crop and weed discrimination. The crop signaling was developed by applying a specific fluorescent marker (such as systemic markers, topical markers, plant labels, or fluorescent proteins) to the crop plants, making them machine readable and distinguishable from weeds [124]. For example, Su et al. [125] established a fluorescence imaging system and successfully classified different weeds (such as burning nettle and groundsel) from snap bean crops by detecting a systemic marker (Rhodamine B) that was only applied to the bean seeds before planting. However, the technology is still developing and has not received widespread attention. Higher concentrations of fluorescent marker have already posed a greater risk for normal plant growth [124]. Because the fluorescent signal is unstable, further studies are expected to be conducted in different outdoor areas under full sunlight to assess the photobleaching and persistence of the fluorescent signal in crop plants in diverse environments [126]. In addition, it is not clear whether weeds will absorb this fluorescent compound.
Overall, although crop signaling technique appears promising in plant identification, machine learning is still the most effective technology for rapid detection of weeds and crops.

Conclusions
In this review, the recent applications of non-imaging spectroscopy, RGB imaging, and hyperspectral imaging have been highlighted as potential techniques for automatic discriminations between crops and weeds. These sensing technologies used in smart agriculture have made substantial progress by generating large amounts of data from the fields. Non-imaging spectroscopy provides extensive VIS/IR information that correlates well with plant species, and it has already been widely used to classify grain crops (such as maize, wheat, and soybean), vegetable crops (such as tomato, lettuce, and radish), and fiber crops (such as cotton) from all kinds of weeds. The machine learning algorithms can analyze the high-dimensional data with different features from the training data, thereby achieving accurate crop discrimination. Based on RGB imaging, machine learning including CNN and k-FLBPCM is considered to be a more valuable technique to classify plant images with high accuracy. As a more promising technique, hyperspectral imaging shows great capability in identifying crop and weed plants in real time. Regarding the goal to guarantee robotic weed control, non-imaging spectroscopy, RGB imaging and hyperspectral imaging have been added to the knowledge base of automatic plant care in crops. The discussion of different sensing methods focuses on hyperspectral imaging, but the process of feature variable selection is important for the development of simpler multispectral imaging systems based on advanced machine learning methods. Given the recent boom in machine learning and sensor development, it is anticipated that multispectral imaging will become the prevailing method for real time differentiation of weeds and crops.