A Computer-Vision-Based Approach for Nitrogen Content Estimation in Plant Leaves

Nitrogen is an essential nutrient element required for optimum crop growth and yield. If a specific amount of nitrogen is not applied to crops, their yield is affected. Estimation of nitrogen level in crops is momentous to decide the nitrogen fertilization in crops. The amount of nitrogen in crops is measured through different techniques, including visual inspection of leaf color and texture and by laboratory analysis of plant leaves. Laboratory analysis-based techniques are more accurate than visual inspection, but they are costly, time-consuming, and require skilled laboratorian and precise equipment. Therefore, computer-based systems are required to estimate the amount of nitrogen in field crops. In this paper, a computer vision-based solution is introduced to solve this problem as well as to help farmers by providing an easier, cheaper, and faster approach for measuring nitrogen deficiency in crops. The system takes an image of the crop leaf as input and estimates the amount of nitrogen in it. The image is captured by placing the leaf on a specially designed slate that contains the reference green and yellow colors for that crop. The proposed algorithm automatically extracts the leaf from the image and computes its color similarity with the reference colors. In particular, we define a green color value (GCV) index from this analysis, which serves as a nitrogen indicator. We also present an evaluation of different color distance models to find a model able to accurately capture the color differences. The performance of the proposed system is evaluated on a Spinacia oleracea dataset. The results of the proposed system and laboratory analysis are highly correlated, which shows the effectiveness of the proposed system.


Introduction
After water, nitrogen (N) is one of the most important macronutrients in plants [1], as it is associated with proteins that are directly involved in plant metabolic processes. Therefore, to get higher crop production and to improve food quality, an adequate supply of nitrogen is required by plants. It plays a fundamental role in enhancing the yield and productivity [2]. Nitrogen deficiency is characterized by poor plant growth and pale yellow leaves due to insufficient chlorophyll content and some environmental stresses [3].
Nitrogen is a key component of the chlorophyll that intensifies photosynthesis in crops [4][5][6][7], and plant yield is affected by irregularity in nitrogen content [8][9][10]. The amount of nitrogen required for each crop varies [11][12][13][14]. Leaf color is an indicator of many plants' nutrients status especially the chlorophyll content in the leaf [15]. The color and texture of the leaf determine the nitrogen status in the crop as the nitrogen deficiency

Review of Existing Methods
Nitrogen measurement in plants is carried out by four manual methods: chemical test, normalized vegetation index, leaf color chart, and SPAD meter [32]. the Kjeldahl-digestion method is a popular chemical analysis-based method for nitrogen nutrition estimation. The method was developed by Johan Kjeldahl [33], and it works in three steps. First, sample leaf powder is mixed in a Kjeldahl flask with concentrated acid in a specific ratio. To evolve the CO 2 , the mixture is heated at high flame, and the end product is an ammonium sulfate solution. Second, to get ammonia from ammonium ion (present in ammonium sulfate), sodium hydroxide is added into the mixture. Ammonia gas is evolved by heating the solution, which is trapped in the solution of boric acid and standard acid in the flask. In the final step, titration is used for ammonia estimation in the sample; the ammonia concentration is proportional to the nitrogen content. The reagents used in this process are analyzed in the study [34]. This method gives quite accurate results, but it is very tedious and time-consuming [32].
Soil-Plant Analysis Development (SPAD) is one of the optimal methods for the calculation of plant chlorophyll content. The SPAD meter does not give accurate values but a unitless relative chlorophyll value which can be converted into nitrogen quantity [35]. However, SPAD gives more accurate results for small samples, especially in ornamental plants as it is sensitive to chemical changes in leaves [36].
The leaf color chart is another means to estimate n-nutrition status in plants. This chart has different shades of green, which are matched to the leaf color to get an estimate of nitrogen content in the leaf. Every crop has its leaf color chart with different shades of green according to its possible leaf colors [37]. This method is simple but not very accurate [21].
All the above-mentioned methods are manual and very time-consuming, and some of them also require skilled staff. Recently, computer-aided design (CAD) for nitrogen estimation has attracted significant research efforts. In particular, image processing, optical, and multi-spectral based nitrogen estimation techniques have been proposed for different crops and vegetables, e.g., [4,13,[20][21][22]24,28,30]. Normalized difference vegetation index (NDVI) is a remote sensing technique for vegetation, which has been used for quite some time now [38]. To determine the live green vegetation of the crop, NDVI uses the visible and near-infrared band of the spectrum. This method, however, is expensive and is not suitable for nitrogen estimation in a small area of crops [21]. Numerous methods e.g., [29][30][31] exploit the red-green-blue (RGB) color model for chlorophyll estimation and nitrogen content in the plants and crops.
The chlorophyll meter was the very first tool introduced for the measurement of the chlorophyll contents in plants. The leaf chlorophyll has the property of absorbing the red frequency, but infrared cannot be absorbed in it. Therefore the chlorophyll meter emits the red and infrared frequencies of light and uses the difference in absorption of both frequencies. Based on this differences, the meter estimates the chlorophyll content in the leaf [39]. The studies in [11,40] use reflectance and absorbance in tropical tree areas to estimate nitrogen in plants. Kawashima et al. [23] proposed a system for measuring chlorophyll content in crops. They use a digital camera to capture leaf images and calculate the ratios of different combinations of red (R), green (G), and blue (B) colors. The research concluded that the ratio of the red and blue channels gives the best correlation with the chlorophyll content. The nitrogen estimation algorithms presented in [20,41,42] also use color image analysis. The system in [43] uses three-wavelength diffuse reflectance for nitrogen estimation in plants. Similarly, systems proposed in [44] used the CIE-LAB color system to estimate nitrogen content in leaves. The system proposed in [4] presents a manually operated trolley by which images of crop leaves were captured to predict nitrogen amount in a specific field. The system consists of a camera to capture an image, four lights for proper lighting, and a laptop for system processing. The system was semi-automated and required an operator to handle and operate it properly. The image processing-based system proposed in [45] estimates the leaf chlorophyll content at a canopy scale. It presented a triangular greenness Index (TGI) and other spectral indices for a low-cost indicator of fertilizer requirement in various crops.
The study presented in [46] analyzed various image processing techniques and concluded that color model-based techniques give the most accurate results of nitrogen measurement in plants. The system in [47] uses an artificial neural network to predict the leaf color chart (LCC) panel of the leaf. The system correctly classified the LCC chart cum SPAD meter with good accuracy for rice leaves. The system proposed in [48] also uses a digital camera to calculate nitrogen measurement in cotton plants. The method presented in [49] uses three color models for estimating nitrogen in rice crops. After image segmentation, 13 color indices were calculated from three color models (i.e., RGB, Lab, and HSV). The values were compared with plant nitrogen status, and a significant correlation was found. The algorithm proposed in [50] uses an optical method for the measurement of chlorophyll content in crops. The method in [1] also uses the RGB color image analysis nitrogen estimation in crops. The study presented in [51] used nursery plants seedlings of five tropical trees, and 10 different indexes of RGB were calculated and correlated with SPAD results.

Methodology
The amount of nitrogen nutrient in plants is estimated by analyzing the color of the leaves. In most crops, the leaf color shows the level of nitrogen in the plants and crops; the crop is healthy if the color is lush green but gets pale-yellow with a deficiency of nitrogen in the absence of biotic and abiotic stresses. Like many other image-processing-based systems, the proposed method also exploits this basic phenomenon to estimate the n-nutrient in crops. The proposed method has three steps. First, the image of the selected leaf is captured using any ordinary camera and a specially designed leaf board, also known as slate. The leaf board is white in color with two reference green and yellow-colored circles. Second, the leaf and the reference color circles are automatically segmented from the image using different image processing techniques. Third, the average color of the leaf is computed and compared with the reference colors to estimate the green color value (GCV) index, which is found to be highly correlated with the chlorophyll content in the plants. Moreover, we also evaluate many existing color based methods for n-nutrient estimation and compare their performance with the proposed method using various statistical tools.

Image Acquisition
The first step in any computer-based algorithm for n-nutrient estimation is leaf image acquisition. This is a critical step as the quality of the image can affect the accuracy of the later processing steps. The lighting conditions can affect the leaf color in the image. For example, if the image is taken in bright light, the RGB values are higher than the original leaf color and the image becomes brighter. Conversely, when the image is taken in low light, the RGB values are lesser than the original colors, so it becomes darker than the original. Therefore, it is important to control the illumination and other factors that can impact the color of the leaf and hence the accuracy of the method. Some existing systems use a digital camera, e.g., [25,28,31,52], others use smartphone camera [29,53], and still other use scanners [27,30,54] or hyperspectral imaging systems [55] for image acquisition. Many existing systems capture images in a specially controlled environment, e.g., [4,26,28,29].
In order to overcome the above-mentioned problems, the proposed system obtains the reference colors from the same leaf image. To ensure accurate results in both dark and bright environments, a special slate was made having a green and a yellow color circle on it. These circles are used as reference colors in the n-nutrient estimation. The designed slate and a sample leaf image are shown in Figure 1. Leaves were placed on the above-mentioned slates and images were captured by using an ordinary smartphone camera.

Region of Interest Detection
To analyze the leaf color with the reference green and yellow colors, the leaf and the two reference color circles must be identified and extracted from the image. For object detection, the bounding box method is used, which identifies each object as one bounding box. The process starts by converting the image into grayscale, and then Otsu's method is applied to obtain its global threshold for converting the grayscale image to a binary image. Finally, all the bounding boxes in that image are detected. Let I be a leaf image of size M × N and BW be its the binary image. Each pixel BW(i, j) of the binary image is picked, and its 8-connected pixels are checked by voting technique. The connected pixels with a value of 1 correspond to a region and are considered as one bounding box. Let b 1 , b 2 , b 3 , · · · , b n , be the n bounding boxes detected in image I. For each bounding box, four attributes are computed, namely x b , y b , w b , and h b , representing starting x value, starting y value, width, and height of the box, respectively. Some of the detected bounding boxes are too small, and a few are very large, covering the whole page. Figure 2a shows the results of detecting the bounding boxes on the image shown in Figure 1b, and the detected bounding boxes are highlighted in red color. From all the detected bounding boxes, we are concerned only with the boxes containing the reference circles, which may appear in oval or elliptical shape due to change in camera orientation while capturing the image. We devise a method using the ratio between the width and the height of the boxes to efficiently identify the objects of interests. Let r w be the ratio of box width to height and let r h be the ratio of box height to width.
The boxes containing the reference color circles appear as (nearly) square, and they can be detected by the ratio r w and r h . Since they are almost square, these ratio should be close to 1. From the experiments, we found that the following limits serve the best in detecting the reference color circles.
It is observed that sometimes the whole page is detected as a square or elliptical box. Obviously it can not be the circle we are interested in, and such cases are rectified by confirming that the ratio of the the box height to the image height is not large; it should be less than 0.6; that is, a valid reference color box should satisfy the following relation: where h b is the box height and N is image height. With the help of (3) and (4), the two boxes containing the green and the yellow reference circles are detected and extracted from the original image; that is, a bounding box b i contains the reference circles if The remaining image contains only the leaf, which is extracted by background removal using the Gaussian mixture model that is explained in the following Section 3.1.3.
In Figure 2, the results of the proposed object detection strategy are presented. Figure 2a shows the objects detected in the image (Figure 1b) using the bounding box algorithm. Figure 3 shows the detected regions of interests containing the reference color circles (Figure 3a,b. The rest of the image, excluding the circles' region, is shown in Figure 3c. These regions containing the green and yellow color circles are referred to as I g and I y , respectively and the remaining image containing the leaf is denoted as I l in the rest of the text.

Background Removal
In order to accurately calculate the color, the background of the circles and the leaf must be removed. It can be achieved by color-based thresholding techniques but from experiments, we found that such techniques are not very accurate. Therefore, to ensure that only green and yellow colors are used in the analysis for reference color, the backgrounds of both images (I g , I y ) was removed by applying different rules based on the distance of pixels from radius. For each region, I g and I y , their centers (x c , y c ) are estimated, where w I g and h I g represent the width and the height of image I g , respectively. Since the circle is inscribed in the square box, its radius is half the width or height (i.e., r I g = w Ig 2 ) of the box. The pixels with distance greater than the radius r I g lie outside the circle and are removed. For each pixel location (i, j), its value is updated according to the following rule.
where d(i, j) is the distance of the pixel (i, j) from the center of the box (x c , y c ), calculated as, The background of the other circle (I y ) is removed analogously. However, since the leaf image may contain complex structures, e.g., veins and different colors, it is difficult to obtain an accurate segmentation using simple color-based thresholding. The background of the leaf image (I l ) is removed by using Gaussian Mixture Model (GMM). It divides the image into two clusters, and pixels are characterized in RGB space by their intensity [56,57]. Each pixel in the image is modeled into Gaussian distribution. The likelihood of a pixel in the image (X t ) at time t of each cluster is calculated.
where K is the number of clusters (here assumed as two, i.e., foreground and background), ω i,t is a weight associated with the i th Gaussian in image at t with mean µ i,t and standard deviation Σ i,t : Since the background is dominant in the images, the background pixels are expected to have more weight and less variance. Therefore, the first b Gaussian distributions having weights greater than the designated threshold T are background pixels; therefore, the pixels are classified as background pixels (B) by the following rule, and the rest is foreground.
If a pixel matches with a K-Gaussian, the values of ω, µ, and σ are updated; otherwise, only ω's value is updated. Using the above-mentioned model, the leaf pixels were separated from the background, turning the background pixels into black and keeping the leaf pixels unchanged.
It is observed that even after applying the GMM, some background pixels may still appear as foreground. To resolve this issue, the image is converted into HSI space, with H I l , S I l , and I I l representing the hue, saturation, and intensity, respectively. We observed that the leaf pixels always has hue and saturation components greater than 0.5 and 0.3, respectively. Therefore, any non-leaf pixel incorrectly marked as leaf is removed by holding these two conditions. That is, and Figure 4 shows the results of the proposed method applied on the images shown in Figure 3.

Nitrogen Estimation
After removing the background, we have both the reference circles and the leaf. The mean color value of the three objects is computed for each color channel, red, green, and blue denoted, as µ r , µ g and µ b , respectively.
The mean color values of the reference yellow circle (µ I y ) and the leaf (µ I l ) are also calculated analogously. The green color value (GCV) index of the leaf is now computed by calculating the distance of the leaf mean color from the reference green and yellow colors. Let d 1 be the distance of the leaf from the green reference color and d 2 be the distance of the leaf from the yellow reference color computed using any color distance model. The GCV is then computed as the percentage of their ratio.
The value of GCV represents the n-nutrient percentage present in the leaf. There exist numerous ways to compute the color differences d 1 and d 2 , which are briefly explained in the following section.

Model Development
Numerous color distance models are available that can be used to compute the leaf color difference from the reference colors. We evaluated these models to find the most suitable model for the problem at hand. These models are briefly introduced in the following sections.

Euclidean Distance
The simplest method of finding distance between two colors within a RGB color space is the Euclidean distance. It computes the distance between two colors C1(R 1 , G 1 , B 1 ) and C2(R 2 , G 2 , B 2 ) as where ∆R = R 2 − R 1 , ∆G = G 2 − G 1 , and ∆B = B 2 − B 1 .

Color Approximation Distance
The perception of brightness in the human eye is non-linear. From the experiments, it appears that the curve for this non-linearity is not the same for each color [58,59]. Therefore, there have been many attempts to weight RGB values to better fit human perception, where the components are commonly weighted. The color approximation distance between two colors C1(R 1 , G 1 , B 1 ) and C2(R 2 , G 2 , B 2 ) is calculated as: where r = R 1 +R 2 2 .

CIEXYZ
The RGB color space is not visualized clearly. Since the human eye has three types of color sensors that correspond to different wavelengths, a full plot of visible color is three-dimensional color. The CIE XYZ [60] color space includes all the colors that are visible to a human eye. Therefore, in order to obtain more precise results, the color values were first converted into CIE XYZ color space and then their distance was calculated. The following transformation is used to convert mean values into the CIE XYZ color space.
The normalized tristimulus values x, y, and z were calculated from X, Y, and Z, and the difference between two colors in xy-chromaticity space is computed by the simple distance formula: The position of the reference colors and the leaf is shown in the Figure 5. The distance to the leaf from both reference colors is shown as d 1 and d 2 .

CIE76
The CIELAB (L*a*b*) color space was introduced in 1076, which was said to be a uniform color-space [61]. In any uniform color space, the color difference is easily calculated by Euclidean formula and expressed as a straight line. The difference between two colors C 1 (L 1 , a 1 , b 1 ) and C 2 (L 2 , a 2 , b 2 ) in CIELAB color space is calculated by following formula. d = (∆L * ) 2 + (∆a * ) 2 + (∆b * ) 2 (19) where ∆L * = L 1 − L 2 , ∆a * = a 1 − a 2 , and ∆b * = b 1 − b 2 .

CIE94
Different color distance models have been explored to address the non-uniformities by retaining the CIELAB color space. These methods are generally more accurate for human perception to colors differences. ∆E * 94 was defined in Lch color space with differences in lightness, chroma, and hue calculated from Lab-coordinates [62,63]. The difference between two colors C 1 (L 1 , a 1 , b 1 ) and C 2 (L 2 , a 2 , b 2 ) is computed as The further details of the model can be found in [62].

CIEDE2000
The CIE organization decided to fix the lightness inaccuracies by introducing ∆E * 00 . It is currently the most complicated, CIE color difference algorithm, yet it is accurate [64][65][66]. The difference between two colors is calculated in ∆E * 00 as follows.
A comprehensive description of the model and its computation can be found in [65].

CMC l:c
Since the human eye is sensitive to chroma, the difference of colors can be visualized more clearly in LCh color space [66]. ∆E * CMC calculates the color difference in LCh color space. It has two parameters, lightness (l) and chroma (c), allowing the users to weight the difference based on the ratio of l : c that is deemed appropriate for the application. The distance between two colors C 1 (L 1 , C 1 , h 1 ) and C 2 (L 2 , C 2 , H 2 ) is found as: where S L , S C , and S H are computed from the l and c components, as explained in [66].

Evaluation of Dataset
In this section, we introduce the dataset used to test the performance of the proposed nitrogen estimation method. In the second part of the section, we present the various statistical metrics and tools used to evaluate the performance the proposed method.

Sample Collection and Processing
To evaluate the performance of the proposed method, a set of 15 spinach leaves were collected from the field. They were not grown under controlled conditions. The choice of plants was done in such a way that the dataset contains samples with different n-nutrient contents. One leaf from each selected plant was taken for experiment. The leaves were placed one by one on the designed slate and images were captured by using a simple mobile phone camera. To test the robustness of the system, the leaf was placed in different locations on the slate, confirming all four positions of the circles, i.e., circles on the left, right, top, and bottom of the leaf. It may be noted that leaves have different orientations and shape, making the dataset challenging for accurate leaf and reference circle detection. The amount of nitrogen in each leaf was calculated by KJeldahl's method (discussed in Section 2), which serves as the ground truth for the test dataset.

Performance Evaluation Parameters
We evaluated the performance of the proposed algorithm and compared the results with various existing color-based n-nutrient estimation algorithms. For each test image, the leaf n-nutrient in terms of GCV is computed. For test leaves in the dataset, their nitrogen values were calculated in the laboratory by the KJeldahl method and are used as ground truth in this analysis. The objective scores computed by the proposed method were compared with the corresponding ground truth to estimate different correlations to assess its accuracy. We used the logistic function for non-linear mapping outlined in [67] of nitrogen values with the proposed system results and other compared methods before computing the performance parameters.
Different correlations between the objective scores and the ground truths are computed to evaluate the performance of the proposed method. In addition to the conventionally used coefficient of determination, also known as R-squared (R 2 ), we used various other statistical metrics in performance evaluation. These include Pearson linear correlation coefficient, Spearman rank correlation coefficient, Kendall rank correlation coefficient, and root mean square error.
The coefficient of determination, also known as R-squared (R 2 ), measures the degree of variability of one variable that can be caused by its relationship to another related variable. Its value varies between 0 and 1, and a higher value indicates a better fit for the observations.
whereŷ are the fitted (predicted) values against the actual values y and y is the mean of the y values. The Pearson linear correlation coefficient (PLCC) is the most common measure of correlation. It shows the linear relationship between two sets of data. It is a normalized measure of covariance, such that it gives results between −1 and 1. If the result is 0, there is no correlation between points. A positive result shows a positive correlation, and a negative resultant value shows a negative correlation between data points.
where x and y are the mean values of x and y values. The Spearman's rank correlation coefficient (SROCC) is a non-parametric version of PLCC measuring the direction and strength of both ranked values association. It assesses how well the relationship between two data points can be described using a monotonic function. If the rank between the two data values is similar, the SROCC has high values; otherwise, if the data values are not (or less) similar, its value is low. We therefore correlate the proposed values with ground truth in order to get the similarity measure between them using the monotonic function.
where n is number of observations. Kendall rank correlation coefficient (KROCC) is also a non-parametric measure to determine the strength of dependence between two data points on an ordinal scale. It measures the similarity of the orderings of the data when ranked by each of the quantities and tells us that how many the data points are dependent on each other.
where n c is the number of concordant and n d is the number of discordant. Root mean square error (RMSE) is the commonly used measure of difference between predicted and the actual values. It is the standard deviation of residual (prediction error). It measures the error between the predicted and actual values. Lower values mean there is less difference in both values; i.e., if the correlation coefficient is 1, the RMSE is 0. Since the errors are squared before taking average, RMSE is high in large errors. Therefore, it is more useful when dealing with large errors.

Results and Discussion
In this section, we report the experiments performed to evaluate the performance of the proposed method and also compare it with the performance of existing similar methods introduced to estimate the nitrogen in plants. Moreover, numerous experiments were also performed to test the performance of different color distance models to find the best-suited model for the nitrogen estimation.
In the first experiment, we evaluate the performance of the proposed system with different color distance models, introduced in Section 3.2. The proposed method with each color distance model was executed on the whole test dataset, and nitrogen percentage was computed. The performance parameters were computed with the obtained scores and the ground truth values. The results of these experiments are presented in Table 1. The RGB color-space -based distance models showed more than 65% R 2 , performing better than the CIEXYZ, and CMC l:c models, which achieved less than 12% R 2 . The performance of CIE76 and CIE94 models are similar to the RGB color-space-based distance models, achieving R 2 of 66% and 63%, respectively. The results show that the CIEDE2000 model performs the best, with more than 91% R 2 . The results in terms of PLCC are similar to R 2 . For example, the RGB color-space-based distance models show more than 80% PLCC, performing better than the CIEXYZ and CMC l:c, based models, which achieved less than 43% PLCC. The best results were found by using CIEDE2000 distance calculation method, showing the PLCC of more than 95%. A similar trend can be observed with SROCC and KROCC measures and RMSE. Therefore, in the proposed algorithm, we choose the CIEDE2000 model for measuring the difference between the leaf and the reference colors.
Scatter plots are typically used to show the agreement between two or more variables of data. A scatter plot is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data. We use this plot to show how well the nitrogen content in the leaves correlates with the values extracted from the pictures. Figure 6 shows the scatter plot where the x-axis represents the values (%) estimated by the proposed system, and the y-axis represents the ground truth nitrogen percentage. The plot shows that the straight line comfortably fits the data, showing a strong positive correlation between the estimated and the ground truth values. The figure also shows the variation in the selected dataset as the nitrogen estimates vary from approximately 0.17% to 0.80%. To further investigate the performance of the proposed method, its results are compared with different RGB color-space-based n-nutrient estimation metrics. These metrics have been proposed in different studies, e.g., [24][25][26][27][28][29][30][31]51]. These algorithms compute different ratios of red (R), green (G), and blue (B) components of the leaf color. For ease of reference, these are labeled as M1, M2, · · · , and M17 in this study and are briefly described in Table 2. Table 2. List of leaf color-analysis-based nitrogen estimation metrics used in performance comparison.

Method
Description Method Description The research presented in [23] evaluated different RGB color-space-based methods and reported that metric M11 performs the best. The metric M5 was proposed as optimal in [24,48]. In [51], the metrics M16 and M17 achieved good correlation with the nitrogen value in grape leaves. Moreover, the study in [68] proposed that the red color (R) is the most accurate predictor of nitrogen. In the proposed system, all these methods were calculated, and the correlation with nitrogen value showed the best results with the proposed system. All the compared methods, M1 to M17, were executed on the whole dataset, and the performance parameters were computed. The results are presented in Table 3. These results show that the proposed method outperforms all the competing methods, achieving 0.9198 R 2 , 0.9590 PLCC, 0.7643 SROCC, and 0.6190 KROCC, with the minimum root mean square error of 0.08. Table 3. Performance of the proposed and the compared methods on the whole test dataset. The best results are marked in bold and the second-best in italic. We also compare the performance of the proposed method with numerous existing vision-based plant-N nutrition estimation methods. These methods include [23,[69][70][71][72][73][74][75][76], and their performance comparison is presented in Table 4. In [69], Liu et al. investigated the use of SPAD-502 (Minolta, Japan) for nitrogen estimation in spinach plants. Their results showed that the accuracy of using SPAD-502 for N-nutrient estimation in spinach is 0.89. The study presented by Muchecheti et al. [70] also explored the efficacy of SPAD-502 for N-nutrient estimation in spinach plants. They achieved R 2 values of 0.84, 0.89, and 0.91 for different datasets, and on the whole dataset, the coefficient of determination was 0.89. A vision-based method was developed in [71] for assessing the n-nutrition status of barley plants. Their method applies the principal component analysis (PCA) to digital images and computed a greenness index using RGB components of the color image. They evaluated the greenness index with the SPAD-502 readings and found a correlation between 0.60 to 0.95 for different sample sets. Agarwal et al. [72] computed different features of plant leaf images and evaluated their correlation with the SPAD readings. Their dark-green color index (DGCI) achieved a high correlation of 0.80 with the SPAD reported readings.

Method
The study presented in [73] by Tafolla et al. used SPAD-502 and atLeaf meters to calculate the nitrogen content in romaine lettuce. The SPAD achieved a 0.90 correlation with the laboratory results, and atLeaf showed a correlation of 0.91 with laboratory results. Kawashima et al. [23] proposed an image-processing-based algorithm to assess the chlorophyll content in leaves using a video camera. Their method proposed a function of red and blue components of the leaf image as an N-nutrient indicator and showed a 0.81 correlation with the SPAD-502 readings. In [74], Noh et al. studied the correlation between reflectance from individual channels of the images captured with a multi-spectral sensor and the SPAD readings. They found that the ratio of near-infrared (NIR) band and green compo-nent (G) showed a stronger correlation, 0.86, with the SPAD readings. The vision-based method presented by Borhan et al. in [75] computes different image features to estimate the chlorophyll content in potato leaves. Their method showed a 0.88 correlation with the SPAD-502 readings. In [76], Graeff et al. proposed an image-processing-based method for nitrogen level estimation in broccoli plants. They converted the image into La * b * color space and proposed using the b * component as an n-nutrient estimate. Their method showed a correlation of 0.82 with the laboratory results. The results of the proposed and the compared methods presented in Table 4 show that the proposed vision-based method for N-nutrient estimation is reliable and achieves appreciable accuracy. Table 4. Performance comparison of the proposed method with existing vision-based approaches for n-content estimation in plants.

Method R 2
Liu [69] 0.89 Muchecheti [70] 0.89 Pagola [71] 0.60-0.95 DGCI [72] 0.80 Tafolla M1 [73] 0.90 Tafolla M2 [73] 0.91 Kawashima [23] 0.81 Noh [74] 0.86 Borhan [75] 0.88 Graeff [76] 0.82 Proposed 0.92 The leaf color-based assessment of nitrogen status in plants is very accurate and reliable. Numerous handheld devices and meters have been introduced in recent times to assess the nitrogen content in the plant by analyzing the leaf color. Some popular and widely used devices include SPAD-502, CCM-200, Dualex-4, atLeaf, and GreenSeeker. We compare the performance of the proposed method with these devices, the results are presented in Table 5. Although this comparison would not be equitable or wellearned, as the datasets used in these evaluations are different, it can provide a glance at the effectiveness of the proposed method. The study presented in [77] evaluated the performance SPAD-502, CCM-200, and Dualex-4 for measurement of leaf chlorophyll concentration in four different crops. They achieved 0.90, 0.81, and 0.69 coefficient of determination for corn crop, respectively. GreenSeeker and atLeaf are also popular devices to estimate the nitrogen in leaves. The studies in [73,78] evaluated the GreenSeeker and atLeaf devices, respectively. The GreenSeeker achieved a correlation of 0.73 and atLeaf showed a strong correlation of 0.91 with the laboratory results. We observed that the superior performance of the proposed method over the existing similar techniques is due to many factors. First, in the proposed system, we use reference to green and yellow colors for healthy and nitrogen-deficient leaves, respectively, in a specially designed slate where the crop leaf image is placed for image acquisition. This helps in canceling the illumination changes that may occur due to different reasons such as the time the image is being captured, the weather conditions, the quality of the camera, etc. That means if there is any change in the leaf color due to lighting effects, the reference colors are equally affected in the image, thus canceling the effect. Second, the novel color distance formula used in the proposed method is able to accurately capture the color differences between the leaf and the reference colors. In particular, a large number of available color distance models are evaluated, and the results favored the CIEDE2000 model for the proposed algorithm. Third, the proposed reference color circle detection and the GMM-based leaf extraction produce accurate segmentation, which is momentous for true color analysis. All these factors enhance the system's performance and make it a reliable tool for estimating the n-nutrient in plants.

Conclusions and Future Research Directions
This paper presents a vision-based fully automated system to estimate nitrogen content in crops and plants. The system can be used by a common person as no specific camera or controlled environment is required to capture the images. Leaf images can be captured using any type of camera, including smartphone cameras. The system takes a leaf image and uses the leaf color to calculate its nitrogen content. The proposed algorithm uses different feature detection and background removal techniques to accurately segment the leaf and the reference colors from the image. Moreover, numerous color distance models were evaluated to find the best model for the problem under investigation. The experimental evaluations were performed on the spinach crop leaves dataset, which comprises images captured with a normal smartphone camera. The performance was measured using different parameters, revealing that the proposed algorithm is highly correlated (0.9198 R 2 and 95.9% PLCC) with the nitrogen amount calculated by Kjeldahl method in the laboratory, which proves its effectiveness. A software release of the proposed vision-based framework for N-nutrient estimation in crops is made publicly available on the project website: http://faculty.pucit.edu.pk/~farid/Research/GCV.html, accessed on 8 June 2021.
In the future, we plan to investigate the performance of the proposed system for other crops and plants. Studying the impact of color models for different crops is another interesting future research direction. In the dataset used in the performance evaluation of the proposed method, images were of different resolutions and the proposed leaf detection, and reference color circle extraction worked accurately. However, building a dataset consisting of leaf images captured with different cameras and under different illumination conditions with the varying resolution is important to truly assess the performance of the N-nutrient estimation algorithms.

Conflicts of Interest:
The authors declare no conflict of interest.