Automatic Stomatal Segmentation Based on Delaunay-Rayleigh Frequency Distance

The CO2 and water vapor exchange between leaf and atmosphere are relevant for plant physiology. This process is done through the stomata. These structures are fundamental in the study of plants since their properties are linked to the evolutionary process of the plant, as well as its environmental and phytohormonal conditions. Stomatal detection is a complex task due to the noise and morphology of the microscopic images. Although in recent years segmentation algorithms have been developed that automate this process, they all use techniques that explore chromatic characteristics. This research explores a unique feature in plants, which corresponds to the stomatal spatial distribution within the leaf structure. Unlike segmentation techniques based on deep learning tools, we emphasize the search for an optimal threshold level, so that a high percentage of stomata can be detected, independent of the size and shape of the stomata. This last feature has not been reported in the literature, except for those results of geometric structure formation in the salt formation and other biological formations.


Introduction
Stomata are the gates through which gas exchange through the leaf takes place, involving carbon dioxide intake and water vapor loss (reviewed in Reference [1][2][3]). The carbon dioxide intake drives plant growth and productivity [4], while the water vapor loss through a process known as transpiration regulates leaf temperature, nutrient uptake, and root-to-shoot signaling [5,6]. The remaining of the epidermis is covered by an impervious cuticle, with restricted gas exchange properties [7,8]. Gas-exchange capacity depends on stomatal density (i.e., stomata number per unit area), size, and patterning (spacing) [9][10][11]. Methods for automatic segmentation of stomatal density and size have been earlier presented [12,13], whereas patterning (spacing) is currently performed manually [14].
From a computational point of view, stomatal analysis has been explored mainly on pixel or object analysis. Although localization and processing tools exists, these are limited to search for stomatal position through techniques based on chromatic and morphological combinatorial operations [15][16][17], texture analysis [16], fractal analysis [18], segmentation using object-oriented method in multiple resolutions [19], and, more recently, by Deep Convolutional Neural Networks [12,13,[20][21][22]. Today, no technique allows full spectrum analysis of different stomatal types since they have a great variation according to species, shape, position, and, in general, noise and technique present in the microscopy

Results
Delaunay-Rayleigh Threshold Binarization (DRTB) algorithm has been evaluated over a set of 31 optic microscopy images from Jatoba Hymenaea Courbaril tree localized in Caribbean, Central and South American zones. The images were taken with an Olympus E-330 camera with optic microscope with 3136 × 2352 resolution pixels, at the Biosciences Institute at University of São Paulo, Brazil (USP). The images' set has 3087 regions hand-classified as stomata (see Supplementary Materials section) with its coordinates (x,y).
The abaxial surfaces of the epidermis were obtained from three or four leaves (one per individual) of each species. From each collected leaf, a sample of approximately 1 cm 2 was removed from the leaf's middle region, between the margin and midrib. Dissociation of the leaf epidermis was performed using a 1:1 solution of glacial acetic acid and hydrogen peroxide at 60 • C for 12 h, or the time required to completely decouple the epidermis [34].
The algorithm's performance was evaluated by two statistical indicators: precision (also known as positive predictive value, PPV) and recall (or true positive rate, TPR). The procedure consists of manually identifying stoma's center (x,y) coordinates. Then, the segmentation-algorithm identifies this position and proceeds with a comparison. If the distance between the manual coordinate is less than a threshold with respect to the coordinate of the segmented region, we consider that it has been classified correctly. This case is considered as True Positive (TP). In case the algorithm does not find the stoma, even when it is present in the image, we consider this case as False Negative (FN). Finally, when the algorithm classifies a region where there is no stoma, we consider this coordinate as False Positive (FP) (see Figure 1). Therefore, the evaluation metrics are: From this set, the proposed algorithm is capable of 2752 detections (recall) of 89.14 ± 8%. On the other hand, the number of false positive regions (precision) is 72.8 ± 10%. As shown on Figure 2, the performance is variable with each sample, given structural conditions; however, a high classification rate is achieved as a result of multilevel threshold traveling design (see examples in Figures 3 and 4). In contrast, lower results occur at lower image quality where the stomata are located because they are out of focus. The combined performance of both measures reflects this fact, and it is shown in Figure 3 (specimen). Performance of the proposed algorithm applied to 31 specimens of Hymenaea Courbaril (Jatoba Database). Overall recall and precision are 89.14% and 72.8%, respectively. Maximum recall performance is achieved at specimens #8, #9, and #10 with 100% and maximum precision performance is reached at 98% by specimen #3. Worst recall and precision performance is 70% and 58%, respectively, both at specimen #30. Illustration of precision and recall concepts. True positive (TP) occurs when the distance between manual-coordinate (yellow cross) is less than a threshold with respect to the coordinate of the segmented-region (ellipses). False negative (FN) occurs when the algorithm does not find the stoma, even when it is present in the image. Finally, when the algorithm classifies a region where there is no stoma, we consider this coordinate as False Positive (FP). The ratio TP/(FP + TP) is known as precision (positive predictive value (PPV)), and the ratio (TP/TP + FN) is called recall (true positive rate (TPR)).
From this set, the proposed algorithm is capable of 2752 detections (recall) of 89.14 ± 8%. On the other hand, the number of false positive regions (precision) is 72.8 ± 10%. As shown on Figure 2, the performance is variable with each sample, given structural conditions; however, a high classification rate is achieved as a result of multilevel threshold traveling design (see examples in Figures 3 and 4). In contrast, lower results occur at lower image quality where the stomata are located because they are out of focus. The combined performance of both measures reflects this fact, and it is shown in Figure 3 (specimen). From this set, the proposed algorithm is capable of 2752 detections (recall) of 89.14 ± 8%. On the other hand, the number of false positive regions (precision) is 72.8 ± 10%. As shown on Figure 2, the performance is variable with each sample, given structural conditions; however, a high classification rate is achieved as a result of multilevel threshold traveling design (see examples in Figures 3 and 4). In contrast, lower results occur at lower image quality where the stomata are located because they are out of focus. The combined performance of both measures reflects this fact, and it is shown in Figure 3 (specimen). Performance of the proposed algorithm applied to 31 specimens of Hymenaea Courbaril (Jatoba Database). Overall recall and precision are 89.14% and 72.8%, respectively. Maximum recall performance is achieved at specimens #8, #9, and #10 with 100% and maximum precision performance is reached at 98% by specimen #3. Worst recall and precision performance is 70% and 58%, respectively, both at specimen #30. Performance of the proposed algorithm applied to 31 specimens of Hymenaea Courbaril (Jatoba Database). Overall recall and precision are 89.14% and 72.8%, respectively. Maximum recall performance is achieved at specimens #8, #9, and #10 with 100% and maximum precision performance is reached at 98% by specimen #3. Worst recall and precision performance is 70% and 58%, respectively, both at specimen #30. Orange cross indicates mean (recall/performance 83%/72%) and standard deviation (8%/10%). (Right) Specimen #1 represents an average performance (85%/72%) with clear regions but high diffusion. Specimen #3 has 98%/89% detection performance, and stomata shows sharp borders with clear regions leading to good detection metrics. Specimen #12 shows 94%/58% with high recall and lower precision, explained by diffuse borders and poor region definition. Specimen #30 has the poorest performance with 70%/58% with very diffuse borders and poor regions with high false positive rate.  (Right) Specimen #1 represents an average performance (85%/72%) with clear regions but high diffusion. Specimen #3 has 98%/89% detection performance, and stomata shows sharp borders with clear regions leading to good detection metrics. Specimen #12 shows 94%/58% with high recall and lower precision, explained by diffuse borders and poor region definition. Specimen #30 has the poorest performance with 70%/58% with very diffuse borders and poor regions with high false positive rate. Orange cross indicates mean (recall/performance 83%/72%) and standard deviation (8%/10%). (Right) Specimen #1 represents an average performance (85%/72%) with clear regions but high diffusion. Specimen #3 has 98%/89% detection performance, and stomata shows sharp borders with clear regions leading to good detection metrics. Specimen #12 shows 94%/58% with high recall and lower precision, explained by diffuse borders and poor region definition. Specimen #30 has the poorest performance with 70%/58% with very diffuse borders and poor regions with high false positive rate.

Discussion
Previous results reveal a high precision rate in the detection of Hymenaea Courbaril stomata. However, is this distribution similar on other species? To answer this question, we analyzed seven species with different microscopy techniques. Experimental results show that not only Hymenaea Courbaril has a stable distribution, we applied a manual segmentation to 2842 stomata with 12 different configurations shown in Figure 5 and described in Table 1. Despite morphological and chromatic differences, the spatial distributions are similar, as it is shown in Figure 4. These results are consistent with the findings found by Croxdale [9], where the relationship between area and number is analyzed in the first instance. This result is relevant since it provides statistical evidence that would allow us to design an optimal algorithm for the relative distance between stomata, which is consistent with Peat & Fitter [35] developments. The only requirement is a preprocess with efficient acceptance/rejection range for stomatal segmentation. This last task could be performed by any segmentation algorithm, such as those mentioned above. It is known that environmental conditions during growth may also affect stomatal dispersion pattern among species [36,37]. Based on this background, stomatal spacing may be employed as a marker for studying plant adaptation to growth environment [38]. Next, we present a brief discussion of manual segmentation in different types of plants of the Monocot family, compared to the automatic evaluation of the Dicot family.
Plants 2020, 9, x FOR PEER REVIEW 6 of 17 Figure 5. Hand segmentation to 2842 stomata within 12 species as described in Table 1. Despite morphological and chromatically differences, the spatial distributions are similar to the example shown in Figure 4 (Hymenaea Courbaril).

Manual Segmentation
The automatic algorithm evaluation gives = 89.14 ± 8% and = 72.8 ± 10%. Those results are relevant given image type. Our results are statistically valid, considering the total number of stomata studied. To assess if this relation is relevant for other species, we evaluate 12 different configurations on 7 different plant species. Despite interspecies variations, a relevant result is the Root-Mean-Square Deviation (RMSD) falling with segmented region number. In terms of distribution parameters, they are stable in the range 3 < < 8 as long as region number is lower than 100. As future work, it remains to design an optimal search algorithm based on a distribution parameter in the appropriate range, depending on the type of species, the analysis of second (autocovariance) and higher order moments of geometric characteristics, and the cross covariances with other relevant properties captured by microscopes, like color.
According to the analyzed data, the segmented region number affects the algorithm performance. In general, the higher the segmented region number the lower RMSD between Rayleigh distribution and frequency histogram; which confirms the true nature of the spatial distribution of stomata (see Figure 6). Moreover, Rayleigh distribution ideal parameters (that minimize RMSD) have  Table 1. Despite morphological and chromatically differences, the spatial distributions are similar to the example shown in Figure 4 (Hymenaea Courbaril).

Manual Segmentation
The automatic algorithm evaluation gives TPR = 89.14 ± 8% and PPV = 72.8 ± 10%. Those results are relevant given image type. Our results are statistically valid, considering the total number of stomata studied. To assess if this relation is relevant for other species, we evaluate 12 different configurations on 7 different plant species. Despite interspecies variations, a relevant result is the Root-Mean-Square Deviation (RMSD) falling with segmented region number. In terms of distribution parameters, they are stable in the range 3 < σ < 8 as long as region number is lower than 100. As future work, it remains to design an optimal search algorithm based on a distribution parameter in the appropriate range, depending on the type of species, the analysis of second (autocovariance) and higher order moments of geometric characteristics, and the cross covariances with other relevant properties captured by microscopes, like color.
According to the analyzed data, the segmented region number affects the algorithm performance. In general, the higher the segmented region number the lower RMSD between Rayleigh distribution and frequency histogram; which confirms the true nature of the spatial distribution of stomata (see Figure 6). Moreover, Rayleigh distribution ideal parameters (that minimize RMSD) have a narrow range between 3 < σ < 8; see Figure 7. For each species, this parameter is different; however, results tend to cluster as the segmented region number reaches 100.

Defocus Stomata in Automatic Mode
In the automatic segmentation mode, we observe relevant differences in the maximum and minimum performance of the proposed algorithm. When analyzing these differences in detail, we observe that they are due to the lack of focus of the stomata in some samples of the study (Figure 8).

Defocus Stomata in Automatic Mode
In the automatic segmentation mode, we observe relevant differences in the maximum and minimum performance of the proposed algorithm. When analyzing these differences in detail, we observe that they are due to the lack of focus of the stomata in some samples of the study (Figure 8).

Figure 8.
Comparison between high and low focused stomata. In specimen #10, stomata are clearly visible with sharp edges with very low RMSD (see Figure 7). Specimen #29 has diffused edges.

Main Advantages and Disadvantages
As a first advantage, we highlight that our method is based on the stomatal geometric properties, easily verifiable, such as centroids spatial distribution; which does not depend on the shape and size of the stomata. Second, the image-preprocessing method is a well-established procedure in the field of imaging, and it is inspired by the property of diffusion, which is one of the fundamental mechanisms of stomatal physiology. Third, stomatal geometry is described by a Delaunay tessellation, which is a technic known to captures diffusive-processes, such as those occurring at the cellular level. Fourth, the use of the median as the main statistic decreases outliers-weight, which increases robustness. Regarding the disadvantages, our proposal might be classified as standard as it was not implemented through a deep-learning scheme (although it would be possible to do so). Despite our efforts to capture a large number of samples, we recognize that the number of species is limited, although high variability between samples can be observed. Regarding the distribution's discrimination method, it is known that RMDS has some bias, which could be avoided with an Akaike scheme or similar. The preprocessing phase of our proposal is designed for Hymenaea Courbaril, as future work, and this process might be automated in such a way that it is possible to accept segmentation in other types of stomata.

Materials and Methods
The natural way to deal with structural complexity found in stomata images is noise analysis. The latter is associated, by large, to simplification and information-reduction processes, like anisotropic diffusion, wavelet transform techniques, and nonlinear, statistical, or adaptive filters [39][40][41]. Even when image segmentation is possible, for example, by morphological processing, the use of geometric properties is desirable. Therefore, we propose a novel segmentation method that seeks an optimal binarization threshold through Rayleigh-distribution distance-minimization. This idea was inspired by Staff et al. [42], where they analyzed stomatal pore center coordinates, but, in this case, only areas of the inner triangles and their angles were reported. In Reference [43], a statistical mechanics explanation of this phenomenon is given through particle interaction, in this case, the Figure 8. Comparison between high and low focused stomata. In specimen #10, stomata are clearly visible with sharp edges with very low RMSD (see Figure 7). Specimen #29 has diffused edges.

Main Advantages and Disadvantages
As a first advantage, we highlight that our method is based on the stomatal geometric properties, easily verifiable, such as centroids spatial distribution; which does not depend on the shape and size of the stomata. Second, the image-preprocessing method is a well-established procedure in the field of imaging, and it is inspired by the property of diffusion, which is one of the fundamental mechanisms of stomatal physiology. Third, stomatal geometry is described by a Delaunay tessellation, which is a technic known to captures diffusive-processes, such as those occurring at the cellular level. Fourth, the use of the median as the main statistic decreases outliers-weight, which increases robustness. Regarding the disadvantages, our proposal might be classified as standard as it was not implemented through a deep-learning scheme (although it would be possible to do so). Despite our efforts to capture a large number of samples, we recognize that the number of species is limited, although high variability between samples can be observed. Regarding the distribution's discrimination method, it is known that RMDS has some bias, which could be avoided with an Akaike scheme or similar. The preprocessing phase of our proposal is designed for Hymenaea Courbaril, as future work, and this process might be automated in such a way that it is possible to accept segmentation in other types of stomata.

Materials and Methods
The natural way to deal with structural complexity found in stomata images is noise analysis. The latter is associated, by large, to simplification and information-reduction processes, like anisotropic diffusion, wavelet transform techniques, and nonlinear, statistical, or adaptive filters [39][40][41]. Even when image segmentation is possible, for example, by morphological processing, the use of geometric properties is desirable. Therefore, we propose a novel segmentation method that seeks an optimal binarization threshold through Rayleigh-distribution distance-minimization. This idea was inspired by Staff et al. [42], where they analyzed stomatal pore center coordinates, but, in this case, only areas of the inner triangles and their angles were reported. In Reference [43], a statistical mechanics explanation of this phenomenon is given through particle interaction, in this case, the distance between stomata, as a means of regulation and state of the tissue in general. An algorithm summary is shown in Figure 9.
Plants 2020, 9, x FOR PEER REVIEW 9 of 17 distance between stomata, as a means of regulation and state of the tissue in general. An algorithm summary is shown in Figure 9.

Preprocessing
Standard noise-reduction and boundary-preservation techniques are used given the common noise properties present in most stomata images. We propose a two-step methodology based on Perona-Malik (PM) diffusion and Meanshift-Hadamard intensity-reduction.
Step I. Perona-Malik: The first step is PM iterative filter application [44]. This algorithm reduces noise level preserving and, at the same time, boundary structure through diffusion adaptation [45], meaning a lower diffusion constant near boundaries, and higher otherwise. Being an interactive filter, level simplification of image structures is possible, which is a leading characteristic in stomata search. Filter design criteria defined by Perona and Malik [46] are causality, immediate localization, and piecewise smoothing. The last three properties are relevant in our problem given that stomata have well defined structures hard to segment, mainly by their surrounding structures. As for causality, PM filters all regions classified as noise. Immediate localization allows sharp boundaries, despite scale changes, and the most useful property here is piecewise smoothing since it allows smaller structures to collapse into larger ones sharing a visual similarity criterion.
Let be an image defined over the space , 3 (ℕ) of arrays with rows and columns with pixels in the natural numbers. Depending on the color-space, {Red, Green, Blue} (RGB) or {Hue,Saturation,Value} (HSV)superscripts are used to refer the components. The PM filter defined from , (ℕ) into itself is a solution of the heat-equation parameterized by representing the number of times the filter is applied (time in the original partial differential equation context) and a diffusion parameter controlling the noise level (see Reference [46] for more details) (see Figure 10). As a result of PM filter application over RGB space, the red channel increases separation levels with respect to background, and this result is relevant as the application of PM filter improves stomata profile detection, as it is exemplified in Figure 11.

Preprocessing
Standard noise-reduction and boundary-preservation techniques are used given the common noise properties present in most stomata images. We propose a two-step methodology based on Perona-Malik (PM) diffusion and Meanshift-Hadamard intensity-reduction.
Step I. Perona-Malik: The first step is PM iterative filter application [44]. This algorithm reduces noise level preserving and, at the same time, boundary structure through diffusion adaptation [45], meaning a lower diffusion constant near boundaries, and higher otherwise. Being an interactive filter, level simplification of image structures is possible, which is a leading characteristic in stomata search. Filter design criteria defined by Perona and Malik [46] are causality, immediate localization, and piecewise smoothing. The last three properties are relevant in our problem given that stomata have well defined structures hard to segment, mainly by their surrounding structures. As for causality, PM filters all regions classified as noise. Immediate localization allows sharp boundaries, despite scale changes, and the most useful property here is piecewise smoothing since it allows smaller structures to collapse into larger ones sharing a visual similarity criterion.
Let P be an image defined over the space M 3 n,m (N) of arrays with n rows and m columns with pixels in the natural numbers. Depending on the color-space, {Red, Green, Blue} (RGB) or {Hue, Saturation, Value} (HSV)superscripts are used to refer the components. The PM filter ∂ t defined from M n,m (N) into itself is a solution of the heat-equation parameterized by t representing the number of times the filter is applied (time in the original partial differential equation context) and a diffusion parameter controlling the noise level (see Reference [46] for more details) (see Figure 10). As a result of PM filter application over RGB space, the red channel increases separation levels with respect to background, and this result is relevant as the application of PM filter improves stomata profile detection, as it is exemplified in Figure 11.   Figure 10) the image is decomposed into red, green, and blue channels (RGB decomposition) with a standard routine (see shared code at GitHub). The red channel is used later in the next process.
Step II. Meanshift-Hadamard: The second step is the application of Meanshift operator over the red channel to obtain an image with a reduced intensity level but with sharp defined boundaries, which eases the segmentation process. This step uses an unsupervised clustering algorithm over the intensity-space through a minimization process between each energy cluster [47]. With the aim at improving signal-background splitting, a Hadamard division [48] operator between saturation (HSV space) and red (RGB space) channels is applied. This operation enhances signal splitting, as it can be seen in Figure 12. The final operator is the consecutive application of PM, Meanshift and Hadamard division (3): where is the resultant preprocessed image.   Figure 10) the image is decomposed into red, green, and blue channels (RGB decomposition) with a standard routine (see shared code at GitHub). The red channel is used later in the next process.
Step II. Meanshift-Hadamard: The second step is the application of Meanshift operator over the red channel to obtain an image with a reduced intensity level but with sharp defined boundaries, which eases the segmentation process. This step uses an unsupervised clustering algorithm over the intensity-space through a minimization process between each energy cluster [47]. With the aim at improving signal-background splitting, a Hadamard division [48] operator between saturation (HSV space) and red (RGB space) channels is applied. This operation enhances signal splitting, as it can be seen in Figure 12. The final operator is the consecutive application of PM, Meanshift and Hadamard division (3): where is the resultant preprocessed image.  Figure 10) the image is decomposed into red, green, and blue channels (RGB decomposition) with a standard routine (see shared code at GitHub). The red channel P R is used later in the next process.
Step II. Meanshift-Hadamard: The second step is the application of Meanshift M operator over the red channel to obtain an image with a reduced intensity level but with sharp defined boundaries, which eases the segmentation process. This step uses an unsupervised clustering algorithm over the intensity-space through a minimization process between each energy cluster [47]. With the aim at improving signal-background splitting, a Hadamard division [48] operator between saturation (HSV space) and red (RGB space) channels is applied. This operation enhances signal splitting, as it can be seen in Figure 12. The final operator is the consecutive application of PM, Meanshift and Hadamard division (3): where F is the resultant preprocessed image. This two-step algorithm might be modified based upon stomata-image type. However, the proposed technique is useful as segmentation method as it helps with the post-labeling process, which is a main focus of the research.

Delaunay-Rayleigh Threshold Binarization (DRTB Algorithm)
Our main algorithm uses stomata spatial localization as a method to find an optimal binarization threshold. As input, it uses the grey-scale image and outputs an optimal umbralization level such that error in Delaunay-distance frequencies be minimum with respect to an ideal Rayleigh distribution [28] (Figure 13). Next, we present detailed description of DRTB algorithm phases.  Figure 10), the filtered red-channel ∂ t P R is used as input for Meanshift; the output M ∂ t P R is combined with the saturation channel of the original image P S through the Hadamard division, and this later image is subject to clustering. This two-step algorithm might be modified based upon stomata-image type. However, the proposed technique is useful as segmentation method as it helps with the post-labeling process, which is a main focus of the research.

Delaunay-Rayleigh Threshold Binarization (DRTB Algorithm)
Our main algorithm uses stomata spatial localization as a method to find an optimal binarization threshold. As input, it uses the grey-scale image F and outputs an optimal umbralization level such that error in Delaunay-distance frequencies be minimum with respect to an ideal Rayleigh distribution [28] ( Figure 13). Next, we present detailed description of DRTB algorithm phases.
Step III. Threshold level binarization: First, phase is binary-image building given a threshold. Literature gives various binarization techniques; however, most of them use as input information the relationship between intensity levels of the image. Instead, our algorithm explores a geometric-analysis of image embedded structures as optimal binarization-level search method. Initially, all binarization level-space values are traversed by normalization of intensity-levels in [0, 1] domain and then we explore such space. Binarization operator H l (F) at level l is defined as: which has the same size of F but with values in [0, 1]. All pixels with values higher that l are given the value 1 and the rest 0. From the image H l (F), we can obtain a series of n R regions with areas given by r = [r 1 , r 2 , r 3 , . . . , r n R ].
Step IV. Binary labeling and filtering: Valid regions are those whose area be not an outlier. We propose [49] methodology for outlier detection. First, median absolute deviation (MAD) is determined as: measured in pixels with value 1, and b = 1.4826 is a normality constant. Area outlier detection criterion is defined as: This means that all regions not fulfilling the criterion will be discarded in the following analysis.
Step V. Delaunay tessellation: This phase consists in the generation of a triangular tessellation for all regions defined as segmented inliers. Mass centers for each i-th region are determined through central moment estimation [50].  Figure  10), the filtered red-channel is used as input for Meanshift; the output ( ) is combined with the saturation channel of the original image through the Hadamard division, and this later image is subject to clustering. This two-step algorithm might be modified based upon stomata-image type. However, the proposed technique is useful as segmentation method as it helps with the post-labeling process, which is a main focus of the research.

Delaunay-Rayleigh Threshold Binarization (DRTB Algorithm)
Our main algorithm uses stomata spatial localization as a method to find an optimal binarization threshold. As input, it uses the grey-scale image and outputs an optimal umbralization level such that error in Delaunay-distance frequencies be minimum with respect to an ideal Rayleigh distribution [28] (Figure 13). Next, we present detailed description of DRTB algorithm phases.  Let m i the mass center corresponding to i-th region (see Figure 14). A Delaunay algorithm uses an iterative process to generate a triangular tessellation considering a given region. The output is a set of 3 vertices for each triangle in terms of its coordinates m 1 i , m 2 i , m 3 i ; given a two-dimensional set of points, it is easy to build a tessellation joining the points (vertices) with lines (edges) in a triangulation. For each triangle, a circumcircle might be drawn. A Delaunay tessellation is such a triangulation that no vertex from the tessellation is found inside the circles. Figure 14 shows a triangulation composed by 17 vertices. For each triangle, its circumcircle is drawn. A circle meets three vertices, and no vertex is inside. Delaunay tessellations tend to maximize the smallest inner angle possible. Diffusion operators complies with a maximum principle, which are relevant in existence and uniqueness of solutions under this operation; it is possible to extend these properties to numerical solution calculations obtained over Delaunay tessellations, mainly because of the geometric regularity imposed over the angle distribution [51]. This kind of tessellation are used in partial differential equation analysis and morphogenesis studies, and we expect it might play a role given the importance of gas-diffusion in stomatal physiology. Step VI. Distance analysis: As result of the last process, a set of triangles in obtained (Figure 14), each vertex an inlier-region. We analyze the set of all distances: = [ ( 1 , 2 ), ( 2 , 3 ), ( 1 , 3 ) ], = 1, 2, 3, . . . , where (•,•) is the Euclidian distance between vertices, and is the number of tessellated triangles. Let = [ 1 , 2 , ⋯ , , ⋯ , 3 ], a vector composed of all edges, and let the number of occurrences of distance ; then: is the occurrence probability of in the vector for a given threshold obtained from the binarization process.
Step VII. Error estimation: Consider the Rayleigh distribution of parameter : our experimental results point at its presence in all analyzed images (manual or automatic process); see Reference [28] for other applications of this probability in natural formations. Initial calibration points at = 2 and > 0. By root mean-square (RMS) deviation (RMSD) minimization, we assess the proximity of ( | ) to ℛ( | ) as a function of threshold and -parameter: See the Discussion for comments on the significance of the parameter .
Step VIII. Optimal level selection: The ideal stomata distribution is associated with the minimum RMSD. The optimal parameter ̂ happens when such distance is minimum: Once the best level is found, the image is finally binarized. An example is shown in Figure 15. Step VI. Distance analysis: As result of the last process, a set of triangles in obtained (Figure 14), each vertex an inlier-region. We analyze the set of all distances: where δ(·, ·) is the Euclidian distance between vertices, and n is the number of tessellated triangles. Let d = [d 1 , d 2 , · · · , d i , · · · , d 3n ], a vector composed of all edges, and let n i the number of occurrences of distance d i ; then: is the occurrence probability of d i in the vector d for a given threshold l obtained from the binarization process.
Step VII. Error estimation: Consider the Rayleigh distribution of parameter σ: our experimental results point at its presence in all analyzed images (manual or automatic process); see Reference [28] for other applications of this probability in natural formations. Initial calibration points at σ = 2 and t > 0. By root mean-square (RMS) deviation (RMSD) minimization, we assess the proximity of p(d i |l) to R(d i |σ) as a function of threshold l and σ-parameter: RMSD(σ, l) = See the Discussion for comments on the significance of the parameter σ.
Step VIII. Optimal level selection: The ideal stomata distribution is associated with the minimum RMSD. The optimal parameterl happens when such distance is minimum: Once the best level is found, the image is finally binarized. An example is shown in Figure 15.

Conclusions
We showed an optimal segmentation algorithm based on stomata-position frequency-analysis through Delaunay tessellation and Rayleigh distribution. We tested the algorithm with Hymenaea Courbaril before a preprocess technique aimed at noise reduction. Our proposal is centered around a Delaunay-Rayleigh Threshold Binarization (DRTB algorithm) that allows an optimal binarization threshold with a posterior segmentation. The automatic segmentation shows the presence of a stable Rayleigh distribution, despite image differences. This result is relevant given that DRTB algorithm might be applied to other species, as was shown in the manual experimental phase with seven different species.

Conclusions
We showed an optimal segmentation algorithm based on stomata-position frequency-analysis through Delaunay tessellation and Rayleigh distribution. We tested the algorithm with Hymenaea Courbaril before a preprocess technique aimed at noise reduction. Our proposal is centered around a Delaunay-Rayleigh Threshold Binarization (DRTB algorithm) that allows an optimal binarization threshold with a posterior segmentation. The automatic segmentation shows the presence of a stable Rayleigh distribution, despite image differences. This result is relevant given that DRTB algorithm might be applied to other species, as was shown in the manual experimental phase with seven different species.