Global Image Thresholding Adaptive Neuro-Fuzzy Inference System Trained with Fuzzy Inclusion and Entropy Measures

Thresholding algorithms segment an image into two parts (foreground and background) by producing a binary version of our initial input. It is a complex procedure (due to the distinctive characteristics of each image) which often constitutes the initial step of other image processing or computer vision applications. Global techniques calculate a single threshold for the whole image while local techniques calculate a different threshold for each pixel based on specific attributes of its local area. In some of our previous work, we introduced some specific fuzzy inclusion and entropy measures which we efficiently managed to use on both global and local thresholding. The general method which we presented was an open and adaptable procedure, it was free of sensitivity or bias parameters and it involved image classification, mathematical functions, a fuzzy symmetrical triangular number and some criteria of choosing between two possible thresholds. Here, we continue this research and try to avoid all these by automatically connecting our measures with the wanted threshold using some Artificial Neural Network (ANN). Using an ANN in image segmentation is not uncommon especially in the domain of medical images. However, our proposition involves the use of an Adaptive Neuro-Fuzzy Inference System (ANFIS) which means that all we need is a proper database. It is a simple and immediate method which could provide researchers with an alternative approach to the thresholding problem considering that they probably have at their disposal some appropriate and specialized data.


Introduction
Thresholding is used in separating the foreground (which contains one or more objects) from the background of an image.This is achieved by rendering all of its pixels as either black or white.The result is a binarized version of our initial input which is usually the introductory step of other applications concerning image processing or computer vision.It is an important process and, though it may seem an easy task, it is in fact a complicated and elaborate procedure because of the wide variety of characteristics which are unique to each image.Thresholding techniques are sorted into two distinct groups: global thresholding methods and local or adaptive thresholding methods.The former assign a unique threshold to the whole image while the latter find a different threshold for every pixel based on the grayscale information of its adjoining pixels.Sometimes, these two are combined to some hybrid method.Local techniques are of course slower and they are usually used in cases where our input is degraded or contains multiple objects, lighting "irregularities" and various distortions.In samples like these, the performance of global methods is usually poor and their results are insufficient.However, in other cases, global techniques are preferred due to their speed and simplicity.
As far as global methods are concerned, when a specific intensity value between 0 and 255 (for greyscale images) is chosen as the threshold t of the image, the pixels whose intensity is larger than t are set as "white" while the other pixels are set as "black".Depending on the way this value t is calculated, global techniques are categorized as histogram-shaped-based, clustering-based, entropy-based, object attribute-based and spatial approaches [1].Although there are several general methods of image binarization, the quality of their results may vary depending on the set of images they are being applied on.In other words, some method may perform well for specific images but this doesn't mean that it will be suitable for all inputs.Hence, we often need to confine our area of experimentation to specific domains of particular attributes.
In [2], we introduced some possible fuzzy inclusion measures and their respective (based on Young's theorem [3]) fuzzy entropy measures.These measures were extracted from a specific formula in which we used fuzzy binary operations (fuzzy intersections or t-norms and fuzzy implications).At the same time, we tried to "enclose" these inclusion functions in some theoretical mathematical context by examining their possibility to follow some axioms concerning fuzzy subsethood.We saw that they can be connected with Young's axioms [3] of fuzzy inclusion which were introduced as a link and a continuation of Kosko's respective research on fuzzy subsethood and entropy [4][5][6].Although we had to slightly alter one of Young's original axioms, we didn't invalidate the rest of her work (concerning the connection between fuzzy subsethood and fuzzy entropy as stated earlier by Kosko) which means that our inclusion measures returned some respective entropy measures according to her theorem.
Fuzzy subsethood and entropy functions can be used in a variety of applications (image processing, feature selection, fuzzy classification, fuzzy controllers, fuzzy rules, similarity measures) and there are several presentations dealing with such measures (e.g., [7][8][9][10][11][12][13][14][15][16][17][18][19], to name a few after 2000).In [2], while examining the behavior of our measures, we saw that some of them had some interesting attributes which could offer additional information when specific applications are concerned.In an attempt to prove the applicability of these measures as well as to support our belief that they could offer us a different approach to the solution of a particular problem, we also introduced a general algorithm of global image thresholding.This method efficiently used some of our indicators and was based entirely on them.This means that, unlike other methods of image thresholding, it does not rely on the histogram of the image nor does it depend on the optimization of some statistical or fuzzy measure.We didn't involve with any kind of concavity or statistical analysis or fuzzy classification.We only used specific fuzzy inclusion values between the image and some standard images (black, white, grey).The entropy of the image was used as well.It is a very immediate and comprehensible method and its main characteristic is that it is an open and adaptable procedure which can be adjusted to the distinctive characteristics of someone's area of interest.
Then, in [20,21], we adapted this general method to a local level.There, we focused on documents of non-uniform illumination and our algorithm was adjusted to cases of this particular domain.These text documents usually include text and line-drawings and there are several occasions where we need to recognize or improve their content.This can be achieved by their proper binarization and it is important for document analysis systems or optical character recognition (OCR) processes.Global methods are insufficient for this kind of images so we had to adjust our method to a local thresholding technique.Once again, the results were very competent while compared with those of MATLAB's "adaptthresh" command (version: 9.0.0.341360-R2016a,The MathWorks Inc., Natick, MA, USA).It is also very important that we managed to present a fully automated procedure (regarding user provided constants) and we avoided sensitivity or bias parameters (which are used by most local methods) by properly exploiting some of the information that our measures provided.
After this research on a global and on a local level, we were convinced not only for the applicability of our measures but for their potential to offer valuable information to thresholding problems as well.Therefore, we thought that a next logical step to our research could be an effort of establishing the connection between our measures and desired threshold t in a automatic way.Based on our previous work, a fuzzy neural network seemed very appropriate for this task.One of our main goals here is to examine the possibility of using fuzzy inclusion and entropy measures as input variables of an inference system.Of course, the target setting of the whole process can be puzzling, but, at the same time, it may prove to be an advantage as well.For instance, it could help someone enhance specific results of an existing method (without having to modify or improve the method itself) or approach more specialized targets without having to fabricate a respective algorithm from scratch.Thus, our second goal is to support this argument by presenting some corresponding results.These derive from two specific systems trained with the ANFIS (Adaptive Neuro-Fuzzy Inference System) method.The first one tries to enhance specific binarizations of Otsu's method on a particular set of images and the second one attempts to approach some more specialized results which are mainly focused on edge detection.These examples are only indicative and a visual expert could estimate this whole process better and more accurately.
After introduction, we have an overview on global thresholding methods and on adaptive neuro-fuzzy inference systems.We also summarize some basic parts of our previous work.In the next section, we explain the procedure that we followed and we present the results of our first global thresholding ANFIS which is trained with targets obtained from a combination of Otsu's thresholds and human observation.Then, in Section 4, we experiment on public databases and we train a second ANFIS in order to achieve some more focused results which we evaluate both visually and numerically.This is followed by a summarization and some further discussion.Finally, Section 6 contains our conclusions.

Global Thresholding Methods and Neural Networks in Image Segmentation
Like we said in our introduction, thresholding techniques are sorted (according to the information they obtain from the data) as histogram-shaped-based, clustering-based, entropy-based, object attribute-based, spatial approaches and local methods [1].Histogram based methods analyze the shape properties of grey-level histograms.The proper threshold is estimated via concavity analysis or regularization and statistical measurements.Some methods examine the distance of the histogram from convex hull [22][23][24][25][26] and others use curvature analysis in order to find peaks and valleys or overlapping peaks [27][28][29].In other cases, the histogram is converted into a smoothed two-peaked representation using auto-regressive modeling [30,31] or we have a rectangular approximation to the lobes of the histogram [32,33].
Clustering-based methods output two groups of pixels from gray-level information.Some iterative techniques search for the midpoints of the peaks [34][35][36][37], whereas other algorithms are based on minimization of the error between the form of the image and Gaussian distributions [38,39].We also have techniques that use fuzzy clustering based on the histogram of the image [33,40] and methods relying on the optimization of some statistical measure.Otsu's [41] method belongs in this category and it is one of the most referenced, well-known and widely used thresholding techniques ever.
Entropy-based methods exploit the entropy of the distribution of the gray levels of the image.The first who studied thresholding using Shannon's entropy were Pal et al. [42] and Johannsen and Bille [43].In some cases, we have an attempt of maximizing the entropy as an indication of maximum information transfer [44][45][46][47] and in others an effort of minimizing the entropy in order to have maximum information preservation [48][49][50][51].
Object attribute-based methods are based on some attribute quality or on the similarity between the original image and the binarized one.Such attributes may be edge matching [52], gray-level moments [53][54][55], shape compactness [56,57], connectivity [58], texture [59], or stability of segmented objects [60].Other algorithms use fuzzy measures in order to calculate the resemblance between the original image and its binarized version [61,62].
Spatial methods use gray-level distribution along with the dependency of pixels in a local area.One of the first techniques was Rosenfeld's [63] who used local average.We also have the use of relaxation [64,65], the Laplacian of the images [24], quadtrees [66] and second-order statistics [67].
Of course, there are several variations or modifications of these algorithms as well.As someone can see, there are numerous presentations and a lot of different approaches dealing with thresholding; however, it still remains an open problem.Overall, the dependence of many methods on the shape of the histogram prevents them from generalization while other methods are bias or parameter dependent or have a significant computational cost.Generally, all thresholding techniques work well for some types of images but not for all of them.We also mentioned that local methods ( [81][82][83][84] are some of the most known algorithms) calculate a different threshold for each pixel and we believe it is fair to be studied separately.In any case, here we focus on global thresholding fuzzy neural networks and we won't elaborate on these techniques.
Finally, we need to mention that there are several attempts on image segmentation with the use of neural networks, especially in the case of medical images.According to Othman and Tizhoosh [85], methods with a self-organizing map (unsupervised learning) use a large number of parameters.Furthermore, these need to be modulated before training and their reliance on prior information complicates their generalization.In [86], we have a two-stage neural network for volume segmentation of medical images.The first stage consists of a self-organizing principal components analysis network and the second of a self-organizing map used for clustering.In [87], medical images are segmented using a constraint-based Hopfield network.An incremental neural network for the segmentation of tissues in ultrasound images is presented in [88] where discrete Fourier transform (DFT) and discrete cosine transform (DCT) are used.In [89], we have a constraint satisfaction neural network used on a survey on image segmentation and, in [90], we have self-organizing neural networks used in the segmentation of astronomical images.
What we present here is different from these methods and our technique directly derives from our algorithms in [2,20,21].The binarization of the image is accomplished based solely on some fuzzy inclusion and entropy measurements (after our image is transformed into a fuzzy set) and we don't use any information regarding the histogram of our input.Moreover, we don't rely on any kind of analysis (concavity, statistical, etc.) of the image and no sensitivity or bias parameters are used.These inclusion and entropy values (five in total) constitute the inputs of our ANFIS while the target could be extracted-for example-from some other methods, human experience, data analysis or a combination of several procedures.All of these render our whole process very immediate and comprehensible and someone only needs some proper data to train his fuzzy system.Of course, the setting of the ground truth may be difficult or tiresome, nevertheless, it could help us improve specific results of a global technique or accomplish more specialized targets without having to modify the method itself or turning to more complex algorithms (local or hybrid).

Fuzzy Subsethood and Entropy Measures
In [2,91], we studied the production of possible fuzzy subsethood and entropy measures according to Young's axioms and theorem.Let X be our universal set (a finite set), ∅ its complement, F(X) its power set and A, B members of X with corresponding membership grades m A (x) and m B (x).Moreover, A C will be the standard complement of A (meaning that m A C (x) = 1 − m A (x)), P will be the fuzziest set ( m P (x) = 1/2 for every x ∈ X) and A ⊆ B in Zadeh's sense will mean that m A (x) ≤ m B (x) for every x ∈ X.Next are Young's original axiomatization of fuzzy inclusion and her theorem regarding fuzzy entropy measures: Definition 1.A fuzzy inclusion measure is a function S : F(X) × F(X) → I which satisfies the following: Theorem 1.If S is a fuzzy subsethood measure (meaning a function satisfying S1 − S3), then E defined as is an entropy measure (according to De Luca and Termini, (1972)) of fuzzy set A, where We initially tried to produce fuzzy inclusion measures (according to Young's axioms) and their corresponding (according to Theorem 1) entropy measures based on the following formula: where T is a fuzzy intersection (t-norm) and I is a fuzzy implication.Our incipient efforts to combine several known fuzzy t-norms and implications resulted in producing only Kosko's measures [4][5][6].
After we replaced axiom S3 with the following: without affecting the rest of Young's work, we managed to produce various possible fuzzy inclusion and entropy measures by combining several common fuzzy intersections and implications.More accurately, we had seven possible fuzzy inclusion indicators which satisfy-at least-S1, S2 and S3 and three corresponding entropy measures (some inclusion measures returned the same entropy indicators).
Here, once again, we focus on the following two inclusion indicators: and where and These derive from our formula by using as T the usual product . I Luc is Lucasiewichz's implication whereas I Z is an alteration of Zadeh's implication in order to satisfy the following property: (which is in fact the boundary condition regarding fuzzy implications as presented in [92]).These two functions measure the inclusion of a fuzzy set A into ∅ as larger than zero and this was important for our algorithms in [2,20,21] to work.
The following function E 1 is their common respective entropy measure: This will also be used during the binarization process.

Global and Local Thresholding Using Fuzzy Inclusion and Entropy Measures
During the second part of [2], we tried to validate the usefulness and the applicability of these measures by designing a global thresholding method based on them.Our main goal was to check their reliability through their potential to be used in real applications.Thresholding ensued from our experimentations since we were obtaining some very satisfying and promising results.Thus, we didn't focus on any particular domain of images and our research included several pictures of various characteristics.
Let's have a brief synopsis of what we saw there.Let im be a m × n gray-scale image.We flatten im and we divide its pixel intensities with 255 so that we have a fuzzy set A of m • n elements.X is a completely white image of the same size, ∅ is a completely black image and P is a "completely grey" image (the fuzziest set).Our set of inputs consisted of images of various sizes and characteristics and we measured the inclusion (using S 1 ) between them and sets X, ∅ and P. We also measured their entropy using E 1 .Let's say that The value of s 1 indicates the "brightness" of our picture and the value of s 2 its "darkness".s 3 and s 4 measure its "grayness".Several experimentations and observations led us to the assumption that there might be a relation between these values and the desired threshold of im.Then, we tried to establish this connection in a more specific mathematical manner.After further trials and tests, we came up with Algorithm 1 which is our general algorithm of global thresholding.What we have done could also be made using measure S 2 in a similar way.The distinctive feature of Algorithm 1 is that it is not a strict mathematical process but an open an adaptable procedure.This lies on steps 4 and 7 where the connection between r and s 1 , s 2 (or even r and all of s 1 , s 2 , s 3 , s 4 and e) isn't unequivocal and the same goes for the criterion of setting t as t 1 or t 2 .Since it is widely accepted that thresholding algorithms don't necessarily perform efficiently on every particular set of images, we consider this flexibility as an advantage of our method.
A main difficulty in testing image processing algorithms is the limited number of proper and specialized databases of images for comparing the various techniques [93].Researchers usually use a small set of inputs (occasionally repeating the same samples over and over again) and we often have a technique which is sufficient for this specific set, but we don't have adequate evidence regarding its effectiveness on other cases.Thus, due to this absence of data (regarding global thresholding and the proper single value target), we had to set specific targets on our own.Considering that our main target was to evaluate the reliability and applicability of our measures, we used various samples which belong to several domains and have different characteristics (bright images, dark images, high-entropy images, one-object images, multi-object images, noisy images and so on).As our goal, we tried to match (or if possible surpass) the results of Otsu's method as much as possible.Certainly, in several cases, this may not be the ideal target, but it was a good initial standard for estimating the efficiency of our process.
Our first implementation of Algorithm 1 was based on linear regression analysis where we used a specific function r = r(s 1 , s 2 ) and a particular criterion cr 1 .We effectively binarized more than 90% of about two hundred samples.The second implementation of our general Algorithm 1 included a classification of the images (six classes from very bright to very dark), a different function r = r(s 1 , s 2 ) for each class and a new criterion cr 2 .This showed the adaptability of our method since it allowed us to efficiently binarize all of our previous inferior (according to us) binarizations without damaging the rest of our results.
Then, in [20,21], we took the above method to a local level.The transition was immediate.Let's say we have a pixel p and its m × n neighborhood N (as a fuzzy set).We set We proceeded similarly to the second implementation of Algorithm 1 using eight groups according to the values of s 1 and s 2 .These groups were: Our general algorithm was Algorithm 2. During our first implementation of this algorithm on degraded or unevenly illuminated text documents, we used some standard r for each of our eight classes and we took some very satisfying results.Then, we moved our experimentation on other categories of images as well.While we again had some very sufficient results, at some point, we had two binarizations which we considered to be inadequate.They were not unacceptable, but we thought we could accomplish even better binarizations.Thus, we proceeded in a second implementation of Algorithm 2 using a varying r for each group.

Algorithm 2: (Local)
• Step 1 Take m × n neighborhood of pixel p and create fuzzy set N of m • n elements by dividing the gray intensities with 255, • Step 2 Compute s 1 , s 2 , s 3 , s 4 and e, • Step 3 Classify set N and compute r according to the group it belongs to • Step 4 Create fuzzy symmetric triangular number t = (r − c, r, r + c) where c = |s 3 − s 4 |, • Step 5 Use e as the truth value and compute t 1 = c(e − 1) + r and t 2 = c(1 − e) + r, Step 6 Based on a criterion set or t = t 1 or t = t 2 , • Step 7 Binarize pixel p with threshold t.
This way, we avoided user provided sensitivity or bias parameters and we didn't lose the automatic character of the whole process.In other words, we included some kind of sensitivity during the calculation of r allowing measures s 1 and s 2 to provide further information during the procedure.This second implementation resulted in the "correction" of our previous insufficient binarizations without substantially harming all of our previous adequate results.Many of them were in fact slightly improved.This was another proof of the adaptability of our general method and of the advantages that something like this can offer.
After our research in [2,20,21], we were convinced of the connection between threshold t and s 1 , s 2 , s 3 , s 4 , e.Of course, this relation is not some kind of simple or linear mathematical function so it was only logical to try establishing it in a more automatic manner.The use of an ANFIS seemed very appropriate and we directed our efforts towards this goal.The main problem during this process was certainly the lack of specialized databases of the form "Image im-Proper threshold t" so we had, once again, to construct our own data.This demanded a lot of trials and experimentation.In the next sections, we will see and discuss the results of these attempts.

Why ANFIS
ANFIS was introduced by Jang in 1993 [94] and constitutes a method of constructing Fuzzy Inference Systems (FIS) based on your available data.It has the structure of an artificial neural network and combines the advantages of both ANN and fuzzy logic systems.The ANFIS model shares both numerical and linguistic knowledge.It has been used in numerous applications and presentations of various fields (like mechanics, physics, economics, biology, industry and others) (e.g., [95][96][97][98][99][100][101][102][103][104][105], to name just a few recent).It has also been studied by other authors or compared with other neural networks (e.g., [103,106,107]).Its main advantage lies on the fact that it automatically adjusts the membership functions to our available data in order to achieve the best performance of our system.This is accomplished by using a hybrid learning model which includes gradient descent and Least Square Estimators (LSE).This model has been proved very effective in the training of such systems.
ANFIS algorithm is also included in many software packages (like in MATLAB which we use here) and is a very quick and effective way of evaluating the quality of your data and their capability of interpreting some actual process or phenomenon.ANFIS benefits also include its adaptation capability, nonlinear ability, and rapid learning capacity.Thus, based on the conclusions of our previous work as well as on the way our general algorithms were constructed, we thought it was fitting to turn to some fuzzy neural network in order to proceed to binarizations in a automatic manner.

Data Construction and Initial Evaluation
In order to construct a proper database, we measured s 1 , s 2 , s 3 , s 4 , e (as described above) for a large set of images.We concluded with 252 images of various characteristics.These include famous paintings, random photos, noisy images, face images, comic pictures, object (one or many) images, landscapes and others.We didn't focus on any specific domain since we mainly wanted to evaluate the proper transfer of information from our measures to a neural network.Our sources were the internet, articles of similar interest and the Caltech collections [108,109].Some of these inputs constitute a brighter or a darker version of the same image since we wanted to create a set that covers all proper ranges (regarding the values of s 1 and s 2 ) based on our eight categories we saw in Section 2.3.We have to make clear that we don't own most of these samples and that we used them because they were available for downloading and free of watermarks.Their copyright ownership is rather vague.Some of these have been used several times in similar research and others are uploaded in many sites without any further specific copyright details.Regardless, we downloaded and used these samples for research purposes only, so we hope that no one will object to this.However, if some owner of any particular image has any objections, we apologize in advance and we pledge to ask for his permission or to remove this specific sample.
In order to proceed to the training of our ANFIS, we needed of course a target t for each sample.Since there is not some publicly free, specialized database of the form "Image im-Proper threshold t", we decided to initially set Otsu's threshold as our t.This would help us to evaluate the potential of the whole process and to see if there is any meaning in continuing with it.Of course, based on our results in [2,20,21], we believed that most probably we would obtain some pretty good results.Thus, we trained an ANFIS with 252 vectors of the form (s v 1 , s v 2 , s v 3 ,s v 4 , e v , t v ), v = 1(1)252.These can can be seen using the following link: https://www.dropbox.com/s/tc297yl6crljgvc/GANFIS1Data.pdf?dl=0 and we will refer to them as data matrix 1.Samples are sorted according to s 1 -s 2 and someone can see that almost half of our samples have s 1 < s 2 and the others s 1 > s 2 .In other words, we made a substantial effort to have a good balance in our data.The smallest error occurred when we used triangular, bell-shaped and Gaussian membership functions and it was about 0.02 (which is rather small, about five intensity values concerning grayscale images).We should also mention that we tried to use as input variables various combinations of s 1 , s 2 , s 3 , s 4 and e (e.g., only s 1 , s 1 and s 2 , all of s i , s 1 , s 2 and e etc.).The best training was accomplished by using the maximum data, that is to say all of s 1 , s 2 , s 3 , s 4 and e and it seems that each of them carries significant information concerning the attributes of the image.As fuzzy union and intersection, we used the default functions probor and product respectively since they returned the best results.From now on, all our results are obtained from systems trained with probor, product and triangular membership functions.Thus, our first conclusion was that our ANFIS is trained very satisfactorily using these data.This derives not only from the small error but from the visual results as well.Of course, the latter cannot be displayed for obvious reasons.This was another proof that our measures are logical and applicable and that they can be related with thresholding very sufficiently.After this trial, we moved to the practical evaluation of the training of our ANFIS by separating a testing set of 50 images.These samples were picked randomly, but we made sure that all eight categories (from very dark to very bright) were properly represented.These images, like those of our training set, are of various sizes and characteristics and they are the samples listed in Table 1.We trained our ANFIS using the remaining 202 samples.We used triangular membership functions and our error was again around 0.02 (slightly smaller that before).Then, we used our trained ANFIS to binarize our testing set.Our results were from similar to indistinguishable compared to those of Otsu's algorithm.Thus, based on these observations, we were convinced that fuzzy inclusion and entropy measures can effectively be used as inputs of the fuzzy inference system.

G(lobal) ANFIS 1-Adjusting Otsu's Targets
Of course, our target was not just the reproduction of an existing global method via an ANFIS.The above procedure was made in order to examine the possibility of using our measures as inputs of the inference system.At the same time, it is important that we managed to approach a global method in this specific way because someone could interfere with the targets of the available data in order to "aid" this particular method in some "difficult" cases.Global algorithms which are free of sensitivity or bias parameters produce a specific result which cannot be further changed (unless, of course, someone proceeds to a modification or alteration of the whole methodology).Thus, we have to accept this result or turn to some more complex method (local or hybrid).On the other hand, if we use an inference system, we can experiment with the targets of our data set and modify specific results without having to mathematically redesign our whole algorithm.Certainly, the main difficulty of this process lies in the target setting, but this could derive from complicated analysis, someone's experimentations or even simple observations.Let's see an example of this.
During our data collection and the construction of our database, we observed some "imbalances" among Otsu's thresholds, which could, to a certain degree, be responsible for our training error.What we mean is that some samples with relatively close values of s i and e had a rather large difference regarding their threshold obtained from Otsu's method.For instance, let's observe in Figure 1 the two images which correspond to lines 11 and 12 of our data matrix 1.In the matrix, we can see that their values-regarding s i and e-are very similar, whereas their threshold significantly differs (0.Thus, in this sense, we felt that the threshold of sample 12 is rather high.If we set it closer to that of sample 11, let's say cut half their difference, we can see what happens in Figure 2. We could say that, apart from the fact that the thresholds of the two samples come closer, the binarization of sample 12, using this lower value, is improved.We made similar observations in other cases as well.For example, samples 39 and 41 (Figure 3) also have similar fuzzy inclusion and entropy values, but their thresholds are 0.26 and 0.48, respectively.We can see what happens if we lower the target of sample 41 (like described above) in Figure 4.By bringing the thresholding targets closer, we can see that we have obtained some further details of the swan.Thus, based on these kinds of observations as well as on some visual experimentations, we readjusted the threshold-targets of our set and we trained our G(lobal) ANFIS 1.These new targets can be seen in the last column of our data matrix 1.Initially, we trained an ANFIS using all samples (1-252) to observe possible variation in our training error.Indeed, our error was reduced to almost half (around 0.01).Then, we trained G ANFIS 1 using the same training set of 202 inputs and our new targets.Again, the error remained around 0.01 and in Figure 5 we can see the differences in the binarizations of some samples of our testing set by G ANFIS 1 and Otsu's algorithm.We should also mention that, due to samples of "extreme" illumination, sample 220 is a characteristic case, someone could add some "safety conditions", e.g., we could set t always larger than 0.04 or 0.05 and lower than 0.95 or 0.96.This way, we can avoid the risk of an unacceptable binarization (like a completely black or white image) in case of such an "extreme" input.As we can see, G ANFIS 1 helped us (slightly or significantly) with the thresholding of some dark images like samples 2, 5, 11, 14, some medium-lighted images like samples 41, 65, 89 and some bright images like samples 204, 220, 231.At the same time, we didn't substantially harm all other results.The complete results of our testing set can be seen using the following link: https: //www.dropbox.com/s/1qugs1l38hrg1up/GANFIS1Binarizations.pdf?dl=0.The binarizations of our training samples remained almost the same to those obtained before the target readjustment apart from a few samples (about 8 out of 202) where some more noticeable differences can be spotted.Regardless, we didn't come with any totally unacceptable binarizations and we could say that G ANFIS 1 responded sufficiently (considering its purpose).
Though we didn't alter a lot of our testing set binarizations, the above is just an indicative example of how someone could interfere with the database and alter specific results of a method without having to proceed to a modification of the algorithm itself.This was done based on simple observations and we are not some vision experts.We are sure that someone with more specialized knowledge could perform this much more efficiently and produce more distinguishable differences.
Next, we will see some part of our research on public databases which are related to image segmentation.This way, we will have the opportunity to set some more specialized targets and get some numerical results (besides the visual observation) as well.

Public Databases Which We Used
Two distinguished online databases, which concern image segmentation and were suggested to us, are the "Berkeley Segmentation Data Set" [110] and the "Stanford Background Data Set" [111].They are available for everyone and they offer a significant number of samples for algorithm testing.In fact, the Berkeley set 500 even has three separate sets of images (training, validation and testing) which seem very appropriate for our experimentations.However, the problem is that the ground truth of each sample is a set of several possible separations of the objects of the image according to human perception.This is far different from the single value ground truth we need and the Berkeley's research team applies complex algorithms (including Contour plots and combinations of global and local thresholds) in order to identify the borders of the objects.In other words, the team is mainly focused on edge detection and the human annotations do not contain "filled" objects which occur when someone applies a single global threshold on the whole image.The database also provides some ground truth files in "Microsoft access table shortcut" form which we could not handle easily.Another problem with the samples of this database is the fact that almost all of them are "medium-lighted" (they belong in groups 3, 4, 7 and 8 according to the classification of Section 2.3 and just a few belong in group 2) and we cannot cover the whole required range (in terms of the difference s 1 -s 2 ).
The Stanford database also shares the same features (their goal is very different than ours and the range is even narrower), but it accompanies each sample with some matrices in "txt" form (regarding the layers, regions and surfaces of the image) which can help in constructing a single value target.More specifically, the "regions" file contains a chromatic separation of the different objects of the image which helped us in producing some possible binarizations of the sample including "filled" (black) objects.Thus, we decided to train our ANFIS with data obtained from the Stanford database and test it on the samples (of similar lighting) of the Berkeley database.We believed that this could give our process some extra impartiality.

Data Set Construction of G(lobal) ANFIS 2
Let A be our image (from Stanford collection).Based on the "regions" matrix of A, we built two possible binarizations SB 1 and SB 2 of the image (black objects, no characteristics of the objects are visible) and we assigned to each sample two possible thresholds t 1 and t 2 based on the Jaccard similarity index between two m × n matrices A = (a ij ) and B = (b ij ).Its formula is: Thresholds t 1 and t 2 where chosen in order to have the maximum value of J g (A, B 1 ) and J g (A, B 2 ) respectively.As our final target of image A, we set the average of these two thresholds.Thus, we resulted in 715 vectors of the form (s 1)715.Since some targets seemed "awkward" and we believed that they were not appropriate for our inference system, we filtered these data (there were a lot of similar images in terms of fuzzy inclusion and entropy values) and we ended up with a total of 511 vectors.These can can be seen using the following link: https: //www.dropbox.com/s/ld5xiz20tho4z7p/GANFIS2Data.pdf?dl=0.After this, we trained our G(lobal) ANFIS 2 with the filtered database and we proceeded to its testing on the Berkeley dataset.

Testing of G ANFIS 2
For our testing, we used the BSD300 database (an older version of the latest BSD500 database) which contains 200 training images and 100 testing images.We used all 300 samples as our testing set which we will mention as BTS.We binarized these images with Otsu's method and G ANFIS 2 and we evaluated the results based on the human segmentation of each image (we used the one with the most segments) which looks like the example in Figure 6.Since, like we mentioned earlier, the range of our training set was not adequate, some samples of BTS were out of the range of G ANFIS 2. In addition, there were some images of BTS which were not represented in the training set (regarding the values of s i and e).Thus, we had to eliminate some images of BTS (the results were black or nearly black and white or nearly white binarizations) and conduct our measurements on the remaining images.We excluded 59 samples (out of the 300) and, for each of the remaining images, we measured the similarity of the binarized image with its corresponding human segmentation using two measures.The first was Jaccard's index J g and the other was the structural similarity index (SSI) of MATLAB.Let's call A our original sample, GT its binary ground truth, B 1 its binarization with Otsu's method and B 2 its binarization with G ANFIS 2. Out of the 241 samples, we had: • 211 cases where J g (GT, B 2 ) > J g (GT, B 1 ) (the difference J g (GT, B 2 ) − J g (GT, B 1 ) varied from 0.76 to 0.01).• 24 cases where J g (GT, B 2 ) < J g (GT, B 1 ) (the difference J g (GT, B 1 ) − J g (GT, B 2 ) varied from 0.17 to 0.01).• 6 cases where the difference of J g (GT, B 1 ) and J g (GT, B 2 ) was zero.
All measurements were made on "tiff" formatted images and were rounded to two decimal numbers.Since we are talking of a lot of images, in Table 2 we present (in two columns) the forty first and the last two lines of the respective table.The full table (sorted by the difference J g (GT, B 2 ) − J g (GT, B 1 )) can be seen using the following link: https://www.dropbox.com/s/4m52xo29dy00sva/JaccValuesandDifferences.pdf?dl=0.Again, in Table 3, we present (in two columns) the forty first and the last two lines of the respective table.The full table (sorted by the difference SSI(GT, B 2 ) − SSI(GT, B 1 )) can be seen using the following link: https://www.dropbox.com/s/okl9e126i9vyjxa/SSIValuesandDifferences.pdf?dl=0.Based on the values of Tables 2 and 3, we can deduce that the binarizations of G ANFIS 2 were generally closer to our ground truth image compared to those of Otsu's algorithm.In other words, by setting our targets the way described above, we led our system to more bright and "empty" binarizations in order to approximate the human border sketching.Of course, like mentioned before, Berkeley's research is completely different and far more complex than ours and it is impossible to replicate these results using a single threshold (a local approach, which is part of our current research, would be more appropriate).Nevertheless, it is again indicative of the benefit someone could have by setting the global targets on his own.
Besides the previous tables, we felt that it would be useful to present some visual results as well.Since we are talking of a lot of images, we chose some (from the union of the top 65 lines of our Tables 2  or 3) which we believe that can depict this "approximation" more distinctly and we present them in Figure 7.As far as our "bad" results are concerned, in Figure 8, we demonstrate the last lines (our "worst" binarizations according to the similarity indices) of Tables 2 and 3.As someone can see, these do not significantly differ from the respective binarizations of Otsu's algorithm.

Summary and Some Further Remarks
To sum up, in the previous two sections, we saw how fuzzy inclusion and entropy measures could efficiently be used as inputs of an inference system and how someone could benefit from setting the thresholding targets on his own.This can be hard regarding certain cases, but, on the other hand, it can help us produce specific results without having to alter an existing method or construct a new mathematical algorithm.Moreover, it could be implemented in various ways which vary from complicated analysis to simple observations.G ANFIS 1 and 2 were just some indicative evidence of how something like this could be done.Of course, in the case of G ANFIS 2, it would be impossible to reach the results of Berkeley's and Stanford's research teams using a single global threshold.Nevertheless, considering the circumstances, we think we managed to approach these results somehow better than Otsu's algorithm or, at least, have more clear binarizations in many cases.The use of these data-sets was an interesting suggestion, however, their content is more related to our research on local and color segmentation.These and several other procedures consist parts of our current work and soon, we will be able to share some of its results.
Moreover, during this whole process which we described in the previous sections, we saw once again why is important to have a variety of fuzzy measures which could lead to new ideas and alternative ways of solving specific problems.We mention this because our algorithms in [2,20,21] couldn't be implemented using any fuzzy inclusion function since most of them measure the inclusion of a fuzzy set A into ∅ as zero.Under these circumstances, we wouldn't be able to measure the "darkness" of an image and using only s 1 wouldn't be enough for the proper connection between our measures and threshold t.The same applies here since without the values of s 2 , our ANFIS returned inferior results to those we earlier saw.Another inclusion measure S that returns S(A, ∅) = 0 is Goguen's index with the following formula: Its respective entropy measure is: Of course, in this case, we have S(A, ∅) = 1 − S(X, A).We experimented with measures (2), (3) and ( 9) and S 1 returned us the smaller error and the best results.This is the reason we used it in this presentation.Perhaps someone could achieve some better results using some other measure according to his own needs and targets.
Our theoretical work on the measures we introduced in [2] is constant and continuous; however, their application on real problems and their possible connection with other researches concerning such fuzzy functions could offer us a lot of valuable information regarding their potential as well as their behavior.Some procedures which often deal with fuzzy inclusion or entropy measurements concern feature selection, fuzzy classification, fuzzy controllers, fuzzy rules and similarity measures.There are several presentations of this kind (e.g., [112][113][114][115][116][117][118][119] are a few recent) and it would be extremely useful if a certain part of our work could be connected with (or even included in) some of these studies.This would be a significant aid to our research and we could better comprehend the attributes and the performance of these measures.

Conclusions
In this paper, we continued our work in [2,20,21] and we presented an alternative approach to the thresholding problem using the ANFIS method to automatically connect our fuzzy inclusion and entropy measures with the wanted threshold of the image.Like in all of our previous presentations, we avoided any kind of analysis of our input (depending on its histogram) and bias or sensitivity parameters.All we need is a simple transformation of the image into a fuzzy set along with some simple measurements and, once again, we can see why it is important to have a variety of fuzzy measures since this could lead us to new ways of dealing with specific problems.
Of course, our research is continued and, as we mentioned, we have already used this ANFIS approach on a local level as well.Certainly, in this case, the process of data collection is different, but again, our method has returned some very adequate results (at least for the kind of images we used in [20,21] which are mainly degraded or unevenly illuminated documents).These are definitely going to be displayed in a future presentation along with some other results which derived from our current experimentations on several domains of images.At the same time, we also continue our theoretical research on our measures and we already have several ideas of combining them with other relative works which deal with fuzzy inclusion and entropy or include processes which are connected with these concepts.

Figure 7 .
Figure 7. Original samples, their Otsu and G ANFIS 2 binarizations and their ground truth image.

Figure 8 .
Figure 8. Original testing samples, their binarizations with Otsu's method and G ANFIS 2 and their ground truth image.

Table 3 .
Structural similarity index measurements and their differences.