Salient Object Detection Based on Optimization of Feature Computation by Neutrosophic Set Theory

In recent saliency detection research, too many or too few image features are used in the algorithm, and the processing of saliency map details is not satisfactory, resulting in significant degradation of the salient object detection result. To overcome the above deficiencies and achieve better object detection results, we propose a salient object detection method based on feature optimization by neutrosophic set (NS) theory in this paper. First, prior object knowledge is built using foreground and background models, which include pixel-wise and super-pixel cues. Simultaneously, the feature maps are selected and extracted for feature computation, allowing the object and background features of the image to be separated as much as possible. Second, the salient object is obtained by fusing the features decomposed by the low-rank matrix recovery model with the object prior knowledge. Finally, for salient object detection, we present a novel mathematical description of neutrosophic set theory. To reduce the uncertainty of the obtained saliency map and then obtain good saliency detection results, the new NS theory is proposed. Extensive experiments on five public datasets demonstrate that the results are competitive and superior to previous state-of-the-art methods.


Introduction
Regarding an image, we are only interested in the salient regions, as they are the most fascinating and expressive regions [1].Due to the uniqueness of these features and the absence of prior knowledge, salient maps obtained through traditional theory-based algorithms [2][3][4][5][6][7][8][9][10][11][12][13][14] heavily rely on inherent features of both the object and the background in the image.These attributes encompass color, brightness, texture, position, and contrast [2][3][4][5][6][7][8], all of which pose challenges in isolating the object's features.Hence, the greater the number of features fused, the easier it is to identify the object.However, this is performed without taking into account the algorithm's structural merits and performance.Furthermore, in real-world application scenarios, an excess of fused low-level features may potentially lead to deteriorated saliency detection.We posit two primary reasons for this: first, as the number of features increases, so does the likelihood of redundant information, surpassing the algorithm's processing capacity.This can result in the salient object information being overshadowed by extraneous features.Secondly, with an abundance of features, there is also the potential for non-objective features to appear more salient, resulting in a similar issue akin to missing feature information.In this paper, we have experimentally demonstrated the critical importance of selecting features that align with the algorithm's structure, yet differ sufficiently for subsequent target feature processing.
Furthermore, the effective organization, discrimination, and rectification of the features of the object and background, along with the accurate classification of feature pixels, require an analysis of their characteristics and patterns.This process involves uncovering the inherent connections between features and subsequently constructing an algorithmic model.Low-rank matrix decomposition methods [15,16], rooted in an understanding of low-level features like image color, brightness, texture, position, and contrast, are well-suited for extracting object features.These methods operate on the assumption that an image can be partitioned into a low-rank section containing highly redundant information (e.g., visually consistent background region) and a sparse salient section (e.g., unique foreground target region).Consequently, the feature matrix of the input image can be deconstructed into a low-rank redundant matrix corresponding to the background and a sparse matrix corresponding to the salient object.After the features are decomposed, those within the sparse matrix represent the most refined expression of the image's content, revealing its unique information.Nevertheless, it remains uncertain whether it encompasses the complete object information, or how to highlight the object information as prominently as possible.To address this issue, introducing prior knowledge of the object is a valuable approach.Adopting an iterative fusion approach to incorporate this prior information highlights the object information and completes the initial salient object detection.As shown in Figure 1, the initially obtained saliency map contains too much redundant and interfering information, which affects the results of saliency detection.Therefore, further optimization is necessary.Moreover, it is worth noting that the majority of saliency maps are now presented as hierarchical multi-valued images, where many region pixels have values within the range of [0, 1], and the distinctions between these features are inherently fuzzy.This paper proposes that transforming the salient optimization problem into a fuzzy processing problem is an important approach.The NS theory [17,18] offers significant advantages in handling fuzzy problems.It extends the scope of fuzzy theory, broadening its applications and enabling a more precise expression and processing of fuzzy and uncertain information.Therefore, we present a salient object detection algorithm based on feature computation optimization using neutrosophic set theory (SMNS).The following is a list of our major contributions.

1.
We show that the selection of features should be focused on their uniqueness rather than their quantity.Prior object knowledge is constructed using the foreground and background models, integrating pixel-wise and super-pixel cues.In this way, we effectively obtain the main information about the object.

2.
We provide a new mathematical definition of neutrosophic set theory, redefining the formulas for its three subsets: True (T), Indeterminacy (I), and False (F).

3.
A new fusion update method is applied to highlight object features and achieve salient object detection, significantly improving the saliency map's overall quality and accuracy.
The rest of the paper is structured as follows: Section 2 presents related works.Section 3 describes the proposed salient object detection approach in detail.Section 4 presents experimental results in comparison to other approaches, as well as discussions.Section 5 concludes with conclusions.

Related Works
In most traditional theory-based saliency detection algorithms, low-level features, spatial information, and central prior knowledge are employed to generate the object information.In 1998, Itti et al. [19] proposed a central enveloping contrast, which detects salient objects by using color, intensity, and directional features.Huo et al. [20] exploited the link between several color attributes and their spatial information to describe an image's global information.Borji and Itti [2] employed the sparsity of local and global image blocks as two complementing processes to design saliency models.Goferman et al. [3] assumed that salient objects have a compact spatial structure and subsequently detect salient objects based on the difference information of the image's spatial structure.In [13], a new salient object detection method based on foreground, center, and background is proposed that employs regional color as the foreground and highlights salient objects by integrating perceptible uniform color differences within the region as the foreground.According to [14], the saliency map of an image can be produced via a global estimate of linear color combinations in a high-dimensional color space, and then the performance of saliency detection is further improved by additional local features and learning-based algorithms.Other works [4,5] believe that the salient object is generally framed toward the center of the image and that the saliency map is constructed using central prior knowledge.However, salient objects sometimes exist outside the image center in many images, resulting in a central prior mapping that improperly suppresses salient areas far from the image center while highlighting some background regions near the center.
In recent years, image boundaries have been routinely used as background prior information for salient object detection [5][6][7][8][9][10][11][12].The saliency estimation problem is transformed into a rank-and-retrieve task in [10], with image boundary information taken as background prior knowledge.In [6], the difference between the object and the background is assumed to be on the shortest path to the image boundary.The separation strategy for the object and background is based on the very short distance between the background region and the boundary and the relatively long distance to the object.Zhu et al. [8] proposed the concept of background connectivity, which determines whether a region belongs to the background by evaluating the extent of connectedness between the region and the boundary.Yuan et al. introduced a saliency regression correction approach in [12], which improves the accuracy and robustness of saliency estimates based on prior boundary knowledge by correcting and deleting foreground super-pixels near the boundary.While edge priors can improve the accuracy of salient object detection, they still face many challenges, such as the processing of saliency map details not being satisfactory under low-level features.Therefore, we drew inspiration from related algorithms in the fields of deep learning [21,22], image segmentation [23], and image registration [24] to conduct research on salient object detection based on traditional theories.

Proposed Algorithm
In this section, we provide a detailed overview of the proposed algorithm, illustrated in Figure 1.The proposed algorithm comprises two main parts.The first part involves constructing the detection algorithm for salient objects.Initially, we establish prior object knowledge using foreground and background models that incorporate pixel-wise and super-pixel cues.Subsequently, we select and extract feature maps for computation.This step aims to effectively differentiate the object features from the background features in the image.Ultimately, we fuse the features decomposed by the low-rank matrix recovery model with the prior object knowledge to accomplish salient object detection.The second part is to optimize the saliency map.The results of salient detection preserve the mutual uncertainty between object and background information in the image.As a result, saliency optimization becomes a fuzzy processing issue.Given that NS theory has demonstrated advantages in addressing fuzzy problems, it proves to be well-suited for saliency optimization.Therefore, we propose a novel mathematical description of the NS theory to optimize the saliency map and improve object detection accuracy.

Feature Extraction and Selection
In salient object detection algorithms, the low-level features of an image or depth features based on these inherent features are critical.Furthermore, feature selection is also important.The following are descriptions of the features chosen for this paper:

•
Color features: Because the L*a*b* color space is based on human perception of color and includes all perceptible colors, its color gamut exceeds that of the red, green, and blue (RGB) color spaces, making it appropriate for image processing.Its three components, as well as its color volume, are tetra-chromatic features.The color volume is defined in [13] as: • Image local information entropy: Image entropy is a statistical form of feature that represents the average amount of information in an image.The local entropy of the image is defined according to the order of the gray distribution of the pixels, which reflects the richness of the image information.The greater the entropy, the clearer the image details [25,26].

•
Guided image filtering [27]: As the name implies, a guided filter is required to have a guide to filter, and different guided images can be selected depending on their different functions.We chose to utilize it to extract edge features in this study, so its input is the image itself.• Gradient [28]: Gradient images are images with directional variations in grayscale intensity or color.The mathematical definition of the gradient is a vector, indicating that the directional derivative of a function at that location takes the largest value along that direction.As a result, the image can be viewed as a two-dimensional discrete function, and its grayscale gradient is the derivation of this two-dimensional discrete function.The difference can be used instead of the differential to derive the image's grayscale gradient.Then, we obtained different features by setting the gradient operator.
As previously stated, insufficient feature information prevents the object from being entirely highlighted, resulting in saliency map degradation and over-segmentation issues during the subsequent segmentation process.The more feature information that surpasses the algorithm's processing capability, the more worthless information appears in the results, lowering the saliency map's quality.Under-segmentation will undoubtedly occur in this case.Some features can be highlighted, resulting in the same problem with fewer features as before.
In [14], 75-dimensional feature vectors and 38-dimensional feature vectors were extracted from images for subsequent saliency map estimation.The experimental results are shown in Figure 2, demonstrating that having too many features is not always optimum for detecting salient objects.In Table 1, the feature combinations are designated 1*, 2*, and 3*.The optimum detection effect may be obtained only if the feature set is appropriate for the proposed algorithm.The performance of our algorithm for the three sets of features is shown in Figure 3.These results demonstrate that the selection of features should not be based on quantity but on the rational combination of features and the analysis of the algorithm structure, which can remove redundancy and extract useful information.Therefore, the feature combinations 2* is employed as the input for subsequent feature computation in this paper, namely the feature maps illustrated in Figure 1.
Table 1.Three sets of image feature combinations.

Feature Descriptions Dim
Color Features

The Prior Object Knowledge
To obtain the best object saliency map, prior knowledge of the object, which will provide us with the most significant possible help, is indispensable.In this paper, prior knowledge based on the foreground and background models, incorporating pixel-wise and super-pixel-wise cues, is built.In [8], a background model with good robustness and accuracy is provided, and its structure is expressed as follows: where Ctr(p) and w bg i represent the contrast of the ith super-pixel and its background probability, respectively, and p i is the ith super-pixel in the image.Ctr(p) is the sum of the contrasts of the ith super-pixel with other super-pixels.The probability w bg i is the similarity between the ith super-pixel p i and the background.
The SLIC method is employed as an image super-pixel preprocessing algorithm in this work since it has fewer parameters and a faster performance when compared to other super-pixel techniques.However, a background model based on the SLIC method with erroneous super-pixel bounds may result in the loss of some object information.As a result, to address the shortcomings of the background model and gain complete object information, we propose a pixel-based foreground model, and its flowchart is as follows: first, the input image is transformed to the local entropy map G1 [31], guided image filtering map G2 [27], and the grayscale image G3.Salient feature map G2 is obtained based on G2 by Equation (3) in [32].
where G(u) is the mean pixel value of G2.Finally, G1, G2 , and G3 are weighted and superimposed, and the object is highlighted by the sigmoid function to obtain a saliency map.
The foreground model is defined as follows: Here, the foreground model mainly enriches the details of the image.Additionally, it also makes up for the deficiency of the background method because the background model considers the pixel area connecting the boundary as the background, which causes the object on the boundary to be regarded as the background.Therefore, we obtain relatively complete prior target knowledge by combining the foreground model and the background model.The specific fusion steps of the two models are described as follows: first, the common features of the foreground and background models can be generated by using the intersection operation, and its mathematical expression is presented as follows: where Fg and Bg denote the foreground model in Equation ( 4) and the background model in Equation ( 2), respectively, and ϑ 1 is a normalization coefficient.The difference method is applied to obtain different features of the foreground model and the background model.It is defined as follows: where we set the difference coefficients ϑ 2 = 1.2 and ϑ 3 = 0.8.ϑ 4 is a normalization coefficient.
Then, to further optimize the features of the salient object, the correlation matrix C (Fg,Bg) is obtained by calculating the correlation between the foreground model and the background model in a 9*9 region.The common features and different features are combined and multiplied by the correlation matrix to obtain the salient features of the object P s .
where ρ(Fg, Bg) is the Spearman correlation coefficient, which is used to measure the correlation between the foreground and background models [33].finally, the prior object knowledge is obtained by the following definition: The prior object knowledge P s of the image provides the object feature information for the feature computation in the next section.

Feature Computation
The foundation for salient object detection is effectively organizing, discriminating, and correcting the features of the object and background to produce precise feature separation.As a result, feature separation techniques are critical, and we discover that the low-rank matrix decomposition approach is well suited for feature separation under the assumption that images can be separated into low-rank features and sparse features.Then, we can achieve feature separation using the low-rank matrix L and sparse matrix S of the image.Therefore, in this paper, the low-rank matrix decomposition algorithm described in [16] is employed for feature computation.min where Ψ(L) is a constraint on a low-rank feature space.Here, Ω(S) indicates the l 1 − norm.Θ(L, S) is a regularization term on L and S such that the separation of their two features is possible.α and β are positive trade-off parameters.
In terms of low-rank matrix decomposition, the most significant difference from [16] is that we remove the multilevel tree structure and change it to a single-layer structure, which accelerates the algorithm.Figure 4 shows that the performance of the low-rank matrix decomposition with a tree structure in paper [16] and our single-layer low-rank matrix decomposition in this paper are equivalent.However, the complex tree structure is removed in this paper, so the complexity of the algorithm is effectively reduced.
By performing the low-rank matrix decomposition algorithm using Equation ( 8), we obtain a sparse matrix S including salient object information and a low-rank matrix L containing the background portion.

The Rule of Features Fusion
When the decomposition of the features is completed, sparse matrix S contains the most refined feature expression of the image content, representing the unique information in an image.However, it may not contain the desired object information.To solve this problem, the object prior knowledge P s provides the information of the object feature for matrix S, which can highlight the object features.Therefore, to further separate the object and background features, sparse matrix S, low-rank matrix L, and the prior object knowledge P s are combined to reduce the difference between S and P s , and the effective object features are extracted.We define the following feature fusion update formula: where I 1 is the identity, W = exp( , w i,j = x i − x j 2 represents the weight between super-pixels, and x i and x j are the ith and jth pixel in the image.The parameters µ and η are the loss factor and the adjustment factor, which are set to 0.6 and 0.3, respectively.When t = 0, the initial value of S is obtained by Equation ( 8).(I 1 − µW −1 ) • S t is the main body of object features with t iterations.(I 1 − µW −1 ) is the object coefficient, which is used to highlight the object.η • (diag(L • (1 − P s )) represents background features.diag(W • S t )) * (diag(P s • S t )) −1 is the adjustment amount of the object features.By adding the prior knowledge, we can fine-tune the background feature and object feature by iteration, and separate the object and background as much as possible.
Although we have detected the object in the image, the saliency result still contains several blurs and noises.In the following section, we present an optimization approach based on NS theory that can remove the fuzzy value of the results and obtain a clean saliency map.

Saliency Optimization
The ultimate purpose of salient object detection is to accurately segment the object, which is a binary image that matches the ground truth.The majority of saliency maps are multivalued images with multiple regional pixels with values in the range [0, 1].Because the distinction between these features is fuzzy, the saliency optimization issue can be turned into a fuzzy processing problem.
We discover that NS theory has a significant advantage in resolving uncertainty [34].The basic idea of the NS theory is that all propositions and things have three memberships: True (T), Indeterminacy (I), and False (F), and their values are different.It is the handling of uncertainty that distinguishes it from other theories.In traditional sets, T and F are determined by thresholds; in fuzzy sets, membership functions are employed to determine whether they belong to T or F; and in NS, the values of the three components of T, I, and F are defined by different functions.In [35], the definitions of T, I, and F are as follows: where g(i, j) and G(i, j) are the intensity and gradient values at pixel coordinate (i, j), respectively.
As the NS theory is only a guiding theory (a set of principles), there are multiple mathematical descriptions of NS in different applications.Therefore, we introduce a novel mathematical description of the three subsets-T, I, and F-to deal with the fuzzy problem.We can observe that the variables in Equations ( 10)-(12) about the three subsets represent linear relationships, which makes it difficult to handle fuzzy information.For this reason, the ability of the three subsets to process the uncertainty can be increased by introducing nonlinear functions instead of linear functions.Moreover, we find that the sigmoid function is a classical nonlinear function that can be adjusted to obtain the T and F subsets by adjusting the parameter.For subset I, the idea of non-local mean filtering is utilized to reduce the uncertainty.Then, their definitions are as follows: where ω, δ and γ, ζ control the steepness of the curve, and are set to 0.2, 0.5 and 0.1, 0.1.
and N i denote the number of neighbor pixels around pixel i, and h is the percentage of filter frequency.f (x i , x j ) = exp(− x(N i ) − x(N j ) 2 2 /h 2 ) represents the similarity between two points.g(x j ) = P s (x j ) * S(x j ) is the data to be processed.x is the pixel in the image.x i and x j is the ith and jth pixel in the image.
The process of saliency optimization is relatively simple and straightforward, using each of the three newly defined subsets to process the salient object detection results obtained from Equation ( 9).The three optimized results are then fused to obtain the final saliency detection results.From Equations ( 13) and ( 15), it can be seen that T(x) and F(x) are the functions of sigmoid transformation, and their parameters are set differently.By adjusting the parameters, we can enhance or weaken the feature points to highlight the object and separate the object and background as far as possible.Inspired by the non-local algorithm for image denoising [36], we redefine I(x i ), which maintains the pixel surrounding features while removing the noise points in similar regions.After the three components are obtained, they are fused to obtain the ultimate features of different regions and generate the saliency map of object detection, which is defined in Equation ( 16).The last term in Equation ( 16) relates to continuous saliency values: where is set to 0.8, w i,j is as in Equation ( 9), and S(x i ), S(x j ) are the ith and jth element of S in Equation ( 9).
The inconsistency of the parameter settings of T(S) and F(S) are applied to separate similar eigenvalues.In this way, we can distinguish similar values and enhance them differently, resulting in the separation of object and background features.I(S) is applied to address the value of the approximate feature position and reduce the impact of the prominent outliers or noises.The final fusion processing improves the overall quality and accuracy of the output salient object detection and the visual smoothness of the saliency map.

Experiments
We perform a series of experiments to completely verify the SMNS algorithm, analyzing the outputs from three perspectives: benchmark datasets, salient object detection methods, and evaluation metrics.

Experiment Fundamentals 4.1.1. Benchmark Datasets
Five popular datasets are employed to validate our algorithm: MSRA10K [37], EC-SSD [38], PASCAL-S [39], DUTOMRON [40], and SOCK [41].There are 10,000 images in MSRA10K and 1000 in ECSSD, all of which contain simple scenes with obvious objects.PASCAL-S, DUT-OMRON, and SOCK are composed of images with relatively complex backgrounds and multiple salient objects, containing 850, 5168, and 1800 images of various sizes, respectively.Moreover, SOCK is obtained from SOC [41] by removing images of no salient object.From Figure 5 and Table 2, we can see that the performance of these evaluation metrics in these datasets is gradually degraded, which shows that the scenes contained in the images in these datasets are gradually more complicated.

Evaluation Metrics
To show the efficacy of the SMNS approach, the precision-recall (P-R) curve [44], F-measure curve [44], mean absolute error (MAE) [45], AUC [16], ROC [16], and Smeasure [46] are employed.Precision (P) is the proportion of object pixels in the saliency map that are successfully detected compared to ground truth pixels, and recall (R) is de-fined as the ratio of correctly detected saliency object pixels to ground-truth object pixels.The F-measure is defined based on P and R with the following expressions: where β 2 is set to 0.3 to emphasize precision more than recall [44].
The P-R curve and F-measure are concerned with the proportion of correctly allocated salient pixels while ignoring detection results from non-salient area pixels.As a result, we introduce MAE to evaluate the accuracy of salient object detection.
The MAE is the pixel-wise average absolute difference between the saliency map S(k) and the ground truth GT(k).
The false positive rate (FPR) is represented by the ROC curve's abscissa, while the true positive rate (TPR) is represented by the vertical coordinate, which corresponds to the true negative rate and false negative rate.The AUC is a metric used to compare the benefits and drawbacks of two categorization prediction algorithms.
where FPR and TPR denote the probability of being judged as positive but not positive and the probability of being judged as positive but also positive.FP is the false positive, TN is the true negative, TP is the true positive, and FN is the false negative.
The above evaluation metrics are all pixel-based similarity comparisons, without considering the similarity of the image structure.Therefore, the S-measure is applied to describe the structural similarity between the saliency map and ground truth, which is defined as follows [46]: where α is in the range of [0, 1] and µ is the proportion of the salient object to the ground truth.S o denotes an object-aware structural similarity measure, S r is the region-aware structural similarity measure, O FG is foreground comparison, O BG is background comparison, w k indicates the weight coefficient, and ssim(k) is a structural similarity measure.

Parameter Settings
In this paper, we use source codes of the compared methods and data.The model parameters are set according to the referenced paper.Furthermore, the parameter settings in this paper are noted in the formulas, and the parameters are fixed for all experiments.We set the number of super-pixels to 250.All experiments are performed on a PC workstation with a 3.6 GHz CPU and 8 GB RAM using MATLAB 2016a.

Performance of the SMNS Model with Some Feature Combinations
We demonstrate that the number of image features must be suitable for the proposed algorithm; otherwise, the salient detection results will be significantly worse.
In MSRA10K, the feature combination 2* adopted in this paper is superior to the other two feature combinations concerning the three evaluation metrics.In PASCAL-S, the performance of the three combinations is comparable in terms of the MAE, but combination 2* performs best in terms of the S-measure and P-R curve.As revealed by the performance of the SMNS model with three feature combinations, which is shown in Figure 6, the selection of features depends not on the number of features but on their suitability for the algorithm.

Comparison of Low-Rank Matrix Decomposition with or without Tree Structure
To fully evaluate the comparison of low-rank matrix decomposition with or without a tree structure, a series of experiments were carried out utilizing five benchmark datasets, including various scenarios and three evaluation metrics.Figure 4 shows that the performance of the low-rank matrix decomposition with a tree structure in the paper [16] and our single-layer, low-rank matrix decomposition in this paper are equivalent.However, the complex tree structure is removed, so the complexity of the algorithm is effectively reduced in theory.The complexity of the tree structure is O(MN).When the tree structure is removed, the complexity becomes O(N), where N is the number of pixels and M is the number of root nodes of the tree.In addition, taking the average running time of each image in the dataset ECSSD as an example, the average time of an image with a tree structure is 1.524 s, while the average time without a tree structure is 1.205 s.

Saliency Optimization of NS Theory
To validate the effectiveness of the NS theory in optimizing the results, we conducted a comparative analysis on the saliency detection results with and without NS theory optimization, employing evaluation metrics such as P-R curves, MAE, and S-measure on both the MSRA10K and PASCAL-S datasets.This is illustrated in Figure 7. SMNS denotes the results optimized using the NS theory, while SMNS* represents results without NS theory processing.On the MSRA10K dataset, the performance of the three objective evaluation metrics for the results after NS theory optimization surpassed those without NS theory optimization.Moreover, the performance of the optimized results showed a significant improvement.On the PASCAL-S dataset, although the performance regarding the S-measure metric was comparable between the two, in terms of MAE and P-R curves, the results after NS theory optimization clearly outperformed those without NS theory processing.Through the comparative analysis above, it is demonstrated that the application of the NS theory in optimizing saliency detection results is highly effective.

Validation of the Proposed Algorithm
To completely assess the validity of our SMNS method, we ran a series of experiments with five benchmark datasets encompassing different scenes and six evaluation metrics, as well as ten state-of-the-art approaches for comparison.The proposed algorithm's performance is depicted in Figure 5 and Table 2.In addition, Figure 8 shows some visual comparison examples of ten state-of-the-art approaches.In Table 2, the proposed SMNS algorithm demonstrates an outstanding performance in terms of MAE.On three out of the five datasets, the evaluation metric MAE for the SMNS algorithm outperforms all the comparative algorithms, showcasing exceptional excellence.While on the PASCAL-S and DUT-OMRON datasets, the SMNS algorithm does not achieve the absolute best performance, it still ranks second, just behind the topperforming algorithms.Additionally, it is notable that the performance of evaluation metrics S-measure and AUC for the SMNS algorithm, while not as dazzling as MAE, is highly competitive.Specifically, the S-measure of this SMNS algorithm performs exceptionally well on all datasets except for SOCK.However, regarding the AUC performance, except for the MSRA10K and PASCAL-S datasets, where the SMNS algorithm performs well, on the ECSSD, DUT-OMRON, and SOCK datasets, its AUC performance is not as impressive as some comparative algorithms.This corresponds to the performance of the ROC evaluation metric in Figure 5 for the same reason, as it is determined by the nature of AUC.
In Figure 5, a comparison of all algorithm results across the five datasets is presented in terms of evaluation metrics, including P-R curves, F-measure, and ROC.It can be observed that on the MSRA10K, PASCAL-S, and DUT-OMRON datasets, the P-R curves enveloped by the SMNS algorithm are mostly superior to those of the comparative algorithms.Although it may not be the best on the other two datasets, the P-R curve envelopes of SMNS still exhibit a highly competitive performance.This indicates the accuracy of the SMNS algorithm in salient object detection.Since the F-measure is derived from the P-R curve, it can be inferred that the performance of the F-measure for the results of the SMNS algorithm is similar to that of the P-R curve.Additionally, across the five datasets, the performance of the ROC metric for the SMNS algorithm, while not the absolute best, is still among the top performers overall.
In conclusion, the results show that the proposed SMNS model can successfully address the aforementioned problems and increase salient object detection accuracy.
However, when researching salient object detection in complex scenes, we must also acknowledge the possibility of detection failures in practical applications.Figure 9 presents examples of salient object detection failures in complex scenes.In natural environments, objects may blend in with surrounding elements, making it challenging for the algorithm to accurately detect them.These cases contribute to a better understanding of the limitations of the algorithm and provide valuable reference for the improvement of subsequent algorithms.

Parameters Analysis
In this study, several parameters are applied in the SMNS approach.Furthermore, the key parameters influence the performance of the final detection and classification results.Based on the content, structure, and correlation of the parameters in the study, these parameters are divided into four components.The first section contains the parameters for super-pixels; the second section discusses the fusion parameters for the foreground and background models; the third section provides the settings for low-rank matrix decomposition.The saliency optimization parameters are in the fourth section.Experiments using cross-validation are carried out to assess the effect of the parameters on the final detection and classification performance.
As shown in Figure 10, the performance of the P-R curve on the ECSSD and PASCAL-S is chosen as an example.We determine that the adjustment of all parameters impacts the results of the SMNS algorithm, and different parameters have different effects.Among them, the number of super-pixels has a significant influence on the SMNS algorithm.In contrast, the parameters of the second part have a lower impact.In addition, regardless of the direction of the second-part parameters adjustment, the performances are equivalent because these parameters only affect the prior knowledge and do not directly affect the salient object detection.The parameters of the third and fourth parts are also crucial to the SMNS algorithm of this paper.The former determines the extraction of object features.The latter controls the degree of optimization and can be employed to obtain better results than by changing other parameters.These parameters are critical, and the relationships among them are worth optimizing and exploring.

Conclusions
We present a novel salient object detection method based on the optimization of feature computation by NS theory.The prior object knowledge is provided for the fusion with the features obtained by feature computation, which can separate the object features and background features of the image as much as possible.The feature selected must be suitable for the algorithm.It is demonstrated that the NS theory, in which the formulas are redefined here, can be used as the optimization algorithm to obtain good saliency detection results.This approach for salient object detection offers some strategies.For instance, non-deeplearning algorithm features are introduced as prior knowledge for feature computation.A further extension of NS theory allows us to optimize the uncertainty in saliency maps.This methodological versatility and efficacy could find applications in a wide range of domains, including robotics, video analysis, and content creation.As biological brains are most effective in salient object detection, and as the NS theory is known in psychology, one of the research questions that can be addressed in the future is the development of brain-inspired techniques.

Figure 1 .
Figure 1.The structure of the proposed algorithm.The first step is to extract and select the input image features; one part is to construct prior knowledge, and the other part is applied as the input of low-rank matrix decomposition.The prior knowledge and output of low-rank decomposition are then fused to complete the salient object detection.The output is obtained by our saliency optimization algorithm.

Figure 2 .
Figure 2. Comparison between the 75-dimensional feature saliency map in the second column and the 38-dimensional feature in the third column.

Figure 4 .
Figure 4. Comparison of low-rank matrix decomposition with or without tree structure in terms of MAE, S-measure, and AUC.

Figure 5 .
Figure 5.Comparison of five datasets in terms of P-R curves, F-measure, and ROC.

Figure 7 .
Figure 7.Comparison of the P-R curve, MAE, and S-measure results of datasets MSRA10K and PASCAL-S.SMNS indicates the result of the NS theory optimization process, and SMNS* has not been processed by NS theory.

Figure 10 .
Figure 10.The impact of different parameters adjustments in terms of P-R curves, where SMNS-P1-4 represents four-part parameters.'+' indicates that the parameters are adjusted in the direction of greater than the original value.'-' indicates that the parameters are adjusted in the direction of less than the original value.

Table 2 .
Comparison of five datasets in terms of MAE, S-measure, and AUC.The best three results are highlighted with red, blue, and green fonts, respectively.