Improved Dominance Soft Set Based Decision Rules with Pruning for Leukemia Image Classification

: Acute lymphoblastic leukemia is a well-known type of pediatric cancer that affects the blood and bone marrow. If left untreated, it ends in fatal conditions due to its proliferation into the circulation system and other indispensable organs. All over the world, leukemia primarily attacks youngsters and grown-ups. The early diagnosis of leukemia is essential for the recovery of patients, particularly in the case of children. Computational tools for medical image analysis, therefore, have significant use and become the focus of research in medical image processing. The particle swarm optimization algorithm (PSO) is employed to segment the nucleus in the leukemia image. The texture, shape, and color features are extracted from the nucleus. In this article, an improved dominance soft set-based decision rules with pruning (IDSSDRP) algorithm is proposed to predict the blast and non-blast cells of leukemia. This approach proceeds with three distinct phases: (i) improved dominance soft set-based attribute reduction using AND operation in multi-soft set theory, (ii) generation of decision rules using dominance soft set, and (iii) rule pruning. The efficiency of the proposed system is compared with other benchmark classification algorithms. The research outcomes demonstrate that the derived rules efficiently classify cancer and non-cancer cells. Classification metrics are applied along with receiver operating characteristic (ROC) curve analysis to evaluate the efficiency of the proposed framework. which a leukemia nucleus is segmented in the image applying the PSO algorithm, and subsequently, and we present how relevant representative features are extracted from the segmented nucleus. During this process, different kinds of features, namely, shape, color, and texture features, are extracted. In texture features, grey level co-occurrence matrix (GLCM) is computed for the dimensions 0°, 45°, 90°, and 135°.


Introduction
During the last few decades, digital image analysis has been enriched with significant advancements and new techniques. A large volume of medical image digital data has been captured and recorded by way of regular clinical observation, research, and analysis. In the field of medical image analysis, various kinds of image processing and analysis techniques have been developed and applied to extract clinical data from the captured images. Despite these advancements in science and technology, in oncology studies, medical practitioners experience uncertainties when classifying malignant features. This constraint has prompted many researchers to design frameworks to analyze the image and to diagnose the disease accurately so that better treatment could be given to the patient at the right time. Leukemia is a collective term applied to a group of malignant diseases with significant myeloid or lymphoid impacts. Manual evaluation of the picture of microscopic leukemia is less reliable and time-consuming, making it impossible for the hematologist to correctly interpret the features of the leukemia cells. Recent researchers have used various statistical and image recognition methods to classify leukemia cells. Two broad types of leukemia are recognized, acute and chronic, depending upon the degree of development of the disease. Acute lymphoblastic leukemia (ALL) is the most prevalent group of cancer in adolescents.
The National Cancer Institute of the United States foresees that in the year 2019, there will be around 60,300 new instances of leukemia are identified and out of which 24,370 people will meet fatality [1]. In India, leukemia is found to be the ninth leading cause among youngsters of age under 14 years [2]. In the case of boys, the highest age-adjusted incidence rate (AAIR) and the lowest AAIR are reported as 101. 4 and 8.4 in Delhi and Meghalaya, respectively. Concerning girls, the highest AAIR of leukemia is recorded in Delhi as 62.3, and the lowest AAIR is Cachar District, Assam as 6.3 [3]. Early detection and complete remission of leukemia are the most challenging tasks for the Oncologists. Globally, several research institutions are striving towards finding effective treatment of leukemia [4]. The correct and timely diagnosis of leukemia helps a lot in implementing the right treatment to cure leukemia. The present research focuses on the application of rough set theory and an extension of soft set theory for diagnosing ALL from blood microscopic images.
Particle swarm optimization (PSO) is a popular evolutionary computation method introduced by Kennedy and Eberhart in 1995. The inspiration for this concept came from observing the social behavior of bird flocking [5]. It is a powerful population-based optimization technique that has been applied successfully to a wide variety of search and optimization techniques, including some image processing problems such as image segmentation, feature selection, and classification [6][7][8][9][10]. In this paper, we describe an image process by which a leukemia nucleus is segmented in the image applying the PSO algorithm, and subsequently, and we present how relevant representative features are extracted from the segmented nucleus. During this process, different kinds of features, namely, shape, color, and texture features, are extracted. In texture features, grey level co-occurrence matrix (GLCM) is computed for the dimensions 0°, 45°, 90°, and 135°.
In image processing, a large number of features can be extracted, and it leads to the following issues. (1) Complete feature sets decrease the prediction accuracy. (2) They also reduce the processing speed or computational time. This is where feature selection comes into the picture. Feature selection (FS) is the procedure of choosing a common subset of features that is most correlated to the decision classes [11]. Molodtsov [12] designed a kind of soft set theory. It is an innovative mathematical tool to deal with ambiguity and imprecision in the leukemia images and is widely used for medical image processing. Its application in the decision-making process is much contemplated. Maji et al. discussed the application of soft set theory in decision making [13]. Isa et al. proposed an extension of soft set theory involving dominance relation. It is used to deal with uncertainty occurring in the process of multiple criteria-based decision making. In this research, improved dominance soft set-based decision rules are derived to classify the blast and non-blast cell images.

Research Motivation
Medical imaging has improved the comprehension of the auxiliary and useful design of human life structures and is broadly utilized for the discovery, mediation, and administration of clinical issues. The inspiration for our recent research comes from the potential of dominance soft set theory and its application in the medical field. The overarching of our research is the design of computational algorithms for extracting relevant features from a segmented nucleus and reducing its dimensionality. Our method analyzes digital images of leukemia cells, and the derived rules are utilized to classify the blast and non-blast cells. This approach allows us to interpret the visual information of the cellular elements in a similar way to the one that we use our senses to identify objects. The proposed solution for cell morphology analysis follows a methodology that uses soft computing and data mining techniques. This methodology includes segmentation, feature extraction, feature selection, classification, and diagnosis of acute leukemia. In the existing approach, the decision rules are generated based on the dominance-based soft approximation. To enhance the performance of the proposed approach, AND operation in multi-soft set theory is employed in the dominance-based soft approximation. This leads to computing the dependency of the reduct set and decision rules are generated. The derived rules are then simplified by using a rule pruning algorithm [14] which reduces the classification processing time. From the experimental results, it is deduced that the overall classification accuracy of the proposed IDSSDRP is 98.08%, 97.12%, 99.04%, 97.60%, and 95.67% for GLCM_0, GLCM_45, GLCM_90, GLCM_135, and shape and colour datasets respectively. The ROC curve of the IDSSDRP algorithm appears in the top left border of the ROC graph which becomes more significant. This means that the proposed approach correctly differentiates the blast and non-blast cells when compared to the existing traditional approaches.

Research Contribution
The research contributions of this work are enumerated below:


A new algorithm is applied to segment the leukemia nucleus based on Particle Swarm Optimization (PSO), which is a popular search optimization algorithm.  The Haralick texture-based GLCM is employed to extract features in four directions, and shape and color based features from the segmented image.  Improved dominance soft set-based decision rules with pruning algorithm (IDSSDRP) is applied to classify the leukemia cancerous image. This is carried out in three phases: 1. In the first phase, an improved dominance soft set-based reduction technique using AND operation in multi-soft set is applied to find the reduct set.
2. In the second phase, the dominance soft set-based approach is applied to generate decision rules. Receiver operating characteristic (ROC) curve analysis is used to evaluate the efficiency of the proposed decision rules.
3. In the third phase, the rule pruning method is employed to simplify the rules to minimize the processing time for predicting the diseases (tumor image).
 Different classification algorithms are evaluated using appropriate classification measures.
The rest of the paper is organized as follows. Section 2 presents the literature survey of related works on leukemia image analysis and soft set theory. Section 3 explains the methods and materials. Section 4 discusses the proposed method of decision rules making and pruning algorithm with numerical example. Detailed empirical results of the research paper are discussed in Section 5. Finally, Section 6 presents the conclusion and indicates the scope of further research.

Related Work
Applications of soft set theory and its extensions are discussed as follows: In [15], dominancebased rough fuzzy approximations (DFRSA) of an upward or downward cumulated fuzzy set were explained. Attributes reduction was performed using rough set theory based on the discernibility matrix and the heuristic strategy. Fuzzy dominance relation was then used to extract the decision rules. A case study in bankruptcy risk analysis was employed to verify the performance of the DFRSA method. In [16], the authors established soft-dominance relation based on soft set theory in the area of multi-criteria decision analysis.
Many researchers have worked in the field of soft set theory and its extension. In [17], the researchers discussed how various hybrid soft set models could be utilized in the field of decision making. Karaaslan [18] introduced two possible neutrosophic soft sets, namely AND-product and OR-product, to apply in decision-making problems. The arithmetical illustration displays the applications of the neutrosophic soft decision-making method, also called the PNS-decision making method. In [19], the bijective soft set was utilized to generate decision rules. Various medical datasets were analyzed, and the empirical results showed that the bijective soft set-based decision rules effectively classified the diseases. Z-soft fuzzy rough sets-based decision making was proposed by Zhan Jianming et al. [20]. In this approach, some other types of soft set models were also investigated. The mathematical results showed that the proposed method reduced the computational time when compared to other hybrid soft set models. In [21], the association between rough sets, soft sets, and hemirings was examined. The concept of soft, rough hemirings was applied to solve multi-criteria group decision-making problems. Some theoretical concepts of C-soft sets, CC-soft sets, and BC-soft sets lower and upper MSR-hemirings (k-ideals and h-ideals) were also discussed.
The study of blood microscopic images is the most challenging task for automatic detection of tumors from blood microscopic images. Currently, many researchers analyze the leukemia images to detect the blast cells using various machine learning and soft computing techniques. In [22], the author(s) developed an automated technique for white blood cell recognition and categorization. This approach is necessary to analyze each cell component in detail. Different features, namely shapebased, color-based, and texture-based features, are extracted using a new approach for background pixel removal. This process works very well and allows for the early diagnosis of suspicious cells. In [23], the researchers employed an ensemble classifier to predict the ALL in blood microscopic images. It is observed that an ensemble of classifiers leads to higher accuracy in comparison with the existing classifiers, namely Naive Byes, KNN, MLP, RBNF, and SVM.
In [24], the authors described a histogram-based soft covering rough K-means clustering (HSCRKM) algorithm for leukemia nucleus image segmentation. This approach incorporates the benefits of a soft covering rough set and rough k-means clustering. The histogram method is utilized to find the number of clusters to avoid random initialization. Machine learning algorithms were applied to categorize the healthy and leukemia cells. The proposed approach is compared with an existing clustering algorithm, and the efficiency is evaluated based on the prediction metrics. The results indicate that the HSCRKM method efficiently segments the nucleus, and it is also inferred that logistic regression and neural networks perform better than other classification algorithms.
In [25], the authors have developed a computer-aided system to detect acute lymphoblastic leukemia cells. In this approach, discrete orthonormal S-transform has been utilized to extract texture features and linear discriminant analysis is employed to reduce the dimension of the feature set. Adaboost algorithm with random forest (ADBRF) classification algorithm has been proposed to distinguish the blast and non-blast cells. The simulation results based on the five runs of K-fold crossvalidation indicate that the proposed method yields superior accuracy as compared to existing schemes.
In [26], the author(s) have designed a graphical user interface (GUI) technique to differentiate acute lymphoblastic leukemia nucleus from healthy lymphocytes in an image. In this approach, three kinds of hybrid metaheuristic algorithm, namely supervised tolerance rough set PSO based quick reduct (STRSPSO-QR), supervised tolerance rough set PSO based relative reduct (STRSPSO-RR), and supervised tolerance rough set firefly based quick reduct (STRSFF-QR), have been applied to eliminate the redundant features. The selected features were then fed into the classification process and the generated rules were optimized using the Jaya algorithm. The experimental results showed that, after improving the Jaya algorithm, the accuracy of the classification was improved.
In [27], the authors have presented an automatic leukocyte cell segmentation process using a machine learning approach and image processing technique. The features were extracted using fourmoment statistical features and artificial neural networks (ANNs). It was found that the proposed method for blasts cell segmentation provides better accuracy under different conditions.
In [28], the authors developed a decision support system for Acute Leukaemia classification based on digital microscopic images. In this approach, K-means clustering is used to segment the leukemia cells and the features are extracted. The developed system was classified as leukemia cells according to their morphological features. A total of 757 images were collected from two datasets labeled with three different categories, such as blast, myelocyte, and segmented cells. The experimental results show that the proposed approach achieved promising results.

Methods and Materials
The system architecture of tumor detection in acute lymphoblastic leukemia using improve dominance soft set-based decision rule generation with pruning is presented in Figure 1. This architecture contains several processing steps such as input image acquisition, preprocessing, nucleus segmentation, feature extraction, decision rules generation with pruning, and prediction.

Preprocessing
The digital microscope images are RGB color images. The entire ALL Images are generated from digital microscopes and usually in RGB color space, which is difficult to segment. Therefore, the RGB image is converted into a LAB color image. The L*a*b* space consists of a luminosity layer L* and chromaticity layers a* and b*. Here, the color information is represented in two components, i.e., a* and b*. Due to the low color dimension, L*a*b* color space is mostly employed in color-based clustering [33,34]. The sample outputs of LAB color conversion are shown in Figure 2.

Segmentation
Segmentation is a process used to simplify the representation of an image into a more meaningful image. It facilitates the analysis of images [35]. Segmentation is an important phase in many image processing tasks such as medical image analysis, object identification, tumour detection, satellite imagery, etc. A great variety of segmentation methods has been proposed in the past decades. In this research, the particle swarm optimization (PSO) algorithm, which is a widely used segmentation method, was applied to segment the leukemia nucleus [5].
PSO is initialized with a population of particles. Each image is treated as a particle in an Sdimensional space. The ith particle is represented as X = x , x , … . , x . The best previous position p of any particle is P = p , p , … . , p . The index of the global best particle is represented byg .
The velocity for each particle is V = v , v , … . , v . In each iteration update the particle velocity and positions using the Equations (1) and (2). The pseudo code for PSO algorithm-based segmentation is presented in Algorithm 1.

Algorithm 1 Pseudo Code for PSO algorithm
: Each image is considered as a particle : Segmented image For each particle Initializeparticle End Do For each particle Calculate Data fitness value If the fitness_value is better than pBest Set pBest = currentfitnessvalue If pBest is better than gBest Set gBest = pBest End For each particle Calculate particle_Velocity Use gBest and Velocity to update the particle End While maximum iterations or minimum error criteria is met After the preprocessing, PSO based segmentation algorithm was utilized to segment the nucleus. The results of some sample images are shown in Figure 3a-d. Figure 3. Segmentation results using PSO. as (a) Im114_1, Im070_1 and Im073_1; (b) Im192_0, Im259_0 and Im248_0; (c) Im001_1, Im002_1 and Im018_1; (d) Im056_1, Im057_1 and Im060_1.

Feature Extraction
In medical image processing, the process of detection and description of global or local properties of objects present in images is called feature extraction. In the present research, different categories of features were extracted, namely shape-based, color-based, and texture-based features [36][37][38][39]. The leukemia image consists of a massive nucleus of irregular shape and size. The shape is a fundamental feature that describes the physical characteristics of an image. It can be corrupted by noise, random distortion, and obstruction. This leads to image recognition in a more complex process. Colour-based features represent the colour components of an image. Leukemia images are in RGB colour format so that it is a discriminative feature of blood and bone marrow cells [11]. The texture feature describes the organization of the basic elements of an image. Hence, it is not desirable to distinguish the images based on colour-based features alone. Many methods are available to describe the texture features and one of the commonly used measures is the gray-level co-occurrence matrix (GLCM). In this research, GLCM was computed for dimensions 0°, 45°, 90° and 135°. In texture-based features, gray level co-occurrence matrix (GLCM) was computed for the dimensions 00, 450, 900, and 1350. For each segmented image, a total of 110 features were extracted, which consisted of 11 shapedbased features, 88 texture-based features (i.e., each dimension 22 features), and 11 color-based features [40][41][42]. A detailed delineation of the extracted features is presented in Figure 4.

Dominance based Soft Set Theory
Dominance-based soft set approach (DSSA) is an extension of soft set theory which is utilized for decision-making analysis [16]. The P lower approximation, upper approximation and boundaries of Cl and Cl are defined as follows (t = 1, … , n) Bn (Cl ) = P(Cl ) − P(Cl ) Bn (Cl ) = P(Cl ) − P(Cl ) The quality of approximation of the classification Cl by a set of soft set P can be defined as [16]: where γ (Cl) is a degree of consistency of the objects from U, P is the set of criterion soft set, and Cl is considered as classification. Every minimal subset P ⊆ C such that γ (Cl) = γ (Cl)is called a reduct set of Cl (RED ).

The Proposed Method: Improved Dominance Soft Set Based Decision Rules with Rule Pruning (IDSSDRP)
In this research, the improved dominance soft set-based decision rules with rule pruning algorithm are proposed to make decision rules to efficiently classify the acute lymphoblastic leukemia images. The proposed system contains three different phases, namely, improved dominance soft setbased attribute reduction (IDSSA) using AND operation in soft set theory, decision rules (DR) making, and rule pruning (PR). In phase 1, improved dominance soft set-based attribute reduction algorithm presented in Algorithm 2 was utilized to select the critical feature which is related to the decision class. The selected features were fed into phase 2 and generated the decision rules based on the Psoft lower, upper, and boundary region values. Finally, in phase 3, the rule pruning algorithm was used to simplify the rules, which reduce the processing time. The detailed description of each phase is defined as follows: T ← S ∪ {P} (11) S ← T (12) until γ (Cl) = γ (Cl) (13) return S In phase 1, the prominent features based on the improved dominance soft set-based attribute reduction using AND operations in multi-soft set are reduced. The conditional features are denoted as c , c , c … . c and the decision feature is denoted as D. The IDSSA algorithm begins with an empty set. Then, multi-valued information table (F, S) is constructed [43]. For each conditional feature, P boundaries of Cl and Cl are computed. The dependency value for each feature is calculated using AND operations [44] and the maximum dependency value is obtained. If the conditional feature dependency value γ (Cl) is greater than or equal to the dependency value of decision feature, then the reduced feature set S where P ⊆ C is retained. Otherwise, a combination of the minimal feature set is taken and the dependency value is calculated. This process is continued until the stopping condition is met.
In phase 2, the decision rule based on the dominance relations is generated as described in In phase 3, the derived rules are pruned based on the rule pruning method as described in Algorithm 4. Initially, the algorithm begins with an empty set and each rule R is assigned to R . The conditional feature in R is eliminated one by one. In each step, it is verified that if the rule R is inconsistent with any other rules in R , then dropped conditional feature is restored. The resulting rules are stored in P . Before the rule R is added to P , it is verified for rule redundancy. If rule R is logically included in any rule in P , i.e., R ∈ P then R is discarded. This process is continued until the last rule is verified. Finally, the pruned rules are accumulated in P . The sample dataset of job application acceptance is presented in Table 1. Let a1, a2, a3, a4 be denoted as the condition attributes and d be denoted as decision attribute. Reconstruct Table 1 into multi-value information table with respect to each criterion of soft set as presented in Table 2. Table 2. Multi-value information system. The rule pruning method eliminates a total of three rules one for upward unions of classes and two for downward unions of classes.

Performance Analysis of Attribute Reduction Algorithm
In this research, improved dominance soft set-based attribute reduction (IDSSA) using AND operation in multi-soft set theory was employed to choose the most relevant features. Five different feature datasets, i.e., GLCM_0, GLCM_45, GLCM_90, GLCM_135, and shape-based features were considered. Each feature set contained twenty-two features. On average, the performance of the IDSSA algorithm decreases 50 percent of the features. A detailed description of datasets and the number of features extracted and selected are presented in Table 3. Table 3. Acquired Reducts using IDSSA.

Dataset
No. of Features Extracted  IDSSA  GLCM_0  22  10  GLCM_45  22  11  GLCM_90  22  11  GLCM_135  22  11  Shape and Colour  22  12 The reduction percentage for each dataset is presented in a pie chart ( Figure 5). From this chart, it can be noted that the modified dominance soft set-based feature selection algorithm eliminates almost 50% of features in all the datasets. With respect to GLCM_0, it is believed that the reduction percentage (45%) is the minimum reduction percentage when compared to all other datasets.

Evaluation of Proposed IDSSDRP Algorithm
The selected features are then fed into the dominance soft set-based decision rule algorithm. In this algorithm, the lower p-soft and the boundary p-soft approximations are taken as desired rules. In this experiment, five different datasets namely, GLCM-00, GLCM-450, GLCM-900, GLCM-1350, and shape-colour were used to generate the decision rules. For each dataset, 80% of samples were subjected to training and the remaining 20% of samples are used for testing. The decision rule generation algorithm was employed to generate the required rules to predict the tumor image. Finally, the rule pruning algorithm was applied to simplify the obtained rules. The efficiency of the proposed rule pruning algorithm is given in Figure 6. The decision rules are derived from the Psoft lower approximation of the upward and downward unions of class 1 and class 2 for the GLCM_0 dataset as shown in Appendix A1. The pruned rules after applying the proposed rule pruning algorithm for the GLCM_0 dataset are shown in Appendix A2. The number of rules generated for class 1 is three and that of class 2 is one.
Prediction algorithms that learn from the training set give rise to a more accurate system. This system is utilized to predict new objects. In machine learning, the classifier is evaluated by a confusion matrix. A confusion matrix shows the number of correct and incorrect predictions made by the classification model compared to the actual outcomes (target value) in the data. Table 4 shows the values of entries in the confusion matrix for various classifiers. The four performance measures have the advantage of being independent of class costs and conceived probabilities. A classifier aims to minimize false positive and negative rates, or conversely to maximize true negative and positive rates. The performance of the proposed algorithm i.e., improved dominance soft set-based decision rule generation with pruning algorithm is compared with other well-known classification algorithms namely, decision tree [45], J48 [46], JRip [47], LMT [48] and random forest [49]. Various classification assessment metrics are used to evaluate the performance of the proposed IDSSDRP algorithm. The detailed interpretation for each metric is presented in Table 5 [50][51][52][53][54][55][56].

G-mean
The product of the prediction accuracies for both classes precison × recall

Youden's index
The arithmetic mean among sensitivity and specificity sensitivity + specificity -1

Balanced Classification
Rate (BCR) The mean of sensitivity and specificity. ½(sensitivity + specificity)

Balanced Error Rate (BER)or
The mean of the errors in each class. It also named as Half Total Error Rate (HTER)

-BCR
Note: TN-True Positive; TP-True Negative; FP-False Positive; FN-False Negative. Table 6 shows the classification results of the GLCM-0 dataset. Various classification metrics are employed for each classifier with the proposed algorithm. It is noted that the proposed IDSSDRP algorithm performs well when compared to existing classification algorithms.
The error and the balance error rates are very small for the proposed algorithm which indicates that the algorithm classifies the blast and non-blast cell ALL images more accurately.  Table 7 shows the performance of the decision-making algorithms for the GLCM-45 dataset. The proposed IDSSDRP algorithm achieves 97% of overall accuracy and the error rate is 3%. For Youden's index, the proposed decision-making algorithm achieves the highest score, i.e., 2.5 times better than the average score of existing algorithms. The efficiency of the proposed algorithm with respect to the GLCM_90 dataset is presented in Table 8. The experimental results for all the five feature extracted datasets are analyzed and it is believed that concerning the GLCM_90 dataset, the highest overall classification accuracy, i.e., 99%, is achieved. The error rate is 0.01, i.e., 1%. It is also noted that the entire classification algorithms have produced prediction accuracy above 80%. The empirical results of the IDSSDRP algorithm and existing classification algorithms for the dataset GLCM_135 appear in Table 9. It is noted that the classifiers' decision tree and J48 produced equal values for all the metrics. Furthermore, proposed decision rules almost correctly classified the blast and non-blast ALL images.  Table 10 shows the experimental results of different classification approaches for shape and colour dataset. From the interpretation of results, it is believed that the proposed algorithm achieved 95% of prediction accuracy, which is the minimum accuracy value when compared to the results of other algorithms.  Figure 7 exhibits the performance of decision tree, J48, JRip, random forest, and the proposed DSSRMP for each dataset based on prediction accuracy. It is found that the proposed algorithm gives higher prediction accuracy value. With respect to the GLCM_90 dataset, the highest predication accuracy value, i.e., 99% is achieved. On the contrary, the lowest prediction accuracy value, i.e., 77% is produced by the random forest (RF) classifier concerning the GLCM_135 dataset. It is pointed out that the classifier's decision trees and J48 achieved a prediction accuracy of about 80%. The error rate is calculated as the number of all incorrect predictions divided by the total number of inputs. The best error rate is 0.0, however, the worst is 1.0. Figure 8 shows the classification error rate values for various classifiers and the proposed decision-making algorithm with respect to all feature extracted datasets. The proposed IDSSDRP algorithm gives the best error rate values, i.e., less than 0.05. In this graph, it is also noted that the random forest algorithm gives rise to an error rate of 0.23 (relatively higher value) with reference to the GLCM_135 dataset.  Figure 9 illustrates the performance of the proposed and existing algorithms with respect to each dataset, in terms of precision, recall, and F1-measure. Precision, recall, and F1-measure to analyze the performance of the classification algorithms. Precision is defined by how many selected items are relevant whereas recall is defined by how many relevant items are selected. The harmonic means of these two metrics are denoted as F1-measure. From Figure 9, it is observed that the proposed algorithm is compatible and works very well in producing the highest precision, recall, and F1 measure value for the feature extracted datasets.  Table 11 compares the various classification approaches with our proposed IDSSDRP in terms of accuracy, sensitivity, and specificity. In the existing approach [23], SVM classifier gives 91.43% of accuracy, 73.13% of sensitivity, and 98.7% of specificity. From the experimental results, it is revealed that the classification accuracy of the proposed IDSSDRP is 98.08%, 97.12%, 99.04%, 97.60%, and 95.67% for GLCM_0, GLCM_45, GLCM_90, GLCM_135, and shape and colour datasets respectively. It is also noted that, the proposed approach gives more accuracy than SVM classifier.

Graphical Performance Assessment for IDSSDRP
The receiver operating characteristic (ROC) curve is a chart plotting the various cut values of true positive rate towards the false positive rate. It is very important to investigate the performance of the various classifiers. ROC graphs are widely used in the field of decision rules making, machine learning, data analytics, and data mining analysis [57]. In this work, ROC curves for better superiority of soft set-based decision-making can be conducted. The decision-making rules for algorithms appear in the top left corner of the ROC space, which means that the model forecasts the class precisely. The diagonal line denotes the strategy of randomly predicting a class. Any classifier that appears at the bottom right of the ROC graph performs worse than random predictions. The data on the far-left side of the ROC graph is now getting more important. Figure 10a-10e shows the ROC curve analysis of the proposed IDSSDRP and the existing decision-making algorithms. The ROC curve evaluates the graphical performance of the proposed improved dominance soft set-based decision rules making with the pruning algorithm (IDSSDRP). With respect to all the datasets, i.e., GLCM_0, GLCM_45, GLCM_90, GLCM_13, and shape and color, the proposed decision-making algorithm performed much better than the other existing classification algorithms. The curve of the IDSSDRP algorithm appears in the top left border of the ROC graph. This means that the proposed approach correctly diagnosis the blast and non-blast cells.

Conclusions and Future Scope
In this paper, a novel improved dominance soft set-based decision rules generation with pruning algorithm (IDSSDRP) is proposed to predict the acute lymphoblastic leukemia images. The proposed method contains the following advantages. (1) Features are reduced using dominance soft set with AND operation in multi-soft set theory. This improves the classification accuracy and reduces the memory space. (2) Generated decision rules are utilized to predict the blast and non-blast cells. (3) The rule pruning algorithm simplifies the generated decision rules which helps to increase the computational speed. The empirical results show that the proposed IDSSDRP algorithm effectively predicts the tumor cells in ALL leukemia images. The ROC curve analysis precisely displays the proposed system's performance in the accurate diagnosis of the disease.
In the future, we are preparing to create a hybrid method by combining the advantages of certain evolutionary algorithms, such as firefly optimization, gray wolf optimization, mouth flame, Lloyd's algorithm, Huffman algorithm, etc., and set theory extensions.