Iterative Reflect Perceptual Sammon and Machine Learning-Based Bagging Classification for Efficient Tumor Detection

Bose, S. Subash Chandra; Natarajan, Rajesh; H L, Gururaj; Flammini, Francesco; Praveen Sundar, P. V.

doi:10.3390/su15054602

Open AccessArticle

Iterative Reflect Perceptual Sammon and Machine Learning-Based Bagging Classification for Efficient Tumor Detection

by

S. Subash Chandra Bose

¹,

Rajesh Natarajan

²,

Gururaj H L

^3,*

,

Francesco Flammini

^4,*

and

P. V. Praveen Sundar

⁵

¹

Department of Computer Science, Islamiah College (Autonomous), Vaniyambadi 635751, India

²

Information Technology Department, University of Technology and Applied Sciences-Shinas, Al-Aqr, Shinas 324, Oman

³

Department of Information Technology, Manipal Institute of Technology Bengaluru, Manipal Academy of Higher Education, Bangalore 560064, India

⁴

IDSIA USI-SUPSI, University of Applied Sciences and Arts of Southern Switzerland, 6928 Manno, Switzerland

⁵

Department of Computer Science and Applications, Adhiparasakthi College of Arts and Science, Kalavai 632506, India

^*

Authors to whom correspondence should be addressed.

Sustainability 2023, 15(5), 4602; https://doi.org/10.3390/su15054602

Submission received: 9 December 2022 / Revised: 22 February 2023 / Accepted: 27 February 2023 / Published: 4 March 2023

Download

Browse Figures

Versions Notes

Abstract

:

A tumor is an abnormal development of cells in the human body. A tumor develops when cells divide without any control. Tumors change their size from a small to large lump. Tumors appear anywhere in the body. The early stage of diagnosis is an essential one in disease treatment. Many researchers carried out their research on different tumor detection methods. However, the tumor detection accuracy level was not improved and tumor detection time consumption not minimized. In order to address these problems, an Iterative Reflect Perceptual Sammon Bagging Classification (IRPS-BAC) Method is introduced. The aim is to accurately detect brain tumors as early as possible and make the method suitable for real-time applications. The IRPS-BAC Method comprises two processes, namely, feature selection and classification using the iterative reflect perceptual sammon feature selection process and bagging classification process. In the IRPS-BAC Method, an input of medical data are gathered from the Epileptic Seizure Recognition Data Set and Cervical Cancer Risk Classification database. After that, iterative reflect perceptual sammon feature selection process is carried out to select the relevant features. Iterative reflect perceptual divergence computes the variation between two features. After that, sammon mapping projects the similar and dissimilar features into feature space. By this manner, the relevant features get selected using the IRPS-BAC Method. With the help of selected relevant features, bagging classification process is carried out. In bagging classification process, internal node processes the selected features and leaf node to make the tumor decision as normal or cancerous one based on information gain. This, in turn, helps to reduce the time complexity and error rate. The performance of the proposed IRPS-BAC Method is determined by two benchmark datasets through comparing the parameter such as tumor detection time, tumor detection accuracy and error rate with the existing approaches. In the Epileptic Seizure Recognition Data Set, the proposed IRPS-BAC Method improves tumor detection accuracy by 16%, with minimum time period and the error rate of 41 ms and 58% for tumor detection as compared to existing methods. By using Cervical Cancer Risk Classification, the proposed IRPS-BAC Method exhibited higher classification performance measures, including accuracy (14%), time (46 ms), and error rate (61%), than the current conventional approaches.

Keywords:

machine learning; tumor detection; iterative reflect perceptual sammon; bagging classification; relevant features; internal node

1. Introduction

Tumor detection is the challenging and essential task in different medical-image applications as it includes a large amount of data/information. In the medical field, brain tumor detection, either manually or automatically, is vital for clinical diagnosis and treatment to avoid the death rate. Therefore, the novel Iterative Reflect Perceptual Sammon Bagging Classification (IRPS-BAC) Method is useful for medical applications. It provides main advantages in the medical field, especially for early-stage diagnosing and treating brain tumors accurately.

An automatic method was introduced in [1] for epileptic seizure detection depending on deep metric learning. A one-dimensional convolutional embedding module was employed for single-channel and multichannel EEG signals correspondingly. However, the detection time was not reduced by the designed method. The cell-free DNA (cfDNA) methylome was introduced in [2] to identify ovarian cancer at the early stage, but the detection accuracy level was not improved by cfDNA methylome.

A new patient-specific seizure prediction technique was introduced in [3] depending on deep learning with electroencephalogram (EEG) recordings. However, the computational complexity was not reduced by the patient-specific seizure prediction technique. An artificial intelligence system was discussed in [4] for identifying the epileptic focus based on features that utilized interictal EEGs. An efficient computer-aided solution was obtained for epilepsy focus detection, but the computational time consumption was not reduced by the artificial intelligence system.

A new analysis system was introduced in [5] for identifying the epileptic seizure from EEG signals. The designed system employed the statistical features depending on optimum allocation technique (OAT) with logistic model trees (LMT), but the space complexity was not minimized by the designed analysis system. The blood samples of DNA methylation profiling were carried out in [6] for ovarian cancer detection. A supervised machine learning algorithm was used to predict and classify the blood sample as malignant or non-malignant, but the computational cost was not reduced by DNA methylation profiling.

Epithelial ovarian cancer (EOC) was linked in [7] with the pathogenic variants (PVs) in homologous recombination and/or mismatch repair genes. The women’s testing was carried out with familial EOC. However, the tumor detection accuracy was not improved by EOC. The circulating tumor DNA (ctDNA) utility was carried out in [8] with biomarker for EOC, but feature selection time was not reduced by ctDNA utility.

An ultrasensitive and selective electrochemical biosensor was employed in [9] for rapid DNA methylation detection in blood. DNA methylation sensing included hybridization of DNA modified gold-coated magnetic nanoparticles to target DNA and to differentiate methylated DNA. However, computational cost was not reduced by designed biosensor. The signal decomposition and statistical method was discussed in [10] for epileptic seizure detection. The variational mode decomposition (VMD) was carried out to extract components of intrinsic mode functions (IMFs) through EEG signal decomposition. However, the feature selection accuracy was not improved by signal decomposition and statistical method.

A brain tumor occurs due to the unrestrained and fast expansion of cells. If not treated at an early stage, it may lead to death. Despite numerous important efforts as well as talented results in this domain, accurate feature selection and classification have a challenging duty. Few machine learning brain tumor detection methods are examined to discover brain tumor detection. In previous work, the relevant feature was selected, but the feature selection time was enhanced. The conventional classification process failed to attain accuracy. To address these issues, the Iterative Reflect Perceptual Sammon Bagging Classification (IRPS-BAC) Method is introduced.

The following is a list of our key contributions to this research:

➢: This paper presents the proposed IRPS-BAC Method to improve the tumor detection performance with higher accuracy and lesser time complexity;
➢: This paper uses the iterative reflect perceptual sammon feature selection process and bagging classification process in the proposed IRPS-BAC Method;
➢: This paper applies the iterative reflect perceptual sammon feature selection process for computing divergence between features and objective. The sammon mapping projects the similar and dissimilar features into feature space and also selects the relevant features. In this way, the tumor detection time is said to be minimized;
➢: This paper performs the bagging classification process in the IRPS-BAC Method. The internal node process the selected features to take the tumor decision as normal or cancerous based on information gain. This, in turn, helps to improve accuracy and reduce the time complexity and error rate.

The rest of the paper is arranged as follows: the related works of tumor detection is presented in Section 2. In Section 3, details of the IRPS-BAC Method are described with a neat diagram. Section 4 provides the experimental setup. The results and discussions are presented in Section 5. Finally, Section 6 concludes the paper.

2. Related Works

An efficient encrypted EEG data classification and recognition system was designed in [11] through a chaotic baker map and Arnold Transform algorithm with convolutional neural networks (CNNs). A fully automated system depending on the Hybrid Grey Wolf Optimizer Improved Sine Cosine Algorithm (HGWOISCA) was introduced in [12] for EEG signal classification.

An automated deep learning-enabled brain signal classification for epileptic seizure detection was introduced in [13] to categorize the brain signals to identify the existence of seizure or not. A new enhanced search ability based on atomic search optimization (ESAASO) was introduced in [14] for seizure and non-seizure detection. An inertia weight, levy flight, and ranking strategies were combined to increase the performance.

A new epileptic seizure detection method was introduced in [15] with empirical mode decomposition, the mutual information-based best individual feature (MIBIF) selection algorithm, and the multi-layer perceptron neural network. An autonomously generalized retrospective and patient-specific hybrid model was introduced in [16] based on convolutional neural network with long short-term memory. An automated learning framework termed Fourier–Bessel series expansion-based empirical wavelet transform (FBSE-EWT) method was introduced in [17] for identifying epileptic seizures from the EEG signals. A deep long short-term memory (LSTM) network was introduced in [18] to study high-level representations of EEG patterns. The features were given to softmax layer to attain the predicted labels.

A support vector machine classifier was introduced in [19] for time series EEG signals mapping to a complex network. Edge weight fluctuation (EWF) was used to extract fluctuation in EEG signals. A new method was introduced in [20] for time series EEG signals mapping to complex network. A random forest classifier was performed in [21] to relying contextually on several spatial and temporal features of machine learning used in tumor detection. A segmentation and detection method for brain tumors was developed [22] by using images from MRI sequence as an input image to identify the tumor area. Deep hybrid learning (DeepTumorNetwork) was introduced in [23] for categorizing brain cancers. However, the time was not reduced.

3. Methodology

A tumor is the abnormal increase in cells to form an unnatural section with different features from normal cells. Tumor detection is the most challenging task in medical applications [24]. The tumor classification is a difficult task in the field of medical data analysis. Machine learning technology helps the radiologists to easily detect the tumor without any surgical intervention [25]. Different classification methods are discussed by the researchers to detect the tumor disease in an effective manner, but the detection accuracy was not improved and detection time consumption was not reduced by conventional methods. In order to address these issues, an efficient method called the Iterative Reflect Perceptual Sammon Bagging Classification (IRPS-BAC) Method is introduced for tumor detection. The structural diagram of the IRPS-BAC Method is illustrated in the Figure 1.

The above Figure 1 explains the flow process of the proposed IRPS-BAC Method to perform efficient tumor detection with higher accuracy and lesser time consumption. The input medical data ‘d_1, d_2, d_3…d_m’ are gathered from the database. The collected medical data with features are considered as an input for performing feature selection. Finally, the selected features of tumor classification are carried out with higher accuracy and lesser time consumption. The brief explanation of the iterative reflect perceptual sammon feature extraction and bagging classification is discussed in next sub-section.

3.1. Iterative Reflect Perceptual Sammon Feature Extraction

The proposed IRPS-BAC Method performs the feature selection for accurate tumor detection with lesser time consumption. Feature selection is the process that finds the feature subset from the input dataset. Iterative reflect perceptual divergence computes the distance between the two points. In the IRPS-BAC Method, sammon mapping projects the similar and dissimilar features into feature space. The diagrammatic representation of the feature selection is illustrated in Figure 2.

Figure 2 explains the diagrammatic representation of the feature selection process. Let us consider that the number of features in the database is ‘

f e t_{j} = f e t_{1}, f e t_{2}, f e t_{3}, \dots f e t_{n}

’. By applying the iterative reflect perceptual divergence, the distance is determined between feature and objective. It is formulated as

φ_{i r p} = || f e t_{j} - o b j e c t i v e ||

(1)

From (1), ‘

φ_{i r p}

’ denotes iterative reflect perceptual divergence. ‘

f_{j}

’ denotes the features of input data. The iterative reflect perceptual divergence value varies from ‘0’ to ‘1’. After that, the threshold value is predefined to map the input features from a database into any of their subsets. Consequently, the sammon mapping result is given as

S M \to {\begin{matrix} φ_{i r p} > t h; d i s s i m i l a r f e a t u r e s \\ φ_{i r p} < t h; s i m i l a r f e a t u r e s \end{matrix}

(2)

From (2), ‘

S M

’ symbolizes the sammon mapping function. ‘

t h

’ represent the threshold. The feature that gets more diverged from objective is considered as the dissimilar feature, or else the feature is considered as the relevant feature. By this manner, the similar feature gets mapped into low-dimensional space. The algorithmic process of iterative reflect perceptual sammon feature selection is shown in Algorithm 1.

Algorithm 1: Iterative Reflect Perceptual Sammon Feature Selection

Input: Database, number of features ‘

f e t_{j} = f e t_{1}, f e t_{2}, f e t_{3}, \dots f e t_{n}

’

Output: Selected features

1. Begin
2. Number of input features ‘

f e t_{j} = f e t_{1}, f e t_{2}, f e t_{3}, \dots f e t_{n}

’ taken as an input
3. For each feature ‘

f e t_{j}

’
4. Compute the iterative reflect perceptual divergence
5. Map the features into subsets based on divergence value
6. Select the relevant features from database
7. End for
8. End

The algorithmic step of the IRPS-BAC Method is described with feature selection process. Initially, the number of features is considered as an input. Then, the divergence value is computed for every feature. Based on the values, the relevant features are selected. Finally, the relevant features are used to perform efficient tumor detection.

3.2. ID3 Ensembled Bagging Classification

In the IRPS-BAC Method, bagging classifier is an ensemble of base classifiers each on random subsets of a dataset and combines their individual predictions. Each base classifier trained random input generated with replacement. Training set for every base classifier is independent of each other. Bagging classifier minimizes the overfitting through voting which resulted in bias compensated by variance reduction. Bootstrap aggregating is a machine learning technique used to improve the stability and accuracy of classification and regression analysis. Let us consider that Bagging classifier construct ‘

n

’ number of ID3 decision tree results for each medical data in bootstrap samples. The ID3 decision tree is used in the IRPS-BAC Method for categorizing the medical data into normal data or cancerous data (i.e., abnormal data). ID3 decision tree has root node, internal node, and leaf node. The root node has data with selected features. The internal node processes the selected features and the leaf node takes the decision to categorize the tumor as a normal or cancerous one based on information gain. The information gain is given as

B_{i} (τ_{i}) = I G = h (s) - \sum_{C_{s} \in n, a b} \frac{| c_{s} |}{| I |} * h (c_{s})

(3)

From (3), ‘

C_{s}

’ denotes the two classes of ID3 classifier (i.e., normal or abnormal). ‘

h (s)

’ symbolizes the entropy of total set. ‘

S

’ symbolizes the input data. ‘

h (c_{s})

’ symbolizes the entropy of the classes. The information gain is used for data partitioning consistent with selected features. The data with high information gain is categorized as normal. The data with less information gain is categorized as abnormal. The output of classification is attained at the leaf node. Each weak classifier presents the classification results. After that, bagging classifier in the IRPS-BAC Method aggregates all base classifier into strong classifier. It is formulated as

B (τ_{i}) = B_{1} (τ_{i}) + B_{2} (τ_{i}) + \dots + B_{n} (τ_{i})

(4)

From (4), the strong classifier results are obtained. Subsequently, the IRPS-BAC Method applies vote ‘

δ_{i}

’ for each base ID3 classifier results ‘

B (τ_{i})

’. It is computed as

δ_{i} \to \sum_{i = 1}^{n} B (τ_{i})

(5)

From (5), the majority vote of all base ID3 classifier results is employed to formulate the strong classifier for grouping the medical data. Consequently, the strong classifier result is computed as

S C (τ_{i}) = a r g \max_{n} δ (B (τ_{i}))

(6)

From (6), ‘

S C (τ_{i})

’ designates the final strong clustering result, whereas ‘

a r g \max_{n} δ

’ denotes the majority votes of base classifier output. The obtained strong classification results are used to accurately characterize the data as normal data or abnormal data. The algorithmic description of the ID3 Ensembled Bagging Classification in the IRPS-BAC Method is described as follows.

Algorithm 2 describes the algorithmic description of ID3 ensembled bagging classification in the IRPS-BAC Method. The number of medical data with the selected features is considered as an input. With selected features, the ensemble bagging classifier built the number of decision trees to categorize the medical data. The bagging classifier combines the decision tree and assigns the votes to every decision tree result. The decision tree with maximum votes is considered as the final classification results. By this manner, the classification results are improved and minimized the time consumption. The proposed IRPS-BAC Method effectively performs the brain tumor detection with higher accuracy.

Algorithm 2: ID3 Ensembled Bagging Classification

Input: Selected features

Output: Classification results

Begin
1. for each data with selected features
2. Construct base learners with selected features
3. Classifies data based on maximum information gain
4. Combine all weak learners
5. for each weak learners
6. Assign votes to each base learner
7. Find base learner with majority of votes
8. Obtain strong classification results
9. end for
10. end for
End

4. Experimental Settings

The experimental analysis of the proposed IRPS-BAC Method and existing methods, the automatic method [1], cfDNA methylome [2], and Deep Tumor Network [25], are implemented using JAVA coding for detecting the tumor with minimum time consumption. The medical data is collected from two different datasets, namely the Epileptic Seizure Recognition Data Set and Cervical Cancer Risk Classification. The first dataset [23] is taken from the UCI machine learning repository (https://archive.ics.uci.edu/ml/datasets/Epileptic+Seizure+Recognition (accessed on December 2022)). The objective of the dataset is to predict the epileptic seizure which is a symptom due to excessive neuronal activity in brain. The dataset comprises 11,500 instances and 179 attributes. Among 179, 178 attributers are explanatory variables (X1, X2…X178) and last attribute is a class attributes. The second dataset [26] is taken from Kaggle (https://www.kaggle.com/datasets/loveall/cervical-cancer-risk-classification (accessed on December 2022)). This file comprises the list of risk factors for cervical cancer leading to biopsy examination. The dataset comprises 858 instances and 36 attributes.

5. Results Analysis

The simulation results of the proposed IRPS-BAC Method and existing methods, the automatic method [1], cfDNA methylome [2], and Deep Tumor Network [25], are illustrated in this section. The efficiency of proposed and existing methods are compared with different parameters, including:

➢: Tumor detection time;
➢: Tumor detection accuracy;
➢: False-positive rate.

The effectiveness of the proposed IRPS-BAC Method and existing methods are discussed in terms of tables and graphical representation.

5.1. Impact on Tumor Detection Accuracy

Tumor detection accuracy is defined as the ratio of number of data points that are correctly identified as tumor or normal through the classification to the total number of data points. The tumor detection accuracy is expressed as

T D A = (\frac{N u m b e r o f d a t a p o i n t s t h a t a r e c o r r e c t l y d e t e c t e d}{N u m b e r o f d a t a p o i n t s}) * 100

(7)

From (7), ‘

T D A

’ symbolizes the tumor detection accuracy. The tumor detection accuracy is computed in percentages (%).

Table 1 and Table 2 explain the tumor detection accuracy of three different methods, namely, the automatic method, cfDNA methylome, Deep Tumor Network, and the proposed IRPS-BAC Method for two datasets, namely the Epileptic Seizure Recognition Data Set and Cervical Cancer Risk Classification.

The various medical data points are collected from the database. The performance of the tumor detection accuracy of the proposed IRPS-BAC Method is significantly improved when compared to the existing techniques. The simulation graph with different results is illustrated in Figure 3 and Figure 4.

Figure 3 and Figure 4 provide the simulation results of tumor detection accuracy results for three different methods with respect to different number of data points. With an increase in number of data points, the tumor detection accuracy for three different techniques gets increased or decreased, respectively. Comparatively, the IRPS-BAC Method attained higher tumor detection accuracy than the other three existing methods. This is because of using feature selection and classification process for efficient tumor detection. The divergence is computed for every feature to perform the relevant feature selection process. With relevant feature selection, the ensembled bagging classification is a type of machine learning technique using ID3 for analyzing the medical data into normal data or cancerous data for accurate tumor detection. The bagging classifier integrates decision tree as well as allocates the votes to each decision tree outcome. The decision tree with higher votes is taken as the final classification outcomes. These classification results are used to get precise brain tumor detection. When considering the Epileptic Seizure Recognition Data Set with the ‘5000’ input data points considered for experimentation, the tumor detection accuracy is found to be ’93.7%’ using the IRPS-BAC Method. The tumor detection accuracy is ’83.4%’ when compared to the automatic method [1], ‘77.5%’ when compared to cfDNA methylome [2], and ‘73.85%’ when compared to Deep Tumor Network [25]. The average comparison analysis on tumor detection accuracy using the Epileptic Seizure Recognition Data Set is found to be comparatively enhanced by 11%, 17%, and 21% when compared to [1,2,25]. For Cervical Cancer Risk Classification, the number of data points is considered as 160 in the second iteration for calculating the tumor detection accuracy. By applying the proposed IRPS-BAC Method, the tumor detection accuracy was found to be 93.1% and the tumor detection accuracy of existing models in [1,2,25] are 81.9%, 86.3%, and 78%, respectively. Similarly, different performance results are observed for each method with respect to the number of data points. The average of ten comparison results indicates that the IRPS-BAC Method considerably improves tumor detection accuracy using Cervical Cancer Risk Classification by 14%, 8%, and 19% when compared to the two state-of-the-art algorithms explored by [1,2,25].

5.2. Impact on Error Rate

Error rate is defined as the ratio of the number of data points that are incorrectly detected as a tumor or normal to the total number of data points. The error rate is calculated as

E R = (\frac{N u m b e r o f d a t a p o i n t s t h a t a r e i n c o r r e c t l y d e t e c t e d}{N u m b e r o f d a t a p o i n t s}) * 100

(8)

From (8), ‘

E R

’ denotes the error rate. Consequently, the error rate is calculated in terms of percentage (%).

Table 3 and Table 4 describe the error rate of three different methods, namely the automatic method, cfDNA methylome, Deep Tumor Network, and the proposed IRPS-BAC Method, for two datasets, namely, the Epileptic Seizure Recognition Data Set and Cervical Cancer Risk Classification.

The various medical data points are collected from the database. The performance of the error rate of the proposed IRPS-BAC Method is reduced when compared to the existing techniques. The simulation graph with different error rate results is shown in Figure 5 and Figure 6.

Figure 5 and Figure 6 provide the simulation results of error rate results for three different methods with respect to a number of data points. With an increase in number of data points, the error rate for three different techniques gets increased or decreased, respectively. Comparatively, the IRPS-BAC Method attained a lesser error rate than the other three existing methods. The main reason for minimizing the error rate is to apply feature selection and classification process in the IRPS-BAC Method. The divergence value is determined for every feature. The applicable features are chosen by using the innovation of iterative reflect perceptual sammon feature selection process. Next, by using the bagging classification process, an internal node processes the selected features as well as a leaf node, to consider the tumor decision as normal or cancerous based on information gain. In this manner, error rate is said to be reduced.

When considering the Epileptic Seizure Recognition Data Set with the ‘1000’ input data points considered for experimentation, the error rate is found to be ‘7.6%’ using the IRPS-BAC Method. The error rate is ‘14.2%’ when compared to the automatic method [1], ‘18.2%’ when compared to cfDNA methylome [2], and ‘20.2%’ when compared to Deep Tumor Network [25]. The average comparison analysis on error rate using the Epileptic Seizure Recognition Data Set is found to be comparatively reduced by 50%, 59%, and 64% when compared to [1,2,25]. Let us consider 80 numbers of data points taken in the first iteration and the observed results using the IRPS-BAC Method is

6.9 %,

whereas the performance of error rate results of [1,2] is

18.1 %

,

13.7 %,

and

22 %,

respectively. The comparison results indicate that the error rate of the IRPS-BAC Method is considerably reduced by 62%, 52%, and 70% when compared to [1,2,25].

5.3. Impact on Tumor Detection Time

Tumor detection time is defined as the amount of time consumed to detect the tumor from the given input data points through the classification. The time is calculated using the given formula

T D T = D_{n} * t (D S D)

(9)

From (9), ‘

D_{n}

’ represents the input number of data points. ‘

t (D S D)

’ symbolizes the time consumed for detecting the single data point. The tumor detection time is measured in milliseconds (ms).

Table 5 and Table 6 illustrate the tumor detection time of three different methods, namely automatic method, cfDNA methylome, and Deep Tumor Network, of the proposed IRPS-BAC Method for two datasets, namely Epileptic Seizure Recognition Data Set and Cervical Cancer Risk Classification. The medical data points are gathered from the database. The performance of tumor detection time of the proposed IRPS-BAC Method is reduced when compared to the existing techniques. The simulation graph with different tumor detection time results is illustrated in Figure 7 and Figure 8.

Figure 7 and Figure 8 illustrate the simulation results of tumor detection time results for three different methods with respect to number of data points. With increase in number of data points, the tumor detection time for three different techniques gets increased correspondingly. Comparatively, the IRPS-BAC Method attained lesser tumor detection time than the other three existing methods. This is due to the application of feature selection and classification process in the IRPS-BAC Method. The feature selection is carried out through the divergence value to pick up significant features to perform efficient tumor detection. Further, the bagging classification process is applied to combine every weak learner into a strong one for classifying the normal data or abnormal data. This aid in diminishing the tumor detection time. When considering the Epileptic Seizure Recognition Data Set with the ‘9000’ input data points considered for experimentation, tumor detection time is found to be ‘56 ms’ using the IRPS-BAC Method. The tumor detection time is ‘96 ms’ when compared to the automatic method [1], ‘64 ms’ when compared to cfDNA methylome [2], and ‘98 ms’ when compared to the Deep Tumor Network [25]. The average performance analysis on tumor detection time using the Epileptic Seizure Recognition Data Set is found to be comparatively reduced by 49%, 18%, and 55% when compared, according to [1,2,25]. Let us consider 400 data points for conducting the experiments to calculate the tumor detection time. The overall performance of tumor detection time using the proposed IRPS-BAC Method is

0.8 ms

. In addition, the tumor detection time using [1,2,25] was found to be

1.48 ms

, 1.2 ms, and

1.67 ms,

respectively. For each method, various performance results are observed with respect to different counts of input. The comparison result of tumor detection time of the IRPS-BAC Method is considerably reduced by 47%, 30%, and 62% when compared to the two state-of-the-art algorithms, according to [1,2,25].

5.4. Throughput

Throughput is the number of data points being successfully executed in accurate time. This is mathematically expressed as given below:

T h r o u g p u t = [\frac{N u m b e r o f d a t a p o i n t s_{e x e c}}{t i m e (s)}]

where

N u m b e r o f d a t a p o i n t s_{e x e c}

denotes the data points executed time in seconds ‘

t i m e (s)

’. Throughput is computed by data per second (data/sec).

Table 7 and Table 8 demonstrate the throughput of three different methods for two datasets. The various medical data points are gathered from the database. The performance of the throughput of the proposed IRPS-BAC Method is enhanced when compared to the obtainable techniques. The simulation graph with different results is illustrated in Figure 9 and Figure 10.

Figure 9 and Figure 10 offer the comparison results of throughput results for three different methods based on number of data points. With increase in number of data points, the throughput for three different techniques gets increased, respectively. Comparatively, the IRPS-BAC Method attained maximum throughput with the two other existing methods. The main reason for higher throughput is to apply feature selection and classification process for efficient tumor detection. The average performance analysis on throughput using the Epileptic Seizure Recognition Data Set is found to be comparatively increased by 14%, 27%, and 44% when compared using [1,2,25]. The throughput of the IRPS-BAC Method using Cervical Cancer Risk Classification is considerably improved by 11%, 22%, and 41% when compared to the two state-of-the-art algorithms [1,2,25].

6. Conclusions

A new method termed the IRPS-BAC Method is introduced with the iterative reflects perceptual sammon feature selection process and bagging classification process. An input medical data are gathered from input database. An iterative reflect perceptual sammon feature selection process selected the relevant features through computing variation between features and the sammon mapping projects the similar and dissimilar features into feature space. Then, bagging classification process classifies the data into normal or cancerous based on information gain. This, in turn, helps to reduce the time complexity and error rate. The performance of the proposed IRPS-BAC Method is determined through tumor detection time, tumor detection accuracy, and error rate with the existing approaches. The proposed IRPS-BAC Method improves tumor detection accuracy with minimum time for tumor detection.

Author Contributions

Conceptualization, S.S.C.B. and R.N.; methodology, S.S.C.B. and R.N.; software, R.N.; validation, S.S.C.B., R.N., G.H.L., F.F. and P.V.P.S.; formal analysis, R.N.; investigation, S.S.C.B. and R.N.; resources, S.S.C.B., R.N., G.H.L., F.F. and P.V.P.S.; writing—original draft preparation, S.S.C.B., R.N., G.H.L., F.F. and P.V.P.S.; writing—review and editing, S.S.C.B.; visualization, R.N.; supervision, S.S.C.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Available on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Duan, L.; Wang, Z.; Qiao, Y.; Wang, Y.; Huang, Z.; Zhang, B. An Automatic Method for Epileptic Seizure Detection Based on Deep Metric Learning. IEEE J. Biomed. Health Inform. 2022, 26, 2147–2157. [Google Scholar] [CrossRef] [PubMed]
Lu, H.; Liu, Y.; Wang, J.; Fu, S.; Wang, L.; Huang, C.; Li, J.; Xie, L.; Wang, D.; Li, D.; et al. Detection of ovarian cancer using plasma cell-free DNA methylomes. Clin. Epigenetics 2022, 14, 74. [Google Scholar] [CrossRef]
Daoud, H.; Bayoumi, M.A. Efficient Epileptic Seizure Prediction Based on Deep Learning. IEEE Trans. Biomed. Circuits Syst. 2019, 13, 804–813. [Google Scholar] [CrossRef]
Islam, M.R.; Zhao, X.; Miao, Y.; Sugano, H.; Tanaka, T. Epileptic seizure focus detection from interictal electroencephalogram: A survey. Cogn. Neurodynamics 2022, 17, 1–23. [Google Scholar] [CrossRef] [PubMed]
Kabir, E.; Siuly; Zhang, Y. Epileptic seizure detection from EEG signals using logistic model trees. Brain Inform. 2016, 3, 93–100. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Li, N.; Zhu, X.; Nian, W.; Li, Y.; Sun, Y.; Yuan, G.; Zhang, Z.; Yang, W.; Xu, J.; Lizaso, A.; et al. Blood-based DNA methylation profiling for the detection of ovarian cancer. Gynecol. Oncol. 2022, 167, 295–305. [Google Scholar] [CrossRef] [PubMed]
Flaum, N.; Crosbie, E.J.; Edmondson, R.; Woodward, E.R.; Lalloo, F.; Smith, M.J.; Schlecht, H.; Evans, D.G. High detection rate from genetic testing in BRCA-negative women with familial epithelial ovarian cancer. Genet. Med. 2022, 24, 2578–2586. [Google Scholar] [CrossRef]
Hou, J.Y.; Chapman, J.S.; Kalashnikova, E.; Pierson, W.; Smith-McCune, K.; Pineda, G.; Vattakalam, R.M.; Ross, A.; Mills, M.; Suarez, C.J.; et al. Circulating tumor DNA monitoring for early recurrence detection in epithelial ovarian cancer. Gynecol. Oncol. 2022, 167, 334–341. [Google Scholar] [CrossRef]
Chen, D.; Wu, Y.; Tilley, R.D.; Gooding, J.J. Rapid and ultrasensitive electrochemical detection of DNA methylation for ovarian cancer diagnosis. Biosens. Bioelectron. 2022, 206, 114126. [Google Scholar] [CrossRef]
Zhang, S.; Liu, G.; Xiao, R.; Cui, W.; Hu, J.C.; Sun, Y.; Qiu, J.; Qi, Y. A combination of statistical parameters for epileptic seizure detection and classification using VMD and NLTWSVM. Biocybern. Biomed. Eng. 2022, 42, 258–272. [Google Scholar] [CrossRef]
EinShoka, A.A.; Dessouky, M.M.; El-Sayed, A.; Hemdan, E.E.-D. An efficient CNN based epileptic seizures detection framework using encrypted EEG signals for secure telemedicine applications. Alex. Eng. J. 2022, 65, 399–412. [Google Scholar] [CrossRef]
Divya, P.; Devi, B.A. Hybrid metaheuristic algorithm enhanced support vector machine for epileptic seizure detection. Biomed. Signal Process. Control. 2022, 78, 103841. [Google Scholar] [CrossRef]
Escorcia-Gutierrez, J.; Jimenez-Cabas, K.B.J.; Elhoseny, M.; Alshehri, M.D.; Selim, M.M. An automated deep learning enabled brain signal classification for epileptic seizure detection on complex measurement systems. Measurement 2022, 196, 111226. [Google Scholar] [CrossRef]
Mohapatra, S.K.; Patnaik, S. ESA-ASO: An enhanced search ability based atom search optimization algorithm for epileptic seizure detection. Meas. Sens. 2022, 24, 100519. [Google Scholar] [CrossRef]
Hassan, K.M.; Islam, M.R.; Nguyen, T.T.; Molla, M.K.I. Epileptic seizure detection in EEG using mutual information-based best individual feature selection. Expert Syst. Appl. 2022, 193, 116414. [Google Scholar] [CrossRef]
Hussain, W.; Sadi, M.T.; Siuly, S.; Rehman, A.U. Epileptic seizure detection using 1 D-convolutional long short-term memory neural networks. Appl. Acoust. 2021, 177, 107941. [Google Scholar] [CrossRef]
Anuragi, A.; Sisodia, D.S.; Pachori, R.B. Automated FBSE-EWT based learning framework for detection of epileptic seizures using time-segmented EEG signals. Comput. Biol. Med. 2021, 136, 104708. [Google Scholar] [CrossRef]
Hussein, R.; Palangi, H.; Ward, R.K.; Wang, Z.J. Optimized deep neural network architecture for robust detection of epileptic seizures using EEG signals. Clin. Neurophysiol. 2019, 130, 25–37. [Google Scholar] [CrossRef]
Mahmoodian, N.; Boese, A.; Friebe, M.; Haddadnia, J. Epileptic seizure detection using cross-bispectrum of electroencephalogram signal. Seizure 2019, 66, 4–11. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Supriya, S.; Siuly, S.; Wang, H.; Zhang, Y. New feature extraction for automated detection of epileptic seizure using complex network framework. Appl. Acoust. 2021, 180, 108098. [Google Scholar] [CrossRef]
Bianchetti, G.; Taralli, S.; Vaccaro, M.; Indovina, L.; Mattoli, M.V.; Capotosti, A.; Scolozzi, V.; Calcagni, M.L.; Giordano, A.; De Spirito, M.; et al. Automated detection and classification of tumor histotypes on dynamic PET imaging data through machine-learning driven voxel classification. Comput. Biol. Med. 2022, 145, 105423. [Google Scholar] [CrossRef] [PubMed]
Arif, M.; Ajesh, F.; Shamsudheen, S.; Geman, O.; Izdrui, D.; Vicoveanu, D. Brain Tumor Detection and Classification by MRI Using Biologically Inspired Orthogonal Wavelet Transform and Deep Learning Techniques. J. Healthc. Eng. 2022, 2022, 2693621. [Google Scholar] [CrossRef] [PubMed]
Andrzejak, R.G.; Lehnertz, K.; Rieke, C.; Mormann, F.; David, P.; Elger, C.E. Indications of nonlinear deterministic and finite dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Phys. Rev. E 2001, 64, 061907. [Google Scholar]
Younis, A.; Qiang, L.; Nyatega, C.O.; Adamu, M.J.; Kawuwa, H.B. Brain Tumor Analysis Using Deep Learning and VGG-16 Ensembling Learning Approaches. Appl. Sci. 2022, 12, 7282. [Google Scholar] [CrossRef]
Amran, G.A.; Alsharam, M.S.; Blajam, A.O.A.; Hasan, A.A.; Alfaifi, M.Y.; Amran, M.H.; Gumaei, A.; Eldin, S.M. Brain Tumor Classification and Detection Using Hybrid Deep Tumor NetworK. Electronics 2022, 11, 3457. [Google Scholar] [CrossRef]
Fernandes, K.; Cardoso, J.S.; Fernandes., J. Transfer Learning with Partial Observability Applied to Cervical Cancer Screening. Iberian Conference on Pattern Recognition and Image Analysis. In Proceedings of the Pattern Recognition and Image Analysis: 8th Iberian Conference, IbPRIA 2017, Faro, Portugal, 20–23 June 2017; Springer International Publishing: Cham, Switzerland, 2017. [Google Scholar]

Figure 1. Architecture of proposed IRPS-BAC Method.

Figure 2. Feature selection process.

Figure 3. Measurement of tumor detection accuracy for Epileptic Seizure Recognition Data Set.

Figure 4. Measurement of tumor detection accuracy for Cervical Cancer Risk Classification.

Figure 5. Measurement of error rate for Epileptic Seizure Recognition Data Set.

Figure 6. Measurement of error rate for Cervical Cancer Risk Classification.

Figure 7. Measurement of tumor detection time for Epileptic Seizure Recognition Data Set.

Figure 8. Measurement of tumor detection time for Cervical Cancer Risk Classification.

Figure 9. Measurement of throughput for Epileptic Seizure Recognition Data Set.

Figure 10. Measurement of throughput for Cervical Cancer Risk Classification.

Table 1. Tabulation of tumor detection accuracy for Epileptic Seizure Recognition Data Set.

Number of Data Points (Number)	Tumor Detection Accuracy (%)
Number of Data Points (Number)	Automatic Method	cfDNA Methylome	Deep Tumor Network	Proposed IRPS-BAC Method
1000	85.8	81.8	79.8	92.4
2000	82.1	77.35	75	92.4
3000	81.6	76.3	74.85	91.4
4000	81.2	79.5	74.05	87
5000	83.4	77.5	73.85	93.7
6000	81.4	78.0	75.8	87.5
7000	78.5	75	73	91.2
8000	82.3	77	75	89.1
9000	79.5	77.5	74.2	95.3
10,000	85.7	83.5	80.3	91.9

Table 2. Tabulation of tumor detection accuracy for Cervical Cancer Risk Classification Data Set.

Number of Data Points (Number)	Tumor Detection Accuracy (%)
Number of Data Points (Number)	Automatic Method	cfDNA Methylome	Deep Tumor Network	Proposed IRPS-BAC Method
80	82.5	87.5	80.3	93.8
160	81.9	86.3	78	93.1
240	70.8	77.1	68.28	86.7
320	71.9	82.8	68.76	93.4
400	77.5	82.5	70	92
480	85.4	87.5	82.42	93.8
560	87.9	89.3	81.29	92.9
640	89.1	92.2	87.05	96.1
720	90.3	91.7	85.29	96.5
800	90	92.3	87.5	97.5

Table 3. Tabulation of error rate for Epileptic Seizure Recognition Data Set.

Number of Data Points (Number)	Error Rate (%)
Number of Data Points (Number)	Automatic Method	cfDNA Methylome	Deep Tumor Network	Proposed IRPS-BAC Method
1000	14.2	18.2	20.2	7.6
2000	17.9	22.65	25	7.6
3000	18.4	23.7	25.15	8.6
4000	18.8	20.5	25.95	13
5000	16.6	22.5	26.15	6.3
6000	18.6	22	24.2	12.5
7000	21.5	25	27	8.8
8000	17.7	23	25	10.9
9000	20.5	22.5	25.8	4.7
10,000	14.3	16.5	19.7	8.1

Table 4. Tabulation of error rate for Cervical Cancer Risk Classification Data Set.

Number of Data Points (Number)	Error Rate (%)
Number of Data Points (Number)	Automatic Method	cfDNA Methylome	Deep Tumor Network	Proposed IRPS-BAC Method
80	17.5	12.5	19.7	6.2
160	18.1	13.7	22	6.9
240	29.2	22.9	31.72	13.3
320	28.1	17.2	31.24	6.6
400	22.5	17.5	30	8
480	14.6	12.5	17.58	6.2
560	12.1	10.7	18.71	7.1
640	10.9	7.8	12.95	3.9
720	9.7	8.3	14.71	3.5
800	10	7.7	12.5	2.5

Table 5. Tabulation of tumor detection time for Epileptic Seizure Recognition Data Set.

Number of Data Points (Number)	Tumor Detection Time (ms)
Number of Data Points (Number)	Automatic Method	cfDNA Methylome	Deep Tumor Network	Proposed IRPS-BAC Method
1000	30	20	38	15
2000	54	36	63	26
3000	75	45	84	33
4000	88	48	97	36
5000	100	50	116	42.5
6000	108	54	115	48
7000	105	59.5	118	52.5
8000	96	64	105	56
9000	90	67.5	98	58.5
10,000	90	70	100	60

Table 6. Tabulation of tumor detection time for Cervical Cancer Risk Classification Dataset.

Number of Data Points (Number)	Tumor Detection Time (ms)
Number of Data Points (Number)	Automatic Method	cfDNA Methylome	Deep Tumor Network	Proposed IRPS-BAC Method
80	0.4	0.32	0.6	0.24
160	0.72	0.608	0.94	0.432
240	0.984	0.84	1.23	0.6
320	1.248	1.024	1.465	0.704
400	1.48	1.2	1.67	0.8
480	1.68	1.296	1.89	0.864
560	1.848	1.4	2.032	0.952
640	2.048	1.408	2.225	1.024
720	2.232	1.44	2.452	1.008
800	2.32	1.44	2.52	0.96

Table 7. Tabulation of throughput for Epileptic Seizure Recognition Data Set.

Number of Data Points (Number)	Throughput (Mbits/sec)
Number of Data Points (Number)	Automatic Method	cfDNA Methylome	Deep Tumor Network	Proposed IRPS-BAC Method
1000	146	110	85	180
2000	250	220	189	295
3000	355	325	270	416
4000	480	442	389	527
5000	525	505	445	638
6000	660	630	580	748
7000	770	710	650	854
8000	887	812	762	960
9000	978	912	872	1080
10,000	1160	1065	1003	1270

Table 8. Tabulation of throughput for Cervical Cancer Risk Classification Data Set.

Number of Data Points (Number)	Throughput (Mbits/sec)
Number of Data Points (Number)	Automatic Method	cfDNA Methylome	Deep Tumor Network	Proposed IRPS-BAC Method
80	194	163	105	246
160	305	275	229	357
240	397	361	310	462
320	531	487	439	585
400	620	574	505	673
480	734	686	640	782
560	804	740	690	866
640	908	857	802	975
720	1052	980	932	1132
800	1222	1156	1094	1295

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bose, S.S.C.; Natarajan, R.; H L, G.; Flammini, F.; Praveen Sundar, P.V. Iterative Reflect Perceptual Sammon and Machine Learning-Based Bagging Classification for Efficient Tumor Detection. Sustainability 2023, 15, 4602. https://doi.org/10.3390/su15054602

AMA Style

Bose SSC, Natarajan R, H L G, Flammini F, Praveen Sundar PV. Iterative Reflect Perceptual Sammon and Machine Learning-Based Bagging Classification for Efficient Tumor Detection. Sustainability. 2023; 15(5):4602. https://doi.org/10.3390/su15054602

Chicago/Turabian Style

Bose, S. Subash Chandra, Rajesh Natarajan, Gururaj H L, Francesco Flammini, and P. V. Praveen Sundar. 2023. "Iterative Reflect Perceptual Sammon and Machine Learning-Based Bagging Classification for Efficient Tumor Detection" Sustainability 15, no. 5: 4602. https://doi.org/10.3390/su15054602

APA Style

Bose, S. S. C., Natarajan, R., H L, G., Flammini, F., & Praveen Sundar, P. V. (2023). Iterative Reflect Perceptual Sammon and Machine Learning-Based Bagging Classification for Efficient Tumor Detection. Sustainability, 15(5), 4602. https://doi.org/10.3390/su15054602

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Iterative Reflect Perceptual Sammon and Machine Learning-Based Bagging Classification for Efficient Tumor Detection

Abstract

1. Introduction

2. Related Works

3. Methodology

3.1. Iterative Reflect Perceptual Sammon Feature Extraction

3.2. ID3 Ensembled Bagging Classification

4. Experimental Settings

5. Results Analysis

5.1. Impact on Tumor Detection Accuracy

5.2. Impact on Error Rate

5.3. Impact on Tumor Detection Time

5.4. Throughput

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI