DA-Based Parameter Optimization of Combined Kernel Support Vector Machine for Cancer Diagnosis
Abstract
:1. Introduction
2. Related Work
3. The Basic Idea of SVM and the Construction of the Combined Kernel Function
3.1. Linear SVM
3.2. Nonlinear SVM
4. The Construction of the Combined Kernel Function
- (1)
- Linear kernel function
- (2)
- Polynomial kernel function
- (3)
- RBF kernel function
- (4)
- Sigmoid kernel function
5. Dragonfly Algorithm (DA)
- Separation, whose aim is to avoid the collision between individuals and their neighbors in the static swarm.
- Alignment, whose purpose is to match the individual velocity with others in the same group.
- Cohesion, which is used to indicate the tendency of individuals to move towards the center of the group.
- (1)
- Separation
- (2)
- Alignment
- (3)
- Cohesion
- (4)
- Attraction
- (5)
- Distraction
6. Proposed Algorithm: DA-CKSVM
6.1. The Basic Process of DA-CKSVM
Algorithm 1: The main process of DA-CKSVM |
Step 1: Set the maximum iteration times, the number of dragonflies, and the upper and lower bounds of each parameter in the parameters set (C, gamma, d, λ). |
Step 2: Initialize the step vectors, the values of , , , , and in Equation (24), and the position of each individual. |
Step 3: Train the SVM classifier with the training set and test it with the testing set. |
Step 4: Evaluate the fitness value of each individual and update the enemy and food source. |
Step 5: Update the values of , , , , and . |
Step 6: Calculate S, A, C, F and E according to Equations (19)–(23). |
Step 7: Update the neighboring radius. |
Step 8: If the dragonfly has at least one neighbor, the step vector and the position vector of the dragonfly will be calculated according to Equations (24) and (25). If not, the position vector will be updated by Equation (26). |
Step 9: Adjust the new position based on boundaries of the parameters. |
Step 10: If the maximum iteration times is reached, go to the Step 11. Otherwise, loop to Step 3. |
Step 11: Output the final SVM classifier with optimal parameters. |
6.2. Fitness Function
7. Experimental Results and Discussion
7.1. Data Sets and Experimental Platform
7.2. Data Preprocessing
7.3. Cross-Validation
7.4. Experimental Results
8. Conclusions and Future Work
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Bray, F.; Ferlay, J.; Soerjomataram, I.; Siegel, R.L.; Torre, L.A.; Jemal, A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2018, 68, 394–424. [Google Scholar] [CrossRef] [PubMed]
- Ting, F.F.; Tan, Y.J.; Sim, K.S. Convolutional neural network improvement for breast cancer classification. Expert Syst. Appl. 2019, 120, 103–115. [Google Scholar] [CrossRef]
- Alom, M.Z.; Yakopcic, C.; Nasrin, M.S.; Taha, T.M.; Asari, V.K. Breast Cancer Classification from Histopathological Images with Inception Recurrent Residual Convolutional Neural Network. J. Digit. Imaging 2019. [Google Scholar] [CrossRef] [PubMed]
- Ai, D.; Pan, H.; Han, R.; Li, X.; Liu, G.; Xia, L.C. Using Decision Tree Aggregation with Random Forest Model to Identify Gut Microbes Associated with Colorectal Cancer. Genes 2019, 10, 112. [Google Scholar] [CrossRef] [PubMed]
- Vapnik, V.N. An overview of statistical learning theory. IEEE Trans. Neural Netw. 1999, 10, 988–999. [Google Scholar] [CrossRef] [PubMed]
- Dai, S.; Niu, D.; Han, Y. Forecasting of Power Grid Investment in China Based on Support Vector Machine Optimized by Differential Evolution Algorithm and Grey Wolf Optimization Algorithm. Appl. Sci. 2018, 8, 636. [Google Scholar] [CrossRef]
- Illias, H.A.; Zhao Liang, W. Identification of transformer fault based on dissolved gas analysis using hybrid support vector machine-modified evolutionary particle swarm optimisation. PLoS ONE 2018, 13, e0191366. [Google Scholar] [CrossRef]
- Zhou, J.; Li, L.; Wang, L.; Li, X.; Xing, H.; Cheng, L. Establishment of a SVM classifier to predict recurrence of ovarian cancer. Mol. Med. Rep. 2018, 18, 3589–3598. [Google Scholar] [CrossRef] [PubMed]
- Kavitha, M.S.; Shanthini, J.; Sabitha, R. ECM-CSD: An Efficient Classification Model for Cancer Stage Diagnosis in CT Lung Images Using FCM and SVM Techniques. J. Med. Syst. 2019, 43, 73. [Google Scholar] [CrossRef] [PubMed]
- Geeitha, S.; Thangamani, M. Incorporating EBO-HSIC with SVM for Gene Selection Associated with Cervical Cancer Classification. J. Med. Syst. 2018, 42, 225. [Google Scholar] [CrossRef]
- Zhang, L.; Zhou, W.; Wang, B.; Zhang, Z.; et al. Applying 1-norm SVM with squared loss to gene selection for cancer classification. Appl. Intell. 2018, 48, 1878–1890. [Google Scholar] [CrossRef]
- Hsu, C.W.; Lin, C.J. A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 2002, 13, 415–425. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chapelle, O.; Vapnik, V.; Bousquet, O.; Mukherjee, S. Choosing Multiple Parameters for Support Vector Machines. Mach. Learn. 2002, 46, 131–159. [Google Scholar] [CrossRef] [Green Version]
- Tharwat, A.; Gabel, T.; Hassanien, A.E. Parameter optimization of support vector machine using dragonfly algorithm. Paper Presented at the 3rd International Conference on Advanced Intelligent Systems and Informatics (AISI 2017), Cairo, Egypt, 9–11 September 2017. [Google Scholar] [CrossRef]
- Lin, S.-W.; Ying, K.-C.; Chen, S.-C.; Lee, Z.-J. Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst. Appl. 2008, 35, 1817–1824. [Google Scholar] [CrossRef]
- Tharwat, A.; Hassanien, A.E.; Elnaghi, B.E. A BA-based algorithm for parameter optimization of Support Vector Machine. Pattern Recognit. Lett. 2017, 93, 13–22. [Google Scholar] [CrossRef]
- Siqueira, L.F.S.; Morais, C.L.M.; Araújo Júnior, R.F.; de Araújo, A.A.; Lima, K.M.G. SVM for FT-MIR prostate cancer classification: An alternative to the traditional methods. J. Chemom. 2018, 32, e3075. [Google Scholar] [CrossRef]
- Phan, A.V.; Nguyen, M.L.; Bui, L.T. Feature weighting and SVM parameters optimization based on genetic algorithms for classification problems. Appl. Intell. 2016, 46, 455–469. [Google Scholar] [CrossRef]
- Huang, C.-L.; Wang, C.-J. A GA-based feature selection and parameters optimization for support vector machines. Expert Syst. Appl. 2006, 31, 231–240. [Google Scholar] [CrossRef]
- Song, H.; Ding, Z.; Guo, C.; Li, Z.; Xia, H. Research on combination kernel function of support vector machine. Paper Presented at the International Conference on Computer Science and Software Engineering (CSSE 2008), Wuhan, Hubei, China, 12–14 December 2008. [Google Scholar] [CrossRef]
- Dash, C.S.K.; Sahoo, P.; Dehuri, S.; Cho, S.-B. An Empirical Analysis of Evolved Radial Basis Function Networks and Support Vector Machines with Mixture of Kernels. Int. J. Artif. Intell. Tools 2015, 24. [Google Scholar] [CrossRef]
- Mirjalili, S. Dragonfly algorithm: A new meta-heuristic optimization technique for solving single-objective, discrete, and multi-objective problems. Neural Comput. Appl. 2015, 27, 1053–1073. [Google Scholar] [CrossRef]
- Sayed, G.I.; Tharwat, A.; Hassanien, A.E. Chaotic dragonfly algorithm: An improved metaheuristic algorithm for feature selection. Appl. Intell. 2018, 49, 188–205. [Google Scholar] [CrossRef]
- Mafarja, M.M.; Eleyan, D.; Jaber, I.; Hammouri, A.; Mirjalili, S. Binary Dragonfly Algorithm for Feature Selection. Paper Presented at the 2017 International Conference on New Trends in Computing Sciences (ICTCS 2017), Amman, Jordan, 11–13 October 2017. [Google Scholar] [CrossRef]
- Abdel-Basset, M.; Luo, Q.; Miao, F.; Zhou, Y. Solving 0–1 Knapsack Problems by Binary Dragonfly Algorithm. Paper Presented at the International Conference on Intelligent Computing (ICIC 2017), Liverpool, UK, 7–10 August 2017. [Google Scholar] [CrossRef]
- Díaz-Cortés, M.-A.; Ortega-Sánchez, N.; Hinojosa, S.; Oliva, D.; Cuevas, E.; Rojas, R.; Demin, A. A multi-level thresholding method for breast thermograms analysis using Dragonfly algorithm. Infrared Phys. Technol. 2018, 93, 346–361. [Google Scholar] [CrossRef]
- Xu, G.; Zhang, M.; Zhu, H.; Xu, J. A 15-gene signature for prediction of colon cancer recurrence and prognosis based on SVM. Gene 2017, 604, 33–40. [Google Scholar] [CrossRef] [PubMed]
- Tuo, Y.; An, N.; Zhang, M. Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods. Mol. Med. Rep. 2018, 17, 4281–4290. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, Y.; Xie, X.; Yang, X.; Guo, L.; Liu, Z.; Zhao, X.; Luo, Y.; Jia, W.; Huang, F.; Zhu, S.; et al. Diagnosis of early gastric cancer based on fluorescence hyperspectral imaging technology combined with partial-least-square discriminant analysis and support vector machine. J. Biophotonics 2018, 12, e201800324. [Google Scholar] [CrossRef]
- Prabukumar, M.; Agilandeeswari, L.; Ganesan, K. An intelligent lung cancer diagnosis system using cuckoo search optimization and support vector machine classifier. J. Ambient Intell. Hum. Comput. 2017, 10, 267–293. [Google Scholar] [CrossRef]
- Fabelo, H.; Ortega, S.; Casselden, E.; Loh, J.; Bulstrode, H.; Zolnourian, A.; Grundy, P.; Callico, G.M.; Bulters, D.; Sarmiento, R. SVM Optimization for Brain Tumor Identification Using Infrared Spectroscopic Samples. Sensors 2018, 18, 4487. [Google Scholar] [CrossRef]
- Li, M.; Lu, X.; Wang, X.; Lu, S.; Zhong, N. Biomedical classification application and parameters optimization of mixed kernel SVM based on the information entropy particle swarm optimization. Comput. Assist. Surg. 2016, 21 (Suppl. 1), 132–141. [Google Scholar] [CrossRef] [Green Version]
- Smits, G.F.; Jordaan, E.M. Improved SVM regression using mixtures of kernels. Paper Presented at the 2002 International Joint Conference on Neural Networks (IJCNN’02), Honolulu, HI, USA, 12–17 May 2002. [Google Scholar] [CrossRef]
- Thomas, J.; Sael, L. Multi-kernel ls-svm based bio-clinical data integration: Applications to ovarian cancer. Int. J. Data Min. Bioinform. 2017, 19, 150–167. [Google Scholar] [CrossRef]
- Zien, A.; Ong, C.S. Multiclass multiple kernel learning. Paper Presented at the 24th International Conference on Machine Learning (ICML 2007), Corvalis, OR, USA, 20–24 June 2007. [Google Scholar] [CrossRef]
- Nguyen, H.-N.; Ohn, S.-Y.; Park, J.; Park, K.-S. Combined kernel function approach in SVM for diagnosis of cancer. Paper Presented at the First International Conference on Natural Computation (ICNC 2005), Changsha, China, 27–29 August 2005. [Google Scholar] [CrossRef]
- Tan, Y.; Wang, J. A support vector machine with a hybrid kernel and minimal vapnik-chervonenkis dimension. IEEE Trans. Knowl. Data Eng. 2004, 16, 385–395. [Google Scholar] [CrossRef]
- Reynolds, C.W. Flocks, herds, and schools: A distributed behavioral model. Paper Presented at the 14th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1987, Anaheim, CA, USA, 27–31 July 1987. [Google Scholar] [CrossRef]
- Yang, X.S. Nature-Inspired Metaheuristic Algorithms, 2nd ed.; Luniver Press: Beckington, UK, 2010; p. 106. [Google Scholar]
- Chang, C.-C.; Lin, C.-J. LIBSVM: A Library for support vector machines. ACM Trans. Intell. Syst. Technol. 2011, 2. [Google Scholar] [CrossRef]
Data Sets | Instances | Features | Classes |
---|---|---|---|
Breast Cancer Coimbra (BCC) | 116 | 10 | 2 |
Haberman’s Survival (HS) | 306 | 3 | 2 |
Hepatocellular Carcinoma (HCC) | 165 | 49 | 2 |
Thoracic Surgery (TS) | 470 | 17 | 2 |
Breast Cancer Wisconsin Diagnostic (BCWD) | 569 | 30 | 2 |
Breast Cancer Wisconsin Prognostic (BCWP) | 198 | 33 | 2 |
Diffuse Large B-cell Lymphoma (DLBCL_D) | 129 | 3795 | 4 |
Breast_A (B_A) | 98 | 1213 | 3 |
NAME | Detailed Settings |
---|---|
Hardware | |
Central Processing Unit (CPU) | Advanced Micro Devices (AMD) Ryzen 7 2700X |
Frequency | 3.70GHz |
Random Access Memory (RAM) | 16G |
Hard Drive | 250G |
Software | |
Operating System | Windows 10 |
Programming Language | MATLAB R2014a |
Tool for support vector machine (SVM) | LIBSVM |
Algorithm | Parameter | Value |
---|---|---|
Dragonfly algorithm (DA) | Number of dragonflies | 30 |
Generations | 300 | |
Particle swarm optimization (PSO) | 2 | |
2 | ||
Inertia w | 1 | |
Number of particles | 30 | |
Generations | 300 | |
Bat algorithm (BA) | Minimum frequency | 0 |
Maximum frequency | 2 | |
Loudness | 0.5 | |
Pulse rate | 0.5 | |
Number of bats | 30 | |
Generations | 300 | |
Genetic algorithm (GA) | Crossover ratio | 0.6 |
Mutation ratio | 0.1 | |
Selection mechanism | Roulette wheel | |
Population size | 30 | |
Generations | 300 |
Data Set | Average Classification Accuracy and Standard Deviation (%) | ||||
---|---|---|---|---|---|
DA-CKSVM | DA-SVM | PSO-SVM | BA-SVM | GA-SVM | |
BCC | 84.00 ± 1.21 | 82.80 ± 0.00 | 82.81 ± 0.02 | 80.47 ± 3.44 | 75.67 ± 2.94 |
HS | 77.94 ± 0.56 | 76.95 ± 0.28 | 75.05 ± 2.76 | 76.78 ± 1.01 | 73.70 ± 2.91 |
HCC | 75.88 ± 1.15 | 77.43 ± 0.00 | 75.26 ± 5.07 | 76.47 ± 3.02 | 67.82 ± 6.59 |
TS | 85.19 ± 0.15 | 85.11 ± 0.00 | 82.04 ± 0.11 | 85.11 ± 0.00 | 81.83 ± 0.11 |
BCWD | 98.07 ± 0.00 | 98.30 ± 0.08 | 97.89 ± 0.18 | 96.71 ± 0.78 | 97.15 ± 0.07 |
BCWP | 82.06 ± 0.71 | 81.39 ± 0.16 | 81.34 ± 0.00 | 78.13 ± 0.87 | 77.46 ± 0.35 |
DLBCL_D | 80.53 ± 0.03 | 75.77 ± 0.00 | 75.77 ± 0.00 | 75.77 ± 0.00 | 40.01 ± 6.10 |
B_A | 95.00 ± 0.00 | 92.00 ± 0.00 | 92.00 ± 0.00 | 92.00 ± 0.00 | 56.61 ± 11.91 |
Data Set | The Best Results (%) | ||||
---|---|---|---|---|---|
DA-CKSVM | DA-SVM | PSO-SVM | BA-SVM | GA-SVM | |
BCC | 86.21 | 82.80 | 82.88 | 82.88 | 81.97 |
HS | 78.77 | 77.47 | 77.16 | 77.47 | 76.83 |
HCC | 77.43 | 77.43 | 77.43 | 77.43 | 77.43 |
TS | 85.53 | 85.11 | 82.13 | 85.11 | 81.91 |
BCWD | 98.07 | 98.42 | 98.24 | 97.19 | 97.18 |
BCWP | 83.37 | 81.84 | 81.34 | 79.39 | 77.82 |
DLBCL_D | 80.58 | 75.77 | 75.77 | 75.77 | 57.37 |
B_A | 95.00 | 92.00 | 92.00 | 92.00 | 89.89 |
BCC | HS | HCC | TS | BCWD | BCWP | DLBCL_D | B_A | |
---|---|---|---|---|---|---|---|---|
DA-CKSVM vs. DA-SVM | <0.05 | <0.05 | <0.05 | 0.08 | <0.05 | <0.05 | <0.05 | <0.05 |
DA-CKSVM vs. PSO-SVM | <0.05 | <0.05 | 0.15 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 |
DA-CKSVM vs. BA-SVM | <0.05 | <0.05 | <0.05 | 0.08 | <0.05 | <0.05 | <0.05 | <0.05 |
DA-CKSVM vs. GA-SVM | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 | <0.05 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xie, T.; Yao, J.; Zhou, Z. DA-Based Parameter Optimization of Combined Kernel Support Vector Machine for Cancer Diagnosis. Processes 2019, 7, 263. https://doi.org/10.3390/pr7050263
Xie T, Yao J, Zhou Z. DA-Based Parameter Optimization of Combined Kernel Support Vector Machine for Cancer Diagnosis. Processes. 2019; 7(5):263. https://doi.org/10.3390/pr7050263
Chicago/Turabian StyleXie, Tao, Jun Yao, and Zhiwei Zhou. 2019. "DA-Based Parameter Optimization of Combined Kernel Support Vector Machine for Cancer Diagnosis" Processes 7, no. 5: 263. https://doi.org/10.3390/pr7050263
APA StyleXie, T., Yao, J., & Zhou, Z. (2019). DA-Based Parameter Optimization of Combined Kernel Support Vector Machine for Cancer Diagnosis. Processes, 7(5), 263. https://doi.org/10.3390/pr7050263