#
Minimal Complexity Support Vector Machines for Pattern Classification^{ †}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. L1 Support Vector Machines and Minimal Complexity Machines

#### 2.1. L1 Support Vector Machines

#### 2.2. Minimal Complexity Machines

## 3. Minimal Complexity L1 Support Vector Machines

#### 3.1. Architecture

#### 3.2. KKT Conditions

- ${\alpha}_{i}=0$.Because ${y}_{i}\phantom{\rule{0.166667em}{0ex}}b-{y}_{i}\phantom{\rule{0.166667em}{0ex}}{F}_{i}+{\xi}_{i}\ge 0$ and ${\xi}_{i}=0$,$$\begin{array}{c}\hfill \phantom{\rule{2.em}{0ex}}{y}_{i}\phantom{\rule{0.166667em}{0ex}}b\ge {y}_{i}\phantom{\rule{0.166667em}{0ex}}{F}_{i},\phantom{\rule{1.em}{0ex}}\mathrm{i}.\mathrm{e}.,\phantom{\rule{1.em}{0ex}}b\ge {F}_{i}\phantom{\rule{1.em}{0ex}}\mathrm{if}\phantom{\rule{1.em}{0ex}}{y}_{i}=1;\phantom{\rule{1.em}{0ex}}b\le {F}_{i}\phantom{\rule{1.em}{0ex}}\mathrm{if}\phantom{\rule{1.em}{0ex}}{y}_{i}=-1.\end{array}$$
- $C>{\alpha}_{i}>0.$Because ${\beta}_{i}>0$, ${\xi}_{i}=0$ is satisfied. Therefore,$$\begin{array}{c}\hfill \phantom{\rule{2.em}{0ex}}b={F}_{i}.\end{array}$$
- ${\alpha}_{i}=C$.Because ${\beta}_{i}=0$, ${\xi}_{i}\ge 0$ is satisfied. Therefore,$$\begin{array}{c}\hfill \phantom{\rule{2.em}{0ex}}{y}_{i}\phantom{\rule{0.166667em}{0ex}}b\le {y}_{i}\phantom{\rule{0.166667em}{0ex}}{F}_{i}\phantom{\rule{1.em}{0ex}}\mathrm{or}\phantom{\rule{1.em}{0ex}}b\le {F}_{i}\phantom{\rule{1.em}{0ex}}\mathrm{if}\phantom{\rule{1.em}{0ex}}{y}_{i}=1;\phantom{\rule{1.em}{0ex}}b\ge {F}_{i}\phantom{\rule{1.em}{0ex}}\mathrm{if}\phantom{\rule{1.em}{0ex}}{y}_{i}=-1.\end{array}$$

- ${\alpha}_{M+i}=0$.From$$h+{y}_{i}\phantom{\rule{0.166667em}{0ex}}{F}_{i}-{y}_{i}\phantom{\rule{0.166667em}{0ex}}b-1\ge 0,$$$$\begin{array}{c}\hfill {y}_{i}\phantom{\rule{0.166667em}{0ex}}b-h\le {y}_{i}\phantom{\rule{0.166667em}{0ex}}{F}_{i}-1,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{i}.\mathrm{e}.,\phantom{\rule{0.166667em}{0ex}}b-h\le \phantom{\rule{0.166667em}{0ex}}{F}_{i}-1\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{if}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{y}_{i}=1;\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}b+h\ge \phantom{\rule{0.166667em}{0ex}}{F}_{i}+1\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{if}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{y}_{i}=-1.\end{array}$$
- ${C}_{h}>{\alpha}_{M+i}>0$.$$\begin{array}{c}\hfill {y}_{i}\phantom{\rule{0.166667em}{0ex}}b-h={y}_{i}\phantom{\rule{0.166667em}{0ex}}{F}_{i}-1,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{i}.\mathrm{e}.,\phantom{\rule{0.166667em}{0ex}}b-h={F}_{i}-1\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{if}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{y}_{i}=1;\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}b+h=\phantom{\rule{0.166667em}{0ex}}{F}_{i}+1\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{if}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{y}_{i}=-1.\end{array}$$

#### 3.3. Variant of Minimal Complexity Support Vector Machines

## 4. Training Methods

#### 4.1. Calculating Corrections by Newton’s Method

#### 4.1.1. Subprogram 1

#### 4.1.2. Subprogram 2

#### 4.2. Working Set Selection

- If ${V}_{1}$ is the maximum in (97), we optimize Subproblem 1 (${\alpha}_{i}$). Let the variable pair associated with ${b}_{\mathrm{up}}$ and ${b}_{\mathrm{low}}$ be ${\alpha}_{{i}_{\mathrm{min}}}$ and ${\alpha}_{{i}_{\mathrm{max}}}$, respectively.
- If $\eta =0$ and either ${V}_{2}$ or ${V}_{3}$ is the maximum in (97), or if $\eta \ne 0$ and either ${V}_{2}$ or ${V}_{3}$ exceeds $\tau $ but not both, we optimize Subproblem 2 (${\alpha}_{M+i}$ belonging to either Class 1 or 2). Let the variable pair associated with ${b}_{\mathrm{up}}^{k}$ and ${b}_{\mathrm{low}}^{k}$$(k$ is either + or $-)$ be ${\alpha}_{M+{i}_{\mathrm{min}}^{k}}$ and ${\alpha}_{M+{i}_{\mathrm{max}}^{k}}$, respectively.
- If $\eta \ne 0$ and both ${V}_{2}$ and ${V}_{3}$ exceed $\tau $, we optimize Subproblem 2 (${\alpha}_{M+i}$ selected from Classes 1 and 2). Let the variable pair be ${\alpha}_{M+{i}_{\mathrm{min}}^{-}}$ and ${\alpha}_{M+{i}_{\mathrm{max}}^{+}}$. This is to make the selected variables correctable as will be shown in Section 4.4.2.

#### 4.3. Training Procedure of ML1 SVM

- (Initialization) Select ${\alpha}_{i}$ and ${\alpha}_{j}$ in the opposite classes and set ${\alpha}_{i}={\alpha}_{j}=C$, ${\alpha}_{k}=0,k\ne i,j,k=1,\dots ,M$, ${\alpha}_{M+i}={\alpha}_{M+j}=a\phantom{\rule{0.166667em}{0ex}}{C}_{h}$, and ${\alpha}_{M+k}=0,k\ne i,j,k=1,\dots ,M$, where $a\le 0.5$.
- (Corrections) If Pr1 (Program 1), calculate partial derivatives (76) and (77) and calculate corrections by (75). Then, modify the variables by (81). Else, if Pr2, calculate partial derivatives (87) and (88) and calculate corrections by (75). Then, modify the variables by (94).
- (Convergence Check) Update ${F}_{i}$ and calculate ${b}_{\mathrm{up}}$, ${b}_{\mathrm{low}}$, ${b}_{\mathrm{up}}^{+}$, ${b}_{\mathrm{low}}^{+}$, ${b}_{\mathrm{up}}^{-}$, and ${b}_{\mathrm{low}}^{-}$. If (96) is satisfied, stop training. Otherwise if ${V}_{1}$ is the maximum, select Pr1, otherwise, Pr2. Calculate the SMO variables.
- (Loop detection and working set selection) Do loop detection and working set selection shown in the previous section and go to Step 2.

#### 4.4. Convergence Proof

#### 4.4.1. Convergence Proof for Subprogram 1

- $\Delta {\alpha}_{{i}_{\mathrm{max}}}>0$ and $\Delta {\alpha}_{{i}_{\mathrm{min}}}<0$ for ${y}_{{i}_{\mathrm{min}}}={y}_{{i}_{\mathrm{max}}}=1$,
- $\Delta {\alpha}_{{i}_{\mathrm{max}}}<0$ and $\Delta {\alpha}_{{i}_{\mathrm{min}}}>0$ for ${y}_{{i}_{\mathrm{min}}}={y}_{{i}_{\mathrm{max}}}=-1$,
- $\Delta {\alpha}_{{i}_{\mathrm{max}}}<0$ and $\Delta {\alpha}_{{i}_{\mathrm{min}}}<0$ for ${y}_{{i}_{\mathrm{min}}}=1,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{y}_{{i}_{\mathrm{max}}}=-1$,
- $\Delta {\alpha}_{{i}_{\mathrm{max}}}>0$ and $\Delta {\alpha}_{{i}_{\mathrm{min}}}>0$ for ${y}_{{i}_{\mathrm{min}}}=-1,\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}{y}_{{i}_{\mathrm{max}}}=1$.

#### 4.4.2. Convergence Proof for Subprogram 2

## 5. Computer Experiments

#### 5.1. Analysis of Behaviors

#### 5.2. Performance Comparison

#### 5.2.1. Comparison Conditions

#### 5.2.2. Two-Class Problems

#### 5.2.3. Multiclass Problems

#### 5.3. Training Time Comparison

#### 5.4. Discussions

## 6. Conclusions

## Funding

## Conflicts of Interest

## References

- Vapnik, V.N. Statistical Learning Theory; John Wiley & Sons: New York, NY, USA, 1998. [Google Scholar]
- Abe, S. Support Vector Machines for Pattern Classification, 2nd ed.; Springer: London, UK, 2010. [Google Scholar]
- Abe, S. Training of Support Vector Machines with Mahalanobis Kernels. In Artificial Neural Networks: Formal Models and Their Applications (ICANN 2005)—Proceedings of Fifteenth International Conference, Part II, Warsaw, Poland; Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S., Eds.; Springer-Verlag: Berlin, Germany, 2005; pp. 571–576. [Google Scholar]
- Wang, D.; Yeung, D.S.; Tsang, E.C.C. Weighted Mahalanobis Distance Kernels for Support Vector Machines. IEEE Trans. Neural Netw.
**2007**, 18, 1453–1462. [Google Scholar] [CrossRef] - Shen, C.; Kim, J.; Wang, L. Scalable Large-Margin Mahalanobis Distance Metric Learning. IEEE Trans. Neural Netw.
**2010**, 21, 1524–1530. [Google Scholar] [CrossRef] [Green Version] - Liang, X.; Ni, Z. Hyperellipsoidal Statistical Classifications in a Reproducing Kernel Hilbert Space. IEEE Trans. Neural Netw.
**2011**, 22, 968–975. [Google Scholar] [CrossRef] - Fauvel, M.; Chanussot, J.; Benediktsson, J.; Villa, A. Parsimonious Mahalanobis kernel for the classification of high dimensional data. Pattern Recognit.
**2013**, 46, 845–854. [Google Scholar] [CrossRef] [Green Version] - Reitmaier, T.; Sick, B. The responsibility weighted Mahalanobis kernel for semi-supervised training of support vector machines for classification. Inf. Sci.
**2015**, 323, 179–198. [Google Scholar] [CrossRef] [Green Version] - Jiang, H.; Ching, W.K.; Yiu, K.F.C.; Qiu, Y. Stationary Mahalanobis kernel SVM for credit risk evaluation. Appl. Soft Comput.
**2018**, 71, 407–417. [Google Scholar] [CrossRef] - Sun, G.; Rong, X.; Zhang, A.; Huang, H.; Rong, J.; Zhang, X. Multi-Scale Mahalanobis Kernel-Based Support Vector Machine for Classification of High-Resolution Remote Sensing Images. Cogn. Comput.
**2019**. [Google Scholar] [CrossRef] - Lanckriet, G.R.G.; Cristianini, N.; Bartlett, P.; Ghaoui, L.E.; Jordan, M.I. Learning the Kernel Matrix with Semidefinite Programming. J. Mach. Learn. Res.
**2004**, 5, 27–72. [Google Scholar] - Shivaswamy, P.K.; Jebara, T. Ellipsoidal Kernel Machines. In Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics (AISTATS 2007), San Juan, Puerto Rico, 21–24 March 2007. [Google Scholar]
- Xue, H.; Chen, S.; Yang, Q. Structural Regularized Support Vector Machine: A Framework for Structural Large Margin Classifier. IEEE Trans. Neural Netw.
**2011**, 22, 573–587. [Google Scholar] [CrossRef] - Peng, X.; Xu, D. Twin Mahalanobis distance-based support vector machines for pattern recognition. Inf. Sci.
**2012**, 200, 22–37. [Google Scholar] [CrossRef] - Ebrahimpour, Z.; Wan, W.; Khoojine, A.S.; Hou, L. Twin Hyper-Ellipsoidal Support Vector Machine for Binary Classification. IEEE Access
**2020**, 8, 87341–87353. [Google Scholar] [CrossRef] - Pelckmans, K.; Suykens, J.; Moor, B.D. A Risk Minimization Principle for a Class of Parzen Estimators. In Advances in Neural Information Processing Systems 20; Platt, J., Koller, D., Singer, Y., Roweis, S., Eds.; Curran Associates, Inc.: New York, NY, USA, 2008; pp. 1137–1144. [Google Scholar]
- Zhang, T.; Zhou, Z.H. Large Margin Distribution Machine. In Proceedings of the Twentieth ACM SIGKDD Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 313–322. [Google Scholar]
- Zhu, Y.; Wu, X.; Xu, J.; Zhang, D.; Zuo, W. Radius-margin based support vector machine with LogDet regularizaron. In Proceedings of the 2015 International Conference on Machine Learning and Cybernetics (ICMLC), Guangzhou, China, 12–15 July 2015; Volume 1, pp. 277–282. [Google Scholar]
- Abe, S. Improving Generalization Abilities of Maximal Average Margin Classifiers. In Artificial Neural Networks in Pattern Recognition, Proceedings of the 7th IAPR TC3 Workshop (ANNPR 2016), Ulm, Germany, 28–30 September 2016; Schwenker, F., Abbas, H.M., Gayar, N.E., Trentin, E., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 29–41. [Google Scholar]
- Abe, S. Unconstrained Large Margin Distribution Machines. Pattern Recognit. Lett.
**2017**, 98, 96–102. [Google Scholar] [CrossRef] [Green Version] - Abe, S. Effect of Equality Constraints to Unconstrained Large Margin Distribution Machines. In IAPR Workshop on Artificial Neural Networks in Pattern Recognition; Lecture Notes in Computer Science; Pancioni, L., Schwenker, F., Trentin, E., Eds.; Springer: Cham, Switzerland, 2018; Volume 11081, pp. 41–53. [Google Scholar]
- Zhang, T.; Zhou, Z. Optimal Margin Distribution Machine. IEEE Trans. Knowl. Data Eng.
**2020**, 32, 1143–1156. [Google Scholar] [CrossRef] [Green Version] - Burges, C.J.C. A tutorial on support vector machines for pattern recognition. Data Min. Knowl. Discov.
**1998**, 2, 121–167. [Google Scholar] [CrossRef] - Duarte, E.; Wainer, J. Empirical comparison of cross-validation and internal metrics for tuning SVM hyperparameters. Pattern Recognit. Lett.
**2017**, 88, 6–11. [Google Scholar] [CrossRef] - Du, J.Z.; Lu, W.G.; Wu, X.H.; Dong, J.Y.; Zuo, W.M. L-SVM: A radius-margin-based SVM algorithm with LogDet regularization. Expert Syst. Appl.
**2018**, 102, 113–125. [Google Scholar] [CrossRef] - Wu, X.; Zuo, W.; Lin, L.; Jia, W.; Zhang, D. F-SVM: Combination of Feature Transformation and SVM Learning via Convex Relaxation. IEEE Trans. Neural Netw. Learn. Syst.
**2018**, 29, 5185–5199. [Google Scholar] [CrossRef] [Green Version] - Jayadeva. Learning a hyperplane classifier by minimizing an exact bound on the VC dimension. Neurocomputing
**2015**, 149, 683–689. [Google Scholar] [CrossRef] [Green Version] - Jayadeva; Soman, S.; Pant, H.; Sharma, M. QMCM: Minimizing Vapnik’s bound on the VC dimension. Neurocomputing
**2020**, 399, 352–360. [Google Scholar] [CrossRef] - Abe, S. Analyzing Minimal Complexity Machines. In Proceedings of the International Joint Conference on Neural Networks, Budapest, Hungary, 14–19 July 2019. [Google Scholar]
- Abe, S. Minimal Complexity Support Vector Machines. In Artificial Neural Networks in Pattern Recognition; Lecture Notes in Computer Science; Schilling, F.P., Stadelmann, T., Eds.; Springer: Cham, Switzerland, 2020; Volume 12294, pp. 89–101. [Google Scholar]
- Abe, S. Fusing Sequential Minimal Optimization and Newton’s Method for Support Vector Training. Int. J. Mach. Learn. Cybern.
**2016**, 7, 345–364. [Google Scholar] [CrossRef] [Green Version] - Abe, S. Sparse Least Squares Support Vector Training in the Reduced Empirical Feature Space. Pattern Anal. Appl.
**2007**, 10, 203–214. [Google Scholar] [CrossRef] [Green Version] - Keerthi, S.S.; Gilbert, E.G. Convergence of a generalized SMO algorithm for SVM classifier design. Mach. Learn.
**2002**, 46, 351–360. [Google Scholar] [CrossRef] [Green Version] - Fan, R.E.; Chen, P.H.; Lin, C.J. Working Set Selection Using Second Order Information for Training Support Vector Machines. J. Mach. Learn. Res.
**2005**, 6, 1889–1918. [Google Scholar] - Barbero, A.; Dorronsoro, J.R. Faster Directions for Second Order SMO. In Artificial Neural Networks—ICANN 2010; Lecture Notes in Computer Science; Diamantaras, K., Duch, W., Iliadis, L.S., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; Volume 6353, pp. 30–39. [Google Scholar]
- Bezdek, J.C.; Keller, J.M.; Krishnapuram, R.; Kuncheva, L.I.; Pal, N.R. Will the real iris data please stand up? IEEE Trans. Fuzzy Syst.
**1999**, 7, 368–369. [Google Scholar] [CrossRef] [Green Version] - Rätsch, G.; Onoda, T.; Müller, K.R. Soft Margins for AdaBoost. Mach. Learn.
**2001**, 42, 287–320. [Google Scholar] [CrossRef] - Asuncion, A.; Newman, D.J. UCI Machine Learning Repository. 2007. Available online: https://archive.ics.uci.edu/ml/index.php (accessed on 23 October 2020).
- USPS Dataset. Available online: https://www.kaggle.com/bistaumanga/usps-dataset (accessed on 23 October 2020).
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE
**1998**, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version] - LeCun, Y.; Cortes, C. The MNIST Database of Handwritten Digits. Available online: http://yann.lecun.com/exdb/mnist/ (accessed on 23 October 2020).

Problem | Inputs | Training Data | Test Data | Sets | Prior (%) |
---|---|---|---|---|---|

Banana | 2 | 400 | 4900 | 100 | 54.62 (55.21) |

Breast cancer | 9 | 200 | 77 | 100 | 70.59 (71.19) |

Diabetes | 8 | 468 | 300 | 100 | 65.03 (65.22) |

Flare-solar | 9 | 666 | 400 | 100 | 55.25 (55.27) |

German | 20 | 700 | 300 | 100 | 69.92 (70.18) |

Heart | 13 | 170 | 100 | 100 | 55.53 (55.60) |

Image | 18 | 1300 | 1010 | 20 | 57.40 (56.81) |

Ringnorm | 20 | 400 | 7000 | 100 | 50.27 (50.50) |

Splice | 60 | 1000 | 2175 | 20 | 51.71 (52.00) |

Thyroid | 5 | 140 | 75 | 100 | 69.51 (70.25) |

Titanic | 3 | 150 | 2051 | 100 | 67.83 (67.69) |

Twonorm | 20 | 400 | 7000 | 100 | 50.52 (50.01) |

Waveform | 21 | 400 | 4600 | 100 | 66.90 (67.07) |

Problem | ML1 SVM | ML1${}_{\mathbf{v}}$ SVM | L1 SVM | MLP SVM | LS SVM | ULDM |
---|---|---|---|---|---|---|

Banana | 89.18 ± 0.70 | 89.10 ± 0.70 | 89.17 ± 0.72 | 89.07 ± 0.73 | 89.17 ± 0.66 | 89.13 ± 0.72 |

Cancer | 73.03 ± 4.45 | 73.12 ± 4.43 | 73.03 ± 4.51 | 72.81 ± 4.59 | 73.13 ± 4.68 | 73.82${}^{-}$± 4.44 |

Diabetes | 76.17 ± 2.25 | 76.33 ± 1.94 | 76.29 ± 1.73 | 76.05 ± 1.74 | 76.19 ± 2.00 | 76.50 ± 1.94 |

Flare-solar | 66.98 ± 2.14 | 66.99 ± 2.16 | 66.99 ± 2.12 | 66.62 ± 3.10 | 66.25${}^{+}$± 1.98 | 66.34${}^{+}$± 1.94 |

German | 75.91 ± 2.03 | 75.97 ± 2.21 | 75.95 ± 2.24 | 75.63 ± 2.57 | 76.10 ± 2.10 | 76.15 ± 2.29 |

Heart | 82.84 ± 3.26 | 82.96 ± 3.25 | 82.82 ± 3.37 | 82.52 ± 3.27 | 82.49 ± 3.60 | 82.70 ± 3.66 |

Image | 97.29 ± 0.44 | 97.29 ± 0.47 | 97.16 ± 0.41 | 96.47${}^{+}$± 0.87 | 97.52 ± 0.54 | 97.15 ± 0.68 |

Ringnorm | 98.12 ± 0.36 | 97.97 ± 1.11 | 98.14 ± 0.35 | 97.97${}^{+}$± 0.37 | 98.19 ± 0.33 | 98.16 ± 0.35 |

Splice | 89.05 ± 0.83 | 88.99 ± 0.83 | 88.89 ± 0.91 | 86.71${}^{+}$± 1.27 | 88.98 ± 0.70 | 89.13 ± 0.60 |

Thyroid | 95.32 ± 2.41 | 95.37 ± 2.50 | 95.35 ± 2.44 | 95.12 ± 2.38 | 95.08 ± 2.55 | 95.29 ± 2.34 |

Titanic | 77.37 ± 0.81 | 77.40 ± 0.79 | 77.39 ± 0.74 | 77.41 ± 0.77 | 77.39 ± 0.83 | 77.40 ± 0.85 |

Twonorm | 97.36 ± 0.28 | 97.38 ± 0.25 | 97.38 ± 0.26 | 97.13${}^{+}$± 0.29 | 97.43 ± 0.27 | 97.43 ± 0.25 |

Waveform | 89.72 ± 0.73 | 89.67 ± 0.75 | 89.76 ± 0.66 | 89.39${}^{+}$± 0.53 | 90.05${}^{-}$± 0.59 | 90.24${}^{-}$± 0.50 |

Average (B/S/W) | 85.26 (1/3/1) | 85.27 (3/3/1) | 85.26 (1/2/0) | 84.84 (1/0/9) | 85.23 (3/4/3) | 85.34 (6/2/0) |

W/T/L | — | 0/13/0 | 0/13/0 | 5/8/0 | 1/11/1 | 1/10/2 |

Problem | ML1 SVM | ML1${}_{\mathbf{v}}$ SVM | L1 SVM | LS SVM | ULDM |
---|---|---|---|---|---|

Banana | 88.97 ± 0.69 | 89.01 ± 0.62 | 89.01 ± 0.62 | 88.07${}^{+}$± 1.00 | 82.31${}^{+}$± 2.49 |

Cancer | 72.90 ± 5.10 | 71.66${}^{+}$± 4.86 | 72.84 ± 5.25 | 72.75 ± 4.61 | 72.75 ± 4.71 |

Diabetes | 76.19 ± 1.75 | 76.22 ± 1.67 | 76.29 ± 1.75 | 76.39 ± 1.91 | 76.29 ± 1.66 |

Flare-solar | 67.30 ± 2.01 | 67.26 ± 2.12 | 67.19 ± 2.17 | 66.46${}^{+}$± 1.92 | 67.09 ± 1.97 |

German | 75.62 ± 2.16 | 75.71 ± 2.23 | 75.79 ± 2.26 | 75.70 ± 2.05 | 75.23${}^{+}$± 1.92 |

Heart | 82.77 ± 3.62 | 82.93 ± 3.25 | 82.85 ± 3.46 | 83.60${}^{-}$± 3.39 | 83.22${}^{-}$± 3.48 |

Image | 96.59 ± 0.51 | 96.62 ± 0.51 | 96.74 ± 0.47 | 97.01${}^{-}$± 0.43 | 95.35${}^{+}$± 0.55 |

Ringnorm | 93.29 ± 1.02 | 93.31 ± 0.94 | 93.39 ± 0.95 | 92.43${}^{+}$± 0.85 | 94.71${}^{-}$± 0.73 |

Splice | 87.27 ± 0.79 | 86.00${}^{+}$± 1.47 | 87.67 ± 0.68 | 86.22${}^{+}$± 0.71 | 87.62 ± 0.67 |

Thyroid | 95.09 ± 2.58 | 94.99 ± 2.59 | 95.04 ± 2.68 | 91.37${}^{+}$± 3.41 | 89.99${}^{+}$± 3.56 |

Titanic | 77.60 ± 0.72 | 77.63 ± 0.66 | 77.61 ± 0.68 | 77.52 ± 0.74 | 77.59 ± 0.74 |

Twonorm | 97.30 ± 0.42 | 97.25 ± 0.43 | 97.42 ± 0.34 | 97.47${}^{-}$± 0.24 | 97.14${}^{+}$± 0.51 |

Waveform | 89.14 ± 0.73 | 89.16 ± 0.86 | 89.13 ± 0.76 | 89.70${}^{-}$± 0.71 | 90.00${}^{-}$± 0.62 |

Average (B/S/W) | 84.62 (3/0/2) | 84.44 (2/2/2) | 84.69 (3/7/0) | 84.21 (4/1/3) | 83.79 (2/3/5) |

W/T/L | — | 2/11/0 | 0/13/0 | 5/4/4 | 5/5/3 |

Problem | Inputs | Classes | Training Data | Test Data | Prior (%) |
---|---|---|---|---|---|

Numeral | 12 | 10 | 810 | 820 | 10.00 (10.00) |

Thyroid | 21 | 3 | 3772 | 3428 | 92.47 (92.71) |

Blood cell | 13 | 12 | 3097 | 3100 | 12.92 (12.90) |

Hiragana-50 | 50 | 39 | 4610 | 4610 | 12.90 (5.64) |

Hiragana-13 | 13 | 38 | 8375 | 8356 | 6.29 (6.29) |

Hiragana-105 | 105 | 38 | 8375 | 8356 | 6.29 (6.29) |

Satimage | 36 | 6 | 4435 | 2000 | 24.17 (23.50) |

USPS | 256 | 10 | 7291 | 2007 | 16.38 (17.89) |

MNIST | 784 | 10 | 10,000 | 60,000 | 11.35 (11.23) |

Letter | 16 | 26 | 16,000 | 4000 | 4.05 (4.20) |

Problem | ML1 SVM | ML1${}_{\mathbf{v}}$ SVM | L1 SVM | MLP SVM | LS SVM | ULDM |
---|---|---|---|---|---|---|

Numeral | 99.76 | 99.76 | 99.76 | 99.27 | 99.15 | 99.39 |

Thyroid | 97.26 | 97.23 | 97.26 | — | 95.39 | 95.27 |

Blood cell | 93.45 | 93.55 | 93.16 | 93.36 | 94.23 | 94.32 |

Hiragana-50 | 99.11 | 99.22 | 99.00 | 98.96 | 99.48 | 98.96 |

Hiragana-13 | 99.89 | 99.94 | 99.79 | 99.90 | 99.87 | 99.89 |

Hiragana-105 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 | 100.00 |

Satimage | 91.85 | 91.85 | 91.90 | 91.10 | 91.95 | 92.25 |

USPS | 95.42 | 95.37 | 95.27 | 95.17 | 95.47 | 95.42 |

MNIST | 96.96 | 96.95 | 96.55 | — | 96.99 | 97.03 |

Letter | 98.03 | 98.03 | 97.85 | — | 97.88 | 97.75 |

Average (B/S/W) | 97.17 (4/1/0) | 97.19 (4/1/0) | 97.05 (3/0/3) | —(1/1/3) | 97.04 (3/3/1) | 97.03 (4/1/3) |

Problem | ML1 SVM | ML1${}_{\mathbf{v}}$ SVM | L1 SVM | LS SVM | ULDM |
---|---|---|---|---|---|

Numeral | 99.63 | 99.63 | 99.63 | 99.02 | 99.27 |

Thyroid | 97.38 | 97.38 | 97.38 | 94.66 | 94.66 |

Blood cell | 94.32 | 93.77 | 92.13 | 94.39 | 94.55 |

Hiragana-50 | 98.92 | 99.05 | 98.81 | 99.24 | 98.76 |

Hiragana-13 | 99.75 | 99.77 | 99.64 | 99.89 | 99.44 |

Hiragana-105 | 100.00 | 100.00 | 99.99 | 100.00 | 100.00 |

Satimage | 89.25 | 89.25 | 89.25 | 90.30 | 88.85 |

USPS | 94.92 | 95.27 | 94.42 | 95.42 | 95.37 |

MNIST | 96.14 | 96.39 | 95.85 | 96.84 | 96.54 |

Letter | 96.90 | 97.10 | 96.10 | 97.53 | 96.10 |

Average (B/S/W) | 96.72 (3/1/0) | 96.76 (3/4/0) | 96.32 (2/1/5) | 96.73 (7/1/2) | 96.35 (2/2/5) |

Problem | ML1 SVM | ML1${}_{\mathbf{v}}$ SVM | L1 SVM | LS SVM | ULDM | |||||
---|---|---|---|---|---|---|---|---|---|---|

RBF | Poly | RBF | Poly | RBF | Poly | RBF | Poly | RBF | Poly | |

Numeral | 99.63 | 99.51 | 99.63 | 99.63 | 99.63 | 99.51 | 99.63 | 99.51 | 99.51 | 99.51 |

Thyroid | 97.61 | 98.20 | 97.61 | 98.20 | 97.59 | 98.20 | 95.97 | 95.71 | 95.81 | 94.75 |

Blood cell | 94.87 | 94.77 | 94.93 | 94.74 | 94.87 | 94.64 | 94.83 | 94.87 | 94.80 | 94.70 |

Hiragana-50 | 99.70 | 99.63 | 99.72 | 99.65 | 99.57 | 99.46 | 99.67 | 99.63 | 99.63 | 99.59 |

Hiragana-13 | 99.81 | 99.76 | 99.80 | 99.77 | 99.64 | 99.51 | 99.86 | 99.90 | 99.83 | 99.63 |

Hiragana-105 | 99.95 | 99.86 | 99.94 | 99.86 | 99.89 | 99.79 | 99.98 | 99.92 | 99.95 | 99.89 |

Satimage | 92.60 | 89.83 | 92.56 | 89.90 | 92.47 | 89.83 | 92.72 | 89.85 | 92.36 | 89.11 |

USPS | 98.37 | 98.01 | 98.37 | 98.24 | 98.27 | 97.64 | 98.44 | 98.42 | 98.46 | 98.31 |

MNIST | 97.56 | 96.86 | 97.58 | 97.14 | 97.31 | 96.68 | 97.60 | 97.48 | 97.65 | 97.14 |

Letter | 97.64 | 96.97 | 97.72 | 97.10 | 97.43 | 95.52 | 97.83 | 97.37 | 97.73 | 96.18 |

Average | 97.77 | 97.34 | 97.79 | 97.42 | 97.67 | 97.08 | 97.65 | 97.27 | 97.57 | 96.88 |

**Table 8.**Selected parameter values for the RBF kernels. For two-class problems, most frequently selected values ($\gamma $, C, ${C}_{h}$) are shown.

Data | ML1 SVM | ML1${}_{\mathbf{v}}$ SVM | L1 SVM | LS SVM | ULDM |
---|---|---|---|---|---|

Banana | 10, 1, 1 | 50, 1, 1 | 20, 1 | 50, 10 | 50, 10${}^{4}$ |

B. cancer | 0.5, 1, 1 | 0.5, 1, 1 | 0.5, 1 | 5, 1 | 10, 10 |

Diabetes | 0.1, 50, 1 | 0.1, 1, 1 | 0.1, 500 | 0.5, 1 | 5, 100 |

Flare-solar | 0.01, 50, 1 | 0.01, 50, 1 | 0.01, 50 | 0.01, 10 | 0.01, 0.1 |

German | 0.1, 1, 1 | 0.1, 1, 1 | 0.1, 1 | 0.1, 1 | 10, 100 |

Heart | 0.01, 500, 1 | 0.01, 500, 1 | 0.01, 100 | 0.01, 10 | 0.01, 10${}^{8}$ |

Image | 100, 10, 1 | 100, 50, 1 | 100, 50 | 50, 50 | 15, 10${}^{8}$ |

Ringnorm | 100, 0.1, 1 | 50, 1, 1 | 50, 1 | 50, 0.1 | 50, 10 |

Splice | 10, 10, 10 | 5, 10, 1 | 10, 10 | 10, 10 | 10, 10${}^{4}$ |

Thyroid | 5, 50, 1 | 5, 500, 1 | 5, 50 | 100, 1 | 50, 10 |

Titanic | 0.01, 50, 1 | 0.01, 50, 1 | 0.01, 50 | 0.01, 10 | 0.01, 10${}^{4}$ |

Twonorm | 0.01, 50, 1 | 0.01, 50, 1 | 0.01, 1 | 0.01, 50 | 0.01, 1000 |

Waveform | 5, 1, 1 | 50, 1, 1 | 15, 1 | 20, 1 | 50, 100 |

Numeral | 5, 10, 1 | 5, 10, 1 | 5, 10 | 1, 100 | 15, 10${}^{4}$ |

Thyroid (m) | 10, 2000, 1 | 10, 2000, 10 | 10, 2000 | 50, 2000 | 200, 10${}^{8}$ |

Blood cell | 5, 100, 100 | 5, 100, 50 | 5, 100 | 5, 500 | 5, 10${}^{8}$ |

Hiragana-50 | 5, 100, 50 | 5, 100, 50 | 5, 100 | 10, 100 | 10, 10${}^{6}$ |

Hiragana-13 | 50, 50, 500 | 50, 50, 50 | 15, 1000 | 15, 2000 | 20, 10${}^{8}$ |

Hiragana-105 | 20, 10, 500 | 20, 10, 50 | 10, 10 | 15, 2000 | 10, 10${}^{6}$ |

Satimage | 200, 10, 10 | 200, 10, 10 | 200, 10 | 200, 10 | 200, 10${}^{4}$ |

USPS | 10, 50, 2000 | 10, 50, 2000 | 10, 100 | 5, 500 | 5, 10${}^{8}$ |

MNIST | 20, 10, 500 | 20, 10, 500 | 20, 10 | 10, 50 | 10, 10${}^{6}$ |

Letter | 100, 10, 10 | 100, 10, 100 | 200, 10 | 50, 50 | 50, 10${}^{6}$ |

Data | ML1 SVM | ML1${}_{\mathbf{v}}$ SVM | L1 SVM | LS SVM | ULDM |
---|---|---|---|---|---|

Banana | 0.096 | 0.053 | 0.067 | 0.192 | 0.254 |

B. cancer | 0.006 | 0.005 | 0.005 | 0.010 | 0.018 |

Diabetes | 0.025 | 0.026 | 0.029 | 0.119 | 0.222 |

Flare-solar | 0.057 | 0.059 | 0.055 | 0.341 | 0.693 |

German | 0.059 | 0.055 | 0.059 | 0.418 | 0.783 |

Heart | 0.004 | 0.004 | 0.005 | 0.010 | 0.013 |

Image | 0.306 | 0.354 | 0.327 | 8.24 | 21.7 |

Ringnorm | 0.226 | 0.141 | 0.130 | 0.362 | 0.420 |

Splice | 5.29 | 1.79 | 8.52 | 3.77 | 8.52 |

Thyroid | 0.003 | 0.002 | 0.002 | 0.005 | 0.008 |

Titanic | 0.017 | 0.016 | 0.017 | 0.028 | 0.031 |

Twonorm | 0.244 | 0.250 | 0.336 | 0.422 | 0.484 |

Waveform | 0.122 | 0.156 | 0.106 | 0.268 | 0.334 |

Numeral | 0.125 | 0.125 | 0.047 | 1.17 | 1.14 |

Thyroid (m) | 1.53 | 1.52 | 0.938 | 621 | 1452 |

Blood cell | 2.91 | 3.08 | 0.734 | 33.1 | 49.7 |

Hiragana-50 | 47.3 | 97.7 | 8.67 | 244 | 268 |

Hiragana-13 | 348 | 295 | 9.67 | 740 | 920 |

Hiragana-105 | 950 | 920 | 48.5 | 1779 | 1997 |

Satimage | 27.4 | 28.9 | 19.7 | 292 | 693 |

USPS | 513 | 634 | 35.5 | 1089 | 1996 |

MNIST | 6143 | 6670 | 1435 | 8323 | 11,372 |

Letter | 1390 | 2123 | 439 | 3036 | 6544 |

B/S/W | 4/11/0 | 7/7/0 | 15/4/1 | 0/1/1 | 0/0/22 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Abe, S.
Minimal Complexity Support Vector Machines for Pattern Classification. *Computers* **2020**, *9*, 88.
https://doi.org/10.3390/computers9040088

**AMA Style**

Abe S.
Minimal Complexity Support Vector Machines for Pattern Classification. *Computers*. 2020; 9(4):88.
https://doi.org/10.3390/computers9040088

**Chicago/Turabian Style**

Abe, Shigeo.
2020. "Minimal Complexity Support Vector Machines for Pattern Classification" *Computers* 9, no. 4: 88.
https://doi.org/10.3390/computers9040088