# Classification of Hepatitis Viruses from Sequencing Chromatograms Using Multiscale Permutation Entropy and Support Vector Machines

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Material and Methods

#### 2.1. Dataset

#### 2.2. Feature Extraction

#### 2.2.1. Statistical-Based Feature Extraction Method

#### 2.2.2. Entropy-Based Feature Extraction Method

#### 2.3. Support Vector Machines

#### 2.4. Performance Evaluation

#### 2.5. Proposed Framework

## 3. Results

#### 3.1. Classification with Statistical-Based Features

#### 3.2. Classification with Entropy-Based Features

## 4. Discussion

## 5. Conclusions

## Author Contributions

## Ethical Approval

## Funding

## Conflicts of Interest

## References

- Furey, T.S.; Cristianini, N.; Duffy, N.; Bednarski, D.W.; Schummer, M.; Haussler, D. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics
**2000**, 16, 906–914. [Google Scholar] [CrossRef] - Lander, E.S.; Linton, L.M.; Birren, B.; Nusbaum, C.; Zody, M.C.; Baldwin, J.; Devon, K.; Dewar, K.; Doyle, M.; Fitzhugh, W.; et al. Erratum: Initial sequencing and analysis of the human genome: International Human Genome Sequencing Consortium. Nature
**2001**, 409, 860–921. [Google Scholar] [PubMed] - Mateos, A.; Dopazo, J.; Jansen, R.; Tu, Y.; Gerstein, M.; Stolovitzky, G. Systematic learning of gene functional classes from DNA array expression data by using multilayer perceptrons. Genome Res.
**2002**, 12, 1703–1715. [Google Scholar] [CrossRef] [PubMed] - Öz, E.; Kaya, H. Support vector machines for quality control of DNA sequencing. J. Inequalities Appl.
**2013**, 2013, 85. [Google Scholar] [CrossRef] - Pincus, S.M. Approximate entropy as a measure of system complexity. Proc. Natl. Acad. Sci. USA
**1991**, 88, 2297–2301. [Google Scholar] [CrossRef] [PubMed] - Richman, J.S.; Moorman, J.R. Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circ. Physiol.
**2000**, 278, H2039–H2049. [Google Scholar] [CrossRef] [PubMed] - Li, X.; Ouyang, G.; Richards, D.A. Predictability analysis of absence seizures with permutation entropy. Epilepsy Res.
**2007**, 77, 70–74. [Google Scholar] [CrossRef] - Bandt, C.; Pompe, B. Permutation entropy: A natural complexity measure for time series. Phys. Rev. Lett.
**2002**, 88, 174102. [Google Scholar] [CrossRef] - Costa, M.; Goldberger, A.L.; Peng, C.K. Multiscale entropy analysis of complex physiologic time series. Phys. Rev. Lett.
**2002**, 89, 068102. [Google Scholar] [CrossRef] - Costa, M.; Goldberger, A.L.; Peng, C.K. Multiscale entropy to distinguish physiologic and synthetic RR time series. Comput. Cardiol.
**2002**, 29, 137–140. [Google Scholar] - Costa, M.; Goldberger, A.L.; Peng, C.K. Multiscale entropy analysis of biological signals. Phys. Rev. E
**2005**, 71, 021906. [Google Scholar] [CrossRef] [PubMed] - Costa, M.; Peng, C.K.; Goldberger, A.L.; Hausdorff, J.M. Multiscale entropy analysis of human gait Dynamics. Phys. A
**2003**, 330, 53–60. [Google Scholar] [CrossRef] - Humeau-Heurtier, A. The multiscale entropy algorithm and its variants: A review. Entropy
**2015**, 17, 3110–3123. [Google Scholar] [CrossRef] - Nikulin, V.V.; Brismar, T. Comment on “Multiscale entropy analysis of complex physiologic time series”. Phys. Rev. Lett.
**2004**, 92, 089803. [Google Scholar] [CrossRef] [PubMed] - Wu, S.D.; Wu, C.W.; Lee, K.Y.; Lin, S.G. Modified multiscale entropy for short-term time series analysis. Phys. A
**2013**, 392, 5865–5873. [Google Scholar] [CrossRef] - Aziz, W.; Arif, M. Multiscale permutation entropy of physiological time series. In Proceedings of the 9th International Multitopic Conference (INMIC ’05), Karachi, Pakistan, 24–25 December 2005; pp. 1018–1021. [Google Scholar]
- Ravelo-García, A.; Navarro-Mesa, J.L.; Casanova-Blancas, U.; Martin-Gonzalez, S.; Quintana-Morales, P.; Guerra-Moreno, I.; Canino-Rodríguez, J.M.; Hernández-Pérez, E. Application of the permutation entropy over the heart rate variability for the improvement of electrocardiogram-based sleep breathing pause detection. Entropy
**2015**, 17, 914–927. [Google Scholar] [CrossRef] - Nalband, S.; Sundar, A.; Prince, A.A.; Agrawal, A. Feature selection and classification methodology for the detection of knee-joint disorders. Comput. Methods Progr. Biomed.
**2016**, 127, 94–104. [Google Scholar] [CrossRef] - Nalband, S.; Prince, A.A.; Agrawal, A. Entropy-based feature extraction and classification of vibroarthographic signal using complete ensemble empirical mode decomposition with adaptive noise. IET Sci. Meas. Technol.
**2018**, 12, 350–359. [Google Scholar] [CrossRef] - Nicolaou, N.; Georgiou, J. Detection of epileptic electroencephalogram based on permutation entropy and support vector machines. Expert Syst. Appl.
**2012**, 39, 202–209. [Google Scholar] [CrossRef] - Ocak, H. Optimal classification of epileptic seizures in EEG using wavelet analysis and genetic algorithm. Signal Process.
**2008**, 88, 1858–1867. [Google Scholar] [CrossRef] - Song, Y.; Lio, P. A new approach for epileptic seizure detection: Sample entropy based feature extraction and extreme learning machine. J. Biomed. Sci. Eng.
**2010**, 6, 556–567. [Google Scholar] [CrossRef] - Labate, D.; Palamara, I.; Mammone, N.; Morabito, G.; La Foresta, F.; Morabito, F.C. SVM classification of epileptic EEG recordings through multiscale permutation entropy. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Dallas, TX, USA, 4–9 August 2013; pp. 1–5. [Google Scholar]
- Wu, S.D.; Wu, P.H.; Wu, C.W.; Ding, J.J.; Wang, C.C. Bearing fault diagnosis based on multiscale permutation entropy and support vector machines. Entropy
**2012**, 14, 1343–1356. [Google Scholar] [CrossRef] - Fung, G.; Mangasarian, O.L.; Shavlik, J.W. Knowledge-based support vector machine classifiers. In Advances in Neural Information Processing Systems; Becker, S., Thrun, S., Obermayer, K., Eds.; MIT Press: Cambridge, MA, USA, 2003. [Google Scholar]
- Öz, E.; Kurt, S.; Asyalı, M.; Yücel, Y. Feature based quality assessment of DNA sequencing chromatograms. Appl. Soft Comput.
**2016**, 41, 420–427. [Google Scholar] [CrossRef] - Kurt, S.; Öz, E.; Aşkın, Ö.E.; Yücel, Y. Classification of nucleotide sequences for quality assessment using logistic regression and decision tree approaches. Neural Comput. Appl.
**2018**, 29, 251–261. [Google Scholar] [CrossRef] - Seo, T.K. Classification of nucleotide sequences using support vector machines. J. Mol. Evol.
**2010**, 71, 250–267. [Google Scholar] [CrossRef] - Cortes, C.; Vapnik, V. Support vector networks. Mach. Learn.
**1995**, 20, 273–297. [Google Scholar] [CrossRef] - Vapnik, V.N. Statistical Learning Theory; Wiley: New York, NY, USA, 1998. [Google Scholar]
- Bhat, H.F. Evaluating SVM algorithms for bioinformatic gene expression analysis. Int. J. Comp. Sci. Eng.
**2017**, 6, 42–52. [Google Scholar] - Ewing, B.; Green, P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res.
**1998**, 8, 186–194. [Google Scholar] [CrossRef] - MATLAB, Version 9.2.0; The MathWorks Inc.: Natick, MA, USA, 2017.
- Zunino, L.; Olivares, F.; Scholkmann, F.; Rosso, O.A. Permutation entropy based time series analysis: Equalities in the input signal can lead to false conclusions. Phys. Lett. A
**2017**, 381, 1883–1892. [Google Scholar] [CrossRef] - Yan, R.; Liu, Y.; Gao, R.X. Permutation entropy: A nonlinear statistical measure for status characterization of rotary machines. Mech. Syst. Signal Proc.
**2012**, 29, 474–484. [Google Scholar] [CrossRef] - Riedl, M.; Müller, A.; Wessel, N. Practical considerations of permutation entropy. Eur. Phys. J. Spec. Top.
**2013**, 222, 249–262. [Google Scholar] [CrossRef] - Campbell, C.; Ying, Y. Learning with Support Vector Machines; Morgan & Claypool Publishers: San Rafael, CA, USA, 2011. [Google Scholar]
- Alpaydin, E. Introduction to Machine Learning; MIT Press: Cambridge, UK, 2004. [Google Scholar]
- Yue, S.; Li, P.; Hao, P. SVM classification: Its contents and challenges. Appl. Math. J. Chin. Univ.
**2003**, 18, 332–342. [Google Scholar] [CrossRef] - Hsu, C.W.; Chang, C.C.; Lin, C.J. A Practical Guide to Support Vector Classification; Technical Report; Department of Computer Science and Information Engineering, National Taiwan University: Taipei City, Taiwan, 2004; Available online: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf (accessed on 2 November 2019).
- Cherkassky, V.; Mulier, F.M. Learning from Data: Concepts, Theory, and Methods; Wiley-Interscience: New York, NY, USA, 1998. [Google Scholar]
- Kuhn, M. Building Predictive Models in R Using the caret Package. J. Stat. Softw.
**2008**, 28, 1–26. [Google Scholar] [CrossRef][Green Version] - Karatzoglou, A.; Smola, A.; Hornik, K.; Zeileis, A. Kernlab—An S4 Package for Kernel Methods in R. J. Stat. Softw.
**2004**, 11, 1–20. [Google Scholar] [CrossRef][Green Version] - Han, H.; Jiang, X. Overcome support vector machine diagnosis overfitting. Cancer Inform.
**2014**, 13, CIN-S13875. [Google Scholar] [CrossRef] [PubMed][Green Version] - Amarantidis, L.C.; Abásolo, D. Interpretation of entropy algorithms in the context of biomedical signal analysis and their application to EEG analysis in epilepsy. Entropy
**2019**, 21, 840. [Google Scholar] [CrossRef][Green Version] - Acharya, U.R.; Molinari, F.; Sree, S.V.; Chattopadhyay, S.; Ng, K.H.; Suri, J.S. Automated diagnosis of epileptic EEG using entropies. Biomed. Signal Process. Control
**2012**, 7, 401–408. [Google Scholar] [CrossRef][Green Version] - Acharya, U.R.; Sree, S.V.; Ang, P.C.A.; Yanti, R.; Suri, J.S. Application of non-linear and wavelet based features for the automated identification of epileptic EEG signals. Int. J. Neural Syst.
**2012**, 22, 1250002. [Google Scholar] [CrossRef] - Sharma, R.; Pachori, R.B.; Acharya, U.R. Application of entropy measures on intrinsic mode functions for the automated identification of focal electroencephalogram signals. Entropy
**2015**, 17, 669–691. [Google Scholar] [CrossRef] - Arunkumar, N.; Ramkumar, K.; Venkatraman, V.; Abdulhay, E.; Fernandes, S.L.; Kadry, S.; Segal, S. Classification of focal and non focal EEG using entropies. Pattern Recognit. Lett.
**2017**, 94, 112–117. [Google Scholar] - Acharya, U.R.; Fujita, H.; Sudarshan, V.K.; Bhat, S.; Koh, J.E. Application of entropies for automated diagnosis of epilepsy using EEG signals: A review. Knowl. Base Syst.
**2015**, 88, 85–96. [Google Scholar] [CrossRef] - Bhattacharyya, A.; Pachori, R.B.; Upadhyay, A.; Acharya, U.R. Tunable-Q wavelet transform based multiscale entropy measure for automated classification of epileptic EEG signals. Appl. Signal Process. Meth. Syst. Anal. Physiol. Health
**2017**, 7, 385. [Google Scholar] [CrossRef][Green Version] - Tian, P.; Hu, J.; Qi, J.; Ye, X.; Che, D.; Ding, Y.; Peng, Y. A hierarchical classification method for automatic sleep scoring using multiscale entropy features and proportion information of sleep architecture. Biocybern. Biomed. Eng.
**2017**, 37, 263–271. [Google Scholar] [CrossRef] - Rodríguez-Sotelo, J.L.; Osorio-Forero, A.; Jiménez-Rodríguez, A.; Cuesta-Frau, D.; Cirugeda-Roldán, E.; Peluffo, D. Automatic sleep stages classification using EEG entropy features and unsupervised pattern analysis techniques. Entropy
**2014**, 16, 6573–6589. [Google Scholar] [CrossRef][Green Version] - Zhao, D.; Wang, Y.; Wang, Q.; Wang, X. Comparative analysis of different characteristics of automatic sleep stages. Comput. Methods Programs Biomed.
**2019**, 175, 53–72. [Google Scholar] [CrossRef] - Michielli, N.; Acharya, U.R.; Molinari, F. Cascaded LSTM recurrent neural network for automated sleep stage classification using single-channel EEG signals. Comp. Biol. Med.
**2019**, 106, 71–81. [Google Scholar] [CrossRef] [PubMed] - Vimala, V.; Ramar, K.; Ettappan, M. An intelligent sleep apnea classification system based on EEG signals. J. Med. Syst.
**2019**, 43, 36. [Google Scholar] [CrossRef] - Wang, Q.; Zhao, D.; Wang, Y.; Hou, X. Ensemble learning algorithm based on multi-parameters for sleep staging. Med. Biol. Eng. Comput.
**2019**, 57, 1693–1707. [Google Scholar] [CrossRef] - Tzimourta, K.D.; Giannakeas, N.; Tzallas, A.T.; Astrakas, L.G.; Afrantou, T.; Ioannidis, P.; Grigoriadis, N.; Angelidis, P.; Tsalikakis, D.G.; Tsipouras, M.G. EEG window length evaluation for the detection of Alzheimer’s disease over different brain regions. Brain Sci.
**2019**, 9, 81. [Google Scholar] [CrossRef][Green Version] - Larrañaga, P.; Calvo, B.; Santana, R.; Bielza, C.; Galdiano, J.; Inza, I.; Lozano, J.A.; Armañanzas, R.; Santafé, G.; Pérez, A.; et al. Machine learning in bioinformatics. Brief Bioinform.
**2006**, 7, 86–112. [Google Scholar] [CrossRef][Green Version] - Plewczynski, D.; Tkacz, A.; Wyrwicz, L.S.; Rychlewski, L.; Ginalski, K. AutoMotif server for prediction of phosphorylation sites in proteins using support vector machine: 2007 update. J. Mol. Modeling
**2008**, 14, 69–76. [Google Scholar] [CrossRef] [PubMed]

Kernel | $\mathit{K}\left({\mathit{x}}_{\mathit{i}}{}^{\mathit{T}}{\mathit{x}}_{\mathit{j}}\right)$ |
---|---|

Linear | ${x}_{i}{}^{T}{x}_{j}$ |

Radial basis function | $\mathrm{exp}\left(-{\gamma \Vert x}_{i}-{x}_{j}{\Vert}^{2}\right),\gamma >0$ |

Polynomial | ${\left({x}_{i}{}^{T}{x}_{j}+1\right)}^{d},\gamma >0$ |

Classifier Prediction Value | |||
---|---|---|---|

Positive | Negative | ||

Actual Value | Positive | True positives (TP) | False negatives (FN) |

Negative | False positives (FP) | True negatives (TN) |

Feature | Base Calling Signal | ||||
---|---|---|---|---|---|

Method | Description | Adenine | Cytosine | Guanine | Thymine |

Statistical Based | Mean | ${\mu}_{A}$ | ${\mu}_{C}$ | ${\mu}_{G}$ | ${\mu}_{T}$ |

Median | ${\mathrm{median}}_{A}$ | ${\mathrm{median}}_{C}$ | ${\mathrm{median}}_{G}$ | ${\mathrm{median}}_{T}$ | |

Standard Deviation | ${\sigma}_{A}$ | ${\sigma}_{C}$ | ${\sigma}_{G}$ | ${\sigma}_{T}$ | |

Entropy Based | PE | ${\mathrm{HNPE}}_{A}$ | ${\mathrm{HNPE}}_{C}$ | ${\mathrm{HNPE}}_{G}$ | ${\mathrm{HNPE}}_{T}$ |

MPE with s = 2 | ${\mathrm{MPE}}_{A}^{\left(2\right)}$ | ${\mathrm{MPE}}_{C}^{\left(2\right)}$ | ${\mathrm{MPE}}_{G}^{\left(2\right)}$ | ${\mathrm{MPE}}_{T}^{\left(2\right)}$ | |

MPE with s = 3 | ${\mathrm{MPE}}_{A}^{\left(3\right)}$ | ${\mathrm{MPE}}_{C}^{\left(3\right)}$ | ${\mathrm{MPE}}_{G}^{\left(3\right)}$ | ${\mathrm{MPE}}_{T}^{\left(3\right)}$ |

Feature | SVM | Training (10%) | Testing | ${\mathit{\epsilon}}_{\mathit{d}\mathit{i}\mathit{f}\mathit{f}}$ | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|

$\overline{\mathit{A}\mathit{c}\mathit{c}}$ | $\overline{\mathit{\kappa}}$ | $\overline{\mathit{S}\mathit{e}}$ | $\overline{\mathit{S}\mathit{p}}$ | $\overline{\mathit{n}\mathit{S}\mathit{V}}$ | $\overline{\mathit{A}\mathit{c}\mathit{c}}$ | $\overline{\mathit{\kappa}}$ | $\overline{\mathit{S}\mathit{e}}$ | $\overline{\mathit{S}\mathit{p}}$ | |||

Mean | Linear | 0.983 | 0.966 | 0.969 | 1.000 | 10.9 | 0.961 | 0.923 | 0.927 | 0.999 | 0.022 |

Poly. Kernel | 0.970 | 0.940 | 0.945 | 1.000 | 11.8 | 0.960 | 0.921 | 0.924 | 1.000 | 0.010 | |

RBF Kernel | 0.995 | 0.989 | 0.992 | 1.000 | 15.3 | 0.980 | 0.960 | 0.987 | 0.973 | 0.015 | |

Median | Linear | 0.793 | 0.532 | 0.669 | 0.861 | 12.9 | 0.708 | 0.425 | 0.590 | 0.844 | 0.085 |

Poly. Kernel | 0.905 | 0.782 | 0.761 | 1.000 | 10.2 | 0.772 | 0.555 | 0.637 | 0.928 | 0.133 | |

RBF Kernel | 0.825 | 0.618 | 0.740 | 0.891 | 17.6 | 0.718 | 0.442 | 0.613 | 0.836 | 0.107 | |

Standard Deviation | Linear | 0.958 | 0.902 | 0.937 | 0.969 | 10.7 | 0.903 | 0.809 | 0.854 | 0.958 | 0.055 |

Poly. Kernel | 0.980 | 0.957 | 0.979 | 0.974 | 8.9 | 0.922 | 0.846 | 0.871 | 0.979 | 0.058 | |

RBF Kernel | 0.970 | 0.932 | 0.972 | 0.958 | 15.1 | 0.963 | 0.927 | 0.967 | 0.960 | 0.007 | |

All Statistics | Linear | 0.992 | 0.984 | 0.985 | 0.999 | 8.9 | 0.953 | 0.906 | 0.913 | 0.996 | 0.039 |

Poly. Kernel | 0.985 | 0.969 | 0.975 | 1.000 | 10.7 | 0.938 | 0.878 | 0.885 | 0.996 | 0.047 | |

RBF Kernel | 0.990 | 0.979 | 1.000 | 0.977 | 15.5 | 0.972 | 0.945 | 0.994 | 0.948 | 0.018 | |

Training (20%) | Testing | ${\mathit{\epsilon}}_{\mathit{diff}}$ | |||||||||

Mean | Linear | 0.981 | 0.961 | 0.963 | 1.000 | 18.4 | 0.967 | 0.935 | 0.938 | 0.999 | 0.014 |

Poly. Kernel | 0.997 | 0.995 | 0.995 | 1.000 | 19.4 | 0.964 | 0.928 | 0.934 | 0.996 | 0.033 | |

RBF Kernel | 0.995 | 0.989 | 0.991 | 1.000 | 30.4 | 0.983 | 0.966 | 0.990 | 0.975 | 0.012 | |

Median | Linear | 0.801 | 0.604 | 0.621 | 0.991 | 22.5 | 0.770 | 0.547 | 0.585 | 0.971 | 0.031 |

Poly. Kernel | 0.905 | 0.796 | 0.892 | 0.890 | 19.2 | 0.834 | 0.671 | 0.812 | 0.863 | 0.071 | |

RBF Kernel | 0.832 | 0.661 | 0.674 | 0.982 | 27.2 | 0.773 | 0.554 | 0.620 | 0.945 | 0.059 | |

Standard Deviation | Linear | 0.955 | 0.908 | 0.932 | 0.978 | 17.7 | 0.931 | 0.863 | 0.884 | 0.982 | 0.024 |

Poly. Kernel | 0.987 | 0.973 | 1.000 | 0.971 | 12.7 | 0.947 | 0.895 | 0.926 | 0.970 | 0.040 | |

RBF Kernel | 0.990 | 0.979 | 0.993 | 0.985 | 26.0 | 0.964 | 0.928 | 0.976 | 0.951 | 0.026 | |

All Statistics | Linear | 0.989 | 0.978 | 0.979 | 1.000 | 17.0 | 0.970 | 0.941 | 0.945 | 0.998 | 0.019 |

Poly. Kernel | 0.990 | 0.979 | 0.979 | 1.000 | 16.3 | 0.970 | 0.941 | 0.945 | 0.998 | 0.020 | |

RBF Kernel | 0.997 | 0.994 | 1.000 | 0.993 | 30.1 | 0.975 | 0.950 | 0.995 | 0.953 | 0.022 | |

Training (30%) | Testing | ${\mathit{\epsilon}}_{\mathit{diff}}$ | |||||||||

Mean | Linear | 0.986 | 0.972 | 0.974 | 1.000 | 22.7 | 0.968 | 0.937 | 0.939 | 1.000 | 0.018 |

Poly. Kernel | 0.995 | 0.989 | 0.990 | 1.000 | 16.8 | 0.976 | 0.952 | 0.958 | 0.995 | 0.019 | |

RBF Kernel | 0.991 | 0.983 | 0.983 | 1.000 | 43.2 | 0.984 | 0.968 | 0.991 | 0.976 | 0.007 | |

Median | Linear | 0.807 | 0.617 | 0.638 | 0.989 | 31.0 | 0.784 | 0.574 | 0.608 | 0.976 | 0.023 |

Poly. Kernel | 0.921 | 0.839 | 0.897 | 0.934 | 20.4 | 0.839 | 0.681 | 0.809 | 0.879 | 0.082 | |

RBF Kernel | 0.846 | 0.696 | 0.719 | 0.990 | 36.6 | 0.764 | 0.533 | 0.564 | 0.977 | 0.082 | |

Standard Deviation | Linear | 0.954 | 0.908 | 0.925 | 0.984 | 25.5 | 0.936 | 0.872 | 0.897 | 0.978 | 0.018 |

Poly. Kernel | 0.988 | 0.975 | 0.993 | 0.980 | 14.6 | 0.955 | 0.909 | 0.947 | 0.963 | 0.033 | |

RBF Kernel | 0.985 | 0.969 | 0.987 | 0.982 | 31.6 | 0.972 | 0.945 | 0.986 | 0.958 | 0.013 | |

All Statistics | Linear | 0.990 | 0.980 | 0.981 | 1.000 | 21.3 | 0.976 | 0.952 | 0.954 | 0.999 | 0.014 |

Poly. Kernel | 0.995 | 0.989 | 0.989 | 1.000 | 22.1 | 0.972 | 0.945 | 0.952 | 0.995 | 0.023 | |

RBF Kernel | 0.996 | 0.993 | 0.996 | 0.996 | 41.5 | 0.980 | 0.959 | 0.995 | 0.962 | 0.016 | |

Training (40%) | Testing | ${\mathit{\epsilon}}_{\mathit{diff}}$ | |||||||||

Mean | Linear | 0.984 | 0.969 | 0.970 | 1.000 | 25.0 | 0.970 | 0.941 | 0.944 | 1.000 | 0.014 |

Poly. Kernel | 0.990 | 0.979 | 0.985 | 0.994 | 29.1 | 0.980 | 0.960 | 0.963 | 0.998 | 0.010 | |

RBF Kernel | 0.993 | 0.987 | 0.987 | 1.000 | 57.3 | 0.990 | 0.979 | 0.991 | 0.987 | 0.003 | |

Median | Linear | 0.818 | 0.636 | 0.651 | 0.990 | 38.5 | 0.789 | 0.586 | 0.624 | 0.975 | 0.029 |

Poly. Kernel | 0.916 | 0.829 | 0.931 | 0.894 | 29.0 | 0.840 | 0.682 | 0.855 | 0.829 | 0.076 | |

RBF Kernel | 0.828 | 0.661 | 0.684 | 0.989 | 48.0 | 0.827 | 0.656 | 0.678 | 0.984 | 0.001 | |

Standard Deviation | Linear | 0.954 | 0.907 | 0.923 | 0.987 | 33.2 | 0.932 | 0.865 | 0.890 | 0.979 | 0.022 |

Poly. Kernel | 0.995 | 0.989 | 0.997 | 0.991 | 17.1 | 0.968 | 0.936 | 0.971 | 0.964 | 0.027 | |

RBF Kernel | 0.986 | 0.972 | 0.989 | 0.981 | 33.1 | 0.974 | 0.948 | 0.982 | 0.965 | 0.012 | |

All Statistics | Linear | 0.992 | 0.984 | 0.985 | 1.000 | 23.8 | 0.975 | 0.951 | 0.954 | 0.999 | 0.017 |

Poly. Kernel | 0.995 | 0.989 | 0.990 | 1.000 | 21.0 | 0.981 | 0.963 | 0.967 | 0.996 | 0.014 | |

RBF Kernel | 0.996 | 0.992 | 0.995 | 0.997 | 41.5 | 0.984 | 0.968 | 0.983 | 0.984 | 0.012 | |

Training (50%) | Testing | ${\mathit{\epsilon}}_{\mathit{diff}}$ | |||||||||

Mean | Linear | 0.987 | 0.974 | 0.975 | 1.000 | 25.4 | 0.971 | 0.941 | 0.943 | 1.000 | 0.016 |

Poly. Kernel | 0.992 | 0.983 | 0.986 | 0.998 | 29.8 | 0.980 | 0.959 | 0.968 | 0.993 | 0.012 | |

RBF Kernel | 0.993 | 0.985 | 0.990 | 0.995 | 73.4 | 0.989 | 0.977 | 0.988 | 0.988 | 0.004 | |

Median | Linear | 0.817 | 0.639 | 0.655 | 0.995 | 47.8 | 0.803 | 0.611 | 0.632 | 0.987 | 0.014 |

Poly. Kernel | 0.913 | 0.823 | 0.959 | 0.857 | 34.4 | 0.867 | 0.735 | 0.900 | 0.837 | 0.046 | |

RBF Kernel | 0.841 | 0.679 | 0.677 | 1.000 | 54.4 | 0.802 | 0.614 | 0.643 | 0.992 | 0.039 | |

Standard Deviation | Linear | 0.949 | 0.898 | 0.919 | 0.981 | 41.1 | 0.937 | 0.874 | 0.900 | 0.977 | 0.012 |

Poly. Kernel | 0.984 | 0.967 | 0.990 | 0.976 | 32.2 | 0.970 | 0.939 | 0.965 | 0.974 | 0.014 | |

RBF Kernel | 0.987 | 0.973 | 0.995 | 0.977 | 35.7 | 0.968 | 0.935 | 0.977 | 0.954 | 0.019 | |

All Statistics | Linear | 0.995 | 0.990 | 0.991 | 1.000 | 21.3 | 0.980 | 0.961 | 0.963 | 0.999 | 0.015 |

Poly. Kernel | 0.998 | 0.995 | 0.996 | 1.000 | 18.0 | 0.979 | 0.957 | 0.966 | 0.992 | 0.019 | |

RBF Kernel | 0.996 | 0.991 | 0.996 | 0.995 | 37.8 | 0.991 | 0.981 | 0.989 | 0.990 | 0.005 |

Feature | SVM | Training (10%) | Testing | ${\mathit{\epsilon}}_{\mathit{d}\mathit{i}\mathit{f}\mathit{f}}$ | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|

$\overline{\mathit{A}\mathit{c}\mathit{c}}$ | $\overline{\mathit{\kappa}}$ | $\overline{\mathit{S}\mathit{e}}$ | $\overline{\mathit{S}\mathit{p}}$ | $\overline{\mathit{n}\mathit{S}\mathit{V}}$ | $\overline{\mathit{A}\mathit{c}\mathit{c}}$ | $\overline{\mathit{\kappa}}$ | $\overline{\mathit{S}\mathit{e}}$ | $\overline{\mathit{S}\mathit{p}}$ | |||

PE | Linear | 0.944 | 0.880 | 0.984 | 0.894 | 10.8 | 0.933 | 0.867 | 0.957 | 0.909 | 0.011 |

Poly. Kernel | 0.960 | 0.919 | 1.000 | 0.921 | 8.4 | 0.950 | 0.900 | 0.977 | 0.921 | 0.010 | |

RBF Kernel | 0.965 | 0.928 | 1.000 | 0.931 | 16.1 | 0.950 | 0.900 | 0.994 | 0.902 | 0.015 | |

MPE with s = 2 | Linear | 0.954 | 0.904 | 0.995 | 0.911 | 10.8 | 0.941 | 0.882 | 0.973 | 0.905 | 0.013 |

Poly. Kernel | 0.995 | 0.990 | 1.000 | 0.990 | 9.7 | 0.945 | 0.890 | 0.954 | 0.935 | 0.050 | |

RBF Kernel | 0.945 | 0.890 | 1.000 | 0.894 | 15.5 | 0.956 | 0.911 | 1.000 | 0.909 | 0.011 | |

MPE with s = 3 | Linear | 0.949 | 0.894 | 0.984 | 0.909 | 11.3 | 0.937 | 0.874 | 0.963 | 0.909 | 0.012 |

Poly. Kernel | 0.980 | 0.959 | 1.000 | 0.958 | 8.5 | 0.935 | 0.871 | 0.934 | 0.937 | 0.045 | |

RBF Kernel | 0.955 | 0.905 | 1.000 | 0.892 | 15.6 | 0.955 | 0.909 | 1.000 | 0.907 | 0.000 | |

All Entropies | Linear | 0.954 | 0.903 | 0.987 | 0.915 | 10.1 | 0.945 | 0.890 | 0.981 | 0.905 | 0.009 |

Poly. Kernel | 0.980 | 0.956 | 1.000 | 0.950 | 11.0 | 0.954 | 0.908 | 0.987 | 0.919 | 0.026 | |

RBF Kernel | 0.970 | 0.938 | 1.000 | 0.940 | 15.7 | 0.955 | 0.909 | 0.994 | 0.911 | 0.015 | |

Training (20%) | Testing | ${\mathit{\epsilon}}_{\mathit{d}\mathit{i}\mathit{f}\mathit{f}}$ | |||||||||

PE | Linear | 0.945 | 0.889 | 0.990 | 0.897 | 20.6 | 0.949 | 0.899 | 0.988 | 0.908 | 0.004 |

Poly. Kernel | 0.980 | 0.959 | 1.000 | 0.956 | 12.7 | 0.946 | 0.893 | 0.953 | 0.940 | 0.034 | |

RBF Kernel | 0.955 | 0.906 | 1.000 | 0.896 | 31.4 | 0.953 | 0.907 | 1.000 | 0.905 | 0.002 | |

MPE with s = 2 | Linear | 0.946 | 0.889 | 0.992 | 0.894 | 21.2 | 0.950 | 0.901 | 0.989 | 0.909 | 0.004 |

Poly. Kernel | 0.977 | 0.954 | 0.995 | 0.958 | 16.1 | 0.947 | 0.894 | 0.989 | 0.901 | 0.030 | |

RBF Kernel | 0.955 | 0.909 | 1.000 | 0.907 | 32.9 | 0.955 | 0.909 | 1.000 | 0.906 | 0.000 | |

MPE with s = 3 | Linear | 0.948 | 0.894 | 0.992 | 0.904 | 20.4 | 0.950 | 0.900 | 0.989 | 0.907 | 0.002 |

Poly. Kernel | 0.970 | 0.938 | 0.991 | 0.952 | 16.9 | 0.949 | 0.897 | 0.982 | 0.912 | 0.021 | |

RBF Kernel | 0.965 | 0.928 | 0.993 | 0.933 | 31.2 | 0.941 | 0.883 | 0.972 | 0.910 | 0.024 | |

All Entropies | Linear | 0.952 | 0.902 | 0.989 | 0.911 | 19.8 | 0.950 | 0.900 | 0.988 | 0.909 | 0.002 |

Poly. Kernel | 0.987 | 0.971 | 1.000 | 0.964 | 10.6 | 0.966 | 0.932 | 0.977 | 0.955 | 0.021 | |

RBF Kernel | 0.975 | 0.948 | 1.000 | 0.947 | 34.4 | 0.950 | 0.899 | 0.995 | 0.901 | 0.025 | |

Training (30%) | Testing | ${\mathit{\epsilon}}_{\mathit{d}\mathit{i}\mathit{f}\mathit{f}}$ | |||||||||

PE | Linear | 0.949 | 0.897 | 0.989 | 0.906 | 29.4 | 0.948 | 0.896 | 0.987 | 0.906 | 0.001 |

Poly. Kernel | 0.986 | 0.973 | 1.000 | 0.972 | 17.5 | 0.957 | 0.915 | 0.970 | 0.944 | 0.029 | |

RBF Kernel | 0.956 | 0.911 | 1.000 | 0.905 | 49.9 | 0.954 | 0.908 | 1.000 | 0.906 | 0.002 | |

MPE with s = 2 | Linear | 0.953 | 0.905 | 0.993 | 0.909 | 28.6 | 0.949 | 0.897 | 0.990 | 0.905 | 0.004 |

Poly. Kernel | 0.996 | 0.993 | 0.996 | 0.995 | 11.7 | 0.964 | 0.928 | 0.970 | 0.959 | 0.032 | |

RBF Kernel | 0.950 | 0.896 | 1.000 | 0.888 | 50.2 | 0.956 | 0.912 | 1.000 | 0.911 | 0.006 | |

MPE with s = 3 | Linear | 0.950 | 0.899 | 0.993 | 0.904 | 30.7 | 0.950 | 0.899 | 0.989 | 0.907 | 0.000 |

Poly. Kernel | 0.981 | 0.962 | 0.989 | 0.972 | 15.8 | 0.937 | 0.874 | 0.918 | 0.961 | 0.044 | |

RBF Kernel | 0.963 | 0.924 | 1.000 | 0.917 | 50.0 | 0.951 | 0.902 | 1.000 | 0.901 | 0.012 | |

All Entropies | Linear | 0.960 | 0.920 | 0.991 | 0.928 | 28.2 | 0.948 | 0.896 | 0.990 | 0.901 | 0.012 |

Poly. Kernel | 0.996 | 0.996 | 1.000 | 0.992 | 10.6 | 0.983 | 0.967 | 0.983 | 0.983 | 0.013 | |

RBF Kernel | 0.983 | 0.965 | 1.000 | 0.963 | 36.1 | 0.970 | 0.939 | 1.000 | 0.937 | 0.013 | |

Training (40%) | Testing | ${\mathit{\epsilon}}_{\mathit{d}\mathit{i}\mathit{f}\mathit{f}}$ | |||||||||

PE | Linear | 0.951 | 0.901 | 0.989 | 0.909 | 39.0 | 0.947 | 0.893 | 0.987 | 0.903 | 0.004 |

Poly. Kernel | 0.990 | 0.979 | 0.995 | 0.982 | 13.5 | 0.969 | 0.938 | 0.968 | 0.970 | 0.021 | |

RBF Kernel | 0.948 | 0.896 | 1.000 | 0.891 | 70.0 | 0.959 | 0.917 | 1.000 | 0.914 | 0.011 | |

MPE with s = 2 | Linear | 0.951 | 0.901 | 0.994 | 0.905 | 37.4 | 0.950 | 0.899 | 0.989 | 0.906 | 0.001 |

Poly. Kernel | 0.996 | 0.992 | 1.000 | 0.991 | 14.6 | 0.961 | 0.923 | 0.960 | 0.963 | 0.035 | |

RBF Kernel | 0.948 | 0.897 | 1.000 | 0.898 | 68.8 | 0.959 | 0.917 | 1.000 | 0.912 | 0.011 | |

MPE with s = 3 | Linear | 0.949 | 0.896 | 0.991 | 0.903 | 40.7 | 0.951 | 0.901 | 0.990 | 0.908 | 0.002 |

Poly. Kernel | 0.976 | 0.951 | 0.995 | 0.952 | 21.3 | 0.952 | 0.904 | 0.971 | 0.933 | 0.024 | |

RBF Kernel | 0.956 | 0.911 | 1.000 | 0.904 | 68.4 | 0.954 | 0.908 | 1.000 | 0.907 | 0.002 | |

All Entropies | Linear | 0.964 | 0.927 | 0.994 | 0.928 | 30.3 | 0.952 | 0.904 | 0.990 | 0.912 | 0.012 |

Poly. Kernel | 0.993 | 0.987 | 1.000 | 0.986 | 14.0 | 0.977 | 0.954 | 0.987 | 0.967 | 0.016 | |

RBF Kernel | 0.996 | 0.992 | 1.000 | 0.991 | 29.3 | 0.989 | 0.978 | 1.000 | 0.977 | 0.007 | |

Training (50%) | Testing | ${\mathit{\epsilon}}_{\mathit{d}\mathit{i}\mathit{f}\mathit{f}}$ | |||||||||

PE | Linear | 0.948 | 0.895 | 0.990 | 0.903 | 49.5 | 0.951 | 0.901 | 0.989 | 0.909 | 0.003 |

Poly. Kernel | 0.992 | 0.983 | 0.995 | 0.987 | 11.1 | 0.960 | 0.919 | 0.971 | 0.948 | 0.032 | |

RBF Kernel | 0.959 | 0.917 | 1.000 | 0.916 | 76.6 | 0.955 | 0.908 | 1.000 | 0.903 | 0.004 | |

MPE with s = 2 | Linear | 0.952 | 0.904 | 0.996 | 0.905 | 41.4 | 0.950 | 0.900 | 0.989 | 0.907 | 0.002 |

Poly. Kernel | 0.996 | 0.991 | 0.996 | 0.995 | 13.4 | 0.972 | 0.943 | 0.966 | 0.977 | 0.024 | |

RBF Kernel | 0.947 | 0.893 | 1.000 | 0.893 | 84.4 | 0.963 | 0.924 | 1.000 | 0.919 | 0.016 | |

MPE with s = 3 | Linear | 0.951 | 0.902 | 0.992 | 0.907 | 46.6 | 0.949 | 0.898 | 0.991 | 0.904 | 0.002 |

Poly. Kernel | 0.984 | 0.967 | 0.996 | 0.969 | 16.0 | 0.946 | 0.891 | 0.960 | 0.931 | 0.038 | |

RBF Kernel | 0.954 | 0.907 | 1.000 | 0.903 | 86.4 | 0.956 | 0.911 | 1.000 | 0.909 | 0.002 | |

All Entropies | Linear | 0.966 | 0.932 | 0.994 | 0.935 | 30.8 | 0.956 | 0.912 | 0.991 | 0.919 | 0.010 |

Poly. Kernel | 0.999 | 0.997 | 1.000 | 0.997 | 14.6 | 0.981 | 0.962 | 0.991 | 0.971 | 0.018 | |

RBF Kernel | 0.988 | 0.975 | 1.000 | 0.975 | 41.9 | 0.979 | 0.957 | 1.000 | 0.953 | 0.009 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Öz, E.; Aşkın, Ö.E. Classification of Hepatitis Viruses from Sequencing Chromatograms Using Multiscale Permutation Entropy and Support Vector Machines. *Entropy* **2019**, *21*, 1149.
https://doi.org/10.3390/e21121149

**AMA Style**

Öz E, Aşkın ÖE. Classification of Hepatitis Viruses from Sequencing Chromatograms Using Multiscale Permutation Entropy and Support Vector Machines. *Entropy*. 2019; 21(12):1149.
https://doi.org/10.3390/e21121149

**Chicago/Turabian Style**

Öz, Ersoy, and Öyküm Esra Aşkın. 2019. "Classification of Hepatitis Viruses from Sequencing Chromatograms Using Multiscale Permutation Entropy and Support Vector Machines" *Entropy* 21, no. 12: 1149.
https://doi.org/10.3390/e21121149