Pancreatic Cancer Early Detection Using Twin Support Vector Machine Based on Kernel

Early detection of pancreatic cancer is difficult, and thus many cases of pancreatic cancer are diagnosed late. When pancreatic cancer is detected, the cancer is usually well developed. Machine learning is an approach that is part of artificial intelligence and can detect pancreatic cancer early. This paper proposes a machine learning approach with the twin support vector machine (TWSVM) method as a new approach to detecting pancreatic cancer early. TWSVM aims to find two symmetry planes such that each plane has a distance close to one data class and as far as possible from another data class. TWSVM is fast in building a model and has good generalizations. However, TWSVM requires kernel functions to operate in the feature space. The kernel functions commonly used are the linear kernel, polynomial kernel, and radial basis function (RBF) kernel. This paper uses the TWSVM method with these kernels and compares the best kernel for use by TWSVM to detect pancreatic cancer early. In this paper, the TWSVM model with each kernel is evaluated using a 10-fold cross validation. The results obtained are that TWSVM based on the kernel is able to detect pancreatic cancer with good performance. However, the best kernel obtained is the RBF kernel, which produces an accuracy of 98%, a sensitivity of 97%, a specificity of 100%, and a running time of around 1.3408 s.


Introduction
According to the World Health Organization (WHO), the second leading cause of death in the world is cancer [1]. In 2018, around 17 million new cases of cancer occurred in the world and 9.6 million of the world's population died from cancer [2]. Cancer has many types depending on location. If cancer occurs in the pancreas, then the cancer is called pancreatic cancer. Pancreatic cancer is the seventh leading cause of cancer deaths in the world and ranks as the 14th most common cancer [3]. Based on the Global Cancer Observatory in 2018, the estimated number of diagnoses of this cancer in the world is 458,918 and the estimated number of deaths is 432,242 [4]. This cancer is expected to be the second leading cause of cancer deaths in the world in 2030 [5].
Pancreatic cancer is a type of cancer that is difficult to detect physically. This is because the pancreas is an organ deep in the body; there are no external lumps or external skin changes such as in cases of breast lesions [6]. In addition, non-specific symptoms such as nausea, anorexia, jaundice, weight loss, and abdominal pain are also factors in the difficulty of detecting pancreatic cancer early [7]. The difficulty in detecting pancreatic cancer early results in many cases of pancreatic cancer being diagnosed late. When pancreatic cancer is detected, cancer is usually well developed, whereas in one study, only 7% of pancreatic cancer was diagnosed as localized disease [6]. If pancreatic cancer is detected too late, then the cancer can develop properly and spread to other body parts so that the cancer is difficult to treat. Therefore, early detection of pancreatic cancer is an important problem.
The problem of detecting pancreatic cancer early is a classification problem in machine learning. Machine learning has been accepted for various classification problems in various fields, one of which is in the field of medicine. In the field of medicine, several machine learning methods are used to detect several types of cancer, namely breast cancer [8][9][10][11], cervical cancer [12,13], ovarian cancer [14,15], colon cancer [16], prostate cancer [17], and lung cancer [18].
Machine learning has also been implemented to detect pancreatic cancer. Qiu et al. [19] implemented several methods of machine learning, namely the decision tree, the k-nearest neighbor, and the support vector machine. From the results of the implementation, the support vector machine method is the method that produces the best performance, which is 70% sensitivity and 70% specificity.
The support vector machine (SVM) is one of the well-known machine learning methods; it uses the concept of "maximum margin", and this concept reduces generalization errors by maximizing margins between two half-separated planes [20]. Many researchers have proposed the development of SVM to obtain better performance. One such development is the twin support vector machine proposed by Jayadeva et al. in 2007.
The twin support vector machine (TWSVM) aims to find two symmetry planes such that each plane has a distance close to one data class and as far as possible from another data class [20]. On several benchmark data sets, TWSVM is not only fast, but shows good generalization [20]. At present, TWSVM has become one of the popular methods because of its excellent learning performance [21]. However, both SVM and TWSVM indirectly need a kernel method to classify data. The kernel method is a method that uses a kernel function that allows an algorithm to operate in a feature space that has a higher dimension by using product operations between images of all data pairs in the feature space [22]. The kernel functions commonly used for SVM methods are the linear kernel, polynomial kernel, and radial basis function (RBF) kernel.
This paper proposes the TWSVM method as a novel approach for early detection of pancreatic cancer. The kernel functions used are the linear kernel, the polynomial kernel, and the RBF kernel. This paper compares the performance of TWSVM with each kernel to get the best kernel for early detection of pancreatic cancer.

Kernel Method
The kernel method uses a kernel function that allows an algorithm to operate in a feature space that has a higher dimension by using product operations between images of all data pairs in the feature space [22]. Let X n be an input space; F is a feature space, and ϕ : X n → F . Kernel function is defined by Kernel functions that are often used are the linear kernel, the polynomial kernel, and the radial basis function (RBF) kernel. The formulas for the kernel functions are listed below.

Linear Kernel
Linear kernels are the simplest kernel functions, represented by the product x, y [23]. This kernel is the basic kernel that is most often used by SVM because with this kernel, SVM divides data linearly. The linear kernel formula is presented in Equation (2) [24].

Polynomial Kernel
The polynomial kernel is a kernel that is suitable for problems where training data is normalized [24]. From Equation (3), σ is the parameter that must be settled. Variable c is the constant that is set and variable d is the degree of polynomial that is set.

Radial Basis Function (RBF) Kernel
The RBF kernel is a kernel family where distance measurements are smoothed by radials function (exponential function) [23]. The RBF kernel is denoted as in Equation (4), where σ is the adjustable parameter.

Twin Support Vector Machine (TWSVM)
The twin support vector machine (TWSVM) is one of the developments of the support vector machine proposed by Jayadeva et al. in 2007. TWSVM aims to find two symmetry planes such that each plane has a distance close to one data class and as far as possible from another data class. [20]. Figure 1 shows an illustration of TWSVM.

Polynomial Kernel
The polynomial kernel is a kernel that is suitable for problems where training data is normalized [24]. From Equation (3), σ is the parameter that must be settled. Variable c is the constant that is set and variable d is the degree of polynomial that is set.
The RBF kernel is a kernel family where distance measurements are smoothed by radials function (exponential function) [23]. The RBF kernel is denoted as in Equation (4), where σ is the adjustable parameter.

Twin Support Vector Machine (TWSVM)
The twin support vector machine (TWSVM) is one of the developments of the support vector machine proposed by Jayadeva et al. in 2007. TWSVM aims to find two symmetry planes such that each plane has a distance close to one data class and as far as possible from another data class. [20]. Figure 1 shows an illustration of TWSVM. Let D = ( , y )| x ∈ X , y ∈ −1, +1}, i = 1, 2, … , N} is the training data. Then let there be d data points in class +1 and d data points in class -1 such that d + d = d. We form (d × n) matrix A, which contains the data points in class +1, and (d × n) matrix B, which contains the data points in class−1. The two non-parallel hyperplanes are [20]: where is data vector, is weight parameter for first hyperplane, b is bias parameter for first hyperplane, is weight parameter for second hyperplane, and b is bias parameter for second hyperplane.
The TWSVM method is obtained by solving the following pair of quadratic programming problems [20]: TWSVM 1: |A + e b | + c e ξ subject to Let D = x i , y i | x i ∈ X n , y i ∈ {−1, +1}, i = 1, 2, . . . , N is the training data. Then let there be d 1 data points in class +1 and d 2 data points in class −1 such that d 1 + d 2 = d. We form (d 1 × n) matrix A, which contains the data points in class +1, and (d 2 × n) matrix B, which contains the data points in class −1. The two non-parallel hyperplanes are [20]: where x is data vector, w 1 is weight parameter for first hyperplane, b 1 is bias parameter for first hyperplane, w 2 is weight parameter for second hyperplane, and b 2 is bias parameter for second hyperplane. The TWSVM method is obtained by solving the following pair of quadratic programming problems [20]: TWSVM 1: min w 1 ,b 1 ,ξ 2 1 2 ||Aw 1 + e 1 b 1 || 2 + c 1 e T 2 ξ 2 subject to − (Bw 1 + e 2 b 1 ) ≥ e 2 − ξ 2 (8) and TWSVM 2: min w 2 ,b 2 ,ξ 1 1 2 ||Bw 2 + e 2 b 2 || 2 + c 2 e T 1 ξ 1 (10) subject to − (Aw 2 + e 2 b 2 ) ≥ e 1 − ξ 1 (11) where c 1 > 0 and c 2 > 0 are penalty parameters, ξ 1 and ξ 2 are slack variables, and e 1 and e 2 are vectors of 'ones', i.e., each component is 'one' only [20]. The two hyperplanes of TWSVM with kernel [20]: where C T = [A, B] T , u 1 , u 2 ∈ R d , and K is the kernel matrix corresponding to an appropriately chosen kernel function [11]. The kernel TWSVM can be obtained by solving the optimization problems [20]: KTWSVM 1: and KTWSVM 2: subject to − K A, C T u 2 + e 2 b 2 ≥ e 1 − ξ 1 where c 1 > 0 and c 2 > 0 are penalty parameters, ξ 1 and ξ 2 are slack variables, e 1 and e 2 are vectors of 'ones', i.e., each component is 'one' only, C T = [A, B] T , u 1 , u 2 ∈ R d , and K is the kernel matrix corresponding to an appropriately chosen kernel function [20].

k-Fold Cross Validation
In this paper, to obtain a model and evaluate the model obtained, the dataset was divided into training data and testing data. Training data are data used by machines to recognize and study pancreatic cancer data patterns, while testing data are data used to evaluate models obtained after a machine learns data patterns. The dataset was divided into training data and testing data using the k-fold cross validation method. k-fold cross validation is a method for selecting training data samples. The k-fold cross validation method divided the dataset into k sections of equal size [25]. Each subsample was taken as validation data to test the model and repeat the process k times [25]. The advantage of this method is the repetition of random samples as training data and validation [25].

Proposed Method
In this paper, the method proposed for early detection of pancreatic cancer consisted of four stages. In the first stage, the data were divided into training data and testing data using k-fold cross validation. The k value chosen was 10. This means that the dataset was divided into 10 samples of the same size. In this dataset, 9 samples were used as training data, and 1 sample was used as testing data. In the second stage, the training data were used by the TWSVM method based on linear kernels, polynomial kernels, and RBF kernels to study data patterns and build classification models. In the third stage, the classification model obtained was evaluated based on the parameters of accuracy, sensitivity, specificity, and required running time. This evaluation used testing data. After that, evaluation parameters generated by each kernel were compared to find out the best kernel to detect pancreatic cancer early using the TWSVM method. The stages carried out in this paper used the Python 3 programming language.

Data
In this paper, the data used were pancreatic cancer data obtained from Al Islam Hospital Bandung, Indonesia. The data consisted of 203 samples and six features. The six features were the diagnosis of patient, namely pancreatic cancer (Y) and not pancreatic cancer (N), and blood tests which consisted of cancer antigens, hemoglobin, leukocytes, hematocrit, and platelets. The diagnosis feature became a target feature in detecting pancreatic cancer. Table 1 shows part of the data. In the data shown in Table 1, the diagnosis feature is a categorical feature. This feature must be changed into a numeric feature for the proposed method to work. Therefore, based on the TWSVM method, the Y category in the diagnostic feature was transformed into +1 and the N category in the diagnostic feature was transformed into −1.

Confusion Matrix
In this paper, a confusion matrix was used to assist in calculating the evaluation parameters of the classification model. Table 2 shows the confusion matrix used to evaluate the TWSVM classification model based on the kernel for early detection of pancreatic cancer.

Evaluation Parameters
The parameters to evaluate the performance of the TWSVM classification model were accuracy, sensitivity, specificity, and required running time. Table 3 shows the formula for accuracy, sensitivity, and specificity.

Results
In this section, we discuss the performance evaluation of the TWSVM classification model with a linear kernel, polynomial kernel, and RBF kernel. The TWSVM classification model based on kernel that is proposed in this paper refers to research conducted by [20] which detects hepatitis using TWSVM with a linear kernel and a RBF kernel. In research conducted by the authors of [20], the accuracy produced by the RBF kernel is superior to that of a linear kernel. This indicates that the RBF kernel is the appropriate kernel in detecting hepatitis using TWSVM.
In this paper, we have built the TWSVM classification model with a linear kernel, a polynomial kernel, and a RBF kernel in detecting pancreatic cancer. Table 4 presents a comparison of TWSVM performance with linear kernels, polynomial kernels with d = 4, and RBF kernels with σ = 0.05. The performance evaluation parameters compared are accuracy, sensitivity, specificity, and running time. From Table 4, it can be seen that for accuracy, TWSVM models with RBF kernel have the most superior accuracy compared to the linear kernel and the polynomial kernel, reaching 98%. That is, the TWSVM model with the RBF kernel correctly detected 98% of the total cases. The lowest accuracy is the TWSVM model with a polynomial kernel, with a resulting percentage of 80%. In addition, for consideration of sensitivity and specificity, TWSVM models with RBF kernel have percentages that are also superior than the linear kernel and polynomial kernel, which are 97% sensitivity and 100% specificity. This means that the TWSVM model with the RBF kernel is able to detect 98% of cases correctly, with a truth of 97% of all cases of pancreatic cancer, and 100% of all cases of non-pancreatic cancer.
Based on consideration of accuracy, sensitivity, and specificity produced by the TWVM model with a linear kernel, a polynomial kernel, and a RBF kernel, overall the TWSVM model with RBF kernel is the most superior kernel. This means that the pancreatic cancer dataset can be separated almost precisely by RBF function. However, for consideration of running time, the TWSVM model with RBF kernel has the longest running time compared to linear and polynomial kernels, which is around 1.3408 s. The TWSVM model with the polynomial kernel actually produces the fastest running time, which is around 1.2040 s. Even so, the running time produced by the RBF kernel is quite good and acceptable for detecting pancreatic cancer early. Thus, the RBF kernel is the best kernel for TWSVM in detecting pancreatic cancer early.

Conclusions
Early detection of pancreatic cancer is very important so that the handling of pancreatic cancer does not occur too late, before the cancer spreads to other organs in the body. However, early detection of pancreatic cancer is difficult because this cancer has non-specific symptoms. The twin support vector machine method based on the kernel can help detect pancreatic cancer early, based on blood tests. The most appropriate kernel for the TWSVM method in detecting pancreatic cancer is the RBF kernel which produces an accuracy of 98%, sensitivity of 97%, and 100% specificity, and the required running time is 1.3408 s.