Next Article in Journal
Out-of-Plane Equilibrium Points in the Photogravitational Hill Three-Body Problem
Previous Article in Journal
Robust Harmonic Fuzzy Partition Local Information C-Means Clustering for Image Segmentation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improved Generalized-Pinball-Loss-Based Laplacian Twin Support Vector Machine for Data Classification

by
Vipavee Damminsed
1,† and
Rabian Wangkeeree
1,2,*,†
1
Department of Mathematics, Faculty of Science, Naresuan University, Phitsanulok 65000, Thailand
2
Research Center for Academic Excellence in Mathematics, Naresuan University, Phitsanulok 65000, Thailand
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Symmetry 2024, 16(10), 1373; https://doi.org/10.3390/sym16101373
Submission received: 9 September 2024 / Revised: 9 October 2024 / Accepted: 12 October 2024 / Published: 15 October 2024
(This article belongs to the Section Mathematics)

Abstract

:
Nowadays, unlabeled data are abundant, while supervised learning struggles with this challenge as it relies solely on labeled data, which are costly and time-consuming to acquire. Additionally, real-world data often suffer from label noise, which degrades the performance of supervised models. Semi-supervised learning addresses these issues by using both labeled and unlabeled data. This study extends the twin support vector machine with the generalized pinball loss function (GPin-TSVM) into a semi-supervised framework by incorporating graph-based methods. The assumption is that connected data points should share similar labels, with mechanisms to handle noisy labels. Laplacian regularization ensures uniform information spread across the graph, promoting a balanced label assignment. By leveraging the Laplacian term, two quadratic programming problems are formulated, resulting in LapGPin-TSVM. Our proposed model reduces the impact of noise and improves classification accuracy. Experimental results on UCI benchmarks and image classification demonstrate its effectiveness. Furthermore, in addition to accuracy, performance is also measured using the Matthews Correlation Coefficient (MCC) score, and the experiments are analyzed through statistical methods.

1. Introduction

Support vector machine (SVM) [1] is an efficient machine learning model that remains widely utilized today. This is due to its simplicity and ability to explain its mathematical principles easily. It can find a global classification solution, dividing data by creating a single optimal hyperplane. Although the support vector machine is highly popular, it has to solve large matrices when attempting to find solutions using quadratic programming problems (QPPs).
To improve computational efficiency, Jayadeva et al. [2] developed the twin support vector machine (TSVM), which finds two nonparallel hyperplanes, each closer to one class. This reduces problem-solving time by splitting it into two smaller QPPs. Due to its lower computational cost and better generalization than an SVM, many adaptations of the TSVM have emerged. Kumar et al. [3] introduced a least-squares TSVM (LS-TSVM) to simplify computations by replacing QPPs with linear equations. Mei et al. [4] extended TSVMs to multi-task learning with a multi-task LS-TSVM algorithm. Rastogi et al. [5] proposed a robust parametric TSVM (RP-TWSVM), which adjusts the margin to handle heteroscedastic noise. Further extensions of TSVMs are discussed in [6,7,8,9].
A TSVM assigns equal importance to all data, making it sensitive to noise, outliers, and class imbalances [10], which can lead to reduced predictive capability or overfitting. To address these issues, Rezvani et al. [11] introduced Intuitionistic Fuzzy Twin SVMs (IFTSVMs) using intuitionistic fuzzy sets. Xu et al. [12] proposed PinTSVMs for noise insensitivity, and Tanveer et al. [13] introduced the general TSVM with pinball loss (Pin-GTSVM), which reduced sensitivity to outliers. However, TSVMs and Pin-GTSVMs lose model sparsity, motivating Tanveer et al. [14] to propose a Sparse Pinball TSVM (SP-TSVM) using the ϵ -insensitive-zone pinball loss. Rastogi et al. [15] developed the generalized pinball loss, which extends the pinball and hinge loss functions. Panup et al. [16] applied this generalized pinball loss to a TSVM, resulting in the GPin-TSVM. The GPin-TSVM improves accuracy in pattern classification, handles noise and outliers effectively, and retains model sparsity, enhancing scalability. Its structure, based on solving two smaller QPPs, reduces computational complexity and increases efficiency.
The model discussed above is classified as supervised learning, which relies on labeled data for training. However, a key challenge with this approach is the requirement for labeled data, which can be difficult and costly to obtain. Additionally, real-world data often suffer from label noise, where incorrect labels degrade the performance of supervised models. To overcome this limitation, unsupervised learning [17] has been developed, allowing models to be built using unlabeled data. To combine the advantages of both labeled and unlabeled data, a method called semi-supervised learning has emerged, enabling models to utilize both types of data effectively. This area has seen significant growth.
Semi-supervised learning (SSL) [18] integrates both labeled and unlabeled data to enhance the effectiveness of supervised learning. The goal is to build a more robust classifier by leveraging large volumes of unlabeled data alongside a relatively small set of labeled data. Recent advancements in deep learning, such as GACNet, CVANet, and CATNet, apply semi-supervised techniques and attention mechanisms to address data scarcity and improve feature extraction. GACNet [19] uses a semi-supervised GAN to augment hyperspectral datasets, while CVANet [20] and CATNet [21] enhance image resolution and classification with attention modules. These methods collectively demonstrate the power of combining semi-supervised learning and attention for robust image processing across various domains. In addressing noisy labels, the ECMB framework [22] introduced real-time correction with a Mixup entropy and a balance term to prevent overfitting. Then, C2MT [23] advanced this with a co-teaching strategy and the Median Balance Strategy (MBS) to maintain class balance. In 2024, BPT-PLR [24] improved accuracy and robustness by addressing class imbalance and optimization conflicts through a Gaussian mixture model and a pseudo-label relaxed contrastive loss.
While these advancements strengthened noisy label learning, a graph-based approach offers a more structured solution for combining labeled and unlabeled data. The Laplacian Twin Support Vector Machine (Lap-TSVM) [25] enhances the TSVM by incorporating Laplacian regularization to exploit the underlying data structure. This technique smooths the decision function across the data manifold, making the Lap-TSVM especially effective for data with local structures. Widely used in machine learning, spectral clustering, and semi-supervised learning, it captures the structural properties of graphs to improve model performance [26]. One notable feature of the graph Laplacian is its symmetry; when the graph is undirected, the Laplacian matrix is symmetric. This symmetry is critical in many spectral graph algorithms, simplifying computations and ensuring the stability of solutions derived from the matrix. The Laplacian matrix’s symmetry and regularization make it effective for graph-based representations in machine learning, particularly in semi-supervised learning. To further reduce computational time, Chen et al. [27] developed a least-squares version of Lap-TSVM that solved linear equations. Additionally, the Lap-PTSVM [28] was introduced, integrating pinball loss with Lap-TSVM and yielding promising results in classification tasks.
In earlier discussions, the GPin-TSVM extended SVM to nonparallel hyperplanes, offering a resilient model that tackled challenges like noise and sparsity, making it highly reliable for practical applications. To further improve classification performance, particularly when labeled data are limited, the GPin-TSVM is extended by integrating a Laplacian graph-based technique, resulting in the LapGPin-TSVM, a semi-supervised model. The LapGPin-TSVM leverages both labeled and unlabeled data, making it more robust in real-world scenarios where large-scale labeling is impractical.
Our approach involves formulating two quadratic programming problems (QPP) to solve for hyperplanes. We evaluate the model across 13 UCI benchmark datasets using 5-fold cross-validation and varying the ratio of unlabeled data (20%, 40%, 60%, 80%) and noise (0%, 5%, 10%). Additionally, we investigate its performance in image classification using CIFAR-10. The results are analyzed using the Wilcoxon signed-rank test to determine statistical significance. The overall concept of this study is stated in Figure 1. Our proposed approach is outlined and characterized by the following:
  • We combine the twin support vector machine based on the generalized pinball loss (GPin-TSVM) with the Laplacian technique, introducing a novel semi-supervised framework named LapGPin-TSVM. Additionally, we demonstrate noise insensitivity along with a corresponding analysis.
  • We evaluate the efficacy of our model through experiments on the UCI dataset, using various ratios of unlabeled data and noise, and compare the results with three state-of-the-art models. Moreover, we also investigate the application of our approach to image classification.
  • To analyze the performance of LapGPin-TSVM, we employ the win/tie/loss method, average rank, and use the Wilcoxon signed-rank test to better describe the effectiveness of our proposed method.
The rest of the paper is organized as follows: The preliminary section covers the notation used in the study. We then describe key models, including the TSVM, GPin-TSVM, and semi-supervised Lap-TSVM. Section 3 discusses the primary and dual problems, along with the property of the model. Section 4 provides a model comparison, and Section 5 presents the experimental results. Section 6 offers the discussion, followed by the conclusion in Section 7.

2. Preliminaries

To understand the fundamental concepts, we define the notation and symbols used in this work as follows. For a matrix denoted as M and a vector represented by x, their transposes are denoted as M and x , respectively. The inverse matrix of M is expressed as M 1 . Here, we are dealing with a binary semi-supervised classification problem in a d-dimensional real space R d . The complete dataset is denoted as M = ( x 1 , y 1 ) , , ( x l , y l ) , ( x l + 1 ) , , ( x l + u ) , where y i { 1 , + 1 } , and x i for i = 1 , , l represents labeled data, while x i for i = l + 1 , , l + u represents unlabeled data. Assume that we have l 1 and l 2 samples belonging to the labeled data corresponding to classes +1 and −1, respectively. The positively labeled samples are represented in the matrix A R l 1 × d , and the negatively labeled samples are denoted by the matrix B R l 2 × d .

2.1. Twin Support Vector Machine (TSVM)

This method is based on the idea of identifying two nonparallel hyperplanes that classify the data points into their respective classes. It involves a pair of nonparallel planes given by:
x w 1 + b 1 = 0 and x w 2 + b 2 = 0 ,
where w 1 , w 2 R n and b 1 , b 2 R . This approach requires solving two small-sized quadratic programming problems (QPPs), formulated as follows:
min w 1 , b 1 , ξ 1 2 A w 1 + e 1 b 1 2 + c 1 e 2 ξ s.t. ( B w 1 + e 2 b 1 ) + ξ e 2 , ξ 0 ,
and
min w 2 , b 2 , ξ 1 2 B w 2 + e 2 b 2 2 + c 2 e 1 ξ s.t. ( A w 2 + e 1 b 2 ) + ξ e 1 , ξ 0 ,
where ξ is a slack variable, and c 1 , c 2 are positive penalty parameters. e 1 and e 2 are unit vectors of the appropriate size. As the twin support vector machine demonstrates commendable performance, researchers are working to enhance it further. They are currently exploring the integration of a new type of loss called the generalized pinball loss, with a specific focus on improving its handling of classification tasks.

2.2. Twin Support Vector Machine with Generalized Pinball Loss (GPin-TSVM)

Panup et al. [16] recently introduced a variant of the TSVM called the generalized pinball loss TSVM, denoted as GPin-TSVM. The definition of the generalized pinball loss function is given as follows
τ 1 , τ 2 ϵ 1 , ϵ 2 ( u ) = τ 1 u ϵ 1 τ 1 , u > ϵ 1 τ 1 , 0 , ϵ 2 τ 2 u ϵ 1 τ 1 , τ 2 u + ϵ 2 τ 2 , u < ϵ 2 τ 2 ,
where u = 1 y ( w x + b ) and τ 1 , τ 2 , ϵ 1 , ϵ 2 0 . This improvement from the ϵ -insensitive zone [29] results in enhanced model sparsity while retaining all the inherent properties of the original. The optimization problems are outlined as follows:
min w 1 , b 1 , ξ 1 2 A w 1 + e 1 b 1 2 + c 1 e 2 ξ s.t. ( B w 1 + e 2 b 1 ) e 2 1 τ 1 ( ξ + e 2 ϵ 1 ) , ( B w 1 + e 2 b 1 ) e 2 + 1 τ 2 ( ξ + e 2 ϵ 2 ) , ξ 0 ,
and
min w 2 , b 2 , ξ 1 2 B w 2 + e 2 b 2 2 + c 2 e 1 ξ s.t. ( A w 2 + e 1 b 2 ) e 1 1 τ 3 ( ξ + e 1 ϵ 3 ) , ( A w 2 + e 1 b 2 ) e 1 + 1 τ 4 ( ξ + e 1 ϵ 4 ) , ξ 0 ,
where τ 1 , τ 2 , τ 3 , τ 4 , ϵ 1 , ϵ 2 , ϵ 3 , and ϵ 4 are non-negative parameters. Define P = A e 1 , Q = B e 2 . Their dual problems are as follows:
min α , λ 1 2 λ Q ( P P ) 1 Q λ λ e 2 ( 1 + ϵ 2 τ 2 ) + α e 2 ( ϵ 1 τ 1 + ϵ 2 τ 2 ) s.t. 0 ( 1 τ 1 + 1 τ 2 ) α λ τ 2 c 1 e 2 , α 0 , α λ 0 ,
and
min ω , μ 1 2 μ P ( Q Q ) 1 P μ μ e 1 ( 1 + ϵ 4 τ 4 ) + ω e 1 ( ϵ 3 τ 3 + ϵ 4 τ 4 ) s.t. 0 ( 1 τ 3 + 1 τ 4 ) ω μ τ 4 c 2 e 1 , ω 0 , ω μ 0 ,
where α , λ , ω , and μ 0 are Lagrange multipliers. After solving the QPPs, the separating hyperplanes are obtained from
w 1 b 1 = ( P P ) 1 Q λ ,
and
w 2 b 2 = ( Q Q ) 1 P μ .
The GPin-TSVM improves the SVM model by allowing nonparallel hyperplanes, helping it handle various challenges while maintaining strong performance compared to the TSVM. It uses a special loss function called the generalized pinball loss, which is better at handling noise and outliers than the original TSVM. This means that noisy data have less effect on the decision boundaries, making it a reliable choice for real-world applications. Additionally, its increased sparsity makes it easier to compute, improving its scalability.
However, as a purely supervised method, the TSVM relies on labeled data, which can be limited in practice. While effective on well-annotated datasets, the TSVM struggles in scenarios where labeling large volumes of data is costly or impractical. To address this limitation, a method called semi-supervised learning (SSL) has emerged. SSL enables the use of abundant unlabeled data to improve classifier robustness and accuracy while requiring fewer labeled examples. Next, we discuss a semi-supervised learning model based on the TSVM.

2.3. Laplacian Twin Support Vector Machine

The Lap-TSVM model [25] is formulated by integrating a semi-supervised learning framework derived from the TSVM. As mentioned above, the semi-supervised technique uses the Laplacian matrix, which finds applications in various domains, particularly in spectral graph theory and graph-based machine learning. It has properties and eigenvalues that are related to the structure and connectivity of the underlying graph. The Lap-TSVM finds a pair of nonparallel-planes as follows:
x w 1 + b 1 = 0 and x w 2 + b 2 = 0 ,
where w 1 , w 2 R n and b 1 , b 2 R . The primal problems of this work can be expressed as
min w 1 , b 1 , ξ 1 2 A w 1 + e 1 b 1 2 + c 1 e 2 ξ + c 2 ( w 1 2 + b 1 2 ) + c 3 ( w 1 M + e b 1 ) L ( M w 1 + e b 1 ) s.t. ( B w 1 + e 2 b 1 ) + ξ e 2 , ξ 0 ,
and
min w 2 , b 2 , ξ 1 2 B w 2 + e 2 b 2 2 + c 1 e 1 ξ + c 2 ( w 2 2 + b 2 2 ) + c 3 ( w 2 M + e b 2 ) L ( M w 2 + e b 2 ) s.t. A w 2 + e 1 b 2 + ξ e 1 , ξ 0 .
The first three terms are concepts from the TSVM. The fourth term is the Laplacian term that considers the entire dataset. Here, M is the matrix that represents both labeled and unlabeled data. L denotes the graph Laplacian, defined as L = D A , where D is the diagonal matrix of vertex degrees, and A is the adjacency matrix of the graph derived from the weight matrix defined by k-nearest neighbors, expressed as follows:
W i j = exp ( x i x j 2 2 / 2 σ 2 ) , if x i , x j are neighbors ; 0 , otherwise ,
where σ is a parameter that controls the width of the Gaussian kernel, influencing the similarity measure between neighboring points. The decision function of this problem is derived from:
class ( i ) = arg min i = 1 , 2 | x w i + b i | w i ,
where | . | is the perpendicular distance of point x to the two hyperplanes x w 1 + b 1 and x w 2 + b 2 .
In summary, integrating semi-supervised learning into the TSVM framework offers the benefit of utilizing information from both labeled and unlabeled data. This can enhance model performance, particularly in situations where labeled data are limited.

3. Proposed Work

Inspired by the concepts of GPin-TSVMs and Lap-TSVMs, we developed the LapGPin-TSVM model. This extension shifts from supervised to semi-supervised learning by integrating the Laplacian regularization term.
The motivation behind our interest in semi-supervised learning is rooted in the challenges posed by relying solely on labeled data. When dealing with problems that involve the entire dataset, labeled data alone may not be sufficient to establish the most accurate nonparallel hyperplane. This limitation is clearly demonstrated in Figure 2. As stated in Figure 2a, which illustrates the distribution of the entire dataset, the solid circles represent labeled data with two colors: red for the positive class and green for the negative class. The plus symbols, both red and green, represent unlabeled data. The original GPin-TSVM, a supervised learning model, can only be trained using labeled data, as illustrated in Figure 2b. However, in the new approach, LapGPin-TSVM, where the Laplacian term is incorporated into a GPin-TSVM, the model learns from labeled and unlabeled data, as demonstrated in Figure 2c. This approach results in a more reasonable nonparallel hyperplane that better corresponds to the data distribution.

3.1. Primal Problem

The classification task aims to define two nonparallel hyperplanes, similar to (11). We derive this by applying the frameworks of GPin-TSVM ((5) and (6)) and Lap-TSVM ((12) and (13)) to find the positive and negative hyperplanes. This leads to the following optimization problem:
min w 1 , b 1 , ξ 1 2 A w 1 + e 1 b 1 2 + c 1 e 2 ξ + c 2 2 ( w 1 2 + b 1 2 ) + c 3 2 ( M w 1 + e b 1 ) L ( M w 1 + e b 1 ) s.t. ( B w 1 + e 2 b 1 ) e 2 1 τ 1 ( ξ + e 2 ϵ 1 ) , ( B w 1 + e 2 b 1 ) e 2 + 1 τ 2 ( ξ + e 2 ϵ 2 ) , ξ 0 ,
and
min w 2 , b 2 , ξ 1 2 B w 2 + e 2 b 2 2 + c 1 e 1 ξ 1 + c 2 2 ( w 2 2 + b 2 2 ) + c 3 2 ( M w 2 + e b 2 ) L ( M w 2 + e b 2 ) s.t. ( A w 2 + e 1 b 2 ) e 1 1 τ 3 ( ξ + e 1 ϵ 3 ) , ( A w 2 + e 1 b 2 ) e 1 + 1 τ 4 ( ξ + e 1 ϵ 4 ) , ξ 0 ,
where c 1 , c 2 , c 3 are non-negative parameters, and τ 1 , τ 2 , τ 3 , τ 4 , ϵ 1 , ϵ 2 , ϵ 3 , ϵ 4 0 . e 1 and e 2 are unit vectors of the appropriate size. If we set c 2 and c 3 to 0 in (16) and (17), the problems reduce to a GPin-TSVM, demonstrating that our proposed model is a generalization of the previous one.
In problem (16), the first term aims to minimize the sum of the squared distances of labeled samples to the hyperplane, while the second term represents the slack variable controlling the loss of samples by employing the concept of generalized pinball loss. The term w 1 2 + b 1 2 serves as regularization to prevent ill-conditioning. The fourth term is the Laplacian regularization term, introducing a penalty for deviations from smoothness in the decision function across the data manifold. It encourages the model to respect the underlying geometric structure of the data, promoting a more coherent decision boundary.
The given minimization problems can be converted into an unconstrained optimization format by interpreting the constraint as the significance of the generalized pinball loss. This leads to the formulation of the new problem as follows:
min w 1 , b 1 1 2 A w 1 + e 1 b 1 2 + c 1 e 2 L τ 1 , τ 2 ϵ 1 , ϵ 2 e 2 + ( B w 1 + e 2 b 1 ) + c 2 ( w 1 2 + b 1 2 ) + c 3 ( w 1 M + e b 1 ) L ( M w 1 + e b 1 ) ,
and
min w 2 , b 2 1 2 B w 2 + e 2 b 2 2 + c 1 e 1 L τ 1 , τ 2 ϵ 1 , ϵ 2 e 1 ( A w 2 + e 1 b 2 ) + c 2 ( w 2 2 + b 2 2 ) + c 3 ( w 2 M + e b 2 ) L ( M w 2 + e b 2 ) .
These problems may yield different representations of the constrained optimization (16) and (17), but they have equivalent solutions to the original problems.
In seeking the solution to the optimization problems (16) and (17), we reformulate it into dual forms and utilize quadratic programming problems (QPPs) for resolution. This is further discussed in the next section.

3.2. Dual Problem

For considering the dual form and solving the problem, we focus on problem (16), as the computation method for problem (17) is the same. Here, we introduce Lagrange multiplier α , γ , β 0 and then obtain the following form:
L ( w 1 , b 1 , ξ , α , γ , β ) = 1 2 A w 1 + e 1 b 1 2 + c 1 e 2 ξ + c 2 2 ( w 2 + b 2 ) + c 3 2 ( M w 1 + e b 1 ) L ( M w 1 + e b 1 ) α ( ( B w 1 + e 2 b 1 ) e 2 + 1 τ 1 ( ξ + e 2 ϵ 1 ) ) γ ( e 2 + 1 τ 2 ( ξ + e 2 ϵ 2 ) + ( B w 1 + e 2 b 1 ) ) β ξ .
Next, we apply the KKT optimality conditions to obtain the following results:
L w 1 = A ( A w 1 + e 1 b 1 ) + c 2 w 1 + c 3 M L ( M w 1 + e b 1 ) + α B γ B = 0 ,
L b 1 = e 1 ( A w 1 + e 1 b 1 ) + c 2 b 1 + c 3 e L ( M w 1 + e b 1 ) + α e 2 γ e 2 = 0 ,
L ξ = c 1 e 2 α τ 1 γ τ 2 β = 0 ,
α ( ( B w 1 + e 2 b 1 ) e 2 + 1 τ 1 ( ξ + e 2 ϵ 1 ) ) = 0 ,
γ ( e 2 + 1 τ 2 ( ξ + e 2 ϵ 2 ) + ( B w 1 + e 2 b 1 ) ) = 0 ,
β ξ = 0 .
Since β 0 , we have
c 1 e 2 α τ 1 γ τ 2 = β 0 ,
which implies that
c 1 e 2 α τ 1 + γ τ 2 .
Now, we define F = B e 2 , H = A e 1 , J = M e , z 1 = w 1 b 1 and z 2 = w 2 b 2 . By combining (21) and (22) and using (27), the dual form of the problem (16) is obtained as follows:
min α , γ 1 2 ( α γ ) F ( H H + c 2 I + c 3 J L J ) 1 F ( α γ ) + ( α γ ) e 2 ( 1 + ϵ 2 τ 2 ) α e 2 ( ϵ 1 τ 1 + ϵ 2 τ 2 ) s.t. α τ 1 + γ τ 2 c 1 e 2 , α , γ 0 .
Since the above dual form is considered a quadratic programming problem, we solve it to find the solution. Once solved, we obtain the values of α and γ , and then we obtain:
z 1 = ( H H + c 1 I + c 3 J L J ) 1 F ( α γ ) .
Similar to the negative hyperplane (17), we obtain
min μ , η 1 2 ( μ η ) H ( F F + c 5 I + c 6 J L J ) 1 H ( μ η ) + ( μ η ) e 1 ( 1 + ϵ 4 τ 4 ) μ e 1 ( ϵ 3 τ 3 + ϵ 4 τ 4 ) s.t. μ τ 3 + η τ 4 c 4 e 1 , μ , η 0 ,
where μ , η 0 are Lagrange multipliers. The solution of this problem is
z 2 = ( F F + c 5 I + c 6 J L J ) 1 H ( μ η ) .
After acquiring the two hyperplanes, we categorize the new data sample x i using the following expression:
class ( i ) = arg min i = 1 , 2 | x w i + b i | w i .
In this context, | . | represents the perpendicular distance of the data from the hyperplane. The data are assigned to the class of the hyperplane to which they have the minimum distance. The Algorithm of our proposed is shown in Algorithm 1.
Algorithm 1 LapGPin-TSVM
1:
Input:
  • Labeled data: ( x 1 , y 1 ) , , ( x l , y l ) where y i { 1 , + 1 }
  • Unlabeled data: ( x l + 1 ) , , ( x l + u )
  • Parameters: c 1 , c 2 , c 3 , τ 1 , τ 2 , τ 3 , τ 4 , ϵ 1 , ϵ 2 , ϵ 3 , ϵ 4
2:
Preprocess Data:
  • Combine labeled and unlabeled data into matrix M
  • Build Laplacian matrix L using a graph-based approach
  • Split labeled data into positive class ( A ) and negative class ( B )
3:
Define two nonparallel hyperplanes:
x w 1 + b 1 = 0 and x w 2 + b 2 = 0 .
4:
Formulate the primal optimization problems as (16) and (17), and convert the primal problem to the dual form using Lagrange multipliers as stated in (29) and (31), respectively.
5:
Solve the quadratic programming problems (29) and (31) for α , γ , μ , and η .
6:
Compute the final separating hyperplanes by using (30) and (32):
z 1 = ( H H + c 1 I + c 3 J L J ) 1 F ( α γ )
and
z 2 = ( F F + c 5 I + c 6 J L J ) 1 H ( μ η ) .
7:
Classify new data point x i :
  • Calculate the distance from x i to both hyperplanes and assign class based on the minimal distance:
    class ( i ) = arg min i = 1 , 2 | x T w i + b i | w i
8:
Output: The two separating hyperplanes.

3.3. Property of the Lap-GPTSVM

Noise Insensitivity

We explore how the LapGPin-PTSVM tackles the problem of sensitivity to noise. To be concise, we concentrate on the linear case and resolve the issue presented in (18). However, it is worth noting that the same analysis is applicable to the nonlinear case. Define the generalized sign function sgn τ 1 , τ 2 ϵ 1 , ϵ 2 ( u ) as
sgn τ 1 , τ 2 ϵ 1 , ϵ 2 ( u ) = τ 1 if u > ϵ 1 τ 1 , [ 0 , τ 1 ] if u = ϵ 1 τ 1 , 0 if ϵ 2 τ 2 < u < ϵ 1 τ 1 , [ τ 2 , 0 ] if u = ϵ 2 τ 2 , τ 2 if u < ϵ 2 τ 2 ,
where u = 1 y ( w x + b ) , and sgn τ 1 , τ 2 ϵ 1 , ϵ 2 ( u ) represents the subgradient of the generalized pinball loss function. By employing the Karush–Kuhn–Tucker (KKT) optimality condition for Equation (18), we can formulate it as follows:
0 c 1 i = 1 m 2 sgn τ 1 , τ 2 ϵ 1 , ϵ 2 ( 1 + ( w 1 x i + b 1 ) ) x i + c 2 w 1 + c 3 M L ( M w 1 + e b 1 ) + A ( A w 1 + e 1 b 1 ) .
Here, 0 represents the zero vector, x i B , and m 2 is the number of negative samples. Given w 1 and b 1 , we can partition the index set of B into five distinct sets:
V 1 + = { i : 1 + ( w 1 x i + b 1 ) > ϵ 1 τ 1 } , V 2 + = { i : 1 + ( w 1 x i + b 1 ) = ϵ 1 τ 1 } , V 3 + = { i : ϵ 2 τ 2 < 1 + ( w 1 x i + b 1 ) < ϵ 1 τ 1 } , V 4 + = { i : 1 + ( w 1 x i + b 1 ) = ϵ 2 τ 2 } , V 5 + = { i : 1 + ( w 1 x i + b 1 ) < ϵ 2 τ 2 } ,
where i { 1 , 2 , 3 , , m 2 } . By introducing the notation V 1 + , V 2 + , V 3 + , V 4 + , and V 5 + , Equation (35) can be reformulated to assert the existence of ϕ i [ 0 , τ 1 ] and θ i [ τ 2 , 0 ] for which:
τ 1 i V 1 + x i + i V 2 + ϕ i x i + i V 4 + θ i x i τ 2 i V 5 + x i + 1 c 1 A ( A w 1 + e 1 b 1 ) + c 2 c 1 w 1 + c 3 c 1 M L ( M w 1 + e b 1 ) = 0 .
As indicated in Equation (34), the samples in V 3 + may not contribute to the improvement of w 1 due to the fact that the generalized sign function is zero. However, V 3 + directly influences the sparsity of the model. It is important to note that the quantities ϵ 1 and ϵ 2 regulate the number of samples in V 3 + . As ϵ 1 and ϵ 2 approach zero, sparsity diminishes. Conversely, when ϵ 1 and ϵ 2 , we enhance sparsity by including a greater number of samples in V 3 + .
Proposition 1.
If there exists a solution to the optimization problem (16) or (29), the following inequalities must be satisfied:
1 c 1 τ 1 m 2 [ e 1 ( A w 1 + e 1 b 1 ) + c 2 b 1 + c 3 e L ( M w 1 + e b 1 ) ] 1 ,
and
p 1 m 2 1 τ 1 + e 1 ( A w 1   +   e 1 b 1 )   +   c 2 b 1   +   c 3 e L ( M w 1   +   e b 1 ) c 1 m 2 τ 1 + τ 2 ,
where p 1 denotes the number samples in V 1 + .
Proof. 
Consider an arbitrary negative point, denoted as x i 0 , belonging to the set V 1 + . Utilizing the KKT conditions represented by equations (25) and (26), we derive that β i 0 = γ i 0 = 0 . Further analysis of the KKT condition (23) leads to the conclusion that α i 0 = c 1 τ 1 , resulting in α i 0 γ i 0 = c 1 τ 1 .
Define λ = α γ , and consequently, λ i 0 = c 1 τ 1 . Additionally, the expression e 1 ( A w 1 + e 1 b 1 ) + c 2 b 1 + c 3 e L ( M w 1 + e b 1 ) = p 1 c 1 τ 1 + i E 1 + λ i is obtained from (22).
Considering the constraints α i 0 and γ i 0 , it follows that c 1 τ 2 λ i c 1 τ 1 . Thus, the sum of λ i over points not in V 1 + is given by
i V 1 + λ i = [ e 1 ( A w 1 + e 1 b 1 ) + c 2 b 1 + c 3 e L ( M w 1 + e b 1 ) ] p 1 c 1 τ 1 .
Consequently, we establish that c 1 τ 2 i V 1 + λ i m 2 p 1 c 1 τ 1 . This leads to the inequality:
[ e 1 ( A w 1 + e 1 b 1 ) + c 2 b 1 + c 3 e L ( M w 1 + e b 1 ) ] ( m 2 p 1 ) c 1 τ 1 p 1 c 1 τ 1 ,
and
p 1 c 1 τ 1 [ e 1 ( A w 1 + e 1 b 1 ) + c 2 b 1 + c 3 e L ( M w 1 + e b 1 ) ] + ( m 2 p 1 ) c 1 τ 1 .
This completes the proof of our argument. □
When analyzing the aforementioned proposition, we find that the values of τ 1 , τ 2 affect the number of samples in the set V 1 + . Decreasing the τ 1 , τ 2 values results in fewer members in V 1 + , causing the decision boundary to be more sensitive to noise. Conversely, increasing the τ 1 , τ 2 values make the decision boundary less sensitive to noise. Additionally, there is a term related to the graph-based approach, namely, the Laplacian term. This introduces the consideration that the analysis of both labeled and unlabeled data influences the creation of the classification boundary.
In considering the negative hyperplane, we define a set in a similar manner but with respect to positive samples instead. These sets are as follows:
V 1 = { i : 1 ( w 2 x i + + b 2 ) > ϵ 3 τ 3 } , V 2 = { i : 1 ( w 2 x i + + b 2 ) = ϵ 3 τ 3 } , V 3 = { i : ϵ 4 τ 4 < 1 ( w 2 x i + + b 2 ) < ϵ 3 τ 3 } , V 4 = { i : 1 ( w 2 x i + + b 2 ) = ϵ 4 τ 4 } , V 5 = { i : 1 ( w 2 x i + + b 2 ) < ϵ 4 τ 4 } ,
where i { 1 , 2 , 3 , , m 1 } . For the analysis, we use the same approach. Consequently, we obtain the following proposition.
Proposition 2.
If there exists a solution to the optimization problem (17) or (31), the following inequalities must be satisfied:
1 c 1 τ 3 m 1 [ e 2 ( B w 2 + e 2 b 2 ) + c 2 b 2 + c 3 e L ( M w 2 + e b 2 ) ] 1 ,
and
p 2 m 1 1 τ 3 + e 2 ( B w 2   +   e 2 b 2 )   +   c 2 b 2   +   c 3 e L ( M w 2   +   e b 2 ) c 1 m 1 τ 3 + τ 4 ,
where p 2 denote the number samples in V 1 .
The inclusion of Laplacian regularization in LapGPin-TSVM contributes to enhanced generalization and robustness, particularly in scenarios where the data exhibit intrinsic geometric properties.

4. Comparison of the Models

We compare our model with three other models: GPin-TSVM [16], Lap-TSVM [25], and Lap-PTSVM [28].

4.1. LapGPin-TSVM vs. GPin-TSVM

Both the original GPin-TSVM and the new LapGPin-TSVM find solutions using the dual problem and obtain nonparallel hyperplanes. They utilize the same generalized pinball loss function to enhance model performance. Additionally, both approaches are designed to tackle issues of sparsity and noise sensitivity. However, the GPin-TSVM operates using only labeled data, while the LapGPin-TSVM can leverage both labeled and unlabeled data. Moreover, in the LapGPin-TSVM framework, setting the third term’s coefficient c 3 to zero reduces the problem to a GPin-TSVM, making the LapGPin-TSVM a more generalized approach.

4.2. LapGPin-TSVM vs. Lap-TSVM

The LapGPin-TSVM and Lap-TSVM generate two nonparallel hyperplanes and solve the dual problem. They operate within a semi-supervised framework that incorporates the Laplacian term. However, there are key differences between them. The Lap-TSVM is based on the study of TSVMs and hinge loss, which does not tackle the noise sensitivity problem since this loss penalizes only misclassified data. In contrast, the LapGPin-TSVM is improved based on the GPin-TSVM, which utilizes the generalized pinball loss function, penalizing all data, even if correctly classified. This makes the LapGPin-TSVM stable for resampling and imparts noise insensitivity to the model.

4.3. LapGPin-TSVM vs. Lap-PTSVM

The LapGPin-TSVM and Lap-PTSVM generate two nonparallel hyperplanes and address the dual problem within a semi-supervised framework. They differ notably in their loss functions. The Lap-PTSVM uses a pinball loss, which penalizes deviations from a specific threshold, focusing on particular quantile issues but not necessarily handling all data points uniformly. On the other hand, the LapGPin-TSVM utilizes a generalized pinball loss, which offers a more flexible and thorough approach to penalization. This type of loss considers the entire range of errors, including those from correctly classified instances, thus boosting the model’s robustness. Consequently, the LapGPin-TSVM enhances model sparsity and offers improved generalization, making it more adaptable to various datasets and resistant to noise.

5. Numerical Experiments

In this section, the GPin-TSVM [16], Lap-TSVM [25], and Lap-PTSVM [28] are compared with the LapGPin-TSVM. We selected 13 benchmark datasets to evaluate the performance of our proposed model. A comprehensive overview of these datasets is presented in Table 1. A Grid Search was employed to explore a range of hyperparameters. We fine-tuned the parameters c 1 , c 2 , and c 3 from the set { 10 i , i = 5 , 4 , 3 , 2 , 1 , 0 , 1 , 2 , 3 , 4 , 5 } , while τ 1 , τ 2 , τ 3 , τ 4 , ϵ 1 , ϵ 2 , ϵ 3 , and ϵ 4 were selected from the range ( 0 , 1 ) . In the context of nonlinear cases, the kernel parameter σ was tuned from the set { 10 i , i = 3 , 2 , 1 , 0 , 1 , 2 , 3 } .
All experiments were executed in Python 3.9.5 on a Windows 10 system, utilizing an Intel(R) Core(TM) i7-4500U CPU @ 1.80 GHz 2.40 GHz. The experimental results were obtained from a 5-fold cross-validation. Our investigation focused on the model performance, emphasizing the effects of varying ratios of unlabeled data and noise. The bold type indicates the best result.

5.1. Evaluation Metrics

In addition to standard measures such as accuracy (ACC), we employed the Matthews Correlation Coefficient (MCC) as a pivotal metric for evaluating the overall classification performance of the models. It takes into account true positives ( T P ), true negatives ( T N ), false positives ( F P ), and false negatives ( F N ). The formula for MCC is as follows:
MCC = T P × T N F P × F N ( T P + F P ) ( T P + F N ) ( T N + F P ) ( T N + F N ) .
It ranges from 1 to + 1 , where 1 indicates perfect prediction, 0 indicates random prediction, and −1 indicates total disagreement between predictions and actual outcomes.

5.2. Variation in Ratio of Unlabeled Data

The LapGPin-TSVM extends the GPin-TSVM by considering the incorporation of Laplacian techniques and transforming the model into a semi-supervised model. To test the performance of our model, we systematically changed the proportion of unlabeled data in the dataset. We considered ratios ranging from 20 % to 80 % of the total data. The accuracy results for both the linear and nonlinear cases (using the RBF kernel) are shown in Table 2, Table 3, Table 4 and Table 5, respectively. Each table corresponds to a different percentage of unlabeled data: Table 2 presents the results for 20 % of unlabeled data, Table 3 for 40 % , Table 4 for 60 % , and Table 5 for 80 % .
We present the win/tie/loss count for accuracy in the last column of each table. To provide a clearer view, Figure 3 illustrates the results for each percentage of unlabeled data. Each graph represents 26 cases, showing that our proposed model achieved more wins than the others. In some cases, with 60% and 80% unlabeled data, there were no ties or losses. The average rank of accuracy and MCC score of all cases was computed and is shown in Table 6. Here, ranks were assigned by ordering the values in ascending order, then the smallest values as rank 1. Thus, a higher average rank generally indicated better performance.
It can be observed that in almost every case where the percentage of unlabeled data increased, the GPin-TSVM method exhibited lower accuracy compared to other methods as stated in Figure 4. The MCC score for this method consistently yielded results in the same direction showing in Figure 5. This is because this method is a supervised learning approach that builds a classifier using only labeled data.
From Figure 4, in the linear case, the GPin-TSVM, Lap-TSVM, Lap-PTSVM, and LapGPin-TSVM were evaluated as the percentage of unlabeled data increased from 20% to 80%. Initially, the Lap-PTSVM performed better with lower percentages of unlabeled data (20%) but dropped in accuracy as the percentage increases. The LapGPin-TSVM, on the other hand, started strong and maintained relatively stable performance, particularly as the percentage increased, indicating its robustness in handling larger proportions of unlabeled data. The Lap-TSVM also showed consistent performance, though with a slight downward trend as the quantity of unlabeled data increased.
In the nonlinear case using an RBF kernel, all methods generally performed better than in the linear case. The LapGPin-TSVM maintained the highest accuracy levels, even as the percentage of unlabeled data reached 80%, demonstrating its adaptability and superior performance in nonlinear cases.
As shown in Table 2, it is evident that our method outperformed others in both linear and nonlinear cases on WDBC, Heart, and Specf heart data. However, for Sonar, Bupa and Monk-2 data, our method exhibited lower accuracy compared to the Lap-TSVM and Lap-PTSVM. When considering the scenario where the proportion of unlabeled data was 40% of the total data in Table 3, it was observed that our method achieved the highest accuracy in Pima, Diabetes, and WDBC data. Similar to Table 4, our method continued to demonstrate the highest accuracy in both cases when considering Pima, Diabetes, Ionosphere and Monk-2 data. However, for Bupa and Specf heart data, the Lap-PTSVM achieved the highest accuracy. Additionally, in the case of Sonar data, the Lap-TSVM outperformed our approach in both cases. For Table 5, the LapGPin-TSVM outperformed most other methods, except on the Bupa, Sonar, Diabetes, Australian, and Specf heart data.
In addition to the evaluation based on accuracy, we now turn our attention to the results concerning the MCC score. In Table 2, the maximum MCC score for the nonlinear scenario in Monk-2 data was consistent across the Lap-TSVM, Lap-PTSVM, and our method, standing at 0.9464 and closely approaching 1. Furthermore, in Table 3, within the same datasets, our LapGPin-TSVM consistently outperformed in both cases, but the highest accuracy in the linear case belonged to Lap-TSVM. Notably, as indicated in Table 4, the Lap-TSVM outperformed other methods on the Sonar data. Nevertheless, our proposed method surpassed the others on the Ionosphere and Monk-2 data. These outcomes demonstrated a similar pattern to those observed in Table 5.
In conclusion, within the linear scenario, the LapGPin-TSVM showed superior performance in both accuracy and MCC, as evidenced by the values in Table 6, where it held an average rank of 3.45 for accuracy and 3.38 for MCC, while the Lap-PTSVM achieved 3.05 and 2.85, respectively. A higher average rank generally indicates better performance. Similarly, in the nonlinear case, our proposed exhibited the best performance, as highlighted by the average rank values for accuracy and MCC of 3.28 and 3.38, respectively, as presented in Table 6. This investigation put the emphasis on the ability of the LapGPin-TSVM to utilize unlabeled instances, thereby leading to enhanced generalization and adaptability of decision boundaries.

Computational Efficiency of Model

Figure 6 shows that the Lap-PTSVM had the highest average computation time across all percentages of unlabeled data, making it the most computationally demanding. This is due to its use of the pinball loss function, which reduces model sparsity and increases processing time. The LapGPin-TSVM, the second most time-consuming method, uses the generalized pinball loss function to restore sparsity. Both models also require Laplacian matrix calculations, which further increase their processing time compared to GPin-TSVM.
In contrast, the GPin-TSVM and Lap-TSVM had the lowest average times. The GPin-TSVM avoids Laplacian matrix computations and benefits from the generalized pinball loss function, which restores sparsity. Despite needing Laplacian matrix calculations, the Lap-TSVM remains efficient due to its sparsity properties. As the percentage of unlabeled data increased, all models showed reduced computational time, suggesting that larger volumes of unlabeled data are less computationally demanding to process.

5.3. Ablation Study

An ablation study, often applied in the context of neural networks [30], involves systematically altering specific components to assess their impact on a model’s performance. In our investigation, we focused on the effects of ablation on the proposed model by modifying its Laplacian regularization term. This allowed us to evaluate the influence of the modification. After analyzing all test cases with unlabeled data, we identified the optimal configuration of our proposed LapGPin-TSVM model that achieved the highest accuracy.
To assess the impact of the Laplacian regularization term on model performance, we carried out ablation experiments on the Ionosphere dataset. The goal was to examine how the inclusion of the Laplacian regularization term affected the model’s overall accuracy. In this experiment, we tested the problems (16) and (17) of the LapGPin-TSVM without the Laplacian regularization by setting the parameter c 3 = 0 . The results in Table 7 reveal that the model’s performance declined when the Laplacian term was excluded. This decrease in accuracy likely resulted from information loss, as the model struggled to make effective use of the unlabeled data. However, when the Laplacian regularization was included, the performance improved notably, boosting accuracy from 80.91% to 82.91% on the Ionosphere dataset.

5.4. Variation in Ratio of Noise

Due to the fact that the LapGPin-TSVM utilizes error measurement through the computation of the generalized pinball loss, a characteristic that follows this type of loss is its ability to handle noise issues in the data. Therefore, to test the model performance regarding this feature, we conducted experiments on datasets with noise, having a zero mean and different variances of 0, 0.05, and 0.1. Here, we denote the percentage of noise in the data as r. The results are presented in Table 8 for the linear case. For the nonlinear case, the results are displayed in Table 9. Similarly, we computed and presented the average rank for all cases in Table 10.
The results are categorized into linear and nonlinear cases. For the linear case presented in Table 8, our method achieved the best performance at 70% (21 out of 30 instances). On the Fertility data, the experimental results for each model were comparable. On the Pima, Sonar, and Spambase data, our proposed loss outperformed other models. Considering the case where r = 0.05 on the Sonar data, although the Lap-TSVM had the highest accuracy, the highest MCC score was achieved by the Lap-PTSVM. The results are shown in Figure 7.
In Table 9, which display the results of the nonlinear case, our model achieved the highest accuracy in 23 out of 30 instances, equating to 76.67%. Similarly to the previous findings on the Fertility data, all models produced closely aligned results. In the Bupa data, considering r = 0 and 0.05 , our model achieved the highest accuracy, but the highest MCC values belonged to the Lap-TSVM and Lap-PTSVM, respectively. Similar trends emerged on the Ionosphere and Monk-2 data when considering r = 0.1 . Some results are shown in Figure 8.
Analyzing the average rank values for accuracy and MCC in Table 10, it is evident that our method attained the highest average rank. This denotes superior performance compared to other models, with the second-ranking model being the Lap-PTSVM. This outcome is attributed to the fact that the generalized pinball loss is derived from the pinball loss, which possesses the capability to handle noise. Therefore, in addition to our model being a generalized version of the Lap-PTSVM, it also exhibits better sensitivity to noise.
This comprehensively assesses robustness and performance under different levels of noise. The Laplacian regularization term in the LapGPin-TSVM improves model generalization by considering the local data structure, particularly beneficial for datasets with complex intrinsic geometry.
Furthermore, we also conducted a statistical analysis to assess the differences between the model we proposed and other models [31]. Due to the non-normal distribution of our data, we chose to employ the Wilcoxon signed-rank test [32]. This test is a non-parametric statistical test employed to compare two related samples and determine whether there is a significant difference between the paired observations in a sample, employing a significance level of 0.05 for our analysis.
In this analysis, we employed accuracy as the metric, and the results are presented in Table 11 and Table 12. Table 11 compares our model with others, incorporating data from Table 2, Table 3, Table 4 and Table 5, which present results based on the ratio of unlabeled samples. Table 12, on the other hand, extracts data from Table 8 and Table 9. As observed in both tables, the LapGPin-TSVM exhibited significant differences compared to the GPin-TSVM, Lap-TSVM, and Lap-PTSVM.

5.5. Experiment on an Image Dataset

We evaluated the proposed model on a binary classification task using the CIFAR-10 dataset [33], which is a widely used dataset for image retrieval and classification, comprising 60,000 samples from ten classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck, as shown in Figure 9. Each class contains 6000 samples. For the binary classification experiments, specific class pairs were chosen: airplane vs. automobile, ship vs. truck, deer vs. shore, and dog vs. cat. In each instance, feature extraction utilized the ResNet18 architecture to enhance image representation for classification, and the percentage of unlabeled data was set at 70%.
As shown in Table 13, it is evident that the LapGPin-TSVM outperformed most other models across various datasets, including the Lap-PTSVM and Lap-TSVM. For instance, in the “deer vs. shore” classification, the LapGPin-TSVM achieved the highest accuracy of 92.85%, surpassing the Lap-PTSVM (92.46%), Lap-TSVM (91.76%), and the supervised GPin-TSVM (91.15%). Similarly, for the “ship vs. truck” dataset, the LapGPin-TSVM led with an accuracy of 98.03%, exceeding both Lap-PTSVM and Lap-TSVM.

6. Discussion

In our proposed method, the addition of the Laplacian term offers significant benefits but also introduces potential challenges. One of the primary concerns is the increase in computational complexity, as calculating the Laplacian matrix requires constructing a data graph based on similarity measures. Another challenge is related to data noise. While the Laplacian regularization helps enforce smoothness in the decision boundary, it may cause the model to overfit when dealing with noisy data, thus reducing generalization performance.
However, our method addresses these concerns by leveraging the robust properties of the generalized pinball loss function, which inherently handles noise more effectively. This ensures that despite the inclusion of the Laplacian term, the model can manage noisy data and still maintain high performance.

7. Conclusions

This study investigated the LapGPin-TSVM, a novel adaptation of the twin support vector machine (TSVM) enriched with the Laplacian graph-based technique, making it well suited for semi-supervised learning tasks. The LapGPin-TSVM improved model performance by integrating unlabeled data, leading to enhanced generalization, particularly in situations with limited labeled datasets. Our proposed model established a smooth decision boundary, enhancing robustness in the presence of noise, as guaranteed by theoretical foundations. The solution involved solving the quadratic programming problems to determine the classification’s hyperplanes. Additionally, this work considered an extended scenario, encompassing previous studies. A potential direction for future research is to develop techniques for semi-supervised learning that align with advancements in Laplacian matrix methodologies.

Author Contributions

Conceptualization, V.D. and R.W.; methodology, V.D. and R.W.; software, V.D.; validation, V.D.; formal analysis, R.W. and V.D.; writing—original draft preparation, V.D.; writing—review and editing, V.D. and R.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Faculty of Science, Naresuan University (NU), grant no. R2566E041.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors are thankful to the referees for their attentive reading and valuable suggestions. This research was supported by the Science Achievement Scholarship of Thailand.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Christmann, A.; Steinwart, I. Support Vector Machines; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
  2. Jayadeva; Khemchandani, R.; Chandra, S. Twin Support Vector Machines for Pattern Classification. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 905–910. [Google Scholar] [CrossRef] [PubMed]
  3. Kumar, M.A.; Gopal, M. Least Squares Twin Support Vector Machines for Pattern Classification. Expert Syst. Appl. 2009, 36, 7535–7543. [Google Scholar] [CrossRef]
  4. Mei, B.; Xu, Y. Multi-task Least Squares Twin Support Vector Machine for Classification. Neurocomputing 2019, 338, 26–33. [Google Scholar] [CrossRef]
  5. Rastogi, R.; Sharma, S.; Chandra, S. Robust parametric twin support vector machine for pattern classification. Neural Process. Lett. 2018, 47, 293–323. [Google Scholar] [CrossRef]
  6. Xie, X.; Sun, F.; Qian, J.; Guo, L.; Zhang, R.; Ye, X.; Wang, Z. Laplacian Lp Norm Least Squares Twin Support Vector Machine. Pattern Recognit. 2023, 136, 109192. [Google Scholar] [CrossRef]
  7. Li, Y.; Sun, H. Safe Sample Screening for Robust Twin Support Vector Machine. Appl. Intell. 2023, 53, 20059–20075. [Google Scholar] [CrossRef]
  8. Si, Q.; Yang, Z.; Ye, J. Symmetric LINEX Loss Twin Support Vector Machine for Robust Classification and Its Fast Iterative Algorithm. Neural Netw. 2023, 168, 143–160. [Google Scholar] [CrossRef]
  9. Gupta, U.; Gupta, D. Least Squares Structural Twin Bounded Support Vector Machine on Class Scatter. Appl. Intell. 2023, 53, 15321–15351. [Google Scholar] [CrossRef]
  10. Tanveer, M.; Rajani, T.; Rastogi, R.; Shao, Y.H.; Ganaie, M.A. Comprehensive Review on Twin Support Vector Machines. Ann. Oper. Res. 2022, 1–46. [Google Scholar] [CrossRef]
  11. Rezvani, S.; Wang, X.; Pourpanah, F. Intuitionistic Fuzzy Twin Support Vector Machines. IEEE Trans. Fuzzy Syst. 2019, 27, 2140–2151. [Google Scholar] [CrossRef]
  12. Xu, Y.; Yang, Z.; Pan, X. A Novel Twin Support-Vector Machine with Pinball Loss. IEEE Trans. Neural Netw. Learn. Syst. 2016, 28, 359–370. [Google Scholar] [CrossRef] [PubMed]
  13. Tanveer, M.; Sharma, A.; Suganthan, P.N. General Twin Support Vector Machine with Pinball Loss Function. Inf. Sci. 2019, 494, 311–327. [Google Scholar] [CrossRef]
  14. Tanveer, M.; Tiwari, A.; Choudhary, R.; Jalan, S. Sparse Pinball Twin Support Vector Machines. Appl. Soft Comput. 2019, 78, 164–175. [Google Scholar] [CrossRef]
  15. Rastogi, R.; Pal, A.; Chandra, S. Generalized Pinball Loss SVMs. Neurocomputing 2018, 322, 151–165. [Google Scholar] [CrossRef]
  16. Panup, W.; Ratipapongton, W.; Wangkeeree, R. A Novel Twin Support Vector Machine with Generalized Pinball Loss Function for Pattern Classification. Symmetry 2022, 14, 289. [Google Scholar] [CrossRef]
  17. Alloghani, M.; Al-Jumeily, D.; Mustafina, J.; Hussain, A.; Aljaaf, A.J. A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science. In Supervised and Unsupervised Learning for Data Science; Springer Nature: Berlin, Germany, 2020; pp. 3–21. [Google Scholar]
  18. Reddy, Y.C.A.P.; Viswanath, P.; Reddy, B.E. Semi-supervised Learning: A Brief Review. Int. J. Eng. Technol. 2018, 7, 81. [Google Scholar] [CrossRef]
  19. Zhang, W.; Li, Z.; Li, G.; Zhuang, P.; Hou, G.; Zhang, Q.; Li, C. Gacnet: Generate adversarial-driven cross-aware network for hyperspectral wheat variety identification. IEEE Trans. Geosci. Remote Sens. 2023, 62, 5503314. [Google Scholar] [CrossRef]
  20. Zhang, W.; Zhao, W.; Li, J.; Zhuang, P.; Sun, H.; Xu, Y.; Li, C. CVANet: Cascaded visual attention network for single image super-resolution. Neural Netw. 2024, 170, 622–634. [Google Scholar] [CrossRef]
  21. Zhang, W.; Chen, G.; Zhuang, P.; Zhao, W.; Zhou, L. CATNet: Cascaded attention transformer network for marine species image classification. Expert Syst. Appl. 2024, 256, 124932. [Google Scholar] [CrossRef]
  22. Zhang, Q.; Lee, F.; Wang, Y.g.; Ding, D.; Yao, W.; Chen, L.; Chen, Q. An joint end-to-end framework for learning with noisy labels. Appl. Soft Comput. 2021, 108, 107426. [Google Scholar] [CrossRef]
  23. Zhang, Q.; Zhu, Y.; Yang, M.; Jin, G.; Zhu, Y.; Chen, Q. Cross-to-merge training with class balance strategy for learning with noisy labels. Expert Syst. Appl. 2024, 249, 123846. [Google Scholar] [CrossRef]
  24. Zhang, Q.; Jin, G.; Zhu, Y.; Wei, H.; Chen, Q. BPT-PLR: A Balanced Partitioning and Training Framework with Pseudo-Label Relaxed Contrastive Loss for Noisy Label Learning. Entropy 2024, 26, 589. [Google Scholar] [CrossRef] [PubMed]
  25. Qi, Z.; Tian, Y.; Shi, Y. Laplacian Twin Support Vector Machine for Semi-supervised Classification. Neural Netw. 2012, 35, 46–53. [Google Scholar] [CrossRef] [PubMed]
  26. Merris, R. Laplacian Graph Eigenvectors. Linear Algebra Its Appl. 1998, 278, 221–236. [Google Scholar] [CrossRef]
  27. Chen, W.J.; Shao, Y.H.; Deng, N.Y.; Feng, Z.L. Laplacian Least Squares Twin Support Vector Machine for Semi-supervised Classification. Neurocomputing 2014, 145, 465–476. [Google Scholar] [CrossRef]
  28. Damminsed, V.; Panup, W.; Wangkeeree, R. Laplacian Twin Support Vector Machine with Pinball Loss for Semi-supervised Classification. IEEE Access 2023, 11, 31399–31416. [Google Scholar] [CrossRef]
  29. Huang, X.; Shi, L.; Suykens, J.A.K. Support Vector Machine Classifier with Pinball Loss. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 36, 984–997. [Google Scholar] [CrossRef]
  30. Meyes, R.; Lu, M.; de Puiseau, C.W.; Meisen, T. Ablation studies in artificial neural networks. arXiv 2019, arXiv:1901.08644. [Google Scholar]
  31. García, S.; Fernández, A.; Luengo, J.; Herrera, F. Advanced Nonparametric Tests for Multiple Comparisons in the Design of Experiments in Computational Intelligence and Data Mining: Experimental Analysis of Power. Inf. Sci. 2010, 180, 2044–2064. [Google Scholar] [CrossRef]
  32. Demšar, J. Statistical Comparisons of Classifiers over Multiple Data Sets. J. Mach. Learn. Res. 2006, 7, 1–30. [Google Scholar]
  33. Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images; Computer Science University of Toronto: Toronto, ON, Canada, 2009. [Google Scholar]
Figure 1. A conceptual diagram illustrating the development of LapGPin-TSVM by integrating a Laplacian graph-based technique into the GPin-TSVM framework, evolving from both supervised and semi-supervised methods.
Figure 1. A conceptual diagram illustrating the development of LapGPin-TSVM by integrating a Laplacian graph-based technique into the GPin-TSVM framework, evolving from both supervised and semi-supervised methods.
Symmetry 16 01373 g001
Figure 2. Visual representations of classification results on a 2D artificial dataset using the GPin-TSVM (supervised framework) and our proposed LapGPin-TSVM (semi-supervised framework), highlighting the impact of unlabeled data on the classifier. The solid circles represent labeled data, with red indicating the positive class and green indicating the negative class. The plus symbols, both red and green, represent unlabeled data.
Figure 2. Visual representations of classification results on a 2D artificial dataset using the GPin-TSVM (supervised framework) and our proposed LapGPin-TSVM (semi-supervised framework), highlighting the impact of unlabeled data on the classifier. The solid circles represent labeled data, with red indicating the positive class and green indicating the negative class. The plus symbols, both red and green, represent unlabeled data.
Symmetry 16 01373 g002
Figure 3. The count of wins, ties, and losses for accuracy at each percentage of unlabeled data is shown. If there is no color, it means the value is 0.
Figure 3. The count of wins, ties, and losses for accuracy at each percentage of unlabeled data is shown. If there is no color, it means the value is 0.
Symmetry 16 01373 g003
Figure 4. Comparison of average accuracy in linear and nonlinear (RBF) cases across different percentages of unlabeled data.
Figure 4. Comparison of average accuracy in linear and nonlinear (RBF) cases across different percentages of unlabeled data.
Symmetry 16 01373 g004
Figure 5. Comparison of average MCC in linear and nonlinear (RBF) cases across different percentages of unlabeled data.
Figure 5. Comparison of average MCC in linear and nonlinear (RBF) cases across different percentages of unlabeled data.
Symmetry 16 01373 g005
Figure 6. The graph shows the average computational time for the GPin-TSVM, Lap-TSVM, Lap-PTSVM, and LapGPin-TSVM across different unlabeled data percentages.
Figure 6. The graph shows the average computational time for the GPin-TSVM, Lap-TSVM, Lap-PTSVM, and LapGPin-TSVM across different unlabeled data percentages.
Symmetry 16 01373 g006
Figure 7. The accuracy on the Diabetes, Monk-2, and Bupa datasets varies with different noise ratios in the linear cases.
Figure 7. The accuracy on the Diabetes, Monk-2, and Bupa datasets varies with different noise ratios in the linear cases.
Symmetry 16 01373 g007
Figure 8. The accuracy on the Diabetes, Monk-2, and Ionosphere datasets varies with different noise ratios in the nonlinear cases.
Figure 8. The accuracy on the Diabetes, Monk-2, and Ionosphere datasets varies with different noise ratios in the nonlinear cases.
Symmetry 16 01373 g008
Figure 9. An illustration of the CIFAR-10 dataset.
Figure 9. An illustration of the CIFAR-10 dataset.
Symmetry 16 01373 g009
Table 1. The detailed description of the 13 benchmark datasets.
Table 1. The detailed description of the 13 benchmark datasets.
DatasetsNo. of SamplesNo. of Features
Ionosphere35133
Bupa3456
Fertility10010
Pima7688
Banknote13724
Monk-24327
Sonar20860
Diabetes7699
Spambase460157
WDBC56930
Australian69014
Heart30313
Specf heart26722
Table 2. The average values of accuracy, MCC score, and time for experimenting with data containing 20% of unlabeled data on the UCI dataset.
Table 2. The average values of accuracy, MCC score, and time for experimenting with data containing 20% of unlabeled data on the UCI dataset.
GPin-TSVMLap-TSVMLap-PTSVMLapGPin-TSVM
Datasets Acc (%)Acc (%)Acc (%)Acc (%)
MCCMCCMCCMCC
Time (s)Time (s)Time (s)Time (s)
FertilityLinear88.00 ± 5.1088.00 ± 5.1088.00 ± 5.1088.00 ± 5.10
0.05000.0000.0000.000
0.03520.03610.03970.0404
RBF88.00 ± 4.0088.00 ± 4.0088.00 ± 4.0088.00 ± 4.00
0.00000.0000.0000.000
0.04690.04210.05210.0570
BanknoteLinear93.88 ± 1.9798.98 ± 0.6397.52 ± 1.4398.03 ± 0.94
0.88040.97930.95160.9607
2.14920.69190.95160.9607
RBF100.00 ± 0.0099.78 ± 0.1899.85 ± 0.29100.00 ± 0.00
1.0000.99560.99711.000
2.21770.62252.66252.1435
BupaLinear60.00 ± 3.8568.41 ± 4.8067.53 ± 3.5067.25 ± 3.73
0.88040.97930.95160.9607
2.14920.69190.95160.9607
RBF68.69 ± 7.2566.67 ± 4.8067.53 ± 3.5067.25 ± 3.73
0.88040.97930.95160.9607
2.14920.69190.95160.9607
IonosphereLinear81.48 ± 3.7588.03 ± 2.5187.74 ± 2.3590.88 ± 2.66
0.62070.74190.73650.8040
0.36620.17510.14990.1080
RBF89.19 ± 3.9492.89 ± 4.2191.18 ± 3.1191.18 ± 2.74
0.77090.84430.80650.8068
0.15060.07320.11580.1205
Monk-2Linear79.17 ± 3.7483.81 ± 1.9886.35 ± 2.0480.55 ± 2.25
0.59140.67690.72780.6264
0.12530.09560.16700.1221
RBF96.29 ± 3.7197.22 ± 3.3397.22 ± 3.3397.22 ± 3.33
0.92720.94640.94640.9464
0.17260.09860.18090.1586
PimaLinear68.36 ± 2.7675.39 ± 1.4175.39 ± 1.4176.95 ± 1.06
0.23430.42510.42510.4687
0.54480.14110.65380.4987
RBF72.67 ± 4.8275.79 ± 3.0679.97 ± 4.1576.05 ± 4.01
0.36630.44240.47160.4499
0.53410.18740.48130.5510
SonarLinear75.45 ± 9.5877.86 ± 6.0778.36 ± 6.2577.86 ± 6.54
0.53800.55850.57230.5605
0.06320.06880.07100.0704
RBF75.45 ± 3.0877.92 ± 6.1779.83 ± 2.2877.91 ± 4.79
0.51850.56050.59550.5642
0.05520.05460.07790.0661
DiabetesLinear75.39 ± 1.6274.48 ± 2.1977.73 ± 1.9576.82 ± 2.11
0.43500.40930.49280.4722
0.47280.14410.63760.5980
RBF75.77 ± 2.0376.43 ± 1.4776.82 ± 1.6376.95 ± 1.13
0.44330.46110.47150.4744
0.52730.20470.68800.5473
SpambaseLinear84.08 ± 1.0190.45 ± 0.9191.32 ± 1.1191.12 ± 0.95
0.71060.80000.81770.8135
50.070316.410061.556742.6016
RBF88.41 ± 0.9790.30 ± 1.0991.28 ± 0.5791.41 ± 0.51
0.75690.79710.81860.8194
41.556417.356079.490070.7805
WDBClinear88.93 ± 1.1792.96 ± 1.8694.90 ± 1.5295.78 ± 1.41
0.77290.85940.89270.9106
0.27420.10370.30750.2995
RBF93.15 ± 1.0295.26 ± 2.5795.26 ± 2.6395.43 ± 2.79
0.85970.89920.89910.9051
0.25950.12160.32490.3304
Australianlinear64.35 ± 3.7985.65 ± 4.2185.94 ± 4.3686.09 ± 3.95
0.33190.71920.72720.7265
0.33620.11950.35870.3109
RBF67.68 ± 2.0876.81 ± 3.6784.20 ± 4.0076.23 ± 1.96
0.36760.52980.68650.5216
0.35530.14100.42980.5042
Heartlinear75.56 ± 5.9081.85 ± 4.5984.07 ± 3.0184.81 ± 2.16
0.51690.63010.67560.6905
0.06910.06300.09190.0879
RBF79.62 ± 1.1783.70 ± 2.4681.48 ± 3.7081.48 ± 4.05
0.59230.66640.63170.6279
0.10390.07350.10300.0836
Spect heartlinear77.16 ± 3.1677.53 ± 3.9079.03 ± 3.7879.41 ± 3.91
0.14560.02120.01070.000
0.06910.06300.09190.0879
RBF76.42 ± 4.4577.19 ± 5.0481.62 ± 3.9282.01 ± 4.84
0.09810.05470.21360.2701
0.08170.06510.10780.1036
Win/tile/loss 22/3/115/4/713/4/9
Table 3. The average values of accuracy, MCC score, and time for experimenting with data containing 40% of unlabeled data on the UCI dataset.
Table 3. The average values of accuracy, MCC score, and time for experimenting with data containing 40% of unlabeled data on the UCI dataset.
GPin-TSVMLap-TSVMLap-PTSVMLapGPin-TSVM
Datasets Acc (%)Acc (%)Acc (%)Acc (%)
MCCMCCMCCMCC
Time (s)Time (s)Time (s)Time (s)
FertilityLinear79.00 ± 11.5885.00 ± 7.7588.00 ± 5.1088.00 ± 5.10
0.0111−0.04390.0000.000
0.03390.03030.03650.0441
RBF88.00 ± 4.0088.00 ± 4.0088.00 ± 4.0088.00 ± 4.00
0.00000.0000.0000.000
0.03760.05210.04720.0427
BanknoteLinear94.31 ± 1.7498.76 ± 0.6897.45 ± 1.3698.17 ± 0.73
0.88800.97490.95020.9635
1.11780.36271.10451.1202
RBF99.78 ± 0.4399.78 ± 0.2999.56 ± 0.4299.85 ± 0.42
0.99560.99560.99120.9971
1.06690.40621.40741.1707
BupaLinear62.60 ± 5.5365.21 ± 5.5065.22 ± 6.2864.93 ± 6.18
0.20420.27180.27010.2636
0.05870.05720.07120.0680
RBF64.93 ± 6.8265.70 ± 6.1267.25 ± 5.9166.96 ± 6.69
0.28090.29900.34070.3362
0.06940.05200.08760.0795
IonosphereLinear82.04 ± 2.8286.60 ± 3.3688.03 ± 2.1789.17 ± 4.30
0.63050.70670.74670.7660
0.13100.06820.10600.0849
RBF88.05 ± 4.6090.04 ± 3.4790.62 ± 3.3788.89 ± 3.97
0.74020.77980.79040.7554
0.08380.05670.08950.0954
Monk-2Linear80.57 ± 3.8584.73 ± 1.3184.27 ± 2.7484.24 ± 4.66
0.61980.69280.68800.6954
0.09990.05870.12800.1051
RBF96.75 ± 2.8897.21 ± 3.3497.22 ± 3.3497.22 ± 3.34
0.93310.94610.94640.9464
0.09350.08480.12130.1084
PimaLinear68.23 ± 2.7275.65 ± 1.1675.91 ± 1.0976.95 ± 1.24
0.22720.43030.43750.4695
0.25820.11180.32200.2599
RBF72.41 ± 4.1475.66 ± 3.2475.66 ± 4.1375.92 ± 3.77
0.35050.43280.43440.4429
0.27190.13020.26590.2874
SonarLinear72.62 ± 8.5776.96 ± 10.07979.35 ± 7.8677.91 ± 8.48
0.47290.54310.60570.5743
0.05420.04340.05770.0595
RBF73.53 ± 7.6975.53 ± 4.4073.10 ± 5.5375.47 ± 4.67
0.46800.51920.46350.5154
0.04970.05430.06770.0586
DiabetesLinear73.57 ± 1.9273.83 ± 2.4176.69 ± 1.3777.08 ± 1.76
0.38830.39660.47360.4801
0.23610.10300.33580.3023
RBF74.99 ± 2.2475.78 ± 1.9876.30 ± 1.6977.60 ± 2.21
0.42180.44450.45990.4877
0.24820.14790.35970.2975
SpambaseLinear84.10 ± 1.3688.82 ± 0.3690.86 ± 1.1790.78 ± 1.05
0.71130.76560.80810.8062
23.15527.795526.844420.8474
RBF88.21 ± 0.8489.97 ± 0.9591.12 ± 0.7791.15 ± 0.69
0.75340.79010.81350.8138
20.14478.574238.310838.6832
WDBCLinear88.40 ± 2.0293.84 ± 2.6994.72 ± 1.4895.78 ± 1.03
0.76160.87340.88900.9099
0.14660.08450.17300.1735
RBF93.32 ± 0.8894.38 ± 3.5394.38 ± 2.6395.08 ± 2.25
0.86270.87960.87940.8966
0.13780.10440.17810.0065
Australianlinear64.49 ± 3.5885.36 ± 4.4585.80 ± 3.5786.09 ± 3.38
0.34350.70980.72110.7217
0.19230.09120.19310.2084
RBF69.57 ± 4.6974.92 ± 3.2681.88 ± 4.7475.51 ± 1.41
0.37910.49200.64260.5080
0.18210.10890.25680.2597
Heartlinear74.81 ± 8.6580.74 ± 5.9383.33 ± 5.1083.33 ± 4.96
0.50500.60710.65970.6610
0.05200.04930.08180.0719
RBF77.41 ± 3.5978.89 ± 3.4380.37 ± 3.4381.85 ± 2.16
0.56670.56890.60720.6312
0.07770.06870.08860.0817
Spect heartlinear78.27 ± 2.2680.16 ± 2.3879.02 ± 4.4979.03 ± 3.22
0.22660.16700.03900.0231
0.06040.05210.07560.0707
RBF76.43 ± 6.7975.66 ± 4.8780.51 ± 4.2780.89 ± 5.62
0.1111−0.03990.26590.2863
0.06380.05370.07340.0787
Win/tile/loss 25/1/019/1/615/3/8
Table 4. The average values of accuracy, MCC score, and time for experimenting with data containing 60% of unlabeled data on the UCI dataset.
Table 4. The average values of accuracy, MCC score, and time for experimenting with data containing 60% of unlabeled data on the UCI dataset.
GPin-TSVMLap-TSVMLap-PTSVMLapGPin-TSVM
Datasets Acc (%)Acc (%)Acc (%)Acc (%)
MCCMCCMCCMCC
Time (s)Time (s)Time (s)Time (s)
FertilityLinear75.00 ± 10.9585.00 ± 8.9486.00 ± 4.9088.00 ± 5.10
0.07090.0882−0.03060.000
0.02280.03550.03720.0347
RBF88.00 ± 4.0088.00 ± 4.0089.00 ± 3.7488.00 ± 4.00
0.00000.0000.10920.000
0.02770.03820.03990.0467
BanknoteLinear94.31 ± 2.4598.47 ± 0.7897.45 ± 1.5197.96 ± 0.94
0.8880.96900.95020.9591
0.36440.20670.42490.4271
RBF99.19 ± 0.7499.13 ± 0.9199.71 ± 0.2799.85 ± 0.18
0.98380.98230.99410.9971
0.35210.26440.58510.4778
BupaLinear60.28 ± 5.5365.21 ± 5.5065.22 ± 6.2864.93 ± 6.18
0.20420.27180.27010.2636
0.05870.05720.07120.0680
RBF64.64 ± 3.8566.96 ± 2.9567.83 ± 7.5268.12 ± 6.54
0.25660.31620.34050.3512
0.06570.05060.06080.0658
IonosphereLinear79.77 ± 3.5786.61 ± 1.4787.45 ± 3.5787.75 ± 5.00
0.59710.70980.73230.7385
0.23760.05380.07220.0612
RBF86.05 ± 5.2586.35 ± 6.1286.62 ± 2.2787.20 ± 5.83
0.70100.69650.70540.7202
0.06530.05600.06270.0599
Monk-2Linear78.71 ± 3.0278.25 ± 3.1078.95 ± 2.1284.04 ± 3.13
0.59040.56010.57520.6840
0.05480.05040.07840.0755
RBF95.60 ± 2.6997.22 ± 3.3396.75 ± 3.2497.22 ± 3.33
0.91020.94640.93730.9464
0.06170.06570.08780.0799
PimaLinear69.27 ± 2.3574.99 ± 1.7375.00 ± 1.7377.08 ± 1.03
0.26930.41880.41880.4703
0.10950.07410.15520.1421
RBF73.32 ± 5.4675.14 ± 3.7875.66 ± 3.5475.92 ± 4.19
0.38030.42320.43360.4423
0.12240.12090.15070.1578
SonarLinear66.91 ± 12.2274.56 ± 5.9971.72 ± 8.7872.68 ± 9.93
0.33980.50160.45670.4740
0.03990.05430.04810.0452
RBF66.36 ± 3.5471.65 ± 2.6168.75 ± 4.7869.24 ± 8.48
0.33530.43340.38790.3932
0.03500.04020.04690.0461
DiabetesLinear73.96 ± 2.1873.83 ± 2.1876.17 ± 1.8976.56 ± 2.28
0.39830.39380.45720.4685
0.11180.08000.17730.1562
RBF76.30 ± 1.9976.69 ± 1.2576.95 ± 2.5077.08 ± 1.79
0.45560.46950.47700.4823
0.12930.10410.16700.1552
SpambaseLinear84.60 ± 1.1588.77 ± 1.4790.82 ± 1.1390.88 ± 1.01
0.71770.76450.80690.8084
7.81853.33419.72547.8930
RBF87.32 ± 0.9089.45 ± 1.2891.28 ± 0.7191.02 ± 0.93
0.73470.77920.81680.8113
7.21143.449212.882311.6014
WDBCLinear85.94 ± 2.0294.19 ± 0.8994.73 ± 2.1595.43 ± 1.69
0.71100.87830.88920.9023
0.10140.06320.12760.1188
RBF93.49 ± 1.0594.38 ± 2.5194.20 ± 2.4594.03 ± 2.30
0.86660.88060.87610.8750
0.07350.08360.13220.1245
Australianlinear65.07 ± 4.4884.05 ± 4.8585.22 ± 4.2485.94 ± 4.04
0.35130.68410.71100.7216
0.10130.07420.11690.1065
RBF69.27 ± 4.6874.78 ± 2.5678.12 ± 4.6273.48 ± 2.85
0.37540.48880.56570.4656
0.10710.08600.13880.1552
Heartlinear72.96 ± 5.3281.11 ± 5.9082.96 ± 4.2882.22 ± 5.19
0.46080.61660.65340.6396
0.05720.05560.06380.0525
RBF76.67 ± 2.5180.00 ± 1.3980.00 ± 0.7480.00 ± 3.59
0.56800.59520.60020.6034
0.05870.05400.06850.0645
Spect heartlinear74.89 ± 3.9277.54 ± 2.2378.65 ± 4.0278.65 ± 3.47
0.24150.13170.09520.0139
0.03940.03780.06850.0584
RBF75.65 ± 3.5277.51 ± 4.6879.43 ± 3.9477.18 ± 4.46
0.05780.15350.24000.2137
0.06480.04390.05520.0544
Win/tile/loss 25/1/017/2/718/0/8
Table 5. The average values of accuracy, MCC score, and time for experimenting with data containing 80% of unlabeled data on the UCI dataset.
Table 5. The average values of accuracy, MCC score, and time for experimenting with data containing 80% of unlabeled data on the UCI dataset.
GPin-TSVMLap-TSVMLap-PTSVMLapGPin-TSVM
Datasets Acc (%)Acc (%)Acc (%)Acc (%)
MCCMCCMCCMCC
Time (s)Time (s)Time (s)Time (s)
FertilityLinear50.00 ± 15.1778.00 ± 12.0885.00 ± 8.3789.00 ± 4.90
0.01400.09260.04530.1092
0.02560.09260.03660.0274
RBF88.00 ± 4.0087.00 ± 4.0086.00 ± 5.8388.00 ± 4.00
0.0000−0.01530.06770.000
0.02680.03460.03640.0320
BanknoteLinear93.88 ± 2.3697.45 ± 0.897.38 ± 1.1698.54 ± 0.73
0.88030.94850.94870.9704
0.20190.14190.16050.1552
RBF97.38 ± 1.9598.32 ± 1.1898.69 ± 1.5698.84 ± 1.47
0.94730.96610.97390.9768
0.10540.18650.23450.2183
BupaLinear63.77 ± 3.8963.76 ± 6.0168.12 ± 4.1067.25 ± 2.98
0.23060.24740.33150.3129
0.02740.04030.04630.0411
RBF62.61 ± 4.7962.90 ± 4.1665.80 ± 4.2066.09 ± 4.81
0.22280.27180.32210.3300
0.02700.05170.04950.0514
IonosphereLinear78.89 ± 4.2583.18 ± 1.9582.89 ± 4.3884.32 ± 5.12
0.58960.63600.62830.6576
0.03340.05060.05000.0377
RBF84.60 ± 2.9586.34 ± 3.5884.62 ± 2.9188.05 ± 3.61
0.65840.70100.66110.7438
0.03780.04900.05120.0492
Monk-2Linear73.40 ± 3.7969.93 ± 6.9277.57 ± 5.4579.17 ± 1.53
0.49620.39030.55160.5878
0.03770.03850.05600.0480
RBF92.60 ± 2.1494.90 ± 3.3494.90 ± 3.3497.21 ± 3.86
0.85420.89860.89930.9470
0.03620.05020.05410.0591
PimaLinear68.36 ± 2.0074.74 ± 3.0975.13 ± 3.1976.43 ± 1.27
0.22870.41410.42510.4556
0.05340.05700.07910.0822
RBF73.45 ± 4.5775.27 ± 3.0475.14 ± 3.0876.18 ± 3.03
0.38140.42270.42370.4487
0.05530.08400.09700.0982
SonarLinear65.38 ± 6.9168.79 ± 4.8067.31 ± 5.4167.31 ± 5.58
0.30850.38040.36130.3613
0.03130.04630.04010.0396
RBF61.56 ± 4.8366.89 ± 6.0161.61 ± 7.6168.39 ± 10.48
0.22210.33690.24590.3813
0.03010.05220.04180.0399
DiabetesLinear73.18 ± 3.0374.74 ± 1.8877.47 ± 2.6377.21 ± 2.76
0.37660.41230.48290.4801
0.05400.08170.08950.0881
RBF74.22 ± 1.7475.40 ± 3.4174.48 ± 2.6175.66 ± 3.61
0.40230.44620.43270.4449
0.06060.08430.10970.0968
SpambaseLinear85.25 ± 1.7488.54 ± 1.5891.16 ± 1.3591.28 ± 1.21
0.72330.75980.81420.8165
1.14791.29682.33942.1206
RBF86.12 ± 1.7088.17 ± 1.4690.38 ± 1.2390.58 ± 1.07
0.71040.75100.79800.8020
1.32861.34742.71662.6171
WDBCLinear81.55 ± 3.4483.13 ± 7.1093.32 ± 1.0794.02 ± 1.71
0.62390.63790.85740.8727
0.04460.05410.06960.0700
RBF91.73 ± 3.2792.80 ± 2.6791.57 ± 2.1092.97 ± 2.47
0.83730.84950.82210.8501
0.05950.06360.08500.0752
Australianlinear63.91 ± 1.4184.20 ± 3.3485.22 ± 4.0485.80 ± 3.54
0.32660.68610.70560.7132
0.05160.05560.08410.0782
RBF64.05 ± 3.0271.45 ± 2.6568.41 ± 3.8570.00 ± 3.32
0.28720.42490.36080.3944
0.05140.07010.09260.0932
Heartlinear62.22 ± 10.7075.19 ± 7.6473.33 ± 8.9676.30 ± 6.13
0.25520.50330.46360.5198
0.03410.04320.04580.0492
RBF70.37 ± 5.6178.15 ± 1.3880.00 ± 2.1580.74 ± 2.77
0.47470.55700.59810.6141
0.04580.05530.05690.0553
Spect heartlinear62.94 ± 7.1665.51 ± 5.1677.15 ± 4.3679.04 ± 4.45
0.16690.11300.05250.0792
0.03730.04120.05030.0518
RBF64.88 ± 10.9358.43 ± 6.8570.04 ± 2.5969.66 ± 3.21
0.10660.11360.21320.2100
0.02900.04110.04550.0384
Win/tile/loss 25/1/024/0/223/0/3
Table 6. Average rank of all methods when considering different ratios of unlabeled data.
Table 6. Average rank of all methods when considering different ratios of unlabeled data.
ModelMean Rank of AccuracyMean Rank of MCC
LinearNonlinearLinearNonlinear
GPin-TSVM1.441.371.311.30
Lap-TSVM2.382.432.442.42
Lap-PTSVM3.052.852.842.89
LapGPin-TSVM3.453.383.283.38
Table 7. Effects of the Laplacian term.
Table 7. Effects of the Laplacian term.
ModelPercentage of Unlabeled Data
30%50%70%
LapGPin-TSVM89.74 ± 2.1188.04 ± 2.9182.91 ± 1.94
LapGPin-TSVM
without Laplacian term
89.74 ± 2.1286.62 ± 4.2580.91 ± 2.93
Table 8. The mean and standard deviation of test accuracy with various amounts of noise on the UCI dataset calculated using a linear kernel.
Table 8. The mean and standard deviation of test accuracy with various amounts of noise on the UCI dataset calculated using a linear kernel.
GPin-TSVMLap-TSVMLap-PTSVMLapGPin-TSVM
DatasetsrAcc (%)Acc (%)Acc (%)Acc (%)
MCCMCCMCCMCC
Time (s)Time (s)Time (s)Time (s)
Fertility088.00 ± 2.4587.00 ± 2.4588.00 ± 2.4588.00 ± 2.45
0.0000−0.01530.09390.000
0.02420.02970.03610.0292
0.0588.00 ± 2.4588.00 ± 2.4588.00 ± 2.4588.00 ± 2.45
0.00000.00000.0000.000
0.02330.03250.02200.0352
0.188.00 ± 2.4588.00 ± 2.4588.00 ± 2.4588.00 ± 2.45
0.00000.00000.06580.000
0.02060.02800.03110.0303
Banknote094.39 ± 1.9798.24 ± 1.0998.25 ± 0.7098.40 ± 1.04
0.89160.96540.96500.9675
0.11210.10760.15050.1667
0.0594.32 ± 1.9598.24 ± 1.9098.47 ± 0.7098.54 ± 1.03
0.89030.96540.96940.9706
0.10270.10730.16120.1706
0.194.46 ± 2.2498.25 ± 1.0998.32 ± 0.6798.54 ± 0.46
0.89330.96540.96640.9705
0.1110.12300.15970.1648
Bupa063.48 ± 3.2366.09 ± 4.3667.25 ± 4.8267.54 ± 4.45
0.21760.27740.30140.3193
0.03460.03650.05090.0433
0.0564.35 ± 2.6866.09 ± 3.5067.54 ± 4.7368.41 ± 4.88
0.23870.27960.30800.3409
0.02600.03720.05050.0439
0.164.05 ± 2.6566.09 ± 3.9566.67 ± 4.9368.12 ± 4.39
0.23220.27790.28960.3341
0.05310.04060.05160.0424
Ionosphere084.03 ± 8.0786.03 ± 2.3485.74 ± 5.2983.75 ± 2.82
0.64990.70390.69010.6510
0.03520.04020.05460.0515
0.0581.18 ± 4.7784.33 ± 0.9184.33 ± 1.6085.76 ± 1.23
0.58650.66030.66080.6962
0.03810.04940.05100.0540
0.179.49 ± 4.9984.33 ± 1.8084.33 ± 1.6085.75 ± 1.59
0.55150.65420.66120.6909
0.03830.04040.05200.0437
Monk-2079.19 ± 6.1578.95 ± 4.3878.50 ± 6.4580.11 ± 5.23
0.58450.57710.57120.5987
0.04130.05070.08280.0962
0.0578.96 ± 5.3979.18 ± 4.7179.43 ± 6.3680.34 ± 4.93
0.57880.57980.58940.6044
0.04430.04730.08800.0663
0.178.26 ± 6.2278.72 ± 4.5578.96 ± 6.4179.41 ± 5.01
0.56130.57020.58120.5860
0.03470.05030.05660.0705
Pima075.78 ± 1.2977.60 ± 1.4877.73 ± 0.7777.60 ± 1.58
0.43850.48210.48670.4843
0.06990.07960.09700.0863
0.0576.30 ± 1.4877.99 ± 1.7878.38 ± 1.2777.60 ± 1.96
0.45350.49230.50260.4831
0.06240.06710.09080.0920
0.176.18 ± 1.9276.82 ± 1.4677.73 ± 1.2878.00 ± 1.11
0.45010.46440.48440.4929
0.06090.06790.10220.0910
Sonar066.40 ± 5.5670.66 ± 2.9274.02 ± 7.2973.05 ± 4.79
0.33870.42210.48110.4678
0.03400.04610.04290.0419
0.0564.38 ± 7.8571.14 ± 4.6171.13 ± 5.6669.71 ± 3.27
0.30850.42470.42580.4043
0.02670.04240.03850.0404
0.162.49 ± 6.5968.75 ± 3.0368.77 ± 2.3670.21 ± 4.79
0.26530.37410.38110.4041
0.02780.04210.04300.0332
Diabetes075.27 ± 4.2976.30 ± 4.6677.22 ± 2.3477.48 ± 2.96
0.42400.45200.47700.4849
0.09020.06930.09710.0985
0.0576.05 ± 4.0376.56 ± 4.6077.48 ± 3.0778.13 ± 2.87
0.44440.45840.48170.4980
0.06810.05700.10130.1179
0.176.31 ± 4.5376.43 ± 4.0676.83 ± 3.4677.48 ± 2.88
0.45340.45280.46790.4832
0.1110.12300.15970.1648
Spambase090.36 ± 1.5591.58 ± 1.0790.67 ± 1.0891.39 ± 1.32
0.80190.82310.80390.8188
2.28551.49912.63453.1632
0.0589.19 ± 1.9391.34 ± 0.9590.82 ± 1.1090.47 ± 1.15
0.77940.81810.80730.7999
2.30511.47902.6322.9123
0.188.97 ± 1.5190.30 ± 0.7090.47 ± 0.5289.75 ± 0.86
0.77290.79610.80000.7848
2.20581.47142.72012.8718
WDBC085.77 ± 3.4395.78 ± 1.2995.95 ± 1.2194.37 ± 1.82
0.70860.90920.91350.8799
0.05860.06840.07890.0738
0.0583.13 ± 1.4893.15 ± 1.8794.02 ± 2.5894.03 ± 1.15
0.64750.85250.87160.8729
0.05390.06980.08180.0731
0.184.01 ± 3.8693.50 ± 2.0494.37 ± 2.1394.38 ± 2.12
0.66860.86150.88040.8787
0.06090.06790.09840.1074
win/tile/loss 26/3/121/2/719/3/8
Table 9. The mean and standard deviation of test accuracy with various amounts of noise on the UCI dataset calculated using an RBF kernel.
Table 9. The mean and standard deviation of test accuracy with various amounts of noise on the UCI dataset calculated using an RBF kernel.
GPin-TSVMLap-TSVMLap-PTSVMLapGPin-TSVM
DatasetsrAcc (%)Acc (%)Acc (%)Acc (%)
MCCMCCMCCMCC
Time (s)Time (s)Time (s)Time (s)
Fertility088.00 ± 2.4587.00 ± 2.4587.00 ± 2.4588.00 ± 2.45
0.0000−0.01530.15930.000
0.02430.03310.03270.0334
0.0588.00 ± 2.4588.00 ± 2.4588.00 ± 2.4588.00 ± 2.45
0.00000.0000−0.03060.1542
0.02390.03230.03240.0421
0.188.00 ± 2.4588.00 ± 2.4587.00 ± 2.4588.00 ± 2.45
0.00000.00000.05010.000
0.02320.03320.03320.0340
Banknote098.90 ± 0.8999.48 ± 0.7299.34 ± 0.58100.00 ± 0.00
0.97800.98980.98681.000
0.13150.19070.28530.2806
0.0598.90 ± 0.8999.34 ± 0.9999.42 ± 0.55100.00 ± 0.00
0.97800.98700.98821.000
0.14470.21740.28510.2734
0.198.76 ± 1.1499.34 ± 0.8199.49 ± 0.50100.00 ± 0.00
0.97520.98690.98971.000
0.13880.20040.27350.2652
Bupa068.11 ± 3.7770.72 ± 3.5970.14 ± 5.3170.72 ± 6.37
0.32490.39170.38230.3749
0.03570.03870.05890.0580
0.0568.41 ± 4.1470.14 ± 4.0570.43 ± 4.9871.01 ± 5.86
0.33130.37870.38610.3848
0.04040.04160.05710.0612
0.168.41 ± 4.6170.43 ± 3.1270.72 ± 5.0570.43 ± 4.72
0.33120.38570.39290.3773
0.03580.04640.05600.0624
Ionosphere085.18 ± 4.9387.47 ± 2.0786.03 ± 3.5887.46 ± 2.63
0.67810.73320.69900.7245
0.04120.05730.05250.0457
0.0584.03 ± 6.5086.32 ± 1.9687.75 ± 1.1388.88 ± 3.45
0.65050.70770.74130.7613
0.04190.05020.05350.0481
0.182.32 ± 3.9385.46 ± 4.4987.45 ± 4.8487.46 ± 1.69
0.61740.68560.73050.7266
0.03820.05210.04780.0494
Monk-2094.20 ± 4.5396.05 ± 3.9396.29 ± 3.4196.99 ± 3.34
0.89190.92160.92810.9420
0.04040.05790.07480.0747
0.0593.52 ± 4.4295.59 ± 3.4896.51 ± 3.6096.52 ± 3.21
0.87990.91380.93230.9327
0.05500.05840.07000.0726
0.191.44 ± 3.3495.36 ± 2.4595.36 ± 2.6694.21 ± 3.75
0.84090.90730.90720.8912
0.03850.06760.07150.0718
Pima076.04 ± 2.9976.17 ± 2.6577.21 ± 1.7176.56 ± 2.60
0.45200.45450.47720.4624
0.07140.12290.14620.1324
0.0576.04 ± 2.8976.69 ± 2.3377.21 ± 1.7176.56 ± 2.83
0.45260.46650.47760.4632
0.07790.11870.14500.1418
0.176.04 ± 2.7176.30 ± 2.7176.43 ± 1.5276.69 ± 2.83
0.44980.47560.45720.4662
0.06390.11370.13700.1359
Sonar068.32 ± 11.0469.23 ± 9.2771.21 ± 9.8173.12 ± 6.82
0.36710.38520.42320.4663
0.02960.03880.04270.0405
0.0569.71 ± 3.2372.13 ± 2.3273.03 ± 7.2371.64 ± 2.77
0.39720.44960.46000.4400
0.02850.04250.03700.0431
0.164.91 ± 1.6965.92 ± 7.4167.80 ± 8.4169.72 ± 7.30
0.29310.31450.35640.3876
0.02950.04270.04180.0395
Diabetes076.04 ± 4.7576.82 ± 3.5378.12 ± 3.6478.39 ± 4.18
0.44450.46880.49390.5017
0.06450.10210.13570.1281
0.0576.56 ± 4.4576.69 ± 3.3377.87 ± 4.1078.39 ± 4.18
0.45770.46320.48880.5015
0.06690.09270.12930.1216
0.176.69 ± 4.5676.82 ± 3.2878.13 ± 3.7778.65 ± 4.70
0.45930.46630.49580.5073
0.07030.09140.12790.1239
Spambase089.01 ± 1.3890.04 ± 1.6690.75 ± 1.2990.91 ± 1.15
0.76880.79050.80570.8089
2.82103.94674.24313.9955
0.0589.06 ± 1.0889.88 ± 1.4890.49 ± 1.0190.56 ± 0.62
0.76990.78730.80030.8018
2.76084.01913.99643.7862
0.188.16 ± 0.9389.42 ± 1.3089.58 ± 0.9989.43 ± 0.82
0.75130.77800.78080.7776
2.69063.92973.93923.6749
WDBC092.99 ± 0.9693.32 ± 1.2293.84 ± 2.3194.72 ± 1.78
0.84990.85740.86800.8869
0.05880.06810.08930.0878
0.0592.97 ± 0.9693.49 ± 1.3494.02 ± 2.3494.99 ± 1.72
0.84970.86140.87190.8907
0.05440.07360.08170.0930
0.193.33 ± 1.6293.66 ± 1.3294.02 ± 2.2794.72 ± 1.26
0.85720.86550.87220.8871
0.06050.06980.09410.0885
win/tile/loss 27/3/023/3/423/1/6
Table 10. Average rank of all methods when considering different ratios of noise.
Table 10. Average rank of all methods when considering different ratios of noise.
ModelMean Rank of AccuracyMean Rank of MCC
LinearNonlinearLinearNonlinear
GPin-TSVM1.271.201.231.13
Lap-TSVM2.402.282.322.45
Lap-PTSVM3.002.973.153.10
LapGPin-TSVM3.333.553.303.32
Table 11. The results of the Wilcoxon signed-rank test analysis of the models when examining changes in the ratio of unlabeled data.
Table 11. The results of the Wilcoxon signed-rank test analysis of the models when examining changes in the ratio of unlabeled data.
Compare with
LapGPin-TSVM
Negative RanksPositive RanksTest Statistics
nMean
Rank
Sum of
Ranks
nMean
Rank
Sum of
Ranks
Tiesp-Value
Linear case
GPin-TSVM0NaN0.005126.001326.001≤0.001 *
Lap-TSVM1113.73151.003928.821124.002≤0.001 *
Lap-PTSVM1518.53278.003528.49997.002≤0.001 *
Nonlinear case
GPin-TSVM111.5011.504624.271116.505≤0.001 *
Lap-TSVM1122.50247.503624.46880.505≤0.001 *
Lap-PTSVM1330.35394.503421.57733.5050.073
* Indicates a statistically significant change.
Table 12. The results of the Wilcoxon signed-rank test analysis of the models when examining changes in the ratio of noise.
Table 12. The results of the Wilcoxon signed-rank test analysis of the models when examining changes in the ratio of noise.
Compare with
LapGPin-TSVM
Negative RanksPositive RanksTest Statistics
nMean
Rank
Sum of
Ranks
nMean
Rank
Sum of
Ranks
Tiesp-Value
Linear case
GPin-TSVM11.001.002614.50377.003≤0.001 *
Lap-TSVM712.8690.002115.05361.0020.010 *
Lap-PTSVM817.06136.501912.71241.5030.207
Nonlinear case
GPin-TSVM0NaN0.002714.00378.003≤0.001 *
Lap-TSVM47.8831.502315.07346.503≤0.001 *
Lap-PTSVM615.6794.002314.83341.0010.008 *
* Indicates a statistically significant change.
Table 13. Test accuracy of four models using a 5-fold cross-validation. The table includes accuracy for each fold and the average accuracy for each model.
Table 13. Test accuracy of four models using a 5-fold cross-validation. The table includes accuracy for each fold and the average accuracy for each model.
DatasetModelFold 1Fold 2Fold 3Fold 4Fold 5Average Accuracy
Airplane vs.GPin-TSVM96.7997.2997.5896.7196.8397.04 ± 0.34
AutomobileLap-TSVM97.9698.4698.2598.3398.5498.31 ± 0.20
Lap-PTSVM98.0098.1798.1797.7997.7997.98 ± 0.17
LapGPin-TSVM97.9297.9698.2997.7597.9697.98 ± 0.18
Ship vs.GPin-TSVM90.5091.2990.4689.0889.3890.14 ± 0.81
TruckLap-TSVM97.2198.4297.3397.6797.3397.59 ± 0.44
Lap-PTSVM97.9298.3897.3897.7997.7597.84 ± 0.32
LapGPin-TSVM98.0898.7197.5497.9697.8398.03 ± 0.39
Deer vs.GPin-TSVM89.9291.6791.5491.7590.8891.15 ± 0.69
ShoreLap-TSVM91.3091.8891.7592.8391.0491.76 ± 0.62
Lap-PTSVM92.7192.5892.4692.4692.0892.46 ± 0.21
LapGPin-TSVM92.5492.9693.2992.9692.5092.85 ± 0.29
Dog vs.GPin-TSVM79.3078.7579.1779.9680.8879.61 ± 0.74
CatLap-TSVM83.0581.7183.0083.0484.2583.01 ± 0.80
Lap-PTSVM85.4683.7985.9685.4285.7585.28 ± 0.77
LapGPin-TSVM85.2684.9685.8386.0487.6785.95 ± 0.94
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Damminsed, V.; Wangkeeree, R. Improved Generalized-Pinball-Loss-Based Laplacian Twin Support Vector Machine for Data Classification. Symmetry 2024, 16, 1373. https://doi.org/10.3390/sym16101373

AMA Style

Damminsed V, Wangkeeree R. Improved Generalized-Pinball-Loss-Based Laplacian Twin Support Vector Machine for Data Classification. Symmetry. 2024; 16(10):1373. https://doi.org/10.3390/sym16101373

Chicago/Turabian Style

Damminsed, Vipavee, and Rabian Wangkeeree. 2024. "Improved Generalized-Pinball-Loss-Based Laplacian Twin Support Vector Machine for Data Classification" Symmetry 16, no. 10: 1373. https://doi.org/10.3390/sym16101373

APA Style

Damminsed, V., & Wangkeeree, R. (2024). Improved Generalized-Pinball-Loss-Based Laplacian Twin Support Vector Machine for Data Classification. Symmetry, 16(10), 1373. https://doi.org/10.3390/sym16101373

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop