Next Article in Journal
Design of Decoupling Control Based TSK Fuzzy Brain-Imitated Neural Network for Underactuated Systems with Uncertainty
Next Article in Special Issue
A Reflected–Forward–Backward Splitting Method for Monotone Inclusions Involving Lipschitz Operators in Banach Spaces
Previous Article in Journal
Autonomous Normal–Cancer Discrimination by a LATS/pLATS-Explicit Hippo–YAP/TAZ Reaction System
Previous Article in Special Issue
On Probabilistic Convergence Rates of Symmetric Stochastic Bernstein Polynomials
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Solving Variational Inclusion Problems with Inertial S*Forward-Backward Algorithm and Application to Stroke Prediction Data Classification

by
Wipawinee Chaiwino
1,2,3,
Payakorn Saksuriya
1,2,4 and
Raweerote Suparatulatorn
1,2,5,*
1
Advanced Research Center for Computational Simulation, Chiang Mai University, Chiang Mai 50200, Thailand
2
Centre of Excellence in Mathematics, MHESI, Bangkok 10400, Thailand
3
Office of Research Administration, Chiang Mai University, Chiang Mai 50200, Thailand
4
International College of Digital Innovation, Chiang Mai University, Chiang Mai 50200, Thailand
5
Department of Mathematics, Faculty of Science, Lampang Rajabhat University, Lampang 52100, Thailand
*
Author to whom correspondence should be addressed.
Mathematics 2026, 14(1), 101; https://doi.org/10.3390/math14010101
Submission received: 27 November 2025 / Revised: 19 December 2025 / Accepted: 22 December 2025 / Published: 26 December 2025
(This article belongs to the Special Issue Nonlinear Functional Analysis: Theory, Methods, and Applications)

Abstract

This article introduces an iterative algorithm that is created by integrating the S * -iteration process with the inertial forward-backward algorithm. The algorithm is designed to solve optimization problems formulated as variational inclusions in a real Hilbert space. We establish the weak convergence of the algorithm under conventional assumptions. One of the applications of the algorithm is to solve the extreme learning machine, which can be transformed into the variational inclusion problem. Different algorithms, with all parameters set to be identical, are employed to solve the stroke classification problem in order to evaluate the algorithm’s performance. The results indicate that our algorithm converges faster than others and achieves a precision of 93.90%, a recall of 100%, and an F1-score of 96.58%.

1. Introduction

Variational inclusion problems are an essential problem that has numerous applications in many different kinds of research areas. The problems are frequently applied in optimization problems. In particular, this research aims to develop a predictive model for assessing the risk of stroke, a major cause of mortality worldwide. The model is created by using an extreme learning machine training with the forward–backward algorithm which is used to find the optimal parameters of the model. Stroke is known as a medical disorder and is one of the leading causes of global death. It occurs when the blood flow of the brain is interrupted, leading to damaged brain processes [1]. The World Stroke Organization (WSO) emphasizes the worldwide importance of stroke, particularly in low-income and lower-middle-income nations [2]. From 1990 to 2019, stroke incidence surged by 70.0%, accompanied by a 43.0% rise in stroke-related deaths. Furthermore, annually, approximately 5 million individuals globally suffer from stroke. The incidence of stroke survivors appears to be rising in developing nations, where in 2013, there were approximately 25.7 million stroke survivors, 6.5 million deaths, 25.7 million stroke cases, and 113 million disability-adjusted life years (DALYs) related to stroke yearly [3]. Recently, the incidence of DALYs has almost doubled since 2013 [4]. This highlights the significance of stroke disease prevention and the necessity of continuous study. Although multiple factors can lead to a stroke, the latest proof indicates that air pollution has become a significant environmental risk strongly connected with stroke incidence [5].
Global industrialization and urbanization have made air pollution a major public health problem. Particulate matter having a diameter of less than 2.5 μm, referred to as PM2.5, causes significant health concerns that occur annually. It can infiltrate the respiratory tract and penetrate the circulation. Scientific research progressively demonstrates a strong connection between PM2.5 exposure and numerous health complications, including cardiovascular illnesses and stroke [6,7]. Strokes constitute a significant contributor to mortality and disability worldwide [8]. They impose significant challenges on healthcare systems and impact individuals’ quality of life. Considering the substantial human and economic repercussions of strokes, forecasting their incidence in relation to PM2.5 exposure is a critical issue to address. By developing predictive models that precisely evaluate the stroke risk associated with PM2.5 levels, we may enhance public health plans, guide policy decisions, and ultimately decrease the number of cases of this disease. Consequently, comprehending and predicting the relationship between PM2.5 exposure and stroke risk is essential for preventative healthcare initiatives and for alleviating the effects of air pollution on public health.
The application of predictive modeling and the finding of critical risk factors for stroke by machine learning represent a rising interdisciplinary domain that utilizes artificial intelligence to improve stroke risk evaluation and management. Hassan et al. [9] introduced the dense stacking ensemble model for stroke prediction. Age, body mass index (BMI), average glucose level, cardiac disease, hypertension, and marital status are identified as the most significant factors in stroke prediction. Consequently, the majority of machine learning models have been developed and show effectiveness in predicting strokes, particularly by analyzing various data inputs, including sociodemographic information, medical histories, and lifestyle behaviors [10,11]. Abujaber et al. [12] identify critical factors such as age, gender, and pre-stroke health state, which significantly influence prognosis and treatment approaches. Moreover, machine learning techniques, such as random forests and gradient boosting, have demonstrated greater efficacy relative to conventional statistical methods, achieving elevated accuracy rates and improving the prospects for early intervention and personalized patient treatment. Conversely, limitations such as limited sample numbers and data reliability problems decrease the accuracy of all models. This demands further research in developing high-accuracy models.
In this article, we denote by H a real Hilbert space equipped with the inner product · , · and the corresponding norm · . Let R and N represent the sets of real numbers and positive integers, respectively.
A variational inclusion problem in H forms a fundamental class of mathematical problems with broad applications in optimization, equilibrium, and control theory. The standard formulation of this problem is given by
Find w H such that 0 ( F + G ) w ,
where F : H H is a single-valued mapping and G : H 2 H is a multivalued mapping. The set of all solutions to this problem is denoted by F + G 1 ( 0 ) . This formulation provides a unifying framework for various other problems, such as variational inequalities, complementarity problems, and fixed-point problems [13]. The importance of the variational inclusion problem is underscored by its wide-ranging applications across various fields. In optimization, it is employed to model and solve equilibrium problems in complex systems, such as transportation networks and financial markets [14]. In economics, it serves as a foundation for analyzing and predicting market behaviors, helping to model equilibrium conditions and assess the impacts of policy changes [15]. Moreover, in machine learning, this problem underpins algorithms for data classification and regression tasks [16]. These applications highlight the necessity for efficient and robust algorithms capable of managing the complexities inherent in the variational inclusion problem. There is a wealth of research related to the problem, which can be explored further in [17,18,19,20,21].
One of the most effective methods for solving the variational inclusion problem is the forward–backward algorithm. It operates by iteratively applying a forward step to handle the smooth part of the problem and a backward step to address the non-smooth component. This dual-step process is particularly advantageous for problems that can be decomposed into these two parts, allowing for efficient computation and convergence [22]. Enhancing the forward–backward algorithm, the inertial forward–backward algorithm incorporates an inertial term, which accelerates convergence by utilizing information from previous iterations. This modification is inspired by momentum-based approaches in optimization, which are known to improve convergence rates significantly [23].
In 2018, Gibali and Thong [24] proposed two modified algorithms in which the step size rules are adapted via Mann and viscosity techniques, leading to computationally efficient methods and enhanced practical flexibility while guaranteeing strong convergence under suitable assumptions. In 2022, Shehu et al. [25] developed two distinct forward–backward-forward type inertial algorithms for variational inclusion problems in real Hilbert spaces, where one algorithm guarantees weak convergence using a self-adaptive step size strategy independent of the Lipschitz constant, while the other establishes linear convergence under stronger assumptions, such as strong monotonicity, with step sizes explicitly bounded in terms of the Lipschitz constant. The proposed methods are further applied to an optimal control problem formulated as a variational inequality, with numerical results confirming their effectiveness.
Recent studies have explored the use of variational inclusion frameworks for applications in data classification and signal recovery. For example, Peeyada et al. [26] introduced the following inertial Mann forward–backward algorithm (Algorithm 1) for solving the variational inclusion problem, demonstrating its effectiveness in data classification using the Wisconsin breast cancer dataset and signal recovery tasks.
Algorithm 1 Inertial Mann forward–backward algorithm (IMFB).
  • Initialization: Select arbitrary elements w 0 , w 1 H . Let { σ n } [ 0 , ) , { α n } ( 0 , 1 ) , and { τ n } ( 0 , 2 β ) , where β is a cocoercivity constant. Set n : = 1 .
  • Iterative Steps: Construct { w n } by using the following steps:
  • Step 1. Define
    z n = w n + σ n ( w n w n 1 ) ,
    and
    y n = z n + α n ( w n z n ) .
    Step 2. Compute
    w n + 1 = J τ n G ( I τ n F ) y n ,
    where J τ n G = ( I + τ n G ) 1 . Replace n with n + 1 and then repeat Step 1.
Similarly, another work by Peeyada et al. [27] proposed the following modified forward–backward algorithm (Algorithm 2) that incorporates an inertial technique to enhance convergence speed, with applications to breast cancer prediction using machine learning techniques.
Algorithm 2 Modified forward–backward algorithm (MFB).
  • Initialization: Select arbitrary elements w 0 , w 1 H . Let { σ n } [ 0 , 1 ) , { α n } , { β n } [ 0 , 1 ] , and { τ n } ( 0 , 2 β ) , where β is a cocoercivity constant. Set n : = 1 .
  • Iterative Steps: Construct { w n } by using the following steps:
  • Step 1. Define
    z n = w n + σ n ( w n w n 1 ) .
    Step 2. Compute
    y n = α n w n + ( 1 α n ) J τ n G ( I τ n F ) z n ,
    and
    w n + 1 = β n J τ n G ( I τ n F ) w n + ( 1 β n ) J τ n G ( I τ n F ) y n ,
    where J τ n G = ( I + τ n G ) 1 . Replace n with n + 1 and then repeat Step 1.
As demonstrated by Peeyada et al. [26,27], we observe that the adaptation of fixed point iterative schemes (such as Mann or S-iteration structures) into the framework of variational inclusion algorithms is a highly effective strategy. These schemes provide better approximation sequences, leading to empirical accelerated convergence and higher accuracy in numerical computations compared to standard methods.
In this study, we propose a novel algorithm that integrates the inertial forward–backward algorithm with the S * -iteration process, which was introduced by Karahan and Ozdemir [28] for approximating fixed points and solving variational inequality problems. The proposed algorithm aims to address the variational inclusion problem. Our approach ensures weak convergence under the standard assumptions outlined in Section 3. Furthermore, we demonstrate the practical utility of our algorithm by applying it to stroke prediction data classification using an extreme learning machine, as discussed in Section 5.

2. Preliminaries

In this section, we gather some essential definitions and lemmas that will be instrumental in establishing our main results. We use the notation ⇀ to indicate weak convergence and → for strong convergence. Let p , q H . The following identities hold:
p + q 2 = p 2 + 2 p , q + q 2
and
η p + ( 1 η ) q 2 = η p 2 + ( 1 η ) q 2 η ( 1 η ) p q 2
for any η R .
Definition 1.
A multivalued mapping G : H 2 H is defined as follows:
(i) 
G is called monotone if, for every pair ( p , q ) , ( v , u ) Graph ( G ) , the following inequality holds:
q u , p v 0 .
(ii) 
G is called maximally monotone if it is monotone and there is no other monotone operator that properly contains it. In other words, G is maximally monotone if, for every ( p , q ) H × H , the condition
q u , p v 0 for all ( v , u ) Graph ( G )
implies that ( p , q ) Graph ( G ) .
Definition 2.
A self-mapping F : H H is defined as follows:
(i) 
F is called firmly nonexpansive if, for all p , q H , it satisfies:
F p F q 2 p q 2 ( I F ) p ( I F ) q 2 .
(ii) 
F is called β-cocoercive mapping if there exists a constant β > 0 such that, for all p , q H , the following inequality holds:
F p F q , p q β F p F q 2 .
(iii) 
F is called L-Lipschitz continuous if there exists a constant L > 0 such that, for all p , q H , the following inequality holds:
F p F q     L p q .
Lemma 1.
Let G : H 2 H be a maximally monotone mapping and let τ > 0 . The following properties are established:
(i) 
The operator J τ G = ( I + τ G ) 1 is a single-valued firmly nonexpansive mapping [29].
(ii) 
If F : H H is a mapping, then ( F + G ) 1 ( 0 ) is equal to the set of fixed points of J τ G ( I τ F ) ) [29].
(iii) 
If F : H H is a Lipschitz and monotone mapping, then F + G is a maximally monotone mapping [30].
Lemma 2.
([31]) Assume F : H H is a β-cocoercive mapping. Then, F is 1 β -Lipschitz continuous and monotone.
Lemma 3.
([32]) Let { c n } and { d n } be sequences of nonnegative real numbers such that n = 1 d n < and c n + 1 c n + d n . Then, the sequence { c n } converges.
Lemma 4.
([29] (Opial)) Let Ψ be a nonempty subset of H , and let { w n } be a sequence in H . Assume the following conditions hold:
(i) 
For every w Ψ , the sequence { w n w } converges.
(ii) 
Every weak sequential cluster point of { w n } belongs to Ψ.
Then, the sequence { w n } converges weakly to a point in Ψ.

3. Weak Convergence Results

To establish the weak convergence theorems for the variational inclusion problem using Algorithm 3, we first prove Lemma 5, which serves as a crucial tool for deriving the results of Lemma 6.
Lemma 5.
Assume that F : H H is a β-cocoercive mapping and G : H 2 H is a maximally monotone mapping such that ( F + G ) 1 ( 0 ) is non-empty. Then, for every z H , w ( F + G ) 1 ( 0 ) , and τ > 0 , the following inequality holds:
J τ G ( I τ F ) z w 2 z w 2 τ ( 2 β τ ) F z F w 2 z J τ G ( I τ F ) z τ ( F z F w ) 2 .
Proof. 
Assuming that F + G 1 ( 0 ) is non-empty, let w F + G 1 ( 0 ) . Consider z H and τ > 0 . Then, according to Equation (1) and the fact that F is a β -cocoercive mapping, we obtain
z w τ ( F z F w ) 2 = z w 2 2 τ z w , F z F w + τ 2 F z F w 2 z w 2 2 τ β F z F w 2 + τ 2 F z F w 2 = z w 2 τ ( 2 β τ ) F z F w 2 .
By integrating this with Lemma 1 ( i ) and ( i i ) , we derive the following conclusion:
J τ G ( I τ F ) z w 2 = J τ G ( I τ F ) z J τ G ( I τ F ) w 2 z w τ ( F z F w ) 2 z J τ G ( I τ F ) z τ ( F z F w ) 2 z w 2 τ ( 2 β τ ) F z F w 2 z     J τ G ( I τ F ) z τ ( F z F w ) 2 .
   □
For clarity, we present Algorithm 3 in Lemma 6 as follows.
Lemma 6.
Let F : H H be a β-cocoercive mapping, and let G : H 2 H be a maximally monotone mapping such that ( F + G ) 1 ( 0 ) is non-empty. Define the sequence { w n } as generated by the following iterative algorithm:
Algorithm 3 Inertial S* forward–backward algorithm (IS*FB).
  • Initialization: Select arbitrary elements w 0 , w 1 H . Let { α n } , { β n } , { γ n } [ 0 , 1 ] , { τ n } ( 0 , 2 β ) , { σ n } ( , ) and set n : = 1 .
    Iterative Steps: Construct { w n } by using the following steps:
    Step 1. Define
    z n = w n + σ n ( w n w n 1 ) .
    Step 2. Compute
    t n = J τ n G ( I τ n F ) z n , y n = α n z n + ( 1 α n ) t n ,
    and
    u n = β n t n + ( 1 β n ) J τ n G ( I τ n F ) y n .
    Step 3. Evaluate
    w n + 1 = γ n t n + ( 1 γ n ) J τ n G ( I τ n F ) u n .
    If z n = t n then stop and z n F + G 1 ( 0 ) . Otherwise, replace n with n + 1 and then repeat Step 1.
Assume that n = 1 | σ n | w n w n 1   < . Then, for every w ( F + G ) 1 ( 0 ) , lim n w n w exists and lim n w n w = lim n z n w .
Proof. 
Let w F + G 1 ( 0 ) and define Θ n ( z ) = z J τ n G ( I τ n F ) z τ n ( F z F w ) 2 for all z H and n N . In view of Lemma 5, we obtain the following three inequalities:
t n w 2   z n w 2 τ n ( 2 β τ n ) F z n F w 2 Θ n ( z n ) , J τ n G ( I τ n F ) y n w 2   y n w 2 τ n ( 2 β τ n ) F y n F w 2 Θ n ( y n ) ,
and
J τ n G ( I τ n F ) u n w 2   u n w 2 τ n ( 2 β τ n ) F u n F w 2 Θ n ( u n ) .
Since τ n ( 0 , 2 β ) , the three preceding inequalities can be reduced as follows:
t n w   z n w , J τ n G ( I τ n F ) y n w   y n w ,
and
J τ n G ( I τ n F ) u n w u n w .
Using Equation (2) along with inequalities (3)–(5), we demonstrate that inequality (6) holds, as outlined below.
w n + 1 w 2 =   γ n t n + ( 1 γ n ) J τ n G ( I τ n F ) u n w 2 γ n   t n w 2 + ( 1 γ n ) J τ n G ( I τ n F ) u n w 2 γ n   t n w 2 + ( 1 γ n ) u n w 2 γ n   t n w 2 + ( 1 γ n ) [ β n t n w 2 + ( 1 β n ) J τ n G ( I τ n F ) y n w 2 ] [ γ n + ( 1 γ n ) β n ] t n w 2 + ( 1 γ n ) ( 1 β n ) y n w 2 [ γ n + ( 1 γ n ) β n ] t n w 2 + ( 1 γ n ) ( 1 β n ) [ α n z n w 2 + ( 1 α n ) t n w 2 ] = ( 1 γ n ) ( 1 β n ) α n z n w 2 + [ γ n + ( 1 γ n ) β n + ( 1 γ n ) ( 1 β n ) ( 1 α n ) ] t n w 2 ( 1 γ n ) ( 1 β n ) α n z n w 2 + [ γ n + ( 1 γ n ) β n + ( 1 γ n ) ( 1 β n ) ( 1 α n ) ] · [ z n w 2 τ n ( 2 β τ n ) F z n F w 2 Θ n ( z n ) ] = z n w 2 [ γ n + ( 1 γ n ) β n + ( 1 γ n ) ( 1 β n ) ( 1 α n ) ] · [ τ n ( 2 β τ n ) F z n F w 2 + Θ n ( z n ) ] .
From inequality (6) and the definition of the sequence { z n } , we derive that
w n + 1 w   z n w =   w n + σ n ( w n w n 1 ) w   w n w + | σ n | w n w n 1 .
To conclude, we employ the recently established inequality along with Lemma 3 and the assumption, leading us to the conclusion that lim n w n w exists. Furthermore, it follows that lim n w n w = lim n z n w .    □
We are now ready to present the following three theorems on weak convergence.
Theorem 1.
Let F : H H be a β-cocoercive mapping, and let G : H 2 H be a maximally monotone mapping such that ( F + G ) 1 ( 0 ) is non-empty. Define the sequence { w n } as generated by Algorithm 3. Assume that the following conditions are satisfied:
(i) 
n = 1 | σ n | w n w n 1 < .
(ii) 
0 < lim inf n τ n lim sup n τ n < 2 β .
(iii) 
lim inf n γ n > 0 .
Then, { w n } converges weakly to an element in F + G 1 ( 0 ) .
Proof. 
Let w ( F + G ) 1 ( 0 ) . The proof provided in Lemma 6 demonstrates that inequalities (3)–(6) are satisfied, and that lim n w n w = lim n z n w . Based on inequality (6), we obtain the following inequality:
γ n [ τ n ( 2 β τ n ) F z n F w 2 + Θ n ( z n ) ]   z n w 2 w n + 1 w 2 .
Applying the limit to this inequality, along with conditions ( i i ) and ( i i i ) , yields that
lim n F z n F w   =   lim n z n t n τ n ( F z n F w ) = 0 .
This implies that
lim n z n t n = 0 .
This follows from the definition of the sequence { z n } and condition ( i ) that
w n t n   w n z n   +   z n t n   | σ n | w n w n 1   +   z n t n   0 as n .
Next, let w ˜ be a weak sequential cluster point of the sequence { w n } . This implies that there exists a subsequence { w n k } such that w n k w ˜ as k . According to Result (10), this further implies that t n k w ˜ as k . We then apply the maximal monotonicity of F   +   G to show that w ˜ ( F + G ) 1 ( 0 ) . Let ( v , u ) Graph F + G , meaning that u F v G v . By the definition of t n , it follows that m n k = 1 τ n k z n k t n k F z n k for some m n k G t n k . Applying the 1 β -Lipschitz continuity of F, we obtain
v t n k , F t n k + m n k   v t n k F t n k + m n k   v t n k F t n k F z n k + 1 τ n k z n k t n k   1 β + 1 τ n k v t n k z n k t n k .
Taking the limit in this inequality, we observe that the right-hand side tends to zero as k . This is due to the conditions β > 0 , lim inf n τ n > 0 , and Result (9), along with the boundedness of { t n k } . Consequently, it follows immediately that
lim k v t n k , F t n k + m n k = 0 .
By exploiting the monotonicity of G, we establish that
v t n k , u F v m n k 0 .
Thus, using the monotonicity of F, we derive
v t n k , u   v t n k , F v + m n k =   v t n k , F v F t n k + v t n k , F t n k + m n k   v t n k , F t n k + m n k .
Applying Result (11) and taking the limit of this inequality as k , we obtain
v w ˜ , u = lim k v t n k , u 0 .
This, combined with the maximal monotonicity of F + G , implies that w ˜ ( F + G ) 1 ( 0 ) . Finally, by invoking Lemma 4, we conclude that the sequence { w n } converges weakly to an element in ( F + G ) 1 ( 0 ) .    □
Theorem 2.
Let F : H H be a β-cocoercive mapping, and let G : H 2 H be a maximally monotone mapping such that ( F + G ) 1 ( 0 ) is non-empty. Define the sequence { w n } as generated by Algorithm 2. Assume that the following conditions are satisfied:
(i) 
n = 1 | σ n | w n w n 1 < .
(ii) 
0 < lim inf n τ n lim sup n τ n < 2 β .
(iii) 
lim inf n β n > 0 and lim sup n γ n < 1 .
Then, { w n } converges weakly to an element in F + G 1 ( 0 ) .
Proof. 
Let w ( F + G ) 1 ( 0 ) . From inequality (6), we derive the following inequality:
( 1 γ n ) β n [ τ n ( 2 β τ n ) F z n F w 2 + Θ n ( z n ) ]   z n w 2 w n + 1 w 2 .
By using the same technique as in the proof of Theorem 1, it is established that the sequence { w n } converges weakly to an element in ( F + G ) 1 ( 0 ) .    □
Theorem 3.
Let F : H H be a β-cocoercive mapping, and let G : H 2 H be a maximally monotone mapping such that ( F + G ) 1 ( 0 ) is non-empty. Define the sequence { w n } as generated by Algorithm 3. Assume that the following conditions are satisfied:
(i) 
n = 1 | σ n | w n w n 1 < .
(ii) 
0 < lim inf n τ n lim sup n τ n < 2 β .
(iii) 
lim sup n α n < 1 , lim sup n β n < 1 , and lim sup n γ n < 1 .
Then, { w n } converges weakly to an element in F + G 1 ( 0 ) .
Proof. 
Let w ( F + G ) 1 ( 0 ) . Inequality (6) leads to the following inequality:
( 1 γ n ) ( 1 β n ) ( 1 α n ) [ τ n ( 2 β τ n ) F z n F w 2 + Θ n ( z n ) ]     z n w 2 w n + 1 w 2 .
Utilizing the methodology outlined in the proof of Theorem 1, the sequence { w n } is shown to converge weakly to an element in ( F + G ) 1 ( 0 ) .    □

4. Data

This study employs a publicly available dataset from the Kaggle platform [33] to train and evaluate the proposed algorithms. The dataset, summarized in Table 1, contains 70,692 samples with 18 attributes, where the stroke variable serves as the binary target label (1 for stroke occurrence and 0 otherwise).
The dataset integrates diverse clinical, behavioral, and demographic factors, making it suitable for comprehensive stroke risk modeling. A key strength of this dataset lies in its inclusion of both modifiable and non-modifiable risk factors. Modifiable attributes encompass behavioral and physiological characteristics such as smoking status, physical activity, fruit and vegetable consumption, body mass index (BMI), and cholesterol levels. Non-modifiable variables include demographic factors such as age and gender. In addition, the dataset records mental health indicators and self-assessed general health, thereby introducing psychological aspects that complement traditional biomedical predictors. This integrated structure combines medical examinations (e.g., BMI, blood pressure), patient self-reports (e.g., lifestyle habits, general health), and clinical history (e.g., diabetes, heart disease), offering a realistic representation of healthcare data. The wide age range—from 18 to over 80 years—supports analysis of age-related stroke patterns, while binary and ordinal encodings ensure computational efficiency and reproducibility. Overall, this dataset provides a balanced and heterogeneous feature space that effectively captures the multifactorial nature of stroke risk, making it an appropriate benchmark for evaluating the robustness and generalization of the IS*FB, IMFB, and MFB algorithms.

5. Application

This section will illustrate the formatting of certain machine learning problem, particularly a classification problem, into a convex minimization problem and solve them using the proposed algorithms. In this study, we formulate the problem as a single-hidden layer feedforward neural networks (SLFNs), which are formally defined as follows:
Let x k = [ x 1 , x 2 , , x N ] T R l denote the input training data, and let r k = [ r 1 , r 2 , , r N ] T R m be the corresponding target values. A standard SLFN with M hidden nodes employs an activation function A such that:
O j = i = 1 M φ i A ( ω i , x j + b i )
where φ i represents the optimal output weight at the i-th hidden node, ω i is the weight, and b i is the bias term. The hidden layer output matrix Λ is constructed as follows:
Λ = A ( ω 1 , x 1 + b 1 ) A ( ω M , x 1 + b M ) A ( ω 1 , x N + b 1 ) A ( ω M , x N + b M )
The primary objective of SLFN is to determine the optimal output weight vector, denoted as
φ = [ φ 1 T , φ 2 T , , φ M T ] T ,
which satisfies the equation Λ φ = R . Here, R = [ r 1 T , r 2 T , , r N T ] T represents the training target data. However, a key challenge arises in obtaining the solution
φ = Λ R ,
where Λ denotes the Moore-Penrose generalized inverse of Λ . In cases where Λ does not exist or is ill-conditioned, resolving this issue through convex minimization enables an effective computation of φ .
In the subsequent section, we provide a comprehensive review of pertinent experimental trials related to our classification problem. We employ Algorithm 3 with the regularization of the least squares problem is given by:
min φ R M 1 2 Λ φ R 2 2 + λ φ 1 , λ > 0 .
Our Algorithm 3 effectively addresses this problem by employing the following setting operators:
F ( φ ) 1 2 Λ φ R 2 2 , G ( φ ) ( λ φ 1 ) .
With this setting, the operator F is 1 Λ 2 2 -cocoercive, and the operator G is maximal monotone, see [29]. The evaluation of the proposed algorithm is based on four key performance metrics: accuracy, precision, recall, and F1-score [34]. These metrics are computed using the values of true positives ( T P ), false positives ( F P ), true negatives ( T N ), and false negatives ( F N ), as follows:
Accuracy = T P + T N T P + T N + F P + F N × 100 % ;
Precision = T P T P + F P × 100 % ;
Recall = T P T P + F N × 100 % ;
F 1 - score = 2 × ( Precision × Recall ) Precision + Recall .
Before model construction, the dataset was randomly split into a training set (80%) and a test set (20%). The computational process employs a sigmoid activation function with hidden nodes M = 10 . The maximum number of iterations is set to 900 as a stopping criterion. To ensure a fair comparison of performance across different algorithms, all parameters are set to identical values, with σ n defined as follows and other parameters summarized in Table 2.
t 0 = 1 , t n = 1 + 1 + 4 t n 1 2 2 , and
σ n = min 1 ( n + 1 ) 2 w n w n 1 2 , 0.5 i f w n w n 1 and n > 900 ; 0.5 i f w n = w n 1 and n > 900 ; t n 1 1 t n oherwise
for all n N .
The classification results, summarized in Table 3 and illustrated in Figure 1, Figure 2 and Figure 3, show that the three algorithms yield comparable predictive performance on the stroke dataset. Both IMFB and IS*FB achieve identical metrics, with precision and accuracy of 93.90%, a recall of 100%, and an F1-score of 96.58%. The MFB algorithm attains slightly different results, with a precision of 93.71% and an F1-score of 96.75%. A consistent observation across all models is the perfect recall, confirming that all stroke cases were successfully identified. The high F1-scores and accuracies exceeding 93% demonstrate that all algorithms maintain a strong trade-off between sensitivity and specificity, indicating reliable generalization. While the overall predictive performances are similar, a significant distinction emerges in convergence efficiency. The IS*FB algorithm attains stable accuracy at the 151st iteration, compared with 171 for IMFB and 381 for MFB. This reduction in iteration count directly translates into lower computational cost and faster training time, which are crucial for large-scale and real-time learning applications. The rapid stabilization of precision and F1-score curves further confirms the strong numerical stability and smooth error decay of IS*FB, attributable to the inertial S * -iteration mechanism. This mechanism effectively integrates momentum information from previous iterations to accelerate convergence without oscillation. Hence, IS*FB preserves the predictive reliability of traditional inertial schemes while achieving superior computational efficiency, making it a promising framework for high-dimensional optimization and data-driven classification tasks.

6. Conclusions and Future Work

This study introduces the I S * FB algorithm for solving variational inclusion problems. We establish its weak convergence under standard assumptions and demonstrate its practical effectiveness. Specifically, we apply the algorithm to a stroke prediction task using an extreme learning machine with a single hidden layer. The dataset—comprising 70,692 samples and 18 features—captures diverse risk factors such as lifestyle, medical history, and physiological attributes. Experimental results show that I S * FB attains competitive classification accuracy while converging more rapidly than existing forward–backward algorithms, making it a promising choice for large-scale optimization in machine learning applications.
Overall, these findings underscore the potential of variational inclusion–based methods in predictive modeling, especially within healthcare analytics. Future research could explore adaptive parameter strategies and extend the algorithm to other medical datasets or real-time prediction frameworks.

Author Contributions

Conceptualization, W.C., P.S. and R.S.; methodology, R.S.; software, W.C.; validation, P.S. and R.S.; formal analysis, W.C. and R.S.; investigation, W.C. and P.S.; resources, W.C. and P.S.; data curation, W.C. and P.S.; writing—original draft preparation, R.S.; writing—review and editing, W.C. and P.S.; visualization, W.C.; supervision, R.S.; project administration, R.S.; funding acquisition, W.C., P.S. and R.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by (1) Fundamental Fund 2026, Chiang Mai University, Chiang Mai, Thailand; (2) Thailand Science Research and Innovation (TSRI); (3) Chiang Mai University, Chiang Mai, Thailand; and (4) Centre of Excellence in Mathematics, MHESI, Bangkok 10400, Thailand.

Data Availability Statement

The dataset used to train the model and evaluate the algorithm’s performance, titled “Diabetes, Hypertension and Stroke Predictio”, is publicly available on the Kaggle platform at https://www.kaggle.com/datasets/prosperchuks/health-dataset/data (accessed on 27 November 2025).

Acknowledgments

This research was partially supported by (1) Fundamental Fund 2026, Chiang Mai University, Chiang Mai, Thailand; (2) Thailand Science Research and Innovation (TSRI); (3) Chiang Mai University, Chiang Mai, Thailand; and (4) Centre of Excellence in Mathematics, MHESI, Bangkok 10400, Thailand.

Conflicts of Interest

The authors declare no conflicts of interest in this paper.

References

  1. Bersano, A.; Gatti, L. Pathophysiology and treatment of stroke: Present status and future perspectives. Int. J. Mol. Sci. 2023, 24, 14848. [Google Scholar] [CrossRef]
  2. Feigin, V.L.; Brainin, M.; Norrving, B.; Martins, S.; Sacco, R.L.; Hacke, W.; Fisher, M.; Pandian, J.; Lindsay, P. World Stroke Organization (WSO): Global Stroke Fact Sheet 2022. Int. J. Stroke 2022, 17, 18–29. [Google Scholar] [CrossRef]
  3. Feigin, V.L.; Krishnamurthi, R.V.; Parmar, P.; Norrving, B.; Mensah, G.A.; Bennett, D.A.; Barker-Collo, S.; Moran, A.E.; Sacco, R.L.; Truelsen, T.; et al. Update on the Global Burden of Ischemic and Hemorrhagic Stroke in 1990–2013: The GBD 2013 Study. Neuroepidemiology 2015, 45, 161–176. [Google Scholar] [CrossRef] [PubMed]
  4. Sudharsanan, N.; Deshmukh, M.; Kalkonde, Y. Direct estimates of disability-adjusted life years lost due to stroke: A cross-sectional observational study in a demographic surveillance site in rural Gadchiroli, India. BMJ Open 2019, 9, e028695. [Google Scholar] [CrossRef] [PubMed]
  5. Lin, H.; Guo, Y.; Di, Q.; Zheng, Y.; Kowal, P.; Xiao, J.; Liu, T.; Li, X.; Zeng, W.; Howard, S.W.; et al. Ambient PM2.5 and stroke: Effect modifiers and population attributable risk in six low- and middle-income countries. Stroke 2017, 48, 1191–1197. [Google Scholar] [CrossRef]
  6. Ueshima, H.; Sekikawa, A.; Miura, K.; Turin, T.C.; Takashima, N.; Kita, Y.; Watanabe, M.; Kadota, A.; Okuda, N.; Kadowaki, T.; et al. Cardiovascular disease and risk factors in Asia: A selected review. Circulation 2008, 118, 2702–2709. [Google Scholar] [CrossRef] [PubMed]
  7. Alexeeff, S.E.; Liao, N.S.; Liu, X.; Van Den Eeden, S.K.; Sidney, S. Long-term PM2.5 exposure and risks of ischemic heart disease and stroke events: Review and meta-analysis. J. Am. Heart Assoc. 2021, 10, e016890. [Google Scholar] [CrossRef]
  8. GBD 2021 Stroke Risk Factor Collaborators. GBD 2021 Stroke Risk Factor Collaborators. Global, regional, and national burden of stroke and its risk factors, 1990–2021: A systematic analysis for the Global Burden of Disease Study 2021. Lancet Neurol. 2024, 23, 973–1003. [Google Scholar] [CrossRef]
  9. Hassan, A.; Gulzar Ahmad, S.; Ullah Munir, E.; Ali Khan, I.; Ramzan, N. Predictive modelling and identification of key risk factors for stroke using machine learning. Sci. Rep. 2024, 14, 11498. [Google Scholar] [CrossRef]
  10. Dritsas, E.; Trigka, M. Stroke risk prediction with machine learning techniques. Sensors 2022, 22, 4670. [Google Scholar] [CrossRef]
  11. Chahine, Y.; Magoon, M.J.; Maidu, B.; Del Álamo, J.C.; Boyle, P.M.; Akoum, N. Machine learning and the conundrum of stroke risk prediction. Arrhythmia Electrophysiol. Rev. 2023, 12, e07. [Google Scholar] [CrossRef]
  12. Abujaber, A.A.; Alkhawaldeh, I.M.; Imam, Y.; Nashwan, A.J.; Akhtar, N.; Own, A.; Tarawneh, A.S.; Hassanat, A.B. Predicting 90-day prognosis for patients with stroke: A machine learning approach. Front. Neurol. 2023, 14, 1270767. [Google Scholar] [CrossRef] [PubMed]
  13. Rockafellar, R.T.; Wets, R.J.B. Variational Analysis; Springer Science & Business Media: Berlin, Germany, 2009; Volume 317. [Google Scholar] [CrossRef]
  14. Facchinei, F.; Pang, J.S. Finite-Dimensional Variational Inequalities and Complementarity Problems; Springer Science & Business Media: New York, NY, USA, 2003. [Google Scholar] [CrossRef]
  15. Whinston, M.D.; Green, J.R. Microeconomic Theory; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
  16. Bishop, C.M. Pattern Recognition and Machine Learning; Springer: New York, NY, USA, 2006. [Google Scholar]
  17. Jun-On, N.; Cholamjiak, W. Enhanced double inertial forward-backward splitting algorithm for variational inclusion problems: Applications in mathematical integrated skill prediction. Symmetry 2024, 16, 1091. [Google Scholar] [CrossRef]
  18. Tongnoi, B. The forward-backward-forward algorithm with extrapolation from the past and penalty scheme for solving monotone inclusion problems and applications. Numer. Algor. 2025, 98, 2113–2143. [Google Scholar] [CrossRef]
  19. Khonchaliew, M.; Petrot, N.; Suwannaprapa, M. Two-step inertial viscosity subgradient extragradient algorithm with self-adaptive step sizes for solving pseudomonotone equilibrium problems. Carpathian J. Math. 2025, 41, 813–835. [Google Scholar] [CrossRef]
  20. Thang, T.V.; Tien, H.M. Parallel inertial forward-backward splitting methods for solving variational inequality problems with variational inclusion constraints. Math. Meth. Appl. Sci. 2025, 48, 748–764. [Google Scholar] [CrossRef]
  21. Alakoya, T.O.; Ogunsola, O.J.; Mewomo, O.T. An inertial viscosity algorithm for solving monotone variational inclusion and common fixed point problems of strict pseudocontractions. Bol. Soc. Mat. Mex. 2023, 29, 31. [Google Scholar] [CrossRef]
  22. Lions, P.L.; Mercier, B. Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 1979, 16, 964–979. [Google Scholar] [CrossRef]
  23. Nesterov, Y.E. A method for solving the convex programming problem with convergence rate O( 1 k 2 ). Sov. Math. Dokl. 1983, 27, 372–376. [Google Scholar]
  24. Gibali, A.; Thong, D.V. Tseng type methods for solving inclusion problems and its applications. Calcolo 2018, 55, 49. [Google Scholar] [CrossRef]
  25. Shehu, Y.; Liu, L.; Dong, Q.L.; Yao, J.C. A relaxed forward-backward-forward algorithm with alternated inertial step: Weak and linear convergence. Netw. Spat. Econ. 2022, 22, 959–990. [Google Scholar] [CrossRef]
  26. Peeyada, P.; Suparatulatorn, R.; Cholamjiak, W. An inertial Mann forward-backward splitting algorithm of variational inclusion problems and its applications. Chaos Soliton. Fract. 2022, 158, 112048. [Google Scholar] [CrossRef]
  27. Peeyada, P.; Dutta, H.; Shiangjen, K.; Cholamjiak, W. A modified forward-backward splitting methods for the sum of two monotone operators with applications to breast cancer prediction. Math. Methods Appl. Sci. 2023, 46, 1251–1265. [Google Scholar] [CrossRef]
  28. Karahan, I.; Ozdemir, M. A general iterative method for approximation of fixed points and their applications. Adv. Fixed Point Theory 2013, 3, 510–526. [Google Scholar]
  29. Bauschke, H.H.; Combettes, P.L. Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd ed.; Springer: Cham, Switzerland, 2017. [Google Scholar] [CrossRef]
  30. Brézis, H. Opérateurs Maximaux Monotones et Semi-Groupes de Contractions dans les Espaces de Hilbert; Mathematical Studies; North-Holland: Amsterdam, The Netherlands, 1973; Volume 5. [Google Scholar]
  31. Zhang, S.S.; Lee, J.H.; Chan, C.K. Algorithms of common solutions to quasi variational inclusion and fixed point problems. Appl. Math. Mech. 2008, 29, 571–581. [Google Scholar] [CrossRef]
  32. Auslender, A.; Teboulle, M.; Ben-Tiba, S. A logarithmic-quadratic proximal method for variational inequalities. Comput. Optim. Appl. 1999, 12, 31–40. [Google Scholar] [CrossRef]
  33. Diabetes, Hypertension and Stroke Prediction Dataset. Available online: https://www.kaggle.com/datasets/prosperchuks/health-dataset/data (accessed on 27 November 2025).
  34. Wardhani, N.; Rochayani, M.; Iriany, A.; Sulistyono, A.; Lestantyo, P. Cross-validation Metrics for Evaluating Classification Performance on Imbalanced Data. In Proceedings of the 2019 International Conference on Computer, Control, Informatics and its Applications (IC3INA), Tangerang, Indonesia, 23–24 October 2019; pp. 14–18. [Google Scholar] [CrossRef]
Figure 1. Performance metrics of the IMFB algorithm during training and testing. Subfigures show (a) Accuracy, (b) Precision, (c) Recall, and (d) F1 Score across 1000 iterations.
Figure 1. Performance metrics of the IMFB algorithm during training and testing. Subfigures show (a) Accuracy, (b) Precision, (c) Recall, and (d) F1 Score across 1000 iterations.
Mathematics 14 00101 g001
Figure 2. Performance metrics of the MFB algorithm during training and testing. Subfigures show (a) Accuracy, (b) Precision, (c) Recall, and (d) F1 Score across 1000 iterations.
Figure 2. Performance metrics of the MFB algorithm during training and testing. Subfigures show (a) Accuracy, (b) Precision, (c) Recall, and (d) F1 Score across 1000 iterations.
Mathematics 14 00101 g002
Figure 3. Performance metrics of the I S * FB algorithm during training and testing. Subfigures show (a) Accuracy, (b) Precision, (c) Recall, and (d) F1 Score across 1000 iterations.
Figure 3. Performance metrics of the I S * FB algorithm during training and testing. Subfigures show (a) Accuracy, (b) Precision, (c) Recall, and (d) F1 Score across 1000 iterations.
Mathematics 14 00101 g003
Table 1. Stroke dataset attributes information.
Table 1. Stroke dataset attributes information.
Attribute NameDefinitions and Encoding
Input
AgeAge of Patient:
1: =18–24     5: =40–44     9: =60–64     13: =80 or older
2: =25–29     6: =45–49     10: =65–69
3: =30–34     7: =50–54     11: =70–74
4: =35–39     8: =55–59     12: =75–79
genderGender of Patient:
0: =Female
1: =Male
HighChol0: =no high cholesterol
1: =high cholesterol
CholCheck0: =no cholesterol check in 5 years
1: =yes cholesterol check in 5 years
BMIBody Mass Index
SmokerHave you smoked at least 100 cigarettes in your entire life? [Note: 5 packs = 100 cigarettes]:
0: =no
1: =yes
HeartDiseaseorAttackcoronary heart disease (CHD) or myocardial infarction (MI):
0: =no
1: =yes
PhysActivityphysical activity in past 30 days - not including job
0: =no
1: =yes
FruitsConsume Fruit 1 or more times per day
0: =no
1: =yes
VeggiesConsume Vegetables 1 or more times per day
0: =no
1: =yes
HvyAlcoholConsumpadult men 14 drinks per week and adult women 7 drinks per week:
0: =no
1: =yes
GenHlthWould you say that in general your health is:
1:= excellent      4:= fair
2:= very good     5:= poor
3:= good
MentHlthdays of poor mental health scale 1–30 days
PhysHlthphysical illness or injury days in past 30 days scale 1–30
DiffWalkDo you have serious difficulty walking or climbing stairs?:
0 := no
1 := yes
HighBP0 := no high
BP 1 := high BP
Diabetes0 := no diabetes
1 := diabetes
Output
stroke0 := no stroke
1 := suffered stroke
Table 2. Chosen parameters for each algorithm.
Table 2. Chosen parameters for each algorithm.
Algorithm τ n α n β n γ n λ
IMFB 1.9999 / Λ 2 2 10 3 --1
MFB 1.9999 / Λ 2 2 10 3 10 3 -1
I S * FB 1.9999 / Λ 2 2 10 3 10 3 10 3 1
Table 3. The performance of all algorithms.
Table 3. The performance of all algorithms.
AlgorithmPrecisionRecallF1-ScoreAccuracy
IMFB93.90100.0096.5893.90
MFB93.71100.0096.7593.71
I S * FB93.90100.0096.5893.90
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chaiwino, W.; Saksuriya, P.; Suparatulatorn, R. Solving Variational Inclusion Problems with Inertial S*Forward-Backward Algorithm and Application to Stroke Prediction Data Classification. Mathematics 2026, 14, 101. https://doi.org/10.3390/math14010101

AMA Style

Chaiwino W, Saksuriya P, Suparatulatorn R. Solving Variational Inclusion Problems with Inertial S*Forward-Backward Algorithm and Application to Stroke Prediction Data Classification. Mathematics. 2026; 14(1):101. https://doi.org/10.3390/math14010101

Chicago/Turabian Style

Chaiwino, Wipawinee, Payakorn Saksuriya, and Raweerote Suparatulatorn. 2026. "Solving Variational Inclusion Problems with Inertial S*Forward-Backward Algorithm and Application to Stroke Prediction Data Classification" Mathematics 14, no. 1: 101. https://doi.org/10.3390/math14010101

APA Style

Chaiwino, W., Saksuriya, P., & Suparatulatorn, R. (2026). Solving Variational Inclusion Problems with Inertial S*Forward-Backward Algorithm and Application to Stroke Prediction Data Classification. Mathematics, 14(1), 101. https://doi.org/10.3390/math14010101

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop