Introducing AI in Pension Planning: A Comparative Study of Deep Learning and Mamdani Fuzzy Inference Systems for Estimating Replacement Rates

Lappas, Pantelis Z.; Symeonidis, Georgios

doi:10.3390/math13233737

Open AccessArticle

Introducing AI in Pension Planning: A Comparative Study of Deep Learning and Mamdani Fuzzy Inference Systems for Estimating Replacement Rates

by

Pantelis Z. Lappas

^1,*

and

Georgios Symeonidis

^2,*

¹

IDEAL Research Laboratory, Department of Statistics and Actuarial-Financial Mathematics, University of the Aegean, 83200 Karlovassi, Greece

²

Department of Statistics and Insurance Science, University of Piraeus, 18534 Piraeus, Greece

^*

Authors to whom correspondence should be addressed.

Mathematics 2025, 13(23), 3737; https://doi.org/10.3390/math13233737

Submission received: 6 October 2025 / Revised: 9 November 2025 / Accepted: 18 November 2025 / Published: 21 November 2025

(This article belongs to the Special Issue Quantitative Analysis and Mathematical Modeling in Economics and Financial Decision-Making)

Download

Browse Figures

Versions Notes

Abstract

Funded pensions have become a key focus in strategies to ensure supplementary income during retirement. This paper explores two distinct approaches for estimating replacement rates: a deep learning model and a Mamdani Fuzzy Inference System (FIS). Using synthetic datasets for training, the deep learning model delivered accurate replacement rate predictions when benchmarked against exact solutions. On the other hand, the FIS approach, which leverages expert insights and practical experience, produced encouraging results but revealed opportunities for refining the definitions of intervals and linguistic categories. To bridge the strengths of both approaches, we introduce a conceptual integration using the Analytic Hierarchy Process (AHP), providing a multi-criteria decision-support framework that combines predictive accuracy from neural networks with the interpretability of fuzzy systems. The findings emphasize the potential of artificial intelligence (AI) methods, including neural networks and fuzzy logic, in advancing pension planning. While these techniques remain underutilized in this area, they hold significant promise for developing decision-support systems, particularly in big data contexts. Such systems can offer initial replacement rate estimates, serving as valuable inputs for experts during the decision-making process. Additionally, the paper suggests future research into multi-criteria decision analysis to improve decision-making within multi-pillar pension frameworks.

Keywords:

funded pensions; replacement rate; deep learning; mamdani fuzzy inference system; expert knowledge; synthetic data; multi-criteria decision analysis

MSC:

68T20

1. Introduction

As the demographic factor, captured by the dependency ratio, is the driving force behind upward expenditure trends [1], many countries that traditionally avoided funded pensions are now incorporating these into their main pillars. One example is Greece, which established its first fully funded pension fund in the first pensions pillar in 2021. Looking at developed countries as a whole, ref. [2] discusses how many OECD (Organisation for Economic Co-operation and Development) countries have undertaken pension reforms to address pension and aging-related challenges. These reforms often include raising the retirement age, adjusting benefit formulas, and promoting private pension schemes to complement public pensions. Additionally, the report notes the importance of adapting pension systems to demographic changes to ensure that they remain both adequate and financially viable. On the other hand, many countries have been retracting privatization reforms and have returned to more traditional, Pay-as-you-go Pension Plan (PAYG) systems [3]. In total, eighteen countries, thirteen in Eastern Europe and the former Soviet Union and five in Latin America, reversed privatizations, meaning 60 per cent of the countries that had privatized pensions reversed the process and started to switch back to public systems [4].

The replacement rate has long been a key consideration in pension planning, reflecting how effectively a pension system can replace a retiree’s pre-retirement earnings. This metric is invaluable for both policymakers and pension planners, serving as a foundational tool to design systems that balance adequate retirement income with the sustainability of the pension framework [5]. Therefore, replacement rate helps determine if retirees will be able to sustain their standard of living after they stop working. A higher replacement rate indicates a pension system that provides greater financial security for retirees, offering a more comfortable standard of living during retirement. In contrast, a lower replacement rate may signal potential economic difficulties for retirees, raising concerns about the adequacy of their retirement income.

Given the complexity and dynamic nature of pension systems, there is a growing need for advanced tools that can accurately estimate and predict replacement rates. Recent actuarial studies have shown that deep learning can effectively model complex demographic and financial relationships, particularly in mortality forecasting and longevity risk estimation [6]. These developments demonstrate that AI techniques are increasingly capable of addressing dynamic, non-linear pension-related phenomena, motivating their use in this study.

While replacement rates can be calculated deterministically from known inputs, real-world applications often involve uncertainty, incomplete data, or multi-pillar interactions where deterministic formulas alone may be insufficient. Techniques rooted in Artificial Intelligence (AI) and Decision Sciences, such as neural networks [7] and fuzzy logic [8], offer robust solutions to these challenges. Furthermore, explainable AI methods are gaining importance in actuarial and financial modeling, as they enable transparency and trust in predictive systems [9]. This aspect is particularly relevant in pension planning, where interpretability supports regulatory compliance and stakeholder confidence.

In the context of pension planning, AI tools can play a pivotal role in (i) analyzing extensive datasets, (ii) identifying intricate patterns, (iii) developing decision-support tools, and (iv) delivering highly accurate predictions. For instance, these tools can provide preliminary replacement rate estimates, particularly in scenarios involving large-scale data analysis, enabling experts to make well-informed decisions based on data-driven insights. Even in scenarios where deterministic calculations exist, AI provides a scalable approach for integrating expert knowledge, handling large datasets, and supporting preliminary decision-making, enabling pension experts to focus on nuanced or multi-criteria adjustments.

This study investigates two innovative approaches for estimating replacement rates: (i) Artificial Neural Networks (ANNs) and (ii) a Fuzzy Inference System (FIS). Using a synthetic dataset, it evaluates the predictive capabilities of these AI-driven methods in comparison to traditional calculation techniques. These calculations take into account factors such as annuity values, contribution rates, and expected return rates. For both ANNs and FIS, the study explores various parameter configurations, aiming to identify optimal setups based on insights from previous research [10]. Furthermore, to leverage the complementary strengths of ANNs and FIS, this study introduces a conceptual integration using the Analytic Hierarchy Process (AHP). This multi-criteria decision-making framework enables the systematic combination of the predictive accuracy of neural networks with the interpretability of fuzzy systems.

The novelty of this research does not lie merely in approximating deterministic outputs but also in demonstrating how AI can serve as a decision-support system, offering flexible, extensible, and scalable tools that can adapt to more complex or uncertain pension scenarios. This study therefore fills a gap in the literature by providing the first comparative assessment of ANN and Mamdani FIS models for replacement-rate estimation under a controlled experimental framework.

This work aligns with recent advances in hybrid AI systems that combine neural and fuzzy reasoning to achieve interpretable and adaptive decision-making frameworks [11], further underscoring the potential of explainable intelligence in financial forecasting contexts. Particularly, this research lies in demonstrating how neural networks and fuzzy logic can enhance the precision and reliability of replacement rate estimations, offering significant improvements over conventional methods and paving the way for more effective pension management strategies.

The remainder of this paper is structured as follows: Section 2 provides a brief literature review on the introduction of funded schemes internationally and the relevance of the latter to the promotion of savings and potential growth, as well as the application of AI techniques in pension modeling. In the following section (Section 3), the methodology and all necessary equations together with the corresponding parameters are discussed. Moreover, the section outlines the methodology employed in this study, detailing the use of synthetic datasets and the implementation of neural networks and FIS for replacement rate calculations. Section 4 presents computational experiments and results, including computer specification, data generation processes and results for both AI methods and traditional calculations. In addition to this, we analyze and discuss the results obtained from our experiments, highlighting the performance of neural networks and FIS compared to conventional approaches. Finally, Section 5 concludes the paper with a summary of key findings and their implications for the future of pension planning. Moreover, it outlines future research directions, suggesting potential enhancements and expansions of the proposed methodologies. Additional details on the fundamental concepts of neural networks and fuzzy logic are provided in Appendix A and Appendix B, respectively.

2. Literature Review

Some countries decided to introduce mandatory fully funded pension schemes, as a second tier to the PAYG tier. Recent developments have led policy makers to recognize that fully funded components can be an integral part of an adequate retirement income. In this direction, several countries, including Sweden and some Eastern Member States such as Bulgaria, Estonia, Croatia, Latvia, Lithuania, Poland and Slovakia, and Sweden have switched part of their public pension schemes into (quasi-) mandatory private funded schemes [1]. These arrangements are typically statutory, but the insurance relationship is between the individual and the pension fund. As a result, insured individuals retain ownership of their pension assets, allowing them to benefit from the returns but also assume the associated risks.

The introduction of funded pillars became popular across Europe after the 1990’s [12], however, the reversals which followed the 2008 economic and financial crisis actually reduced or completely eliminated the role of funding in many countries [13]. There were also serious disagreements about the effect of these privatizations [4].

One of the main arguments in favor of funded schemes is that they promote growth. Based on the econometric results found in [14], it can be confirmed that the investments of pension funds have had a positive impact on the economic growth of selected non-OECD countries. It is found that the strong performance of pension funds from 2002 to 2018 directly boosted the economies of non-OECD countries. Increased funds available for pension asset investments led to improved efficiency in their capital markets, fueling economic growth.

For OECD countries, a recent study by the OECD [2] likewise stresses the importance of private pensions to economic growth. The underlying logic is that larger pension assets can translate into greater investment, which in turn can fuel economic activity and growth. OECD also shows that pension fund assets in the OECD area exceeded USD 35 trillion at end-2020, versus 32.3 in 2019 [15]. There was an increasing trend despite COVID-19 in almost all countries except those facing significant early withdrawals.

Having said that, let there be no doubt that funded schemes, as PAYG schemes, also come with flaws. In 2018, a study of the International Labour Organization (ILO) presented an extensive list of these flaws, as perceived by its authors [4]. Examples are the stagnation of coverage rates, the deterioration of pension benefits, the increase in gender and income inequality, the high transition costs and respective large fiscal pressures, high administrative costs, weak governance, the concentration of the private insurance industry, the actual benefits and their recipients, the actual effect on capital markets in developing countries, the transfer of financial markets and demographic risks from states to individuals and the deteriorated social dialogue.

In any case, the typology of the pension funds around the world shows an increase in funded pensions systems and an overall inclination of countries to implement these [1,2], without this meaning that negative aspects of funded schemes are neglected.

It is worth mentioning that the field of pension systems and aging populations has witnessed significant progress, largely driven by the integration of machine learning and advanced statistical techniques. Research reviewed in this survey highlights a wide range of applications for these methodologies, covering various aspects of pension management and elderly care. Recent studies demonstrate that deep learning techniques, such as Long Short-Term Memory (LSTM) models, have been successfully applied to mortality forecasting in developing countries, highlighting their potential for actuarial applications in pension systems [16]. In parallel, AI-driven portfolio optimization has proven valuable for managing pension fund assets, supporting both individual and institutional investment decisions [17].

Liu et al. (2024) [18] conducted a study aimed at predicting osteoporotic fractures using Electronic Health Records (EHR). Their work tested 12 machine learning algorithms and 3 hybrid models, focusing on identifying risk factors. The top-performing model combined Support Vector Machine (SVM) with XGBoost, achieving over 90% accuracy and highlighting key clinical variables. In another study, ref. [19] applied survival analysis techniques to Turkey’s private pension system, utilizing Cox regression, accelerated failure time models, and machine learning methods. Their findings show that machine learning provides non-parametric alternatives to traditional models, revealing socio-economic disparities in private pension participation.

In Hungary, the research by [20] compared the effectiveness of (LSTM) neural networks with the traditional Lee-Carter model for mortality forecasting. Their findings showed that LSTM models achieved higher accuracy, underscoring their potential to enhance actuarial calculations and contribute to the sustainability of pension systems. Similarly, ref. [21] examined the link between income-expenditure balance and retirement age in pension systems through machine learning techniques. Their analysis revealed a strong correlation between retirement age and life expectancy, offering important insights to guide policy decisions for creating sustainable pension schemes.

In Denmark, ref. [22] employed statistical regularization and group bridge methods to analyze retirement transitions in the country’s pension programs, identifying significant factors that influence retirement decisions. Expanding on the evaluation of pension services, ref. [23] introduced a novel method for assessing service quality by utilizing multi-dimension attention Convolutional Neural Networks (MACNN). The proposed method integrated emoticons, sentiment analysis, and textual features to provide deeper and more accurate insights into consumer sentiment compared to traditional methods.

Further advancing the role of AI technology, ref. [24] addressed gaps in smart home pension systems by combining machine learning with wireless sensor networks. The proposed combination improved elderly care services by providing detailed data-driven insights into daily routines and health status. Additionally, ref. [25] developed a machine learning algorithm specifically designed to model lexicographic preferences in decision-making. Applied to occupational pension schemes, this approach effectively supported decision-making by evaluating multiple attributes simultaneously.

In the field of public policy, researchers have also made significant contributions by advocating for Fuzzy Set Qualitative Comparative Analysis (FSQCA) as a valuable alternative to traditional case-oriented and regression-based methods. This method is particularly effective for research designs with smaller sample sizes, offering concise results while addressing the complexity of individual cases [26].

Fuzzy decision-making methods have recently been extended to pension fund management and identification of medical insurance fraud, enabling adaptive strategies for risk assessment and asset/resource allocation [27,28]. Moreover, hybrid architectures that combine neural networks with fuzzy inference systems (ANFIS) have been explored to enhance prediction accuracy while maintaining interpretability, reflecting contemporary trends in AI-driven financial modeling [29]. It should be noted that few studies on hybrid AI architectures have explored formal multi-criteria decision-making frameworks to systematically integrate the outputs of neural networks and fuzzy inference systems. This gap is particularly evident in pension planning, where balancing predictive accuracy with interpretability and expert knowledge is critical. Our study addresses this limitation by introducing AHP, providing a structured methodology to combine complementary AI approaches into a coherent decision-support framework.

Although numerous qualitative and quantitative studies have advanced the understanding of pension systems, a notable gap remains in the application of neural networks and fuzzy logic to replacement-rate estimation. Systematic reviews of fuzzy set applications in finance mostly emphasize their broad utility in risk management and decision-making [30]. Moreover, few studies have adopted a comparative framework that incorporates explainable models such as fuzzy inference systems. This gap highlights the novelty and significance of exploring these methodologies further.

It is worth mentioning that while deterministic formulas exist for replacement rate estimation, they are often rigid and do not easily accommodate uncertainty, large-scale data, or multi-criteria considerations. The application of AI approaches such as neural networks and fuzzy logic provides scalable, adaptable, and extensible frameworks that can serve both as accurate predictors and as decision-support tools. This flexibility underscores the practical relevance of our study beyond simple function approximation.

3. Methodology

In this study, we aim to provide a controlled comparative evaluation of two alternative AI methodologies, Neural Networks (NNs) and Mamdani FIS, for estimating replacement rates in funded pension schemes. The selected variables, including annuity, contribution rate, return rate, total working years, and expected life of the annuity, are standard and theoretically grounded determinants in pension modeling [10], capturing the key factors affecting replacement rate outcomes. The use of a large synthetic dataset allows precise benchmarking against exact calculations while ensuring diverse and realistic scenarios. While some linear relationships exist in the data, both AI methods are fully extensible to non-linear, stochastic, or real-world datasets. The models are intentionally presented independently as alternative approaches rather than hybridized, enabling a clear comparison of their respective strengths, limitations, and applicability for decision support.

In addition to separately modeling replacement rates using NN and FIS approaches, the study conceptually explores their potential integration through a multi-criteria decision-making framework based on the AHP, as described in Section 3.6.

3.1. Overview of the Replacement Rate Concept

This paper will follow and showcase the three parameters mentioned above, as these are presented and analyzed in [10]. These parameters are annuity, return on investment and contribution rate. In funded systems, a contributor puts aside a certain amount of money per period of time; in the end of their working life, they get back the accumulated funds. These comprise the future present value of the amounts paid, increased by the return on investment, and paid out in the form of a lifelong pension flow (annuity).

A simplified formula describing the accrued amount would be:

Acc = \sum_{i = 1}^{n} X {(1 + i)}^{n}

where:

$A c c$ : is the accrued amount at retirement;
X: is the amount of contributions per year (assumed fixed);
i: is the yearly rate of return on investment;
n: is the number of years of contribution.

3.1.1. The Baseline Scenario Analysis

As stated earlier, this paper will use the same baseline as in [10] and build up the examples with neural networks and fuzzy logic, respectively. We will, however, mention the most essential points of the scenario to avoid extensive need of reference to the original text.

In a fully funded system that is implemented de novo, without any financial burden from past commitments. The potential transition from an already existing system, which is many times the case, along with its financial burden, is not analyzed in this paper. As in all pensions, the single most critical indicator is the projected replacement rate at the time of retirement. The replacement rate is defined as the fraction of the first pension amount, over the last wage. An alternative not used in this paper, substitutes the denominator with the average career income instead of the last wage.

In order to calculate representative replacement rates for the fully funded system, we expand Equation (1) to account for income variability and maturity. This allows us to calculate the required contribution rate as a percentage of a specific amount of income (e.g., 6% on earnings). We also account for a percentage of expenses on contributions. In this context, the accrued amount at time n shall be given by:

{Acc}_{n} = X_{n} \cdot (1 - E) {(1 + i)}^{\frac{1}{2}} + {Acc}_{n - 1} (1 + i)

where:

${Acc}_{n}$ : is the accrued amount at time n;
$X_{n}$ : is the amount of contributions at time n as a percentage of income, before expenses;
E: is the percentage of expenses on contributions;
i: is the yearly rate of return on investment;
n: is the number of years of contributions.

The annual return rate on investment (i) is assumed to be constant for the purpose of simplicity. This means we practically use an average return rate overall instead of the different yearly rates. Payments are assumed monthly; hence, the amount of the last year is multiplied by

{(1 + i)}^{\frac{1}{2}}

as if there were one payment in the middle of the year, to simplify.

Let us use Equation (2) and proceed to calculate the replacement rate for a proposed funded scheme. To do that, certain additional assumptions are needed. These are the salary maturity per year (i.e., the percentage of increase in a person’s salary due to legislated increase based on more years of work, an increase in pay or a promotion), a baseline contribution rate which would serve as a norm or as the minimum (i.e., legislators are expected to present people with a standard percentage of contribution to either begin with or serve as the standard where choice is not made), an expense rate on contributions, and a real return rate (deducing expenses).

After careful considerations [10], the full career is assumed to be 40 years, annuity is set to 15.64, the baseline contribution rate on pensionable earnings is set at 6% of pensionable earnings, maturity per year on income and expense rates on contributions are both set at 0.5% (see Table 1). Finally, the real return rate (RRR) is carefully looked into [10] using historical returns of 100 years along with experience from literature, OECD and EU publications, and the 30-year maturity of the Zero-coupon yield curve spot rates of AAA-rated euro area central government bonds and the whole euro area central governments bonds, and is set at 3.5% for the baseline, ranging from 2% to 6% for the sensitivity scenarios.

Using these assumptions, we calculate the replacement rate at 26.02% for 40 years or 0.65% yearly, which constitutes the baseline calculation.

We observe that the replacement rate covers a significant amount of return on retirement, with a contribution of 6% on pensionable income.

3.1.2. Sensitivity Analysis

In the previous subsection, a careful selection of assumptions was made in order to depict the baseline scenario. Drawing on research from [10] we will now present the variations/sensitivity analysis for the three variables chosen, annuity, contribution, and return rate, respectively.

Based on different life tables and adjustments for the increase in life expectancy in the last years, ignoring the COVID-19 years which could lead to a serious miscalculation, we derive a span of 14.00 to 18.00, with a 0.1 interval. The replacement rates arising from the main annuity values can be found in Table 2.

Respectively, we may assume values for the return rate varying between 2% and 6% based on extensive analysis and careful consideration of historical returns, experience from literature, publications of international institutions, and government bonds. Under these considerations, the baseline was set at 3.5%, while the sensitivity scenarios will range from 2% to 6%. Using a step of 0.1%, the results for the main values of the return rate and respective replacement rates are depicted in Table 3 below.

Finally, the contribution rate is presented using values between 5% and 9%, based mainly on European experience. A step of 0.1% has been used. Table 4 below shows the main contribution rate values and respective replacement rates.

3.2. Computational Methodology Overview

The computational methodology for estimating replacement rates in pension systems is designed as a structured pipeline that transforms raw inputs into predictive outputs through well-defined algorithmic steps. This framework aims to formalize the sequence of data preprocessing, model training, inference, and evaluation, allowing a rigorous understanding of the procedures while providing reproducibility and transparency.

The methodology can be conceptualized as a set of complementary toolboxes. The first toolbox is a neural network-based approach, where parameters are estimated through optimization over training data. This method is fully data-driven and suitable for high-dimensional problems. The second toolbox is a Mamdani FIS, which is human-in-the-loop and rule-based, enabling interpretable predictions without the need for iterative training. Both toolboxes share the same input space but differ fundamentally in the way knowledge is incorporated and outputs are generated. The combination of these two frameworks provides a versatile and rigorous computational environment for replacement rate estimation, allowing analysts to select the appropriate method based on data availability, interpretability requirements, and computational resources.

Figure 1 presents a schematic overview of the computational methodology developed in this study. The diagram illustrates two complementary analytical pipelines built upon a shared preprocessing stage. The lower section of the figure also highlights the comparative strengths and limitations of each method, as well as their potential hybridization for future research.

3.2.1. Mathematical Formalization of the Neural Network Pipeline

Let the dataset be denoted by:

D = {(x^{(i)}, y^{(i)})}_{i = 1}^{N}, x^{(i)} = (x_{1}^{(i)}, x_{2}^{(i)}, \dots, x_{d}^{(i)}) \in R^{d}, y^{(i)} \in R^{+},

(1)

where

x^{(i)}

represents the input features (such as annuity, contribution rate, return rate, and years of contributions), and

y^{(i)}

is the target replacement rate. The dataset is first partitioned into training, validation, and test sets, denoted

D_{train}

,

D_{val}

, and

D_{test}

, respectively. Feature standardization is applied:

{\tilde{x}}_{j}^{(i)} = \frac{x_{j}^{(i)} - μ_{j}}{σ_{j}}, j = 1, \dots, d,

(2)

where

μ_{j}

and

σ_{j}

are the empirical mean and standard deviation of feature

x_{j}

over the training set.

The neural network is formalized as a function:

f_{θ} : R^{d} \to R, {\hat{y}}^{(i)} = f_{θ} ({\tilde{x}}^{(i)}),

(3)

where

θ

is the set of all weights and biases. For a network with L layers, the forward propagation is defined recursively by:

\begin{matrix} h^{(0)} & = {\tilde{x}}^{(i)}, \end{matrix}

(4)

\begin{matrix} h^{(ℓ)} & = ϕ_{ℓ} (W^{(ℓ)} h^{(ℓ - 1)} + b^{(ℓ)}), ℓ = 1, \dots, L - 1, \end{matrix}

(5)

\begin{matrix} {\hat{y}}^{(i)} & = ψ (W^{(L)} h^{(L - 1)} + b^{(L)}), \end{matrix}

(6)

where

W^{(ℓ)}

and

b^{(ℓ)}

are the weight matrix and bias vector for layer ℓ,

ϕ_{ℓ}

is the activation function for hidden layers (e.g., ReLU), and

ψ

is the output layer activation function (Softplus to ensure positive predictions).

The network parameters

θ = {W^{(ℓ)}, b^{(ℓ)}}_{ℓ = 1}^{L}

are optimized by minimizing the mean squared error over the training set:

\hat{θ} = arg min_{θ} \frac{1}{| D_{train} |} \sum_{i \in D_{train}} {(y^{(i)} - f_{θ} ({\tilde{x}}^{(i)}))}^{2} .

(7)

Hyperparameters, including the number of layers L, neurons per layer, learning rate, dropout rates, and batch size, are selected via K-fold cross-validation. The cross-validation loss is defined as:

{\hat{L}}_{CV} = \frac{1}{K} \sum_{k = 1}^{K} \frac{1}{| D_{k}^{val} |} \sum_{i \in D_{k}^{val}} {(y^{(i)} - f_{θ} ({\tilde{x}}^{(i)}))}^{2} .

(8)

Once trained, the network is evaluated on the test set using multiple metrics, including mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), R-squared (

R^{2}

), adjusted R-squared (

R_{adj}^{2}

), and mean absolute percentage error (MAPE). The complete formal definitions of these metrics are provided in Appendix A.

3.2.2. Mathematical Formalization of the Fuzzy Inference System Pipeline

Let the input space be

X \subset R^{d}

and the output space

Y \subset R

. Each input variable

x_{j}

is associated with a set of

K_{j}

fuzzy sets

A_{j, k}

, each with a membership function:

μ_{A_{j, k}} : R \to [0, 1], k = 1, \dots, K_{j} .

(9)

The output variable y has M fuzzy sets

B_{m}

with membership functions

μ_{B_{m}} (y)

.

A set of R rules is defined as:

R_{r} : IF x_{1} is A_{1, k_{1}^{r}} AND \dots AND x_{d} is A_{d, k_{d}^{r}} THEN y is B_{m^{r}}, r = 1, \dots, R .

(10)

For an input

x^{(i)}

, the firing strength of rule r is computed using the minimum t-norm:

α_{r}^{(i)} = min_{j = 1, \dots, d} μ_{A_{j, k_{j}^{r}}} (x_{j}^{(i)}) .

(11)

The aggregated fuzzy output is then:

{\tilde{y}}^{(i)} = ⋃_{r = 1}^{R} α_{r}^{(i)} \cdot B_{m^{r}},

(12)

and a crisp output is obtained using a defuzzification method. Appendix B presents a complete formal definitions of the fuzzy logic concepts.

The FIS approach provides interpretable predictions without the need for training, relying instead on expert-defined rules and membership functions. It is particularly advantageous in low-dimensional settings or when human-in-the-loop control is desired. Limitations include scalability issues and the need for careful design of fuzzy sets.

3.3. Description of the Synthetic Datasets Used

To assess the performance of our models, we created a comprehensive synthetic dataset designed to simulate realistic pension planning scenarios. The dataset was generated with a large number of data points to ensure reliable model training and evaluation. In total, we generated 500,000 samples, offering a diverse range of values for key variables relevant to pension systems.

It is important to note that the synthetic dataset is intentionally used for controlled benchmarking, allowing us to directly compare the predictive performance of NN and Mamdani FIS against exact replacement rate calculations. While some linear relationships exist in the data, the methodology is fully extensible to non-linear, stochastic, or real-world datasets, which represent potential future applications. This controlled setup ensures robust validation of the models before deployment in complex real-world scenarios.

The synthetic data was constructed by defining specific intervals for the main parameters involved in estimating replacement rates. The parameters and their corresponding ranges are as follows:

Annuity Interval: Annuity values were sampled between 14.00 and 18.00, covering a realistic range of potential annual pension payouts.
Return Rate Interval: The expected return on investment, a crucial factor in pension planning, was varied from 2.0% to 6.0%, reflecting a range from conservative to moderately aggressive investment strategies.
Contribution Rate Interval: The contribution rate, representing the percentage of income allocated to the pension, was sampled between 6.00% and 10.00%, capturing a variety of typical contribution rates.
Years: The number of contributing years was fixed at 40, representing a typical full career span for pension contributions.

These intervals were selected to reflect realistic pension scenarios and to ensure the dataset covers a wide range of potential outcomes for model evaluation.

Although the dataset is synthetic, the proposed computational framework is fully transferable to empirical pension data. The mathematical formulations of both the neural network and FIS pipelines (Section 3.2.1 and Section 3.2.2) are agnostic to the data source, requiring only consistent feature alignment and normalization. Application to real-world datasets would involve standard preprocessing steps, including data anonymization, noise filtering, and handling missing values, followed by retraining the models with observed variables such as income distributions, contribution patterns, and demographic characteristics. Consequently, the synthetic dataset serves as a controlled validation stage, while the overall methodological design remains directly applicable as a decision-support prototype in practical pension planning contexts.

3.4. Details of the Deep Learning Model

The primary goal of the neural network in this study is to provide an alternative approach for estimating replacement rates, rather than to replace deterministic formulas. The model evaluates how well NN can predict replacement rates based on key variables, offering a scalable and flexible method for preliminary decision support. Interpretability and feature-importance analysis, while valuable, are beyond the scope of this comparative study.

The development of our deep learning (DL) model involved two primary phases: an initial baseline neural network and a more exhaustive search for an optimal model architecture. Below, we detail the methodology followed for each phase.

3.4.1. Baseline Neural Network Approach

In the baseline approach (Algorithm 1), we started by constructing a straightforward neural network model using TensorFlow and Keras libraries in Python 3.12.7. Initially, the data preparation involved splitting the dataset into training and testing sets with an 80-20 ratio. The features were then standardized to ensure each feature had a mean of zero and a standard deviation of one, enhancing the model’s convergence.

Algorithm 1 Baseline Neural Network Approach

1:: Load and Preprocess Data
2:: Load the dataset into a DataFrame
3:: Split the dataset into features (X) and target (y)
4:: Split the data into training (80%) and testing (20%) sets using train_test_split
5:: Standardize the features using StandardScaler
6:: Build the Neural Network Model
7:: Initialize a sequential model
8:: Add a dense layer with 64 neurons and ReLU activation
9:: Add a dropout layer with 20% dropout rate
10:: Add a dense layer with 32 neurons and ReLU activation
11:: Add a dropout layer with 20% dropout rate
12:: Add a dense layer with 16 neurons and ReLU activation
13:: Add a dropout layer with 20% dropout rate
14:: Add a dense output layer with 1 neuron and Softplus activation
15:: Compile the model with Adam optimizer, MSE loss function, and MAE metric
16:: Train the Model
17:: Define early stopping with patience of 10 epochs
18:: Train the model using training data with validation split of 20%, for up to 100 epochs, and batch size of 32, including early stopping
19:: Evaluate the Model
20:: Evaluate the model on the test set and print MAE
21:: Plot training and validation loss over epochs
22:: Plot training and validation MAE over epochs
23:: Make Predictions and Visualize
24:: Make predictions on the test set
25:: Plot true vs. predicted values
26:: Plot the residuals distribution
27:: Save the Model and Scaler
28:: Save the model to an H5 file
29:: Save the scaler to a PKL file

The model architecture for the baseline approach comprised three hidden layers with 64, 32, and 16 neurons, respectively. Each of these layers was followed by a dropout layer to prevent overfitting. Rectified Linear Unit (ReLU) activation functions were employed for the hidden layers, and a Softplus activation function was used for the output layer to ensure positive predictions.

During the training phase, the model was compiled with the Adam optimizer and MSE loss function, tracking MAE as a metric. Early stopping was implemented to monitor validation loss and halt training when improvements ceased, thus preventing overfitting. The model was trained for up to 100 epochs with a batch size of 32, using 20% of the training data as a validation set.

The model’s performance was evaluated on the test set, reporting the MAE. Additionally, training and validation loss, along with MAE, were plotted over epochs to visualize the model’s learning process. Predictions were compared to true values using scatter plots, and residuals were analyzed to assess the model’s accuracy.

3.4.2. Exhaustive Search Approach

To optimize the model architecture, we conducted an exhaustive search exploring various network topologies (Algorithm 2). This phase aimed to identify the most effective neural network configuration for our predictive model. Initially, data preparation mirrored the baseline approach, involving standardization and splitting the dataset into training and testing sets.

Algorithm 2 Exhaustive Search Neural Network Approach

1:: Load and Preprocess Data
2:: Load the dataset into a DataFrame
3:: Split the dataset into features (X) and target (y)
4:: Split the data into training (80%) and testing (20%) sets using train_test_split
5:: Standardize the features using StandardScaler
6:: Define Model Building Function
7:: Initialize a sequential model
8:: for each hidden layer do
9:: Add a dense layer with specified number of neurons and ReLU activation
10:: Add a dropout layer with 20% dropout rate
11:: end for
12:: Add a dense output layer with 1 neuron and Softplus activation
13:: Compile the model with Adam optimizer, MSE loss function, and MAE metric
14:: Experiment with Different Topologies
15:: Define the range of hidden layers and neurons per layer
16:: Initialize variables to track the best model and lowest MAE
17:: Train and Evaluate Models
18:: for each configuration of hidden layers and neurons do
19:: Build the model with the current configuration
20:: Define early stopping with patience of 10 epochs
21:: Train the model using training data with validation split of 20%, for up to 100 epochs, and batch size of 32, including early stopping
22:: Evaluate the model on the test set
23:: if the model’s MAE is lower than the best MAE then
24:: Update the best MAE
25:: Save the current model and training history as the best model
26:: end if
27:: end for
28:: Plot Training History of Best Model
29:: Plot training and validation loss/MAE over epochs
30:: Make Predictions and Visualize
31:: Make predictions with the best model on the test set
32:: Evaluate the model using MAE, MSE, RMSE, $R^{2}$ , $R_{adj}^{2}$ and MAPE
33:: Plot true vs. predicted values & Plot the residuals distribution
34:: Save the Best Model and Scaler
35:: Save the best model to an H5 file
36:: Save the scaler to a PKL file

Exploration of model architectures involved testing multiple configurations, varying the number of hidden layers from 1 to 5 and adjusting the number of neurons per layer (specifically 64, 32, 16, 8, and 4 neurons). Each architecture configuration incorporated dropout layers to mitigate overfitting and ReLU activation functions for the hidden layers, while employing a Softplus activation function for the output layer to ensure positive predictions.

During training, early stopping mechanisms were implemented to prevent overfitting, utilizing the Adam optimizer and the MSE loss function. The models were trained with a focus on minimizing validation MAE, which served as the primary criterion for selecting the best-performing model.

Evaluation of the selected model encompassed multiple metrics including MAE, MSE, RMSE,

R^{2}

,

R_{adj}^{2}

, and MAPE. To visualize model performance, predictions were plotted against true values, while residual analysis provided insights into the accuracy and effectiveness of the selected architecture.

3.5. Explanation of the Mamdani Fuzzy Inference System

The Mamdani FIS allows us to model complex relationships between input variables (i.e., annuity, return rate, and contribution rate) and an output variable, replacement rate, using fuzzy logic principles (Algorithm 3). Fuzzy logic is particularly suited for handling imprecise and uncertain information by assigning degrees of membership to linguistic terms, such as “low”, “medium”, and “high”, rather than crisp numerical values.

The process begins with defining the range of values, or the universe of discourse, for each variable. For instance, the annuity spans values from approximately 14.0 to 18.0, the return rate varies between 2.0% and 6.0%, and the contribution rate ranges from 6.0% to 10.0%. These ranges are divided into smaller segments to ensure that the fuzzy sets accurately capture the variations within the input data.

Membership functions are then created for each variable, employing triangular shapes to represent different categories. For example, the annuity variable is divided into the categories “low”, “medium”, and “high”, with centers at 14.0, 15.5, and 17.0, respectively. Additional triangular membership functions are developed for the return rate and contribution rate, with linguistic terms tailored to reflect the unique characteristics of each variable’s range.

The output variable, which is the replacement rate, is also assigned membership functions that describe varying levels of desirability across its possible values. Categories such as “low”, “low to medium”, “medium”, “medium to high”, and “high” are used, covering a range from 0 to 100. Each category is represented by a triangular fuzzy set to indicate degrees of membership. For instance, “low” might correspond to replacement rates under 30.0, “medium” could represent rates between 30.0 and 60.0, and “high” might include rates above 60.0.

Once the membership functions are defined, a series of fuzzy rules is established to determine how the input variables influence the replacement rate. These rules reflect domain-specific knowledge and expert’s insights. For example, one rule might state, “If the annuity is high, the return rate is low, and the contribution rate is medium, then the replacement rate is medium to high”, producing consistent and meaningful output for a specific combination of input values.

Algorithm 3 Mamdani Fuzzy Inference System (FIS)

1:: Define Universe Variables and Membership Functions
2:: Define antecedent variables (e.g., $X_{1}$ , $X_{2}$ , $X_{3}$ ) and consequent variable (Y) with linguistic terms (e.g., low, medium, high)
3:: Define membership functions for each antecedent variable using triangular or Gaussian functions
4:: Define membership functions for the consequent variable with linguistic terms and corresponding fuzzy sets
5:: Define Fuzzy Rules
6:: Define a set of fuzzy rules linking antecedent variables to consequent variables:
7:: - Rule 1: If $X_{1}$ is low and $X_{2}$ is high, then Y is medium
8:: - Rule 2: If $X_{1}$ is medium and $X_{3}$ is low, then Y is high
9:: - …
10:: Create Fuzzy Control System
11:: Create a fuzzy control system using the defined rules
12:: Initialize the fuzzy control system simulation
13:: Apply Inputs and Compute Outputs
14:: for each input set in the dataset do
15:: Pass input values to the fuzzy control system simulation
16:: Compute the inferred output using the fuzzy inference mechanism
17:: Store computed fuzzy output in the dataset
18:: end for

Finally, a control system is built to facilitate the inference process. This system integrates all the defined rules and membership functions, creating a comprehensive framework for decision-making based on fuzzy logic principles. These principles allow for the simulation of the control system with specific input values. That is, when inputs are processed through the FIS, the system generates outputs designed to be straightforward and easy to understand.

3.6. Conceptual Integration Framework Through Analytic Hierarchy Process

To establish a unified and interpretable decision-support framework, this subsection formalizes the conceptual integration of the DL and Mamdani FIS results using the AHP [31]. The AHP provides a multi-criteria decision-making structure in which each modeling paradigm is evaluated under several criteria reflecting quantitative performance and qualitative interpretability.

3.6.1. Definition of Alternatives and Criteria

Let the set of modeling alternatives be:

A = {a_{1}, a_{2}, a_{3}} = {DL, FIS, Hybrid (DL + FIS)} .

(13)

and let the evaluation criteria be:

C = {c_{1}, c_{2}, c_{3}, c_{4}} = {Accuracy, Interpretability, Stability, Computational Cos t} .

(14)

Each alternative

a_{i}

(i = 1, 2, 3)

is evaluated under each criterion

c_{j}

(j = 1, 2, 3, 4)

by means of a normalized score

s_{i j} \in [0, 1]

, forming the performance matrix:

S = {[s_{i j}]}_{n \times m}, n = 3, m = 4 .

(15)

3.6.2. Criterion-Specific Score Computation

For each criterion

c_{j}

, a raw metric

M_{i}^{(j)}

is computed from the output of model

a_{i}

. To enable comparison across heterogeneous scales, the normalized score is defined as:

s_{i j} = \{\begin{matrix} \frac{M_{max}^{(j)} - M_{i}^{(j)}}{M_{max}^{(j)} - M_{min}^{(j)}}, & if larger M_{i}^{(j)} is undesirable, \\ \frac{M_{i}^{(j)} - M_{min}^{(j)}}{M_{max}^{(j)} - M_{min}^{(j)}}, & if larger M_{i}^{(j)} is desirable, \end{matrix}

(16)

where

M_{min}^{(j)} = min_{i} M_{i}^{(j)}, M_{max}^{(j)} = max_{i} M_{i}^{(j)} .

(17)

Specifically:

Accuracy ( $c_{1}$ ): quantified by the MAE, denoted $M_{i}^{(1)} = {MAE}_{i}$ ; its explicit formula is given in Appendix A. Since smaller values indicate better accuracy, the undesirable form in Equation (16) applies.
Interpretability ( $c_{2}$ ): represented by

$M_{i}^{(2)} = \frac{1}{3} [norm (R_{i}) + norm (H_{i}) + norm (S_{i}^{SHAP})],$

(18)

where $R_{i}$ denotes the number of fuzzy rules (FIS) or neurons per layer (DL), $H_{i}$ is the fuzzy entropy of the rule base, and $S_{i}^{SHAP}$ is a sparsity or feature-importance index from SHAP values.
Stability ( $c_{3}$ ): quantified by the variance $V_{i}$ of model predictions under small input perturbations; the corresponding measure is

$M_{i}^{(3)} = \frac{1}{V_{i}},$

(19)

with larger values denoting more stable behavior.
Computational Cost ( $c_{4}$ ): expressed by the total training time $T_{i}$ or parameter count $P_{i}$ ,

$M_{i}^{(4)} = T_{i} or P_{i},$

(20)

for which the undesirable form of Equation (16) applies.

3.6.3. AHP Weight Determination

In the AHP, pairwise comparisons between criteria define the positive reciprocal matrix:

P = {[p_{j k}]}_{m \times m}, p_{j k} = \frac{1}{p_{k j}}, p_{j j} = 1, p_{j k} > 0 .

(21)

The criterion weights

w = {(w_{1}, w_{2}, \dots, w_{m})}^{⊤}

are obtained as the normalized principal right eigenvector of

P

:

P w = λ_{max} w, \sum_{j = 1}^{m} w_{j} = 1 .

(22)

Consistency of judgments is assessed via the consistency ratio (CR),

CR = \frac{(λ_{max} - m) / (m - 1)}{RI},

(23)

where RI is the random index for a matrix of order m (given by Saaty [31]);

CR < 0.10

indicates satisfactory consistency.

3.6.4. Integration of FIS-Derived Fuzzy Weights

The FIS yields fuzzy weights

{\tilde{w}}_{j}

representing the subjective importance of each criterion

c_{j}

. Each fuzzy weight is characterized by its membership function

μ_{{\tilde{w}}_{j}} (x)

, with

x \in [0, 1]

(see Appendix B for more details). Defuzzification by the centroid (center-of-gravity) method produces a crisp equivalent:

{\hat{w}}_{j} = \frac{\int_{0}^{1} x μ_{{\tilde{w}}_{j}} (x) d x}{\int_{0}^{1} μ_{{\tilde{w}}_{j}} (x) d x} .

(24)

The resulting values are normalized so that:

\sum_{j = 1}^{m} {\hat{w}}_{j} = 1 .

(25)

These

{\hat{w}}_{j}

may directly replace the eigenvector weights in Equation (22) or serve to generate a consistent comparison matrix via:

p_{j k} = \frac{{\hat{w}}_{j}}{{\hat{w}}_{k}} .

(26)

3.6.5. Global Score Aggregation

Given the normalized performance matrix

S

and the criterion weights

w

(or

\hat{w}

), the overall score of each alternative

a_{i}

is:

S_{i} = \sum_{j = 1}^{m} w_{j} s_{i j}, i = 1, 2, 3,

(27)

and the preferred model is selected as

a^{*} = arg max_{a_{i} \in A} S_{i} .

(28)

3.6.6. Illustrative Example and Discussion

Table 5 reports a simplified example with hypothetical normalized scores and AHP-derived weights.

In this hypothetical case, the Hybrid model attains the highest global score (

S_{i} = 0.81

), representing the best compromise among the four evaluation criteria.

Equations (13)–(28) formalize how AHP can systematically integrate DL and FIS outputs into a coherent multi-criteria decision-support framework. The DL model contributes quantitative precision (via lower MAE), while the FIS provides qualitative interpretability and fuzzy-derived weighting of criteria. The AHP layer then aggregates these complementary perspectives into an analytically consistent decision metric (see Figure 2). Although implemented here conceptually, this formulation establishes a rigorous mathematical foundation for future hybrid systems that combine predictive performance and expert reasoning in pension planning.

4. Computational Experiments and Results

4.1. Computer Specifications and Model Implementation Framework

The experiments were conducted on a system featuring an AMD Ryzen 5 5500U processor with Radeon Graphics, operating at a base speed of 2.10 GHz. The computer was equipped with 8 GB of RAM, with about 5.83 GB available for use, and ran on the 64-bit version of Windows 11 Pro. Python served as the primary programming language for implementing the study’s methodologies.

The deep learning models were implemented using the Keras Sequential API, which allows a clear and structured definition of neural network architectures as sequences of layers. In this framework, the model parameters are optimized by minimizing a loss function, which in our case is the mean squared error. The choice of MSE was motivated by its stability during training and its sensitivity to large deviations, making it suitable for financial predictions where significant errors can have high impact. Complementary metrics, including MAE and MAPE, were used during evaluation to provide interpretable insights for decision-makers. These metrics, together with the formal definitions of network transformations and loss functions, are detailed in Section 3.2 of the methodology, as well as in Appendix A.

4.2. Data Preparation

We generated a synthetic dataset consisting of 500,000 entries, each capturing three key financial parameters: annuity, return rate, and contribution rate. The annuity values ranged from 14.00 to 18.00, with an average close to 16.00 and a standard deviation of 1.15, indicating a moderate degree of variation. The return rate spanned from 2.00 to 6.00, averaging around 4.00 with a comparable standard deviation. Contribution rates showed an average of approximately 8.00, fluctuating between 6.00 and 10.00, also with a standard deviation of 1.15.

The dataset was partitioned into training (80%) and test (20%) subsets, with the training subset further used in a five-fold cross-validation scheme for model selection and hyperparameter optimization.

4.3. Baseline Neural Network Approach Results

In our baseline neural network, we used the Keras Sequential API to build a straightforward layered architecture [32]. The network included an input layer, three hidden layers, and an output layer. The input layer matched the number of features in the dataset, while the hidden layers had 64, 32, and 16 neurons, respectively, with ReLU as the activation function (ReLU was chosen because it introduces non-linearity into the model, enabling it to effectively learn and represent complex patterns in the data).

To prevent overfitting, we added dropout layers after each hidden layer with a 20% dropout rate (this regularization technique temporarily deactivates a random subset of neurons during training, encouraging the model to rely on a broader set of features and improving its ability to generalize to unseen data). The output layer used a single neuron with a Softplus activation function, ensuring non-negative predictions, which are important for regression tasks.

The model was trained on the dataset over 100 epochs (each epoch represents one complete pass through the entire dataset) with a batch size of 32, allowing the data to be processed in manageable chunks before updating the model’s weights (this setup struck a balance between computational efficiency and stable training dynamics). To guide the optimization process, we used the MSE loss function to measure prediction accuracy and the Adam optimizer with a learning rate of 0.001 to fine-tune the weights efficiently (Adam optimizer adjusts the learning rate during training, ensuring efficient convergence while minimizing the loss function).

When tested on the evaluation dataset, the model achieved an MAE of approximately 1.086. This metric, which reflects the average magnitude of prediction errors, indicated that the model’s predictions were closely aligned with the actual values (a lower MAE suggests strong predictive accuracy, confirming the network’s capability in addressing this regression task).

To further evaluate the baseline model’s performance and identify areas for improvement, we generated a series of diagnostic visualizations. One key plot, shown in Figure 3, illustrates the loss and MAE trends across training epochs for both the training and validation datasets. This visualization is invaluable for understanding the model’s learning progression. Ideally, it demonstrates a steady decline in both loss and MAE, indicating that the model is improving without signs of overfitting.

The second visualization (Figure 4) presents a line plot of residuals, offering a detailed look at the model’s predictive errors. Residuals are the differences between the predicted values and the actual target values. In this plot, the residuals fluctuate, revealing that while the model performs well overall, there are instances where its predictions deviate significantly from the actual outcomes. These deviations suggest room for improvement, emphasizing the need for additional model refinement and optimization to minimize fluctuations and boost overall accuracy.

The third visualization (Figure 5) illustrates the distribution of residuals using a histogram, providing a clear overview of the prediction errors across the dataset. Most residuals are concentrated between −2 and 2, indicating that the model’s predictions are generally close to the actual values. However, there are some noticeable outliers, particularly at the distribution’s tails, where residuals range from −4 to −2 and from 4 to 6. These outliers highlight cases where the model’s predictions differ significantly from the true values, pointing to areas where adjustments to the model or input features could improve performance.

Together, these visualizations offer a comprehensive evaluation of the baseline neural network model’s performance. They highlight the model’s ability to learn from the data and produce accurate predictions while also pinpointing areas that could benefit from further fine-tuning. By analyzing these insights, we can make targeted improvements to the model, enhancing its effectiveness in the regression task.

4.4. Optimized Neural Network Approach

To improve the predictive accuracy of the neural network model, we systematically tested several configurations and carefully documented their results. The first setup, with a single hidden layer of 64 neurons, achieved a notably low MAE of 0.2006. Building on this, we explored a second configuration with two hidden layers (64 and 32 neurons), which resulted in a slightly higher MAE of 0.3472. Adding a third hidden layer (64, 32, and 16 neurons) increased the MAE to 0.6720, indicating that greater complexity did not necessarily improve performance.

When we expanded to four hidden layers (64, 32, 16, and 8 neurons), the MAE rose further to 1.2302. Finally, a configuration with five hidden layers (64, 32, 16, 8, and 4 neurons) recorded the highest MAE of 1.9828, suggesting diminishing returns with increased depth. Interestingly, the model with the lowest MAE (0.2006) also demonstrated strong overall performance across additional metrics, providing a balanced and effective solution. These included MSE (to measure the average squared differences between predicted and actual values, serving as an indicator of prediction variance), RMSE (to offer a more intuitive interpretation by aligning its scale with the target variable’s units),

R^{2}

(to assess how much of the variation in the target variable can be explained by the predictors, with values close to 1 reflecting a better fit),

R_{adj}^{2}

(to account for the number of predictors, providing a refined measure of model fit, particularly for complex models), and MAPE (to express prediction errors as a percentage of the actual values, offering a practical perspective on the relative magnitude of errors).

In all our experiments, we kept parameters like the number of epochs, batch size, activation functions, and optimization methods consistent with those used in the baseline neural network configuration. This allowed us to directly compare performance differences and attribute them to changes in the network’s architecture. With this approach, we could carefully assess the impact of model complexity on predictive accuracy, finding a balance between simplicity and complexity that worked best for our objectives.

The configuration that performed best consisted of a single hidden layer with 64 neurons (Figure 6), achieving an MAE of 0.2006. But the numbers tell only part of the story. The model’s performance was further highlighted by its impressively low error values, including an MSE of 0.0718, RMSE of 0.2680,

R^{2}

of 0.9995, and an

R_{adj}^{2}

of 0.9995. What stood out most, however, was its minimal MAPE of 0.0055, which underscores the model’s reliability in delivering accurate predictions. This not only confirms the model’s effectiveness but also makes it a robust solution for the regression task.

In contrast to the baseline model, the optimized neural network configurations have shown notable improvements in model performance metrics. The line plot (Figure 7 and Figure 8) depicting the convergence of model loss and MAE over epochs indicates enhanced stability and faster convergence during both training and validation phases.

Furthermore, the distribution plot of residuals (Figure 9) reveals a significant refinement. Most residuals are now concentrated within a narrower range, specifically between −1 and 1, indicating that the model predictions align closely with the actual values for most instances. There are some outliers visible on the right tail extending up to 3, but their reduced frequency compared to the baseline suggests better overall model accuracy and consistency.

In addition to this, the following 3D plots (Figure 10 and Figure 11) visualize the relationship between the variables annuity, return rate, and contribution rate, with the color of each data point representing the corresponding actual replacement rate (computed with exact formulas mentioned in previous sections) and replacement rate predicted by the optimized neural network model. Each axis of the plot represents a different variable: annuity, return rate, and contribution rate. The color bar on the right side of the plot indicates the scale of replacement rates, ranging from lower values (purple) to higher values (yellow).

The plot reveals how different combinations of annuity, return rate, and contribution rate contribute to actual (Figure 10) and predicting replacement rates (Figure 11). Data points are scattered across the plot, with colors varying based on the replacement rate values. The colormap helps distinguish between different levels of replacement rates, facilitating visual interpretation of the predictions.

The use of color coding plays a key role in enhancing the interpretability of the model’s performance across the dataset. In particular, we are visually distinguishing areas where the model accurately predicts replacement rates (shown by consistent color regions) from those where discrepancies or outliers exist (indicated by shifts in color). This approach provides an intuitive way to assess prediction quality. The inclusion of white as a marker for replacement rates around 30 further aids clarity, effectively highlighting specific ranges of both actual values and predictions.

In both visualizations, the data points are carefully color-coded to represent predicted replacement rates, with white explicitly indicating values close to 30. The presence of consistently white-colored data points suggests that the models perform exceptionally well when predicting replacement rates near this threshold. This uniformity reflects the accuracy of the models and indicates their ability to handle scenarios around this key value with minimal error. The similarity observed between the two visualizations reinforces the models’ effectiveness in generalizing across the dataset, capturing the relationships between input variables and replacement rates with remarkable consistency.

These results (see Table 6) underscore the reliability and precision of the neural network models in estimating replacement rates. By successfully learning the underlying patterns in the data and delivering stable predictions, the models demonstrate their capability to achieve high levels of accuracy in this regression task.

4.5. Mamdani Fuzzy Inference System

The Mamdani FIS designed for predicting replacement rates utilizes linguistic variables defined over specific intervals for its inputs and outputs. Here are the linguistic variables and their intervals:

Inputs:
−
Annuity: Low [13.99–15.0], Medium [14.0–17.0], High [17.0–18.01];
−
Return Rate: Low [1.99–3.0], Medium [2.0–5.0], High [5.0–6.01];
−
Contribution Rate: Low [5.99–7.0], Medium [6.0–9.0], High [9.0–10.01].
Output:
−
Replacement Rate: Low [0.0–30.0], Low to Medium [0.0–60.0], Medium [30.0–100.0], Medium to High [30.0–100.0], High [60.0–100.0].

The Mamdani FIS employs 27 fuzzy rules to map the input variables to the output replacement rate. These rules are defined as follows:

If annuity is low and return rate is low and contribution rate is low, then replacement rate is low.
If annuity is low and return rate is medium and contribution rate is low, then replacement rate is medium.
If annuity is low and return rate is high and contribution rate is low, then replacement rate is medium to high.
If annuity is low and return rate is low and contribution rate is medium, then replacement rate is low to medium.
If annuity is low and return rate is medium and contribution rate is medium, then replacement rate is medium to high.
If annuity is low and return rate is high and contribution rate is medium, then replacement rate is medium to high.
If annuity is low and return rate is low and contribution rate is high, then replacement rate is medium.
If annuity is low and return rate is medium and contribution rate is high, then replacement rate is medium to high.
If annuity is low and return rate is high and contribution rate is high, then replacement rate is medium to high.
If annuity is medium and return rate is low and contribution rate is low, then replacement rate is low.
If annuity is medium and return rate is medium and contribution rate is low, then replacement rate is low to medium.
If annuity is medium and return rate is high and contribution rate is low, then replacement rate is medium to high.
If annuity is medium and return rate is low and contribution rate is medium, then replacement rate is low to medium.
If annuity is medium and return rate is medium and contribution rate is medium, then replacement rate is low to medium.
If annuity is medium and return rate is high and contribution rate is medium, then replacement rate is medium to high.
If annuity is medium and return rate is low and contribution rate is high, then replacement rate is medium.
If annuity is medium and return rate is medium and contribution rate is high, then replacement rate is medium.
If annuity is medium and return rate is high and contribution rate is high, then replacement rate is medium to high.
If annuity is high and return rate is low and contribution rate is low, then replacement rate is low.
If annuity is high and return rate is medium and contribution rate is low, then replacement rate is low to medium.
If annuity is high and return rate is high and contribution rate is low, then replacement rate is medium.
If annuity is high and return rate is low and contribution rate is medium, then replacement rate is low to medium.
If annuity is high and return rate is medium and contribution rate is medium, then replacement rate is low to medium.
If annuity is high and return rate is high and contribution rate is medium, then replacement rate is medium.
If annuity is high and return rate is low and contribution rate is high, then replacement rate is medium.
If annuity is high and return rate is medium and contribution rate is high, then replacement rate is medium.
If annuity is high and return rate is high and contribution rate is high, then replacement rate is medium to high.

Figure 12, Figure 13, Figure 14 and Figure 15 depict the triangular membership functions for the input variables of interest (i.e., annuity, return rate and contribution rate), as well as the output variable of interest (i.e., replacement rate).

As an illustrative example, consider the following values:

Annuity: 16.746141;
Return rate: 3.67;
Contribution rate: 7.13.

Using the membership degrees, the Mamdani FIS applies the rules and aggregates them using triangular membership functions for the output replacement rate. In this case, it predicts a replacement rate of approximately 30.0, demonstrating how the Fuzzy Inference System effectively handles inputs and generates corresponding outputs based on fuzzy logic principles (Figure 16).

In order to mitigate computational demands, the Mamdani FIS was applied to a subset of 11,000 samples from the dataset. Similar to our previous analysis, 3D plots were generated to compare the behavior of the FIS-predicted replacement rate values against those predicted by the optimized neural network (Figure 17). Upon examination, it was observed that the neural network’s predicted values closely approximated the actual replacement rates (Figure 18). However, contrasting results were noted with the FIS predictions (Figure 19), revealing significant fluctuations. This suggests that further investigation is necessary to refine the rules and membership function intervals according to expert knowledge and preferences.

Investing more time and effort into refining these aspects is crucial for achieving a closer alignment between the FIS (i.e., the predictive accuracy and stability of the FIS) and the desired outcomes (i.e., estimated replacement rates). While the FIS method offers valuable advantages, especially in dealing with uncertain or imprecise data through fuzzy logic, it also comes with its own set of challenges. One key obstacle is the heavy reliance on expert knowledge to design rules and membership functions. If these rules are not clearly defined, the predictions may become unreliable, leading to inconsistent outcomes. This highlights the critical role experts play in optimizing FIS models and ensuring their effectiveness. Ultimately, developing precise rules and carefully tuning the parameters are essential steps for making the most of the FIS approach.

4.6. Discussion and Practical Considerations

4.6.1. Data Quality and Availability for Neural Networks

Implementing neural networks in real-world pension fund scenarios entails several challenges related to data quality and availability. Pension datasets are often incomplete or contain noisy entries due to reporting errors or heterogeneous data sources. Our synthetic experiments (see Table 6 and Figure 8, Figure 9, Figure 10 and Figure 11) demonstrate that the optimized neural network can achieve high predictive accuracy (MAE = 0.2006,

R^{2} = 0.9995

) under controlled conditions. In practice, preprocessing techniques such as imputation, outlier removal, and normalization, along with data augmentation strategies, can help mitigate the impact of missing or noisy data, ensuring robust model performance.

4.6.2. Dynamic Updating of Fuzzy Rules

The Mamdani Fuzzy Inference System relies on expert-defined rules and membership functions, which may need frequent updates in response to fluctuating macroeconomic conditions, such as changes in interest rates, inflation, or pension policy. While our experiments applied the FIS to a subset of 11,000 samples (Figure 17, Figure 18 and Figure 19), demonstrating reasonable performance, practical implementation would require a dynamic rule adaptation mechanism. Techniques such as adaptive fuzzy systems, rule-weight adjustment, or integration with reinforcement learning could allow the FIS to automatically adjust its rules over time, maintaining predictive accuracy in changing environments.

4.6.3. Interpretability of Neural Networks

One limitation often associated with deep learning models is their ‘black-box’ nature. In the context of pension advisory services, transparency and interpretability are critical for client trust and regulatory compliance. Our 3D visualizations (Figure 10 and Figure 11) help interpret the relationship between input parameters and replacement rates, but additional methods such as SHAP (SHapley Additive exPlanations) or LIME (Local Interpretable Model-agnostic Explanations) can further elucidate feature contributions [33,34]. These tools can provide insight into individual predictions, supporting explainable and accountable AI deployment in financial decision-making.

4.6.4. Cost-Effectiveness Compared to Traditional Methods

The development, deployment, and maintenance of AI-based systems involve initial investment and technical expertise. However, once implemented, neural networks and FIS models can process large-scale pension datasets with minimal human intervention, offering scalability and operational efficiency. For small and medium-sized pension funds, this automation may reduce reliance on highly specialized financial experts, streamline estimation of replacement rates, and enable rapid scenario analysis. Our results indicate that AI models can achieve precise and consistent predictions (e.g., neural network MAE = 0.2006), suggesting potential advantages in both accuracy and cost-effectiveness over traditional approaches while maintaining the ability to integrate expert knowledge where necessary.

4.6.5. Integration Through AHP and Hybrid Evaluation

Beyond the individual analysis of the DL and Mamdani FIS models, Section 3.6 introduced a conceptual framework based on the AHP to integrate both approaches within a unified decision-support structure. This framework enables a balanced evaluation of the models by considering multiple criteria such as predictive accuracy, interpretability, stability, and computational cost in a structured, transparent way.

Under this scheme, the DL and FIS models are treated as decision alternatives, while their respective performance metrics provide the criterion scores. The AHP weighting mechanism, optionally enhanced by fuzzy-derived importance factors from the FIS, allows decision-makers to compute a composite score reflecting the overall desirability of each modeling approach. Conceptually, this hybrid evaluation layer demonstrates how quantitative precision from DL can be systematically combined with the qualitative interpretability of FIS to support informed, multi-criteria pension planning decisions.

Although this integration is introduced conceptually rather than implemented empirically in the present study, it strengthens the methodological coherence between the two computational paradigms and outlines a concrete pathway for future work on hybrid AI-based decision-support systems in the pension domain.

5. Conclusions and Future Remarks

This research set out to find better ways to estimate replacement rates, which are key to planning for retirement. We focused on two main approaches: neural networks and the Mamdani FIS. The neural networks showed great promise, with low MAE and steady progress during training. On the other hand, the FIS method proved especially useful when dealing with the uncertainties that often arise in retirement planning. It became clear that the effectiveness of the FIS approach relies heavily on well-defined rules from experts and careful parameter adjustments.

Several limitations of the current study should be acknowledged. First, the analysis relied on synthetic data to enable controlled benchmarking, which limits the immediate applicability of the models to real-world pension systems. Second, the FIS approach, while interpretable, requires expert knowledge for rule definition, which may not always be readily available. Third, the neural network, although accurate, can be less interpretable and may require extensive hyperparameter tuning for optimal performance. Finally, the study considered a relatively small set of input variables, leaving room for expanding the feature space to capture additional financial or demographic factors.

This study also proposed a conceptual integration framework combining neural network and FIS outputs through the AHP, which enables systematic weighting of criteria such as prediction accuracy, interpretability, and stability in pension planning decisions. Future research should focus on the empirical implementation of this hybrid framework using real-world pension datasets, exploring larger and more diverse feature sets, and assessing its robustness in practical settings. Additional enhancements could include the incorporation of explainability tools for neural networks, such as SHAP or LIME, to improve transparency and trust, and further refinement of FIS rules to enhance interpretability and stability.

In addition, the AHP can provide valuable insights and enhance the robustness of the decision-making process. Specifically, AHP facilitates systematic comparisons of pension strategies by evaluating multiple criteria within a hierarchical framework (see Figure 20). For instance, when assessing and selecting pension strategies (i.e., alternatives), decision-makers can prioritize criteria such as prediction accuracy, risk management, and alignment with broader financial objectives. This approach enables a more effective evaluation of alternatives and supports informed decision-making tailored to individual requirements. Moreover, both individual and group decision-making processes can benefit from the AHP methodology, promoting consistency and agreement in decisions.

In conclusion, this study highlights the efficacy of neural networks in estimating replacement rates and underscores the potential of FIS models, provided that they undergo further refinement. Moreover, the integration of these approaches through the AHP offers a structured, multi-criteria decision-support framework that leverages the complementary strengths of both methods. Future advancements in retirement planning methodologies can help individuals and financial advisors make better-informed choices and achieve more reliable outcomes.

Author Contributions

Conceptualization, P.Z.L. and G.S.; methodology, P.Z.L. and G.S.; software, P.Z.L.; validation, P.Z.L. and G.S.; formal analysis, P.Z.L. and G.S.; investigation, P.Z.L. and G.S.; resources, P.Z.L. and G.S.; data curation, P.Z.L. and G.S.; writing—original draft preparation, P.Z.L. and G.S.; writing—review and editing, P.Z.L. and G.S.; visualization, P.Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Acknowledgments

The authors would like to thank the anonymous reviewers for their valuable comments and constructive suggestions, which helped to improve the quality and clarity of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
AHP	Analytic Hierarchy Process
ANFIS	Adaptive/Artificial Neural Networks with Fuzzy Inference Systems
ANNs	Artificial Neural Networks
CONV	Convolutional Neural Networks (MACNN)
DL	Deep Learning
EHR	Electronic Health Records
FSQCA	Fuzzy Set Qualitative Comparative Analysis
FIS	Fuzzy Inference System
ILO	International Labour Organization
LIME	Local Interpretable Model-agnostic Explanations
LSTM	Long Short-Term Memory
MAE	Mean Absolute Error
MAPE	Mean Absolute Percentage Error
MSE	Mean Squared Error
NNs	Neural Networks
RMSE	Root Mean Squared Error
PAYG	Pay-as-you-go Pension Plan
$R^{2}$	R-squared
$R_{adj}^{2}$	Adjusted R-squared
RRR	Real Return Rate
SHAP	SHapley Additive exPlanations
SVM	Support Vector Machine
OECD	Organisation for Economic Co-operation and Development

Appendix A. Basics of Neural Networks

Neural networks are computational models inspired by the way our brains work [7]. They are designed to learn from data by identifying patterns and making predictions. At a high level, a neural network consists of several layers of interconnected “neurons” that process input data, adjusting their weights and biases as they learn.

In this appendix, we’ll break down the key components of a neural network, explain the mathematical concepts that govern them, and look at how we evaluate a network’s performance.

Appendix A.1. Basic Structure of a Neural Network

A typical neural network is made up of several layers:

Input Layer: This layer receives the raw input data.
Hidden Layers: These layers perform computations on the input, transforming it through neurons that apply weighted sums and activation functions.
Output Layer: The output layer gives us the network’s predictions. Depending on the task, this could be a class label for classification problems, or a continuous value for regression tasks.

It is worth noting that neural networks become “deep” when they have 2 or more hidden layers.

Appendix A.2. How Neurons Work

Each neuron in the network takes in inputs, processes them, and produces an output Figure A1. The process is mathematically represented as:

y = f (\sum_{i = 1}^{n} w_{i} x_{i} + b)

(A1)

where:

$x_{i}$ is the input;
$w_{i}$ are the weights;
b is the bias;
f is the activation function applied to the sum of the weighted inputs.

The goal of the neural network is to adjust the weights and biases so that it can accurately predict outputs when given new, unseen data.

Figure A1. Example of a neural network architecture with 2 hidden layers.

Appendix A.3. Loss Functions and Error Metrics

To train a neural network, we need a way to measure how “wrong” it is. This is where loss functions come in. These functions calculate the error between the predicted values and the actual values. We use different loss functions depending on the problem type (regression or classification).

Appendix A.3.1. Mean Absolute Error (MAE)

MAE tells us how far off the predictions are on average, without considering whether the predictions are too high or too low:

MAE = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |

(A2)

where:

$y_{i}$ is the actual value;
${\hat{y}}_{i}$ is the predicted value;
n is the number of data points.

Appendix A.3.2. Mean Squared Error (MSE)

MSE is similar to MAE but squares the errors, which means it punishes larger errors more heavily:

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

(A3)

MSE is widely used because it helps minimize large errors, but can be overly sensitive to outliers.

Appendix A.3.3. Root Mean Squared Error (RMSE)

RMSE is just the square root of MSE. It brings the error back to the original units of the data, making it easier to interpret:

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}

(A4)

Appendix A.3.4. R-Squared ( $R^{2}$ )

R^{2}

tells us how well the model explains the variation in the data. A value of 1 means the model explains all the variance, and 0 means it explains none:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(A5)

where:

$\bar{y}$ is the mean of the actual values.

R^{2}

values range from 0 to 1, with a higher value indicating a better fit.

Appendix A.3.5. Adjusted R-Squared ( $R_{adj}^{2}$ )

R_{adj}^{2}

takes into account the number of predictors in the model. It’s useful when comparing models with different numbers of features:

R_{adj}^{2} = 1 - (1 - R^{2}) \frac{n - 1}{n - p - 1}

(A6)

where:

n is the number of data points;
p is the number of features.

This version of

R^{2}

prevents us from overestimating the quality of a model when we have many predictors.

Appendix A.3.6. Mean Absolute Percentage Error (MAPE)

MAPE measures the average percentage error between predictions and actual values:

MAPE = \frac{100}{n} \sum_{i = 1}^{n} |\frac{y_{i} - {\hat{y}}_{i}}{y_{i}}|

(A7)

Appendix A.4. Training a Neural Network

Training a neural network involves “teaching” it to make accurate predictions by adjusting the weights and biases using a process called backpropagation. This process calculates how much each weight contributed to the error and updates the weights accordingly using gradient descent.

The basic update rule for weights is:

w = w - θ \frac{\partial L}{\partial w}

(A8)

where:

$θ$ is the learning rate, controlling how big the updates are;
$\frac{\partial L}{\partial w}$ is the gradient of the loss function with respect to the weight w.

Appendix A.5. Overfitting and Regularization

One common problem in neural network training is overfitting, where the model memorizes the training data but fails to generalize to new data. To prevent overfitting, we use techniques like regularization (e.g., L2 regularization) and dropout to penalize overly complex models.

Appendix A.6. Activation Functions

Activation functions are mathematical equations that determine the output of a neural network model given a set of inputs. They introduce non-linearity into the model, allowing it to learn from the errors and adapt its weights accordingly. Let’s discuss two widely used activation functions: ReLU and Softplus.

Appendix A.6.1. Rectified Linear Unit (ReLU)

RELU is one of the simplest and most commonly used activation functions. It is defined as:

f (x) = max (0, x)

(A9)

In simpler terms, if the input is positive, ReLU outputs the same value, and if the input is negative, it outputs zero. This function is simple yet powerful, enabling the model to learn faster and reduce the likelihood of vanishing gradients, which is an issue with other activation functions like the sigmoid or tanh.

Appendix A.6.2. Softplus Activation Function

The Softplus activation function is a smooth and differentiable approximation of the ReLU function. It is commonly used in scenarios where we want to ensure positive outputs while avoiding the abrupt cutoff of ReLU. The Softplus function is defined as:

f (x) = log (1 + e^{x})

(A10)

where: x is the input to the neuron.

The function transforms the input by applying the natural logarithm to the sum of 1 and the exponential of the input value, ensuring that the output is always positive.

Appendix A.7. The Adam Optimizer

The Adam optimizer is an optimization algorithm designed to compute adaptive learning rates for each parameter by using the first and second moments of the gradients.

The update rule for Adam involves two main components:

First moment (mean), which is the exponentially weighted moving average of the gradient.
Second moment (variance), which is the exponentially weighted moving average of the squared gradient.

The algorithm works by adjusting the learning rates based on these moments, allowing it to adapt to the structure of the data and optimize the learning process. The update rule for each parameter

η

at time step t is given by:

{\hat{m}}_{t} = \frac{m_{t}}{1 - β_{1}^{t}}, {\hat{v}}_{t} = \frac{v_{t}}{1 - β_{2}^{t}}

(A11)

η_{t + 1} = η_{t} - \frac{θ {\hat{m}}_{t}}{\sqrt{{\hat{v}}_{t}} + ϵ}

(A12)

where:

$m_{t}$ is the first moment estimate (mean);
$v_{t}$ is the second moment estimate (variance);
$β_{1}$ and $β_{2}$ are the decay rates for the moving averages;
$θ$ is the learning rate;
$ϵ$ is a small constant to avoid division by zero.

Appendix B. Basics Concepts of Fuzzy Logic

In this appendix, we’ll break down the key concepts of Fuzzy Logic and explain the mathematical terms that govern them [8].

Appendix B.1. Crisp and Fuzzy Sets

Crisp values are exact and deterministic, belonging to classical sets as we know them in traditional mathematics. In crisp sets, an element either fully belongs to the set (membership value = 1) or doesn’t belong to it at all (membership value = 0). Fuzzy sets, on the other hand, introduce some flexibility. They allow for degrees of membership, accommodating uncertainty or vagueness in how we define set membership. In a fuzzy set, an element can belong to the set to a certain degree, with membership values ranging between 0 and 1.

Appendix B.2. Fuzzy Membership Function

The fuzzy membership function is used to convert crisp input values into fuzzy values. Formally, we define a membership function for a fuzzy set A on the universe of discourse X as

μ_{A} : X \to [0, 1]

. This function maps each element in X to a value between 0 and 1, known as the degree of membership. Essentially, X is the universal set, and A is a fuzzy subset derived from X.

These functions are often visualized graphically, with the X-axis representing the input values (the universe of discourse) and the Y-axis showing the degree of membership. A common way to fuzzify inputs is by using the triangular membership function, which is defined by three parameters: a, b, and c. The values a and c represent the base of the triangle, while b represents its peak height (e.g., see Figure 12).

In simple terms, the triangular membership function works as follows: if

x = b

, the membership value is 1. If the input x is less than a or greater than c, the membership value is 0. If x falls between a and b, the membership value increases from 0 to 1 as x approaches b. Likewise, if x is between b and c, the membership value decreases from 1 to 0 as x moves closer to c.

Mathematically, the triangular membership function is expressed as:

μ (x; a, b, c) = \{\begin{matrix} 0, & if x \leq a or x \geq c \\ \frac{x - a}{b - a}, & if a \leq x \leq b \\ \frac{c - x}{c - b}, & if b \leq x \leq c \end{matrix}

(A13)

For example, let’s take a membership function

μ (x; 2, 5, 15)

and calculate the membership value for

x = 6.5

:

μ (x; 2, 5, 15) = max (min (\frac{x - a}{b - a}, \frac{c - x}{c - b}), 0)

(A14)

Substituting the values:

μ (6.5; 2, 5, 15) = max (min (\frac{6.5 - 2}{5 - 2}, \frac{15 - 6.5}{15 - 5}), 0) = \dots = max (0.85, 0) = 0.85

(A15)

Thus, the membership value of

x = 6.5

is 0.85.

Appendix B.3. Fuzzy Operators

Let

μ_{A}

and

μ_{B}

be the membership functions that define fuzzy sets A and B, respectively, in the universe of discourse X. To assess the disjunction (or union) of the rule inputs, we use an OR fuzzy operator, which is defined by the following relationship:

μ_{A \cup B} (X) = max (μ_{A} (X), μ_{B} (X))

(A16)

This means that the membership value of X in the union of sets A and B is simply the maximum membership value between the two sets.

Similarly, when we want to evaluate the conjunction (or intersection) of the rule inputs, we use an AND fuzzy operator, which represents the intersection of the fuzzy sets. It is mathematically defined as:

μ_{A \cap B} (X) = min (μ_{A} (X), μ_{B} (X))

(A17)

In this case, the membership value of X in the intersection of sets A and B is determined by the minimum membership value between the two.

Lastly, it’s important to note that the complement of a fuzzy set A can also be evaluated. The complement is simply the fuzzy set where each membership value is subtracted from 1, and it is defined as:

μ_{A^{c}} (X) = 1 - μ_{A} (X)

(A18)

This represents the degree to which an element does not belong to the fuzzy set A, effectively flipping its membership.

Appendix B.4. Fuzzy Rule Structure (IF-THEN Rules)

A fuzzy rule base is at the heart of a Fuzzy Inference System, consisting of a set of IF-THEN rules that define how the system maps input values to output values. These rules are essential because they guide the system’s behavior, and other components, like membership functions, are specifically designed to implement these rules in a way that is both realistic and efficient.

In the Mamdani FIS, the IF-THEN rules are used to translate fuzzy sets in the input universe (denoted as

U \subset R^{n}

) to fuzzy sets in the output universe (denoted as

V \subset R

) using fuzzy logic principles. The general structure of these rules is represented as follows:

R^{(ν)} = IF x_{1} is F_{1}^{ν} AND \dots AND x_{n} is F_{n}^{ν}, THEN y is G^{ν}

(A19)

Here,

F_{j}^{ν}

and

G^{ν}

(where

j = 1, \dots, n

) are fuzzy sets in the input universe

U_{j} \subset R

, representing linguistic variables that define the membership functions for each input. Additionally,

x = {[x_{1}, \dots, x_{n}]}^{T} \in U

represents the input vector, and

y \in V

is the output linguistic variable, both of which belong to their respective input and output universes. The parameter

ν

corresponds to the number of rules in the fuzzy rule base.

Appendix B.5. Defuzzification

Defuzzification is the process of converting a fuzzy output, which represents a range of possibilities, into a single crisp value. This value typically corresponds to the system’s final decision. Several methods can be used for defuzzification, including centroid, maximum membership, and weighted average.

In the centroid method, the crisp value is determined by finding the center of the fuzzy set’s area. Essentially, this is the point along the X-axis at which the fuzzy set would balance (e.g., see Figure 16). To compute the centroid, we use the following formula. Here,

μ (x_{i})

represents the membership value for each point

x_{i}

in the universe of discourse.

For a discrete universe of discourse, the centroid is calculated as:

x_{centroid} = \frac{\sum_{i} μ (x_{i}) x_{i}}{\sum_{i} μ (x_{i})}

(A20)

For a continuous universe of discourse, the centroid is calculated by integrating over the range of x, from $x_{l}$ to $x_{u}$ :

x_{centroid} = \frac{\int_{x_{l}}^{x_{u}} μ (x) x d x}{\int_{x_{l}}^{x_{u}} μ (x) d x}, x_{l} \leq x \leq x_{u}

(A21)

References

Economic Policy Committee (EPC). 2024 Ageing Report. Economic and Budgetary Projections for the EU Member States (2022–2070); Institutional Paper 279. 18 April 2024; Publications Office of the European Union: Luxembourg, 2024. [Google Scholar]
OECD. Pensions at a Glance 2023. OECD and G20 Indicators; OECD Publishing: Paris, France, 2023. [Google Scholar] [CrossRef]
Orenstein, M.A. Pension privatization in crisis: Death or rebirth of a global policy trend? Int. Soc. Secur. Rev. 2011, 64, 65–80. [Google Scholar] [CrossRef]
ILO. Reversing Pension Privatizations. Rebuilding Public Pension Systems in Eastern Europe and Latin America; Ortiz, I., Durán-Valverde, F., Urban, S., Wodsak, V., Eds.; Document and Publications Production, Printing and Distribution Branch (PRODOC) of the ILO: Geneva, Switzerland, 2018. [Google Scholar]
Barr, N.; Diamond, P. The economics of pensions. Oxf. Rev. Econ. Policy 2006, 22, 15–39. [Google Scholar] [CrossRef]
Zheng, H.; Wang, H.; Zhu, R.; Xue, J.-H. A brief review of deep learning methods in mortality forecasting. Ann. Actuar. Sci. 2025, 1–16. [Google Scholar] [CrossRef]
Aggarwal, C. Neural Networks and Deep Learning: A Textbook; Springer Nature: Cham, Switzerland, 2023. [Google Scholar]
Trillas, E.; Eciolaza, L. Fuzzy Logic: An Introductory Course for Engineering Students; Springer International Publishing: Cham, Switzerland, 2015. [Google Scholar]
Perla, F.; Richman, R.; Scognamiglio, S.; Wüthrich, M.V. Accurate and explainable mortality forecasting with the LocalGLMnet. Scand. Actuar. J. 2024, 2024, 739–761. [Google Scholar] [CrossRef]
Symeonidis, G.; Tinios, P.; Xenos, P. Enhancing pension adequacy while reducing the fiscal budget and creating essential capital for domestic investments and growth: Analyzing the risks and outcomes in the case of Greece. Risks 2021, 9, 8. [Google Scholar] [CrossRef]
Apiecionek, L. Fully scalable fuzzy neural network for data processing. Sensors 2024, 24, 5169. [Google Scholar] [CrossRef]
Boulhol, H.; Lüske, M. What’s new in the debate about pay-as-you-go versus funded pensions? In The Future of Pension Plans in the EU Internal Market. Financial and Monetary Policy Studies; da Costa Cabral, N., Rodrigues, N.C., Eds.; Springer: Cham, Switzerland, 2019; Volume 48. [Google Scholar] [CrossRef]
Bielawska, K.; Choń-Domińczak, A.; Stańko, D. Retreat from Mandatory Pension Funds in Countries of Eastern and Central Europe in Result of the Financial and Fiscal Crisis: Causes, Effects and Recommendations for Fiscal Rules; MPRA Paper No. 83345; University Library of Munich, Germany: Munich, Germany, 2017; Available online: https://mpra.ub.uni-muenchen.de/83345/1/ReportCEEreversals-2017_ISBN_book.pdf (accessed on 15 October 2024).
Morina, F.; Grima, S. The impact of pension fund assets on economic growth in transition countries, emerging economies, and developed countries. Quant. Financ. Econ. 2022, 6, 459–504. [Google Scholar] [CrossRef]
OECD. Pensions Funds in Figures; OECD Publishing: Paris, France, 2020. [Google Scholar]
Garido, J.; Shang, Y.; Xu, R. LSTM-based coherent mortality forecasting for developing countries. Risks 2024, 12, 27. [Google Scholar] [CrossRef]
Moreira, L.; dos Santos, I.L.; Gonzalez, P.H. Portfolio optimization for pension purposes: Literature review. J. Econ. Surv. 2025, 1–28. [Google Scholar] [CrossRef]
Liu, M.; Wei, X.; Xing, X.; Cheng, Y.; Ma, Z.; Ren, J.; Gao, X.; Xu, A. Predicting fracture risk for elderly osteoporosis patients by hybrid machine learning model. Digit. Health 2024, 10, 1–11. [Google Scholar] [CrossRef]
Simsek, G.; Karasoy, D. Applying machine learning techniques in survival analysis to the private pension system in Turkey. Commun. Stat.-Methods 2023, 53, 5706–5720. [Google Scholar]
Gabor, S.; Peter, V. Forecasting Hungarian mortality rates with recurrent neural networks. Statisztikai Szle. 2022, 100, 905–922. [Google Scholar]
Yan, S.; Zhou, Y.; Zhang, Y. Analysis of balance of income and expenditure and optimal retirement age of pension insurance co-ordination account based on improved machine learning algorithm. Comput. Intell. Neurosci. 2022, 2022, 5870893. [Google Scholar] [CrossRef] [PubMed]
Shi, S.; Wilke, R.A. Variable selection with group structure: Exiting employment at retirement age—A competing risks quantile regression analysis. Empir. Econ. 2022, 62, 119–155. [Google Scholar] [CrossRef]
Li, C.; Wang, Y.; Li, D.; Chu, D.; Ma, M. An effective method of evaluating pension service quality using multi-dimension attention convolutional neural networks. Int. J. Softw. Eng. Knowl. Eng. 2021, 31, 533–543. [Google Scholar] [CrossRef]
Cui, Y.; Zhang, L.; Hou, Y.; Tian, G. Design of intelligent home pension service platform based on machine learning and wireless sensor network. J. Intell. Fuzzy Syst. 2021, 40, 2529–2540. [Google Scholar] [CrossRef]
Brauning, M.; Hullermeier, E.; Keller, T.; Glaum, M. Lexicographic preferences for predictive modeling of human decision making: A new machine learning method with an application in accounting. Eur. J. Oper. Res. 2017, 258, 295–306. [Google Scholar] [CrossRef]
Rihoux, B.; Ragin, C.C. Configurational Comparative Methods: Qualitative Comparative Analysis (QCA) and Related Techniques; ALNAP; Sage: Thousand Oaks, CA, USA, 2009. [Google Scholar]
Yüks, S.; Eti, S.; Dinçer, H.; Meral, H.; Umar, M.; Gökalp, Y. A novel fuzzy decision-making approach to pension fund investments in renewable energy. Financ. Innov. 2025, 11, 18. [Google Scholar] [CrossRef]
Liu, X.; Zhang, S.; Wang, Z.; Zhang, S. Classification and identification of medical insurance fraud: A case-based reasoning approach. Technol. Econ. Dev. Econ. 2025, 31, 1345–1371. [Google Scholar] [CrossRef]
Asemi, A.; Asemi, A.; Ko, A. Adaptive neuro-fuzzy inference system for customizing investment type based on the potential investors’ demographics and feedback. J. Big Data 2023, 10, 87. [Google Scholar] [CrossRef]
de Andrés-Sánchez, J. A systematic review of the interactions of fuzzy set theory and option pricing. Expert Syst. Appl. 2023, 223, 119868. [Google Scholar] [CrossRef]
Saaty, R.W. The analytic hierarchy process—What it is and how it is used. Math. Model. 1987, 9, 161–176. [Google Scholar] [CrossRef]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R.; Taylor, J. An Introduction to Statistical Learning with Applications in Python; Springer Nature: Cham, Switzerland, 2023. [Google Scholar]
Christou, I.; Soldatos, J.; Lappas, P. An empirical evaluation of explainable AI approaches. In Proceedings of the 21st International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT), Lucca, Italy, 9–11 June 2025; pp. 1–7. [Google Scholar] [CrossRef]
Saranya, A.; Subhashini, R. A systematic review of Explainable Artificial Intelligence models and applications: Recent developments and future trends. Decis. Anal. J. 2023, 7, 100230. [Google Scholar] [CrossRef]

Figure 1. Overview of the computational framework for replacement rate estimation using neural networks and fuzzy inference systems.

Figure 2. Integration of neural network and fuzzy inference system outputs through the AHP for pension replacement rate estimation.

Figure 3. Convergence of model loss and MAE (baseline neural network).

Figure 4. Line plot for comparing true (shown in red) and predicted values (shown in blue).

Figure 5. Residuals distribution (baseline neural network).

Figure 6. Artificial neural network topology.

Figure 7. Convergence of model loss and MAE (optimized neural network).

Figure 8. Line plot (improved) for comparing true (shown in red) and predicted values (shown in blue).

Figure 9. Residual distribution (optimized neural network).

Figure 10. Actual values of replacement rates.

Figure 11. Predicted values of replacement rates.

Figure 12. Triangular membership function (annuity).

Figure 13. Triangular membership function (return rate).

Figure 14. Triangular membership function (contribution rate).

Figure 15. Triangular membership function (replacement rate).

Figure 16. Estimated replacement rate (black line) together with the degree of membership (orange shaded area).

Figure 17. Actual values of replacement rates (sample of 11,000 rows).

Figure 18. Predicted values of replacement rates using optimized neural networks (sample of 11,000 rows).

Figure 19. Predicted values of replacement rates using the Mamdani FIS approach (sample of 11,000 rows).

Figure 20. An example of a hierarchy structure using AHP.

Table 1. Assumptions on calculating the replacement rates for a funded pension scheme.

Income Maturity	Total Contribution Rate	Expenses on Contributions	Total Working Life (Years)	Expected Life of the Annuity
0.50%	6.00%	0.50%	40	15.64

Source: calculations by authors.

Table 2. Main annuity values and respective replacement rates.

Annuity	Replacement Rate (%)
14	29.10
15	27.10
16	25.40
17	24.10
18	22.70

Source: calculations by authors.

Table 3. Main return rate values and respective replacement rates.

Return Rate (%)	Replacement Rate (%)
2	18.80
3	22.80
4	28.50
5	36.80
6	46.90

Source: calculations by authors.

Table 4. Main contribution rate values and respective replacement rates.

Contribution Rate (%)	Replacement Rate (%)
6	26.00
7	30.40
8	34.70
9	39.00
10	43.40

Source: calculations by authors.

Table 5. Illustrative AHP integration of DL, FIS, and Hybrid models.

Model ( $a_{i}$ )	$s_{i 1}$ (Accuracy)	$s_{i 2}$ (Interpretability)	$s_{i 3}$ (Stability)	$s_{i 4}$ (Cost)	$S_{i}$
DL	0.95	0.45	0.80	0.60	0.74
FIS	0.85	0.90	0.70	0.75	0.79
Hybrid	0.92	0.75	0.85	0.65	0.81
Weights $w_{j}$	0.40	0.30	0.20	0.10

Table 6. Performance metrics for neural network configurations.

Configuration (Hidden Layers and Neurons)	Metrics
1 Hidden Layer (64 neurons)
Mean Absolute Error (MAE)	0.2006
Mean Squared Error (MSE)	0.0718
Root Mean Squared Error (RMSE)	0.2680
R-squared ( $R^{2}$ )	0.9995
Adjusted R-squared ( $R_{adj}^{2}$ )	0.9995
Mean Absolute Percentage Error (MAPE)	0.0055
2 Hidden Layers (64, 32 neurons)
Mean Absolute Error (MAE)	0.3472
3 Hidden Layers (64, 32, 16 neurons)
Mean Absolute Error (MAE)	0.6720
4 Hidden Layers (64, 32, 16, 8 neurons)
Mean Absolute Error (MAE)	1.2302
5 Hidden Layers (64, 32, 16, 8, 4 neurons)
Mean Absolute Error (MAE)	1.9828

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lappas, P.Z.; Symeonidis, G. Introducing AI in Pension Planning: A Comparative Study of Deep Learning and Mamdani Fuzzy Inference Systems for Estimating Replacement Rates. Mathematics 2025, 13, 3737. https://doi.org/10.3390/math13233737

AMA Style

Lappas PZ, Symeonidis G. Introducing AI in Pension Planning: A Comparative Study of Deep Learning and Mamdani Fuzzy Inference Systems for Estimating Replacement Rates. Mathematics. 2025; 13(23):3737. https://doi.org/10.3390/math13233737

Chicago/Turabian Style

Lappas, Pantelis Z., and Georgios Symeonidis. 2025. "Introducing AI in Pension Planning: A Comparative Study of Deep Learning and Mamdani Fuzzy Inference Systems for Estimating Replacement Rates" Mathematics 13, no. 23: 3737. https://doi.org/10.3390/math13233737

APA Style

Lappas, P. Z., & Symeonidis, G. (2025). Introducing AI in Pension Planning: A Comparative Study of Deep Learning and Mamdani Fuzzy Inference Systems for Estimating Replacement Rates. Mathematics, 13(23), 3737. https://doi.org/10.3390/math13233737

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Introducing AI in Pension Planning: A Comparative Study of Deep Learning and Mamdani Fuzzy Inference Systems for Estimating Replacement Rates

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. Overview of the Replacement Rate Concept

3.1.1. The Baseline Scenario Analysis

3.1.2. Sensitivity Analysis

3.2. Computational Methodology Overview

3.2.1. Mathematical Formalization of the Neural Network Pipeline

3.2.2. Mathematical Formalization of the Fuzzy Inference System Pipeline

3.3. Description of the Synthetic Datasets Used

3.4. Details of the Deep Learning Model

3.4.1. Baseline Neural Network Approach

3.4.2. Exhaustive Search Approach

3.5. Explanation of the Mamdani Fuzzy Inference System

3.6. Conceptual Integration Framework Through Analytic Hierarchy Process

3.6.1. Definition of Alternatives and Criteria

3.6.2. Criterion-Specific Score Computation

3.6.3. AHP Weight Determination

3.6.4. Integration of FIS-Derived Fuzzy Weights

3.6.5. Global Score Aggregation

3.6.6. Illustrative Example and Discussion

4. Computational Experiments and Results

4.1. Computer Specifications and Model Implementation Framework

4.2. Data Preparation

4.3. Baseline Neural Network Approach Results

4.4. Optimized Neural Network Approach

4.5. Mamdani Fuzzy Inference System

4.6. Discussion and Practical Considerations

4.6.1. Data Quality and Availability for Neural Networks

4.6.2. Dynamic Updating of Fuzzy Rules

4.6.3. Interpretability of Neural Networks

4.6.4. Cost-Effectiveness Compared to Traditional Methods

4.6.5. Integration Through AHP and Hybrid Evaluation

5. Conclusions and Future Remarks

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Basics of Neural Networks

Appendix A.1. Basic Structure of a Neural Network

Appendix A.2. How Neurons Work

Appendix A.3. Loss Functions and Error Metrics

Appendix A.3.1. Mean Absolute Error (MAE)

Appendix A.3.2. Mean Squared Error (MSE)

Appendix A.3.3. Root Mean Squared Error (RMSE)

Appendix A.3.4. R-Squared ( R 2 )

Appendix A.3.5. Adjusted R-Squared ( R adj 2 )

Appendix A.3.6. Mean Absolute Percentage Error (MAPE)

Appendix A.4. Training a Neural Network

Appendix A.5. Overfitting and Regularization

Appendix A.6. Activation Functions

Appendix A.6.1. Rectified Linear Unit (ReLU)

Appendix A.6.2. Softplus Activation Function

Appendix A.7. The Adam Optimizer

Appendix B. Basics Concepts of Fuzzy Logic

Appendix B.1. Crisp and Fuzzy Sets

Appendix B.2. Fuzzy Membership Function

Appendix B.3. Fuzzy Operators

Appendix B.4. Fuzzy Rule Structure (IF-THEN Rules)

Appendix B.5. Defuzzification

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Appendix A.3.4. R-Squared ( $R^{2}$ )

Appendix A.3.5. Adjusted R-Squared ( $R_{adj}^{2}$ )