Utilizing a Two-Stage Taguchi Method and Artificial Neural Network for the Precise Forecasting of Cardiovascular Disease Risk
Abstract
:1. Introduction
2. Materials and Methods
2.1. Mathematical Background
2.1.1. Artificial Neural Network (ANN)
- (1)
- Input Layer: the first layer of the neural network, responsible for receiving raw data or features.
- (2)
- Hidden Layers: layers located between the input and output layers, responsible for further feature extraction and learning representations.
- (3)
- Output Layer: the final layer of the neural network, responsible for generating the ultimate output. Figure 1 shows an ANN comprised of three layers of neurons.
- (4)
- Weights and Biases: The strength of the connections between neurons is represented by weights. Each neuron also has a bias that affects its activation state.
- (5)
- Activation Function: A function that transforms the weighted sum of neuron inputs into an output, introducing nonlinearity. Common activation functions include logistic, tanh, and ReLU (Rectified Linear Unit), which are explained as follows:
- (i)
- Logistic function (sigmoid function): The logistic function maps inputs to a range between (0, 1), and its output represents probability values. It is commonly used in binary classification problems. However, when inputs are significantly far from the origin (too large or too small), the gradient of the function becomes very small. This issue can lead to the problem of vanishing gradients, making it challenging to train neural networks effectively.
- (ii)
- Tanh function: The tanh function maps inputs to the range (−1, 1). Compared to the logistic function, its output is more symmetric around the origin. Although it still faces the vanishing gradient problem, the tanh function’s outputs have an average closer to zero than the logistic function. This property helps mitigate the issue of vanishing gradients.
- (iii)
- ReLU function (Rectified Linear Unit): The ReLU function outputs the input value when it is positive and outputs zero when it is negative. It is a straightforward activation function, often leading to faster convergence during training. Unlike Sigmoid and tanh functions, ReLU does not suffer from the vanishing gradient problem.ReLU(x) = max(0,x)
- (6)
- Loss Function: Used to measure the disparity between the model’s predicted output and the actual output, the loss function is a critical aspect of the training process. Common loss functions include Mean Squared Error and Cross-Entropy.
- (7)
- Optimizer: Responsible for updating the weights and biases of the neural network based on the gradients of the loss function, optimizers play a key role in the training process. Popular optimizers include lbfgs, sgd (stochastic gradient descent), and Adam.
- (i)
- lbfgs (limited-memory broyden-fletcher-goldfarb-shanno): lbfgs combines the benefits of gradient descent with the advantages of quasi-Newton methods, allowing for efficient convergence to the optimal solution.
- (ii)
- SGD (stochastic gradient descent): SGD is a fundamental optimization algorithm used in training artificial neural networks. Unlike traditional gradient descent, which computes gradients using the entire dataset, SGD updates the model’s parameters using only a single data point (or a small batch of data points) at a time. This stochastic nature introduces randomness into the optimization process.
- (iii)
- Adam (adaptive moment estimation): Adam is an adaptive learning rate optimization algorithm that combines the ideas of both momentum and RMSprop. Adam dynamically adjusts the learning rates based on the magnitude of the past gradients, providing faster convergence and better performance compared to fixed learning rate methods like traditional gradient descent.
- (i)
- Linear Combination:For the -th neural layer, compute the linear combination output :Here, represents the weight matrix of the -th layer, is the activation output from the ()-th layer, and is the boas of the -th layer.
- (ii)
- Activation Function:Apply the activation function to obtain the output for the -th layer:Common activation functions include Sigmoid, ReLU, tanh, and Softmax.
- (iii)
- Repeat Steps (i) and (iii):Iterate the linear combination and activation process until the output layer is reached.
- (iv)
- Activation Function for the Output Layer.
- (i)
- Compute Loss:Utilize the loss function to calculate the error between the predicted and actual values. Common loss functions include Mean Squared Error (MSE) or Cross-Entropy, as follows:Mean Squared Error (MSE) (used in regression problems):Cross-Entropy (used in classification problems):
- (ii)
- Compute Gradients:Calculate the gradients of the loss with respect to weights and biases , typically through partial differentiation, as follows:
- (iii)
- Gradient Descent:Utilize the gradient descent to update weights and biases, aiming to minimize the loss function. The update rules are as follows:Here, represents the learning rate.
- (iv)
- Repeat Steps (i) to (iii):Iterate the process of computing loss, gradients, and gradient descent until the network’s parameters converge to optimal values.
2.1.2. Taguchi Method
- (a)
- NTB refers to a type of SN ratio in which a precise target value is established, and the greater the proximity of the quality characteristic value to the target value, the more desirable the outcome. The ultimate objective of the quality characteristic is to attain the target value, representing the optimal functionality. The formula used for NTB is as follows:
- (b)
- STB is an additional SN ratio category that aims for lower values of the quality characteristic. In STB, the ideal value for the quality characteristic is zero, representing the optimal condition. The formula used for STB is as follows:
- (c)
- LTB is another SN ratio type that prioritizes higher values for the quality characteristic. In LTB, the ideal functionality for the quality characteristic is considered infinite, indicating the most desirable outcome. The formula used for LTB is as follows:
2.1.3. Analysis of Variance (ANOVA)
2.2. Methodological Design
- (1)
- Problem definition: describing the source of the dataset and its relevant feature data.
- (2)
- Using the first-stage Taguchi method to find improved hyperparameter settings of the ANN model and trends in the level setting of each hyperparameter: This step involves using the Taguchi method, specifically L18(21 × 37), to collect model accuracy data for various hyperparameter settings. In this study, a ANN was used as the predictive model for CVDs. Experimental, analytical techniques such as an orthogonal array, parameter response table, parameter response graph, and ANOVA were then employed to assist in identifying better settings for the hyperparameter levels and the preferred trend for each hyperparameter that affected the average accuracy of CVD predictions. Finally, five confirmation experiments were conducted on the identified improved hyperparameter settings to ensure reproducibility.
- (3)
- Using the second-stage Taguchi method to find the best hyperparameter settings of the ANN model: In this step, another Taguchi method, specifically L9(34), was once again used to collect experimental data. The purpose was to conduct a second round of the Taguchi method within the hyperparameter range that may contain the best solution. Through the analytical techniques mentioned earlier, including the parameter response table, parameter response graph, and ANOVA, the best settings of the hyperparameter levels were determined. After identifying the best hyperparameter settings, a set of five confirmation experiments was conducted to validate the efficacy of the proposed methodology.
- (4)
- Comparison: The best model accuracy obtained from the two-stage Taguchi optimization method was compared to the accuracy of relevant models reported in the literature to confirm the improvement achieved in this study.
3. Results
3.1. Dataset Introduction
- (a)
- “Age” is an integer variable measured in days. Analyzing the age distribution revealed that there were 8159 individuals (11.66%) below 16,000 days old, 10,027 individuals (14.32%) between 16,000 and 17,999 days old, 20,490 individuals (29.27%) between 18,000 and 19,999 days old, 20,011 individuals (28.59%) between 20,000 and 21,999 days old, and 11,313 individuals (16.16%) between 22,000 and 24,000 days old.
- (b)
- “Height” is an integer variable measured in centimeters. Observing the height distribution revealed that 1537 individuals (2.20%) were below 150 cm, 16,986 individuals (24.27%) were between 150 and 159 cm, 33,463 individuals (47.80%) were between 160 and 169 cm, 15,696 individuals (22.42%) were between 170 and 179 cm, 2213 individuals (3.16%) were between 180 and 189 cm, and 105 individuals (0.15%) were above 190 cm.
- (c)
- “Weight” is a float variable measured in kilograms. Analyzing the weight distribution revealed that 987 individuals (1.41%) weighed less than 50 kg, 7174 individuals (10.25%) weighed between 50 and 59 kg, 20,690 individuals (29.56%) weighed between 60 and 69 kg, 19,476 individuals (27.82%) weighed between 70 and 79 kg, 11,989 individuals (17.13%) weighed between 80 and 89 kg, 5831 individuals (8.33%) weighed between 90 and 99 kg, and 3853 individuals (5.50%) weighed over 100 kg.
- (d)
- “Gender” is a categorical variable represented as follows: 1 for female and 2 for male. Out of 70,000 patients, 45,530 were female (approximately 65.4%) and 24,470 male (approximately 34.96%).
- (e)
- “Systolic blood pressure” is an integer variable. Observing its distribution revealed that there were 13,038 individuals (18.63%) with systolic pressure below 120, 37,561 individuals (53.66%) with systolic pressure between 120 and 139, 14,436 individuals (20.62%) with systolic pressure between 140 and 159, 3901 individuals (5.57%) with systolic pressure between 160 and 179, and 1064 individuals (1.52%) with systolic pressure above 180.
- (f)
- “Diastolic blood pressure” is also an integer variable. Analyzing its distribution revealed that 14,116 individuals (20.17%) had diastolic pressure below 80, 35,450 individuals (50.64%) had diastolic pressure between 80 and 89, 14,612 individuals (20.87%) had diastolic pressure between 90 and 99, 4139 individuals (5.91%) had diastolic pressure between 100 and 109, and 1683 individuals (2.40%) had diastolic pressure above 110.
- (g)
- “Cholesterol” is a categorical variable, represented as follows: 1 for normal, 2 for above normal, and 3 for well above normal. Out of 70,000 patients, 52,385 (approximately 74.84%) had normal cholesterol levels, 9549 (approximately 13.64%) had above normal levels, and 8066 (approximately 11.52%) had well above normal levels.
- (h)
- “Glucose” is a categorical variable, represented by 1 for normal, 2 for above normal, and 3 for well above normal. Out of 70,000 patients, 59,479 individuals (approximately 84.97%) had normal glucose levels, 5190 individuals (approximately 7.41%) had above normal levels, and 5331 individuals (approximately 7.62%) had well above normal levels.
- (i)
- “Smoking” is a binary variable, with 0 indicating non-smokers and 1 indicating smokers. Of the 70,000 patients, 63,831 individuals (approximately 91.19%) were non-smokers, and 6169 individuals (approximately 8.81%) were smokers.
- (j)
- “Alcohol intake” is a binary variable, with 0 indicating no alcohol consumption and 1 indicating alcohol consumption. Out of 70,000 patients, 66,236 individuals (approximately 94.62%) did not consume alcohol, and 3764 individuals (approximately 5.38%) consumed alcohol.
- (k)
- “Physical activity” is a binary variable, with 0 indicating no physical activity and 1 indicating physical activity. Out of 70,000 patients, 13,739 individuals (approximately 19.63%) were inactive, and 56,261 individuals (approximately 80.37%) were physically active.
- (l)
- The “Presence (or absence) of cardiovascular disease” is a binary variable, with 0 indicating the absence of cardiovascular disease and 1 indicating its presence. Out of 70,000 patients, 35,021 individuals (approximately 50.03%) did not have cardiovascular disease, while 34,979 individuals (approximately 49.97%) had cardiovascular disease.
3.2. Using the First-Stage Taguchi Method to Find Better Hyperparameter Settings
- (a)
- central processing unit: 11th Generation Intel(R) Core(TM) i5-1135G7 @ 2.40 GHz;
- (b)
- random access memory: 8.00 GB;
- (c)
- 64-bit system; and
- (d)
- Python version: 3.9.13.
3.3. Using the Second-Stage Taguchi Method to Find the Best Hyperparameter Settings
3.4. Comparative Study
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Gaziano, T.; Reddy, K.S.; Paccaud, F.; Horton, S.; Chaturvedi, V. Cardiovascular disease. In Disease Control Priorities in Developing Countries, 2nd ed.; Oxford University Press: New York, NY, USA, 2006. [Google Scholar]
- Patnode, C.D.; Redmond, N.; Iacocca, M.O.; Henninger, M. Behavioral Counseling Interventions to Promote a Healthy Diet and Physical Activity for Cardiovascular Disease Prevention in Adults without Known Cardiovascular Disease Risk Factors: Updated evidence report and systematic review for the us preventive services task force. JAMA 2022, 328, 375–388. [Google Scholar] [CrossRef] [PubMed]
- Tektonidou, M.G. Cardiovascular disease risk in antiphospholipid syndrome: Thrombo-inflammation and atherothrombosis. J. Autoimmun. 2022, 128, 102813. [Google Scholar] [CrossRef] [PubMed]
- World Health Organization. The World Health Report: 2000: Health Systems: Improving Performance; World Health Organization: Geneva, Switzerland, 2000.
- Said, M.A.; van de Vegte, Y.J.; Zafar, M.M.; van der Ende, M.Y.; Raja, G.K.; Verweij, N.; van der Harst, P. Contributions of Interactions Between Lifestyle and Genetics on Coronary Artery Disease Risk. Curr. Cardiol. Rep. 2019, 21, 1–8. [Google Scholar] [CrossRef] [PubMed]
- Arpaia, P.; Cataldo, A.; Criscuolo, S.; De Benedetto, E.; Masciullo, A.; Schiavoni, R. Assessment and Scientific Progresses in the Analysis of Olfactory Evoked Potentials. Bioengineering 2022, 9, 252. [Google Scholar] [CrossRef] [PubMed]
- Rahman, H.; Bukht, T.F.N.; Imran, A.; Tariq, J.; Tu, S.; Alzahrani, A. A Deep Learning Approach for Liver and Tumor Segmentation in CT Images Using ResUNet. Bioengineering 2022, 9, 368. [Google Scholar] [CrossRef] [PubMed]
- Centracchio, J.; Andreozzi, E.; Esposito, D.; Gargiulo, G.D. Respiratory-Induced Amplitude Modulation of Forcecardiography Signals. Bioengineering 2022, 9, 444. [Google Scholar] [CrossRef] [PubMed]
- Yang, L.; Peavey, M.; Kaskar, K.; Chappell, N.; Zhu, L.; Devlin, D.; Valdes, C.; Schutt, A.; Woodard, T.; Zarutskie, P.; et al. Development of a dynamic machine learning algorithm to predict clinical pregnancy and live birth rate with embryo morphokinetics. F&S Rep. 2022, 3, 116–123. [Google Scholar] [CrossRef]
- Olisah, C.C.; Smith, L.; Smith, M. Diabetes mellitus prediction and diagnosis from a data preprocessing and machine learning perspective. Comput. Methods Programs Biomed. 2022, 220, 106773. [Google Scholar] [CrossRef]
- Arroyo, J.C.T.; Delima, A.J.P. An Optimized Neural Network Using Genetic Algorithm for Cardiovascular Disease Prediction. J. Adv. Inf. Technol. 2022, 13, 95–99. [Google Scholar] [CrossRef]
- Kim, M.-J. Building a Cardiovascular Disease Prediction Model for Smartwatch Users Using Machine Learning: Based on the Korea National Health and Nutrition Examination Survey. Biosensors 2021, 11, 228. [Google Scholar] [CrossRef]
- Khan, A.; Qureshi, M.; Daniyal, M.; Tawiah, K. A Novel Study on Machine Learning Algorithm-Based Cardiovascular Disease Prediction. Health Soc. Care Community 2023, 2023, 1406060. [Google Scholar] [CrossRef]
- Moon, J.; Posada-Quintero, H.F.; Chon, K.H. A literature embedding model for cardiovascular disease prediction using risk factors, symptoms, and genotype information. Expert Syst. Appl. 2023, 213, 118930. [Google Scholar] [CrossRef]
- Rosenblatt, F. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms; Spartan Books: New York, NY, USA, 1962. [Google Scholar]
- Olden, J.D.; Jackson, D.A. Illuminating the “black box”: A randomization approach for understanding variable contributions in artificial neural networks. Ecol. Model. 2002, 154, 135–150. [Google Scholar] [CrossRef]
- Stern, H.S. Neural Networks in Applied Statistics. Technometrics 1996, 38, 205–214. [Google Scholar] [CrossRef]
- McClelland, J.L.; Rumelhart, D.E. Explorations in Parallel Distributed Processing: A Handbook of Models, Programs, and Exercises, 1st ed.; MIT Press: London, UK, 1989. [Google Scholar]
- Fausett, L. Fundamentals of Neural Networks: An Architecture, Algorithms, and Applications; Prentice Hall: New Jersey, NJ, USA, 1994. [Google Scholar]
- Hagan, M.T.; Demuth, H.B.; Beale, M. Neural Network Design; PWS: Boston, MA, USA, 1995.
- Malhotra, P.; Gupta, S.; Koundal, D.; Zaguia, A.; Enbeyle, W. Deep Neural Networks for Medical Image Segmentation. J. Healthc. Eng. 2022, 2022, 9580991. [Google Scholar] [CrossRef]
- Pantic, I.; Paunovic, J.; Cumic, J.; Valjarevic, S.; Petroianu, G.A.; Corridon, P.R. Artificial neural networks in contemporary toxicology research. Chem. Interact. 2022, 369, 110269. [Google Scholar] [CrossRef] [PubMed]
- He, B.; Hu, W.; Zhang, K.; Yuan, S.; Han, X.; Su, C.; Zhao, J.; Wang, G.; Wang, G.; Zhang, L. Image segmentation algorithm of lung cancer based on neural network model. Expert Syst. 2022, 39, e12822. [Google Scholar] [CrossRef]
- Poradzka, A.A.; Czupryniak, L. The use of the artificial neural network for three-month prognosis in diabetic foot syndrome. J. Diabetes Its Complicat. 2023, 37, 108392. [Google Scholar] [CrossRef]
- Krasteva, V.; Christov, I.; Naydenov, S.; Stoyanov, T.; Jekova, I. Application of Dense Neural Networks for Detection of Atrial Fibrillation and Ranking of Augmented ECG Feature Set. Sensors 2023, 21, 6848. [Google Scholar] [CrossRef]
- Su, C.T. Quality Engineering: Off-Line Methods and Applications, 1st ed.; CRC Press: Boca Raton, FL, USA, 2013. [Google Scholar]
- Parr, W.C. Introduction to Quality Engineering: Designing Quality into Products and Processes; Asia Productivity Organization: Tokyo, Japan, 1986.
- Taguchi, G.; Elsayed, E.A.; Hsiang, T.C. Hsiang. In Quality Engineering in Production Systems; McGraw-Hill: New York, NY, USA, 1989. [Google Scholar]
- Ross, P.J. Taguchi Techniques for Quality Engineering: Loss Function, Orthogonal Experiments, Parameter and Tolerance Design, 2nd ed.; McGraw-Hill: New York, NY, USA, 1996. [Google Scholar]
- Kaziz, S.; Ben Romdhane, I.; Echouchene, F.; Gazzah, M.H. Numerical simulation and optimization of AC electrothermal microfluidic biosensor for COVID-19 detection through Taguchi method and artificial network. Eur. Phys. J. Plus 2023, 138, 96. [Google Scholar] [CrossRef]
- Tseng, H.-C.; Lin, H.-C.; Tsai, Y.-C.; Lin, C.-H.; Changlai, S.-P.; Lee, Y.-C.; Chen, C.-Y. Applying Taguchi Methodology to Optimize the Brain Image Quality of 128-Sliced CT: A Feasibility Study. Appl. Sci. 2022, 12, 4378. [Google Scholar] [CrossRef]
- Safaei, M.; Moradpoor, H.; Mobarakeh, M.S.; Fallahnia, N. Optimization of Antibacterial, Structures, and Thermal Properties of Alginate-ZrO2 Bionanocomposite by the Taguchi Method. J. Nanotechnol. 2022, 2022, 7406168. [Google Scholar] [CrossRef]
- Lagzian, M.; Razavi, S.E.; Goharimanesh, M. Investigation on tumor cells growth by Taguchi method. Biomed. Signal Process. Control. 2022, 77, 103734. [Google Scholar] [CrossRef]
- St»Hle, L.; Wold, S. Analysis of variance (ANOVA). Chemom. Intell. Lab. Syst. 1989, 6, 259–272. [Google Scholar] [CrossRef]
- Blanco-Topping, R. The impact of Maryland all-payer model on patient satisfaction of care: A one-way analysis of variance (ANOVA). Int. J. Health Manag. 2021, 14, 1397–1404. [Google Scholar] [CrossRef]
- Mahesh, G.G.; Kandasamy, J. Experimental investigations on the drilling parameters to minimize delamination and taperness of hybrid GFRP/Al2O3 composites by using ANOVA approach. World J. Eng. 2021, 20, 376–386. [Google Scholar] [CrossRef]
- Adesegun, O.A.; Binuyo, T.; Adeyemi, O.; Ehioghae, O.; Rabor, D.F.; Amusan, O.; Akinboboye, O.; Duke, O.F.; Olafimihan, A.G.; Ajose, O.; et al. The COVID-19 Crisis in Sub-Saharan Africa: Knowledge, Attitudes, and Practices of the Nigerian Public. Am. J. Trop. Med. Hyg. 2020, 103, 1997–2004. [Google Scholar] [CrossRef]
- Ulianova, S. Cardiovascular Disease Dataset. [Online]. 2019. Available online: https://www.kaggle.com/datasets/sulianova/cardiovascular-disease-dataset (accessed on 17 February 2023).
No | A | B | C | D |
---|---|---|---|---|
1 | 1 | 1 | 1 | 1 |
2 | 1 | 2 | 2 | 2 |
3 | 1 | 3 | 3 | 3 |
4 | 2 | 1 | 2 | 3 |
5 | 2 | 2 | 3 | 1 |
6 | 2 | 3 | 1 | 2 |
7 | 3 | 1 | 3 | 2 |
8 | 3 | 2 | 1 | 3 |
9 | 3 | 3 | 2 | 1 |
Independent Variable | Dependent Variable | |
---|---|---|
A | B | Y |
1 | 1 | A1B1 |
1 | 2 | A1B2 |
1 | 3 | A1B3 |
2 | 1 | A2B1 |
2 | 2 | A2B2 |
2 | 3 | A2B3 |
No | Feature | Data Type |
---|---|---|
1 | Age | int (days) |
2 | Height | int (cm) |
3 | Weight | float (kg) |
4 | Gender | categorical code |
5 | Systolic blood pressure | int |
6 | Diastolic blood pressure | int |
7 | Cholesterol | 1: normal, 2: above normal, 3: well above normal |
8 | Glucose | 1: normal, 2: above normal, 3: well above normal |
9 | Smoking | binary |
10 | Alcohol intake | binary |
11 | Physical activity | binary |
12 | Presence (or absence) of cardiovascular disease | binary |
Hyperparameter | Hidden Layers | Activation Function | Optimizer | Learning Rate | Moment Rate | Hidden Nodes |
---|---|---|---|---|---|---|
X1 | X2 | X3 | X4 | X5 | X6 | |
Level 1 | 4 | logistic | lbfgs | 0.2 | 0.7 | 4 |
Level 2 | 8 | tanh | sgd | 0.3 | 0.8 | 8 |
Level 3 | 12 | relu | adam | 0.4 | 0.9 | 12 |
Exp. | Hyperparameter | Accuracy | Average Accuracy | Standard Deviation | SN Ratio | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
X1 | X2 | X3 | X4 | X5 | X6 | N1 | N2 | N3 | ||||
1 | 4 | logistic | lbfgs | 0.20 | 0.70 | 4 | 0.4951 | 0.4951 | 0.4951 | 0.4951 | 0.0000 | −6.105 |
2 | 4 | tanh | sgd | 0.30 | 0.80 | 8 | 0.7389 | 0.7382 | 0.7262 | 0.7345 | 0.0071 | −2.682 |
3 | 4 | relu | adam | 0.40 | 0.90 | 12 | 0.4951 | 0.5049 | 0.4951 | 0.4984 | 0.0056 | −6.050 |
4 | 8 | logistic | lbfgs | 0.30 | 0.80 | 12 | 0.4951 | 0.4951 | 0.4951 | 0.4951 | 0.0000 | −6.105 |
5 | 8 | tanh | sgd | 0.40 | 0.90 | 4 | 0.7404 | 0.7405 | 0.7331 | 0.7380 | 0.0043 | −2.639 |
6 | 8 | relu | adam | 0.20 | 0.70 | 8 | 0.4951 | 0.4951 | 0.4951 | 0.4951 | 0.0000 | −6.105 |
7 | 12 | logistic | sgd | 0.20 | 0.90 | 8 | 0.4951 | 0.5049 | 0.5049 | 0.5016 | 0.0056 | −5.994 |
8 | 12 | tanh | adam | 0.30 | 0.70 | 12 | 0.4951 | 0.4951 | 0.5136 | 0.5013 | 0.0106 | −6.002 |
9 | 12 | relu | lbfgs | 0.40 | 0.80 | 4 | 0.7341 | 0.7414 | 0.4951 | 0.6569 | 0.1401 | −4.124 |
10 | 4 | logistic | adam | 0.40 | 0.80 | 8 | 0.4951 | 0.4951 | 0.4951 | 0.4951 | 0.0000 | −6.105 |
11 | 4 | tanh | lbfgs | 0.20 | 0.90 | 12 | 0.7361 | 0.7386 | 0.7372 | 0.7373 | 0.0013 | −2.647 |
12 | 4 | relu | sgd | 0.30 | 0.70 | 4 | 0.7389 | 0.7381 | 0.7404 | 0.7391 | 0.0012 | −2.626 |
13 | 8 | logistic | sgd | 0.40 | 0.70 | 12 | 0.5049 | 0.4951 | 0.4951 | 0.4984 | 0.0056 | −6.050 |
14 | 8 | tanh | adam | 0.20 | 0.80 | 4 | 0.4951 | 0.5049 | 0.4951 | 0.4984 | 0.0056 | −6.050 |
15 | 8 | relu | lbfgs | 0.30 | 0.90 | 8 | 0.7416 | 0.7412 | 0.7404 | 0.7411 | 0.0007 | −2.603 |
16 | 12 | logistic | adam | 0.30 | 0.90 | 4 | 0.4951 | 0.4951 | 0.4951 | 0.4951 | 0.0000 | −6.105 |
17 | 12 | tanh | lbfgs | 0.40 | 0.70 | 8 | 0.7441 | 0.7418 | 0.7423 | 0.7427 | 0.0012 | −2.584 |
18 | 12 | relu | sgd | 0.20 | 0.80 | 12 | 0.7384 | 0.7240 | 0.7431 | 0.7352 | 0.0100 | −2.674 |
0.5999 | 0.0110 | −4.625 |
Level | X1 | X2 | X3 | X4 | X5 | X6 |
---|---|---|---|---|---|---|
1 | 0.6166 | 0.4968 | 0.6447 | 0.5771 | 0.5786 | 0.6038 |
2 | 0.5777 | 0.6587 | 0.6578 | 0.6177 | 0.6025 | 0.6184 |
3 | 0.6055 | 0.6443 | 0.4972 | 0.6049 | 0.6186 | 0.5776 |
Effect | 0.0389 | 0.1619 | 0.1605 | 0.0406 | 0.0400 | 0.0408 |
Rank | 6 | 1 | 2 | 4 | 5 | 3 |
Source | DF | SS | MS | F | p |
---|---|---|---|---|---|
X1 | 2 * | 0.004817 * | |||
X2 | 2 | 0.096382 | 0.048191 | 12.35 | 0.001 |
X3 | 2 | 0.095378 | 0.047689 | 12.23 | 0.001 |
X4 | 2 * | 0.005165 * | |||
X5 | 2 * | 0.00485 * | |||
X6 | 2 * | 0.005116 * | |||
Error | (13) | (0.050712) | 0.003901 | ||
Sum | 17 | 0.242471 | |||
R2 | R2 (adj.) | ||||
79.09% | 72.65% |
Level | X1 | X2 | X3 | X4 | X5 | X6 |
---|---|---|---|---|---|---|
1 | −4.3692 | −6.0775 | −4.0281 | −4.9292 | −4.9120 | −4.6083 |
2 | −4.9254 | −3.7673 | −3.7773 | −4.3538 | −4.6234 | −4.3454 |
3 | −4.5805 | −4.0303 | −6.0697 | −4.5920 | −4.3397 | −4.9214 |
Effect | 0.5562 | 2.3102 | 2.2924 | 0.5754 | 0.5723 | 0.5760 |
Rank | 6 | 1 | 2 | 4 | 5 | 3 |
Source | DF | SS | MS | F | p |
---|---|---|---|---|---|
X1 | 2 * | 0.946 * | |||
X2 | 2 | 19.1948 | 9.5974 | 11.78 | 0.001 |
X3 | 2 | 18.9717 | 9.4859 | 11.64 | 0.001 |
X4 | 2 * | 1.0031 * | |||
X5 | 2 * | 0.9827 * | |||
X6 | 2 * | 0.9979 * | |||
Error | (13) | (10.5947) | 0.815 | ||
Sum | 17 | 48.7612 | |||
R2 | R2 (adj.) | ||||
78.27% | 71.59% |
Method | N1 | N2 | N3 | N4 | N5 | Average Accuracy | Standard Deviation | SN Ratio |
---|---|---|---|---|---|---|---|---|
L18(21 × 37) | 0.7398 | 0.7392 | 0.7394 | 0.7369 | 0.7361 | 0.7383 | 0.0017 | −2.636 |
Hyperparameter | Learning Rate | Moment Rate | Hidden Nodes |
---|---|---|---|
X4 | X5 | X6 | |
Level 1 | 0.25 | 0.85 | 6 |
Level 2 | 0.3 | 0.9 | 8 |
Level 3 | 0.35 | 0.95 | 10 |
Exp. | Hyperparameter | Accuracy | Average Accuracy | Standard Deviation | SN Ratio | ||||
---|---|---|---|---|---|---|---|---|---|
X4 | X5 | X6 | N1 | N2 | N3 | ||||
1 | 0.25 | 0.85 | 6 | 0.7390 | 0.7379 | 0.7417 | 0.7395 | 0.0020 | −2.621 |
2 | 0.25 | 0.90 | 8 | 0.7392 | 0.7381 | 0.7376 | 0.7383 | 0.0008 | −2.635 |
3 | 0.25 | 0.95 | 10 | 0.7348 | 0.7414 | 0.7398 | 0.7387 | 0.0035 | −2.631 |
4 | 0.30 | 0.85 | 8 | 0.7348 | 0.7411 | 0.7399 | 0.7386 | 0.0033 | −2.632 |
5 | 0.30 | 0.90 | 10 | 0.7399 | 0.7341 | 0.7400 | 0.7380 | 0.0034 | −2.639 |
6 | 0.30 | 0.95 | 6 | 0.7357 | 0.7360 | 0.7395 | 0.7371 | 0.0021 | −2.650 |
7 | 0.35 | 0.85 | 10 | 0.7418 | 0.7349 | 0.7388 | 0.7385 | 0.0035 | −2.633 |
8 | 0.35 | 0.90 | 6 | 0.7376 | 0.7301 | 0.7374 | 0.7350 | 0.0043 | −2.675 |
9 | 0.35 | 0.95 | 8 | 0.7420 | 0.7374 | 0.7291 | 0.7361 | 0.0065 | −2.661 |
0.7378 | 0.0033 | −2.642 |
Level | X4 | X5 | X6 |
---|---|---|---|
1 | 0.7388 | 0.7389 | 0.7372 |
2 | 0.7379 | 0.7371 | 0.7377 |
3 | 0.7365 | 0.7373 | 0.7384 |
Effect | 0.0023 | 0.0017 | 0.0012 |
Rank | 1 | 2 | 3 |
Source | DF | SS | MS | F | p-Value |
---|---|---|---|---|---|
X4 | 2 | 0.000008 | 0.000004 | 5.94 | 0.06 |
X5 | 2 | 0.000006 | 0.000003 | 4.12 | 0.10 |
X6 | 2 * | 0.000002 * | |||
Error | (4) | (0.000003) | 0.000001 | ||
Sum | 8 | 0.000016 | |||
R2 | R2 (adj.) | ||||
83.41% | 66.83% |
Level | X4 | X5 | X6 |
---|---|---|---|
1 | −2.629 | −2.629 | −2.648 |
2 | −2.640 | −2.650 | −2.643 |
3 | −2.656 | −2.648 | −2.634 |
Effect | 0.027 | 0.021 | 0.014 |
Rank | 1 | 2 | 3 |
Source | DF | SS | MS | F | p-Value |
---|---|---|---|---|---|
X4 | 2 | 0.001133 | 0.000566 | 6.07 | 0.06 |
X5 | 2 | 0.000773 | 0.000386 | 4.14 | 0.10 |
X6 | 2 * | 0.000299 * | |||
Error | (4) | (0.000373) | 0.000093 | ||
Sum | 8 | 0.002279 | |||
R2 | R2 (adj.) | ||||
83.62% | 67.23% |
Method | N1 | N2 | N3 | N4 | N5 | Average Accuracy | Standard Deviation | SN Ratio |
---|---|---|---|---|---|---|---|---|
L9(34) | 0.7401 | 0.7410 | 0.7439 | 0.7415 | 0.7407 | 0.7414 | 0.0015 | −2.598 |
Comparison | Hidden Layers | Activation Function | Optimizer | Learning Rate | Moment Rate | Hidden Nodes | Average Accuracy | Standard Deviation | SN Ratio |
---|---|---|---|---|---|---|---|---|---|
X1 | X2 | X3 | X4 | X5 | X6 | ||||
L18(21 × 37) | 4 | tanh | sgd | 0.3 | 0.9 | 8 | 0.7383 | 0.0017 | −2.636 |
L9(34) | 4 | tanh | sgd | 0.25 | 0.85 | 10 | 0.7414 | 0.0015 | −2.598 |
Improvement | 0.0032 | −0.0002 | 0.0374 |
Method | Accuracy |
---|---|
ANN | 0.6835 |
Logistic regression | 0.7235 |
Decision tree | 0.6172 |
Random forest | 0.6894 |
Support vector machine | 0.7216 |
K-Nearest Neighbor | 0.6834 |
GA-ANN | 0.7343 |
First-stage Taguchi method | 0.7383 |
Proposed TSTO | 0.7414 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lin, C.-M.; Lin, Y.-S. Utilizing a Two-Stage Taguchi Method and Artificial Neural Network for the Precise Forecasting of Cardiovascular Disease Risk. Bioengineering 2023, 10, 1286. https://doi.org/10.3390/bioengineering10111286
Lin C-M, Lin Y-S. Utilizing a Two-Stage Taguchi Method and Artificial Neural Network for the Precise Forecasting of Cardiovascular Disease Risk. Bioengineering. 2023; 10(11):1286. https://doi.org/10.3390/bioengineering10111286
Chicago/Turabian StyleLin, Chia-Ming, and Yu-Shiang Lin. 2023. "Utilizing a Two-Stage Taguchi Method and Artificial Neural Network for the Precise Forecasting of Cardiovascular Disease Risk" Bioengineering 10, no. 11: 1286. https://doi.org/10.3390/bioengineering10111286
APA StyleLin, C. -M., & Lin, Y. -S. (2023). Utilizing a Two-Stage Taguchi Method and Artificial Neural Network for the Precise Forecasting of Cardiovascular Disease Risk. Bioengineering, 10(11), 1286. https://doi.org/10.3390/bioengineering10111286