# CBRL and CBRC: Novel Algorithms for Improving Missing Value Imputation Accuracy Based on Bayesian Ridge Regression

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

^{2}), and mean absolute error (MAE). The results showed that the performance varies depending on missing values percentage, size of the dataset, and the missingness mechanism. In addition, the performance of the proposed methods is slightly better.

## 1. Introduction

#### 1.1. Missingness Mechanisms

- Missing completely at random (MCAR) [12]: Assume that the missing value indicator matrix $M=\left({M}_{ij}\right)$ and the complete data $Y=\left({y}_{ij}\right)$. The missing data mechanism is described by the conditional distribution of $M$ given $Y$, say $f(M|Y,\varnothing )$ where $\varnothing $ represents the unknown parameter. If missingness does not depend on the values of the data $Y$, missing or observed, then$$f\left(M|Y,\varnothing \right)=f\left(M|\varnothing \right)forallY,\varnothing $$
- Missing at random (MAR) [12]: Let ${Y}_{mis}$ and ${Y}_{obs}$ denote missing data and observed data, respectively. If the Missingness do not depend on the data that are missing, but depends only on ${Y}_{obs}$ of $Y$, then,$$f\left(M|Y,\varnothing \right)=f\left(M|{Y}_{obs},\varnothing \right)forall{Y}_{mis},\varnothing $$
- Missing not at random (MNAR) [2]: When the missing data depends on both observed and missing data.

#### 1.2. Dealing with Missing Data

- Simple linear regression: In which a linear relationship between the dependent $y$ and the independent $X$ variables holds$$y={\beta}_{o}+{\beta}_{1}X+\epsilon $$
- Multiple linear regression: In which more independent variables work together to obtain better prediction. The linear relationship between the dependent and independent variables holds$$y={\beta}_{o}+{\beta}_{1}{x}_{1}+{\beta}_{2}{x}_{2}+\cdots {\beta}_{p}{x}_{p}+\epsilon $$

#### 1.3. Relevant Imputation Algorithms

^{2}score, or RMSE to assess performance accuracy. In [21], an imputation method that uses an auto-encoder neural network to manipulate missing data was proposed. To train the auto-encoder, two-stage training scheme was used. Eight state-of-the-art imputation methods were compared with the proposed method. In [22], two imputation quantiles-based algorithms were proposed. One of them was done with the help of supplementary information, but the other was not. However, describing the relationship between the feature of concern and the extra feature was an issue. In [5], a genetic algorithm with support vector regression and fuzzy clustering to deal with missing data was proposed. FcmGa, SvrGa, and Zeroimpute methods were compared with the proposed method. Though the proposed method improved the imputation accuracy, the dimension of the whole dataset affects the training stage efficiency, which means that if many features have lots of missing values, lots of instances will be rejected. In [23], the authors proposed an efficient method to impute missing data in classification problems using decision trees. It is very closely related to the approach of dealing with ‘‘missing” as a category in its own exact, generalizing it for use with categorical and continuous features too. Their proposed method showed excellent performance through different collection of data types, and sources and proportions of missingness.

## 2. Proposed Algorithms

- In the first step, each proposed algorithm takes a dataset $\mathrm{D}$ as input that holds missing data, then splits it into two sets, the first set ${X}^{\left(comp\right)}$ includes all complete features, and the second set ${X}^{\left(mis\right)}$ includes all incomplete features. The authors assume that the target feature $y$ contains no missing data, so ${X}^{\left(comp\right)}$ comprises all full features plus the target feature $y$.
- In the second step, each proposed algorithm implements its feature selection condition to select the candidate feature to be imputed.
- The first algorithm, we called Cumulative Bayesian Ridge with Less NaN (CBRL), as its name indicates that this algorithm selects the feature that contains less missing data, which leads the model to be built on the most available information (Algorithm 1).
**Algorithm 1**CBRL1: **Input:**

2: D: A dataset with missing values containing $n$ instances.

3:**Output:**

4: D_{imputed}: A dataset with all missing features imputed.

5:**Definitions:**

6: ${X}^{\left(comp\right)}$ Set of complete features.

7: ${X}^{\left(mis\right)}$ Set of incomplete features.

8: ${X}_{imp}^{\left(mis\right)}$ Imputed feature from ${X}^{\left(mis\right)}$.

9: $m$ Number of features containing missing values.

10: $MissObs.{X}_{l}^{\left(mis\right)}$ Set of missing instances in the independent feature ${X}_{l}^{\left(mis\right)}$ $,l\in \left\{1,\dots ,m\right\}$.

11: $Card\left(MissObs.{X}_{l}^{\left(mis\right)}\right)$ Number of missing values in the independent feature ${X}_{l}^{\left(mis\right)}$.

12:**Begin**

13: 1 Split D into ${X}^{\left(comp\right)}$ and ${X}^{\left(mis\right)}$.

14: 2 From ${X}^{\left(mis\right)}$ select ${X}_{l}^{\left(mis\right)}$ that satisfies the condition:

15: Min (Card ($MissObs.{X}_{l}^{\left(mis\right)}$)).

16: 3 While ${X}^{\left(mis\right)}$ $\ne \mathsf{\xd8}$

17: i $g$ ← index of the candidate feature in ${X}^{\left(mis\right)}$.

18: ii Fit a Bayesian ridge regression model on ${X}^{\left(comp\right)}$ as independent features and ${X}_{g}^{\left(miss\right)}$ as dependent feature.

19: iii ${X}_{imp}^{\left(mis\right)}$ ← Impute the missing data in ${X}_{g}^{\left(miss\right)}$ with the fitted model.

20: iv Delete ${X}_{g}^{\left(miss\right)}$ from ${X}^{\left(mis\right)}$ and add ${X}_{imp}^{\left(mis\right)}$ to ${X}^{\left(comp\right)}$.

21: End While

22: 4 return D_{imputed}← ${X}^{\left(comp\right)}$

23:**End** - The second algorithm, we called Cumulative Bayesian Ridge with high correlation (CBRC), depends on the highest correlation between the candidate features that contain missing data and the target feature. CBRC chooses the feature that gives the highest correlation with the target feature. The correlation criterion (i.e., Pearson correlation coefficient) is given by Equation (4) [25]:$$R\left(i\right)=\frac{cov\left({x}_{i},Y\right)}{\sqrt{var\left({x}_{i}\right)\ast var\left(Y\right)}}$$
**Algorithm 2**CBRC1: **Input:**

2: D: A dataset with missing values containing $n$ instances.

3:**Output:**

4: D_{imputed}: A dataset with all missing features imputed.

5:**Definitions:**

6: ${X}^{\left(comp\right)}$ Set of complete features.

7: ${X}^{\left(mis\right)}$ Set of incomplete features.

8: ${X}_{imp}^{\left(mis\right)}$ Imputed feature from ${X}^{\left(mis\right)}$.

9: $m$ Number of features containing missing values.

10: $Corr\left({X}_{l}^{\left(mis\right)},y\right)$ Correlation between ${X}_{l}^{\left(mis\right)}$ and $\mathrm{y}$, $l\in \left\{1,\dots ,m\right\}$.

11:**Begin**

11: 1 Split D into ${X}^{\left(comp\right)}$ and ${X}^{\left(mis\right)}$.

12: 2 From ${X}^{\left(mis\right)}$ select ${X}_{l}^{\left(mis\right)}$ that satisfies the condition:

13: Max (Corr (${X}_{l}^{\left(mis\right)},y$)).

14: 3 While ${X}^{\left(mis\right)}\ne \mathsf{\xd8}$

15: i $g$ ← index of the candidate feature in ${X}^{\left(mis\right)}$.

16: ii Fit a Bayesian ridge regression model on ${X}^{\left(comp\right)}$ as independent features and ${X}_{g}^{\left(miss\right)}$ as dependent feature.

17: iii ${X}_{imp}^{\left(mis\right)}$ ← Impute the missing data in ${X}_{g}^{\left(miss\right)}$ with the fitted model.

18: iv Delete ${X}_{g}^{\left(miss\right)}$ from ${X}^{\left(mis\right)}$ and add ${X}_{imp}^{\left(mis\right)}$ to ${X}^{\left(comp\right)}$.

19: End While

20: 4 return D_{imputed}← ${X}^{\left(comp\right)}$

21:**End**

- After selecting the candidate feature ${X}_{g}^{\left(miss\right)}$, the model is fitted with the cumulative formula defined in Equation (5) using the candidate feature as dependent and the ${X}^{\left(comp\right)}$ as the independent feature. The selected feature deleted from ${X}^{\left(mis\right)}$, and after imputation, the imputed feature ${X}_{imp}^{\left(mis\right)}$ is added to ${X}^{\left(comp\right)}$. Now ${X}^{\left(comp\right)}$ consists of all complete features, $y$ and ${X}_{imp}^{\left(mis\right)}$. Select another candidate feature from ${X}^{\left(mis\right)}$. Fit the model using the cumulative BRR formula with this candidate feature as the dependent feature and ${X}^{\left(comp\right)}$ as an independent feature.$${X}_{g}^{\left(miss\right)}\sim {\rm N}\left({\mu}_{g},{\alpha}_{g}\right)$$$${\mu}_{g}={\beta}_{o}+{\displaystyle \sum}_{i=1}^{c}{\beta}_{i}{X}_{i}^{\left(comp\right)}+{\beta}_{c+1}y+{\displaystyle \sum}_{imp=1}^{g-1}{\beta}_{imp+c+1}{X}_{imp}^{\left(miss\right)2}$$$$\beta \sim N\left(0,{\lambda}_{g}^{-1}{I}_{imp+c+1}\right)$$$${\alpha}_{g}\sim \u210a\left({\alpha}_{1g},{\alpha}_{2g}\right)$$$${\lambda}_{g}\sim \u210a\left({\lambda}_{1g},{\lambda}_{2g}\right)$$
- Repeat from step 2 of feature selection until ${X}^{\left(mis\right)}$ is empty, then return the imputed dataset (${X}^{\left(comp\right)}$), see Figure 1.

## 3. Experimental Implementation

#### 3.1. Benchmark Datasets

#### 3.2. Evaluation

^{2}score, and the time of imputation in seconds (t). The performance evaluation was calculated for the four missingness ratios.

#### 3.2.1. RMSE and MAE

#### 3.2.2. R^{2} Score

^{2}score (Equation (8)) is a statistical measure that indicates how well the predicted values are close to the real data values.

## 4. Results and Discussion

#### 4.1. Error Analysis

#### 4.2. Imputation Time

#### 4.3. Accuracy Analysis

^{2}score (higher-value-is better). Figure 2a shows that CBRL equals to least squares and MICE in accuracy performance, and better than stochastic, norm, Fast KNN, and EMI. CBRC is worse in accuracy performance than CBRL, least squares, and MICE, and better than stochastic, norm, Fast KNN, and EMI. Figure 3a shows that CBRL is better in accuracy performance than stochastic, norm, Fast KNN and EMI, and worse than least squares and MICE. CBRC equals to CBRL in accuracy performance. Figure 4a CBRC is better in accuracy performance than all stated methods and worse than MICE. CBRL is better than all methods in accuracy performance and worth than CBRC and MICE. Figure 5a CBRL and CBRC are better in accuracy performance than stochastic, norm, Fast KNN, and EMI, but worse than least squares and MICE. CBRC is better in accuracy performance than CBRL. Figure 6a shows that CBRL and CBRC are better in accuracy performance than stochastic, norm, Fast KNN and EMI, and worse than least squares and MICE. CBRL is better in accuracy performance than CBRC. Figure 7a shows that in MAR, CBRL and CBRC are better in accuracy performance than least squares, stochastic, norm, Fast KNN and EMI, and equals to MICE. In MCAR, CBRL and CBRC are better in accuracy performance than stochastic, norm, Fast KNN, and EMI but worse than least squares and MICE. In MNAR, CBRL and CBRC are better in accuracy performance than stochastic, norm, Fast KNN and EMI, but equals to least squares and MICE. CBRC is better in accuracy performance than CBRL. Figure 8b, Figure 9b, Figure 10b, Figure 11b, Figure 12b, Figure 13b, Figure 14b and Figure 15b, exhibit the error analysis for the BNG_heart_statlog and Poker Hand dataset using random samples of 10,000, 15,000, 20,000, and 50,000 of instances. The results show that CBRL and CBRC equal in performance to least squares and MICE. CBRL and CBRC are better than stochastic, norm, Fast KNN, and EMI.

## 5. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Mostafa, S.M. Imputing missing values using cumulative linear regression. CAAI Trans. Intell. Technol.
**2019**, 4, 182–200. [Google Scholar] [CrossRef] - Salgado, C.M.; Azevedo, C.; Manuel Proença, H.; Vieira, S.M. Missing data. Second. Anal. Electron. Health Rec.
**2016**, 143–162. [Google Scholar] [CrossRef] [Green Version] - Hapfelmeier, A.; Hothorn, T.; Ulm, K.; Strobl, C. A new variable importance measure for random forests with missing data. Stat. Comput.
**2014**, 24, 21–34. [Google Scholar] [CrossRef] [Green Version] - Batista, G.; Monard, M.-C. A study of k-nearest neighbour as an imputation method. Hybrid Intell. Syst. Ser. Front Artif. Intell. Appl.
**2002**, 87, 251–260. [Google Scholar] - Aydilek, I.B.; Arslan, A. A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm. Inf. Sci.
**2013**, 233, 25–35. [Google Scholar] [CrossRef] - Pampaka, M.; Hutcheson, G.; Williams, J. Handling missing data: Analysis of a challenging data set using multiple imputation. Int. J. Res. Method Educ.
**2016**, 39, 19–37. [Google Scholar] [CrossRef] - Abdella, M.; Marwala, T. The use of genetic algorithms and neural networks to approximate missing data in database. Comput. Inform.
**2005**, 24, 577–589. [Google Scholar] - Luengo, J.; García, S.; Herrera, F. On the choice of the best imputation methods for missing values considering three groups of classification methods. Knowl. Inf. Syst.
**2012**, 32, 77–108. [Google Scholar] [CrossRef] - Donders, A.R.T.; van der Heijden, G.J.M.G.; Stijnen, T.; Moons, K.G.M. Review: A gentle introduction to imputation of missing values. J. Clin. Epidemiol.
**2006**, 59, 1087–1091. [Google Scholar] [CrossRef] - Perkins, N.J.; Cole, S.R.; Harel, O.; Tchetgen Tchetgen, E.J.; Sun, B.; Mitchell, E.M.; Schisterman, E.F. Principled Approaches to Missing Data in Epidemiologic Studies. Am. J. Epidemiol.
**2018**, 187, 568–575. [Google Scholar] [CrossRef] [Green Version] - Croiseau, P.; Génin, E.; Cordell, H.J. Dealing with missing data in family-based association studies: A multiple imputation approach. Hum. Hered.
**2007**, 63, 229–238. [Google Scholar] [CrossRef] [PubMed] - Mostafa, S.M. Missing data imputation by the aid of features similarities. Int. J. Big Data Manag.
**2020**, 1, 81–103. [Google Scholar] [CrossRef] - Iltache, S.; Comparot, C.; Mohammed, M.S.; Charrel, P.J. Using semantic perimeters with ontologies to evaluate the semantic similarity of scientific papers. Informatica
**2018**, 42, 375–399. [Google Scholar] [CrossRef] [Green Version] - Yadav, M.L.; Roychoudhury, B. Handling missing values: A study of popular imputation packages in R. Knowl.-Based Syst.
**2018**, 160, 104–118. [Google Scholar] [CrossRef] - Farhangfar, A.; Kurgan, L.A.; Pedrycz, W. A Novel Framework for Imputation of Missing Values in Databases. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum.
**2007**, 37, 692–709. [Google Scholar] [CrossRef] - Zahin, S.A.; Ahmed, C.F.; Alam, T. An effective method for classification with missing values. Appl. Intell.
**2018**, 48, 3209–3230. [Google Scholar] [CrossRef] - Batista, G.E.A.P.A.; Monard, M.C. An analysis of four missing data treatment methods for supervised learning. Appl. Artif. Intell.
**2003**, 17, 519–533. [Google Scholar] [CrossRef] - Acuña, E.; Rodriguez, C. The Treatment of Missing Values and its Effect on Classifier Accuracy. In Classification, Clustering, and Data Mining Applications; Springer: Berlin/Heidelberg, Germany, 2004; pp. 639–647. [Google Scholar]
- Li, D.; Deogun, J.; Spaulding, W.; Shuart, B. Towards Missing Data Imputation: A Study of Fuzzy K-means Clustering Method. In Proceedings of the International Conference on Rough Sets and Current Trends in Computing, Madrid, Spain, 9–13 July 2004; Springer: Berlin/Heidelberg, Germany, 2004; Volume 3066, pp. 573–579. [Google Scholar]
- Feng, H.; Chen, G.; Yin, C.; Yang, B.; Chen, Y. A SVM regression based approach to filling in missing values. In Proceedings of the International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, Melbourne, Australia, 14–16 September 2005; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3683, pp. 581–587. [Google Scholar] [CrossRef]
- Choudhury, S.J.; Pal, N.R. Imputation of missing data with neural networks for classification. Knowl.-Based Syst.
**2019**, 182. [Google Scholar] [CrossRef] - Muñoz, J.F.; Rueda, M. New imputation methods for missing data using quantiles. J. Comput. Appl. Math.
**2009**, 232, 305–317. [Google Scholar] [CrossRef] - Twala, B.; Jones, M.C.; Hand, D.J. Good methods for coping with missing data in decision trees. Pattern Recognit. Lett.
**2008**, 29, 950–956. [Google Scholar] [CrossRef] [Green Version] - Varoquaux, G.; Buitinck, L.; Louppe, G.; Grisel, O.; Pedregosa, F.; Mueller, A. Scikit-learn. J. Mach. Learn. Res.
**2011**, 12, 2825–2830. [Google Scholar] [CrossRef] - Chandrashekar, G.; Sahin, F. A survey on feature selection methods. Comput. Electr. Eng.
**2014**, 40, 16–28. [Google Scholar] [CrossRef] - Van Buuren, S.; Groothuis-Oudshoorn, K.; Robitzsch, A.; Vink, G.; Doove, L.; Jolani, S.; Schouten, R.; Gaffert, P.; Meinfelder, F.; Gray, B. MICE: Multivariate Imputation by Chained Equations. 2019. Available online: https://cran.rproject.org/web/packages/mice/ (accessed on 15 March 2019).
- Efron, B.; Hastie, T.; Iain, J.; Robert, T. Diabetes Data. 2004. Available online: https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html (accessed on 1 June 2019).
- Acharya, M.S. Graduate Admissions-1-6-2019. Available online: https://www.kaggle.com/mohansacharya/graduate-admissions (accessed on 1 June 2019).
- Stephen, B. Profit Estimation of Companies. Available online: https://github.com/boosuro/profit_estimation_of_companies (accessed on 8 August 2019).
- Kartik, P. Red & White Wine Dataset. Available online: https://www.kaggle.com/numberswithkartik/red-white-wine-dataset (accessed on 11 February 2019).
- Cam, N. California Housing Prices. Available online: https://www.kaggle.com/camnugent/california-housing-prices (accessed on 6 July 2019).
- Magrawal, S. Diamonds. Available online: https://www.kaggle.com/shivam2503/diamonds (accessed on 30 August 2019).
- Cattral, R.; Oppacher, F. Poker Hand Dataset. Available online: https://archive.ics.uci.edu/ml/datasets/Poker+Hand (accessed on 24 November 2019).
- Holmes, G.; Pfahringer, B.; van Rijn, J.; Vanschoren, J. BNG_heart_statlog. Available online: https://www.openml.org/d/267 (accessed on 11 September 2019).
- Kearney, J.; Barkat, S. Autoimpute. Available online: https://autoimpute.readthedocs.io/en/latest/ (accessed on 1 January 2020).
- Law, E. Impyute. Available online: https://impyute.readthedocs.io/en/latest/ (accessed on 8 August 2019).
- Chai, T.; Draxler, R.R. Root mean square error (RMSE) or mean absolute error (MAE)? -Arguments against avoiding RMSE in the literature. Geosci. Model Dev.
**2014**, 7, 1247–1250. [Google Scholar] [CrossRef] [Green Version]

**Figure 2.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (graduate admission dataset).

**Figure 3.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (diabetes dataset).

**Figure 4.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (Profit dataset).

**Figure 5.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (Wine dataset).

**Figure 6.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (California dataset).

**Figure 7.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (diamond dataset).

**Figure 8.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (BNG (10,000) dataset).

**Figure 9.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (BNG (15,000) dataset).

**Figure 10.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (BNG (20,000) dataset).

**Figure 11.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (BNG (50,000) dataset).

**Figure 12.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (Poker (10,000) dataset).

**Figure 13.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (Poker (15,000) dataset).

**Figure 14.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (Poker (20,000) dataset).

**Figure 15.**Comparison between the proposed methods, least squares, stochastic, norm, MICE, Fast KNN and EMI (Poker (50,000) dataset).

**Table 1.**Datasets’ specifications. The first column presents the dataset name, the second column presents the number of instances, the third column presents the number of features, and the fourth column presents the missingness mechanism.

Dataset name | #Instances | #Features | Missingness Mechanism | ||
---|---|---|---|---|---|

MAR | MCAR | MNAR | |||

Diabetes [27] | 442 | 11 | √ | √ | √ |

graduate admissions [28] | 500 | 8 | √ | √ | √ |

profit estimation of companies [29] | 1000 | 6 | √ | √ | √ |

red & white wine [30] | 4898 | 12 | √ | √ | √ |

California [31] | 20,640 | 9 | √ | √ | √ |

Diamonds [32] | 53,940 | 10 | √ | √ | √ |

Poker Hand [33] | 1,025,010 | 11 | √ | √ | √ |

BNG_heart_statlog [34] | 1,000,000 | 14 | √ | √ | √ |

Method Name | Function Name | Package | Description |
---|---|---|---|

MICE [9,26] | mice | impyute | implements the multivariate imputation by chained equations algorithm. |

least squares (LS) [35] | SingleImputer | autoimpute | produces predictions using the least squares methodology. |

norm [35] | SingleImputer | autoimpute | creates a normal distribution using the sample variance and mean of the detected data. |

stochastic (ST) [35] | SingleImputer | autoimpute | samples from the regression’s error distribution and adds the random draw to the prediction. |

Fast KNN (FKNN) [36] | fast_knn | impyute | uses K-Dimensional tree to find k nearest neighbor and imputes using the weighted average of them. |

EMI [36] | em | impyute | imputes using Expectation-Maximization-Imputation. |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

M. Mostafa, S.; S. Eladimy, A.; Hamad, S.; Amano, H.
CBRL and CBRC: Novel Algorithms for Improving Missing Value Imputation Accuracy Based on Bayesian Ridge Regression. *Symmetry* **2020**, *12*, 1594.
https://doi.org/10.3390/sym12101594

**AMA Style**

M. Mostafa S, S. Eladimy A, Hamad S, Amano H.
CBRL and CBRC: Novel Algorithms for Improving Missing Value Imputation Accuracy Based on Bayesian Ridge Regression. *Symmetry*. 2020; 12(10):1594.
https://doi.org/10.3390/sym12101594

**Chicago/Turabian Style**

M. Mostafa, Samih, Abdelrahman S. Eladimy, Safwat Hamad, and Hirofumi Amano.
2020. "CBRL and CBRC: Novel Algorithms for Improving Missing Value Imputation Accuracy Based on Bayesian Ridge Regression" *Symmetry* 12, no. 10: 1594.
https://doi.org/10.3390/sym12101594