The data set is made up of a total of 17,763 samples that correspond to the period of time referred in the description of the data. It is used to test two different algorithms: multiple imputation by chained equations (MICE) and the proposed algorithm AAA (Adaptive Assignation Algorithm). The dataset is submitted to a process of random data deletion. This process consisted of supposing that the probability of an observation being missing does not depend on observed or unobserved measurements. It is called missingcompletelyatrandom (MCAR). The process of random data deletion was repeated five times for three different levels of missing data: 10%, 15%, and 20% of the total. After each deletion process, both algorithms were applied to the resulting data subset and the performance of the two methods compared.
3.2. The Proposed Algorithm AAA
In order to introduce the new algorithm, let us assume that we have a dataset formed by $n$ different variables ${v}_{1},{v}_{2},\dots ,{v}_{n}$. In order to calculate the missing values of the ith column, all the rows with no missing value in the said column are employed. Then, a certain number of MARS models are calculated. It is possible to find rows with very different amounts of missing data from zero (no missing data) to $n$ (all values are missing). Those columns with all values missing will be removed and will be neither used for the model calculation nor imputed. Therefore, any amount of missing data from 0 to $n2$ is feasible (all variables but one with missing values).
In other words, if the dataset is formed by variables ${v}_{1},{v}_{2},\dots ,{v}_{n}$. and we want to estimate the missing values in column ${v}_{i}$, then the maximum number of different MARS models that would be computed for this variable (and in general for each column) is as follows: ${{\displaystyle \sum}}_{k=1}^{n1}\left(\begin{array}{c}n1\\ k\end{array}\right)$. For the case of the data under study in this research, with 10 different variables, a maximum of 5110 distinct MARS models would be trained (511 for each variable).
Table 3 represents the 25 first rows of the dataset in which the algorithm will be applied. When the algorithm is applied to the third column of these datasets (variable
${v}_{3}$), all those rows with missing data (represented by means of the symbol ‘o’) in the third column are not employed for the calculus of the models (rows in red). If those rows were removed, different models would be trained for the prediction of
${v}_{3}$ using different subsets of variables. Continuing with the example of variable
${v}_{3}$ and taking into account the data missing in the 25 first rows, it would be possible to train the following models:
Model 1: a model that uses as output variable ${v}_{3}$ and the other nine as input variables (${v}_{1},{v}_{2},{v}_{4},{v}_{5},{v}_{6},{v}_{7},{v}_{8},{v}_{9},{v}_{10}$).
Model 2: a model that uses as output variable ${v}_{3}$ and as input variables ${v}_{2},{v}_{4},{v}_{5},{v}_{6},{v}_{7},{v}_{8},{v}_{9},{v}_{10}$.
Model 3: a model that uses as output variable ${v}_{3}$ and as input variables ${v}_{1},{v}_{4},{v}_{5},{v}_{6},{v}_{7},{v}_{8},{v}_{9},{v}_{10}$.
Model 4: a model that uses as output variable ${v}_{3}$ and as input variables ${v}_{1},{v}_{2},{v}_{4},{v}_{6},{v}_{7},{v}_{8},{v}_{9},{v}_{10}$.
Model 5: a model that uses as output variable ${v}_{3}$ and as input variables ${v}_{1},{v}_{2},{v}_{4},{v}_{5},{v}_{7},{v}_{8},{v}_{9},{v}_{10}$.
Model 6: a model that uses as output variable ${v}_{3}$ and as input variables ${v}_{4},{v}_{5},{v}_{6},{v}_{7},{v}_{8},{v}_{9},{v}_{10}$.
Model 7: a model that uses as output variable ${v}_{3}$ and as input variables ${v}_{1},{v}_{5},{v}_{6},{v}_{7},{v}_{8},{v}_{9},{v}_{10}$.
Model 8: a model that uses as output variable ${v}_{3}$ and as input variables ${v}_{1},{v}_{4},{v}_{5},{v}_{6},{v}_{8},{v}_{9},{v}_{10}$.
After the calculation of all the available models, the missing data of each row will be calculated using those models that employ all the available nonmissing variables of the row. In those cases in which no model was calculated, the missing data will be replaced by the median of the column. Please note in that the case of large data sets with a nottoohigh percentage of missing data, these will be an infrequent case. In the case of missing completely at random data, the probability, represented by letter
Q, of not having at least two nonmissing values in a certain row can be expressed by the following formula:
where:
N is the number of variables;
P is the rate of missing data in a MCAR case.
In the case of our example, none of the rows was in this situation for the 10 and 15% of missing data, while in the case of 20% of missing data it happened only in one line (less than 0.006% of the total amount of lines). These results are in line with those expected by the formula.
As a general rule for the algorithm, it has been decided that when certain value can be estimated using more than one MARS model, it must be estimated using the MARS model with the largest number of input variables; the value would be estimated by any of those models chosen at random. Finally, in those exceptional cases in which no model is available for estimation, the median value of the variable will be used for the imputation.
Table 3.
Example of the dataset (25 first rows).
Table 3.
Example of the dataset (25 first rows).
Row #  v_{1}  v_{2}  v_{3}  v_{4}  v_{5}  v_{6}  v_{7}  v_{8}  v_{9}  v_{10}  Model 1  Model 2  Model 3  Model 4  Model 5  Model 6  Model 7  Model 8 

1  X  X  X  X  X  X  X  X  X  X  Yes  yes  yes  yes  yes  yes  yes  yes 
2  X  o  X  o  X  X  X  X  X  X  No  no  yes  no  no  yes  yes  no 
3  X  X  X  X  o  X  X  X  X  X  No  no  no  yes  no  no  no  no 
4  X  X  o  o  X  X  X  X  X  X  No  no  no  no  no  no  no  no 
5  X  X  X  X  X  X  X  X  X  X  Yes  yes  yes  yes  yes  yes  yes  yes 
6  X  o  X  X  X  X  X  X  X  X  No  no  yes  no  no  yes  yes  yes 
7  o  X  X  X  X  X  X  X  X  X  No  yes  no  no  no  yes  no  no 
8  X  X  X  X  X  X  X  X  X  X  Yes  yes  yes  yes  yes  yes  yes  yes 
9  o  o  o  X  X  X  X  X  X  X  No  no  no  no  no  no  no  no 
10  X  X  o  X  X  X  X  X  X  X  No  no  no  no  no  no  no  no 
11  X  X  o  X  X  X  X  X  X  X  No  no  no  no  no  no  no  no 
12  X  o  o  X  X  X  X  X  X  X  No  no  no  no  no  no  no  no 
13  X  X  X  X  X  X  X  X  X  X  Yes  yes  yes  yes  yes  yes  yes  yes 
14  o  o  X  X  X  X  X  X  X  X  No  no  no  no  no  yes  no  no 
15  o  X  X  X  X  X  X  X  X  X  No  yes  no  no  no  yes  no  no 
16  X  X  X  X  X  X  X  X  X  X  Yes  yes  yes  yes  yes  yes  yes  yes 
17  o  o  o  o  o  o  X  X  X  X  No  no  no  no  no  no  no  no 
18  X  X  X  X  X  X  X  X  X  X  Yes  yes  yes  yes  yes  yes  yes  yes 
19  X  o  X  X  X  X  o  X  X  X  No  no  yes  no  no  no  no  yes 
20  X  X  X  X  X  o  X  X  X  X  No  no  no  no  yes  no  no  no 
21  X  o  o  X  o  X  X  X  X  X  No  no  no  no  no  no  no  no 
22  X  X  X  X  X  X  X  X  X  X  Yes  yes  yes  yes  yes  yes  yes  yes 
23  X  X  o  o  X  X  X  X  X  X  No  no  no  no  no  no  no  no 
24  X  X  X  X  X  o  X  X  X  X  No  no  no  no  yes  no  no  no 
25  X  X  o  X  X  X  X  X  o  X  No  no  no  no  no  no  no  no 
3.3. The Benchmark Rechnique: The MICE Algorithm
The algorithm called multiple imputation by chained equations (MICE) algorithm was developed by van Buuren and GroothuisOudshoorn [
19]. This referred algorithm is a Markov Chain Monte Carlo Method in which the state space is the collection of all imputed values [
9]. As with any other Markov Chain, the MICE algorithm has to accomplish three properties [
20,
21,
22,
23] in order to converge. The referred properties are as follows:
The chain must be able to reach all parts of the state space. This means that it is irreducible.
The chain should not oscillate between different states. In other words, the Markov Chain must be aperiodic.
Finally, the chain must be recurrent. This means, as in any other Markov Chain, that the probability of the chain of starting from $i$ and returning to $i$ will be equal to one.
According to the experience of the algorithm creator [
19], and also from our own previous experience [
9], the convergence of the MICE algorithm is achieved after a relatively low number of iterations, usually somewhere between five and 20 [
23]. In the case of the present research, up to 20 iterations were considered but as not statistically significant improvements with respect to five iterations were achieved, the results for five iterations are presented.
The MICE algorithm [
23] for the imputation of multivariate missing data consists of the steps that are listed in Algorithm 1. In this algorithm
$Y$ represents a
$n\times p$ matrix of partiallyobserved sample data,
$R$ is a
$n\times p$ matrix,
$01$ response indicators of
$Y$, and
$\varnothing $ represents the parameters space. This methodology was already explained by the authors in previous research published in this journal [
9]. For a more detailed explanation of the algorithm we recommend another look at the original research by van Buuren and GroothuisOudshoorn [
23].
Algorithm 1. MICE algorithm for imputation of multivariate missing data [19]. 
Specify an imputation model $P({Y}_{j}^{mis}{Y}_{j}^{obs},{Y}_{j},R)$ for variable ${Y}_{j}$ with $j=1,\dots ,p$. For each $j$, fill in starting imputations ${Y}_{j}^{0}$ by random draws from ${Y}_{j}^{obs}$. Repeat for $t=1,\dots ,T$ (iterations). Repeat for $j=1,\dots ,p$ (variables). Define ${Y}_{j}^{t}=\left({Y}_{1}^{t},\dots ,{Y}_{j1}^{t},{Y}_{j+1}^{t1},\dots ,{Y}_{p}^{t1}\right)$ as the currently complete data except ${Y}_{j}$. Draw ${\varnothing}_{j}^{t}~P({\varnothing}_{j}^{t}\text{}{Y}_{j}^{obs},{Y}_{j}^{t},R).$ Draw imputations ${Y}_{j}^{t}~P({Y}_{j}^{mis}\text{}{Y}_{j}^{obs},{Y}_{j}^{t},R,{\varnothing}_{j}^{t})$. End repeat $j$. End repeat $t$.
