# Evolving Matrix-Factorization-Based Collaborative Filtering Using Genetic Programming

^{1}

^{2}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Matrix Factorization

#### 2.2. Genetic Programming

`Zero`) and $1.0$ (

`One`), in order to increase the expressiveness in the aggregation functions h generated by the algorithm.

#### 2.3. Experimental Setup

## 3. Results

#### 3.1. Quality of the Recommender System Predictions

`EMF-2`,

`EMF-4`,

`EMF-5`,

`EMF-8`,

`EMF-9`, and

`EMF-10`), the difference between EMF and BiasedMF is negligible, so we can consider that the quality of the predictions is equivalent. On the other hand, the method that gets the best results for MSE is

`EMF-4`. In addition, EMF outperforms the best baseline (BiasedMF) in 4 of the 10 executions made (

`EMF-2`,

`EMF-4`,

`EMF-6`, and

`EMF-10`). These results demonstrate that the proposed method tends to produce fewer large differences between the real rating and the predicted ones.

`EMF-10`provides the best results for MAE and MSE. Furthermore, the proposed method overcomes all tested baselines in all executions performed for both quality measures.

#### 3.2. Results of the Evolutionary Method

`EMF-4`). On the other hand, in Figure 3, we plot the best individual for the FilmTrust dataset (

`EMF-10`). Furthermore, for the sake of completeness, the rest of the individuals have been included in Appendix A. Recall that the correspondence between node symbols and operators are defined in Table 1.

#### 3.3. Convergence of the Evolutionary Method

`EMF-6`and

`EMF-10`). On the other hand, the nodes in the MovieLens graph are sparser and cannot be grouped into a single cluster.

## 4. Discussion

## Author Contributions

## Funding

## Conflicts of Interest

## Appendix A

:**EMF-1**`* - cos cos log exp atan`${p}_{u4}$`- atan`–${q}_{i3}$`exp`–`atan log exp`${p}_{u1}$`exp cos log cos log exp - exp cos`${p}_{u2}$`exp`–`atan log atan sin cos cos atan atan exp - log exp atan cos log exp atan`${p}_{u4}$${p}_{u2}$:**EMF-2**`- + + exp sin`${q}_{i3}$`exp + atan`${p}_{u2}$`cos One`${p}_{u1}$`* +`${p}_{u1}$`sin +`${p}_{u1}$`sin + +`${p}_{u1}$`+ + exp`${q}_{i5}$`exp + *`${p}_{u1}$${p}_{u0}$`sin + +`${p}_{u1}$`* + * * One One`${p}_{u0}$${p}_{u1}$${q}_{i2}$`One +`${p}_{u1}$`* + * * One One`${q}_{i4}$${p}_{u1}$${q}_{i2}$`One`${q}_{i2}$:**EMF-3**`-`-`inv inv + inv inv + sin One cos Zero - inv inv + inv inv + sin atan`–`+ + *`${q}_{i0}$ ${p}_{u4}$ ${p}_{u0}$`*`${q}_{i0}$ ${p}_{u4}$`cos Zero cos Zero + * inv inv`${q}_{i3}$ ${p}_{u4}$–`sin atan + +`${q}_{i0}$ ${p}_{u0}$`*`${q}_{i0}$ ${q}_{i0}$`+ * inv inv`${q}_{i3}$ ${p}_{u4}$–`sin atan + +`${q}_{i0}$ ${p}_{u0}$`* cos Zero`${p}_{u4}$`atan`–`+ +`${q}_{i0}$ ${p}_{u0}$`* inv + inv inv + sin One cos Zero cos Zero`${p}_{u4}$:–**EMF-4**`+ -`- ${q}_{i0}$`+ + + One One *`${p}_{u3}$ ${q}_{i0}$`One + One`${p}_{u2}$`*`${p}_{u0}$ ${q}_{i0}$:––**EMF-5**`+ + atan`${q}_{i1}$`+ + atan`${p}_{u5}$`exp`${p}_{u3}$`atan +`–`- exp`${q}_{i2}$`exp atan`${p}_{u4}$ ${q}_{i5}$`exp`${q}_{i1}$:**EMF-6**`log exp + exp atan exp atan - exp atan - exp`${p}_{u0}$${q}_{i5}$${q}_{i5}$`exp atan -`${p}_{u3}$${q}_{i5}$:**EMF-7**`* -`––`cos`–– ${p}_{u1}$`atan inv`–${q}_{i3}$`exp cos * atan`–`exp atan`${p}_{u1}$–${q}_{i4}$:**EMF-8**`- + -`${q}_{i4}$${p}_{u2}$`inv * cos sin -`${q}_{i4}$${p}_{u2}$`atan exp`${q}_{i2}$`- cos -`${q}_{i4}$${p}_{u4}$`cos cos * -`- ${p}_{u2}$`*`${q}_{i4}$`One`${p}_{u2}$–`atan`${p}_{u0}$:**EMF-9**`+ exp`${q}_{i0}$`- exp atan`${p}_{u0}$`+ - Zero - exp`${q}_{i0}$${q}_{i5}$`+ *`${p}_{u5}$${q}_{i5}$`-`${p}_{u4}$${q}_{i2}$:**EMF-10**`exp atan`pow pow`* exp exp`${p}_{u5}$`inv exp`${q}_{i0}$`exp`${p}_{u4}$`exp inv exp`${p}_{u5}$

:**EMF-1**`exp atan + cos`${p}_{u5}$`exp + +`${p}_{u3}$${q}_{i2}$${p}_{u3}$:––**EMF-2**`+ sin`–– ${p}_{u0}$`+ One exp sin cos exp +`${p}_{u6}$–`+ sin`${p}_{u9}$`+ sin`${q}_{i4}$`exp sin`–`+ One exp sin cos`–`+ sin sin`${p}_{u9}$`+ One exp sin cos exp +`${p}_{u6}$–`+ sin`${q}_{i4}$`exp sin`–`sin`${p}_{u9}$:**EMF-3**`inv exp atan -`${p}_{u2}$`exp exp atan - exp - exp -`${p}_{u2}$`exp atan -`${p}_{u2}$${p}_{u7}$`exp -`- ${p}_{u7}$ ${q}_{i3}$`exp atan -`${p}_{u2}$`exp -`${p}_{u2}$`exp atan -`${p}_{u2}$`exp exp atan - exp -`${p}_{u7}$`exp -`- ${p}_{u2}$`exp exp atan - exp -`${p}_{u2}$`exp -`- ${p}_{u7}$ ${q}_{i3}$`exp atan -`${p}_{u2}$`exp -`${p}_{u2}$`exp atan -`${p}_{u2}$`exp exp -`${p}_{u2}$`exp atan -`${p}_{u2}$`-`${p}_{u2}$ ${p}_{u7}$`-`${p}_{u7}$ ${p}_{u2}$ ${p}_{u7}$`-`${p}_{u7}$ ${p}_{u2}$`-`${p}_{u7}$ ${p}_{u2}$:**EMF-4**`+`${p}_{u4}$`exp cos`${q}_{i6}$:**EMF-5**`exp atan - exp exp`${p}_{u1}$–${q}_{i9}$:**EMF-6**`exp atan +`${q}_{i4}$`+ +`${p}_{u3}$`atan atan`${p}_{u2}$`inv`${p}_{u2}$:**EMF-7**`+ atan exp`${q}_{i2}$`exp cos`${p}_{u2}$:**EMF-8**`+ exp`${p}_{u3}$`exp cos sin exp`–`atan`${q}_{i0}$:**EMF-9**`exp inv cos cos exp cos exp atan - * One exp *`${q}_{i6}$`cos cos exp *`${p}_{u7}$${q}_{i2}$`exp atan - * One exp *`${q}_{i6}$`cos`${p}_{u4}$${p}_{u0}$:**EMF-10**`exp atan + * atan + + atan + * atan + +`${q}_{i5}$`atan +`${q}_{i5}$`atan - inv`${p}_{u8}$`*`${q}_{i5}$`inv inv`${p}_{u8}$`inv`${p}_{u8}$`+`${q}_{i5}$`atan - inv`${p}_{u8}$`*`${q}_{i5}$`inv inv`${p}_{u8}$`inv`${p}_{u8}$`atan +`${q}_{i5}$`atan - inv`${p}_{u8}$`*`${q}_{i5}$`inv +`${q}_{i5}$`atan`${q}_{i5}$`inv`${p}_{u8}$`+`${q}_{i5}$`atan - inv`${p}_{u8}$`*`${q}_{i5}$`inv inv`${p}_{u8}$`inv`${p}_{u8}$

**Figure 2.**Tree representation of the best individual (

`EMF-4`) achieved for the MovieLens dataset. The correspondence between node symbols and operators are defined in Table 1.

**Figure 3.**Tree representation of the best individual achieved for the FilmTrust dataset (

`EMF-10`). The correspondence between node symbols and operators are defined in Table 1.

**Figure 4.**Evolution of the 25th percentile (first quartile) of the population fitness for each run on both (

**a**) MovieLens and (

**b**) FilmTrust datasets.

**Figure 5.**Convergence of the evolutionary method on the MovieLens dataset. The plots show the moving average distance ${D}_{5}\left(n\right)$ with window width $W=5$ and $p=2$ during the 150 generations of the 10 executions.

**Figure 6.**Convergence of the evolutive method on the FilmTrust dataset. The plots show the moving average distance ${D}_{5}\left(n\right)$ with window width $W=5$ and $p=2$ during the 150 generations of the 10 executions.

**Figure 7.**Graphical representation of the distances between the best individuals of each execution of EMF for the (

**a**) MovieLens and (

**b**) FilmTrust datasets. The length of the edges is proportional to the ${L}^{2}$ distance between the corresponding aggregate functions. The mean distance in the MovieLens graph is 70% bigger than the mean distance in FilmTrust.

Operator | Arity | Function | Symbol |
---|---|---|---|

Sine | 1 | $\mathrm{sin}\left(x\right)$ | sin |

Cosine | 1 | $\mathrm{cos}\left(x\right)$ | cos |

Arctangent | 1 | $\mathrm{arctan}\left(x\right)$ | atan |

Exponential | 1 | $\mathrm{exp}\left(x\right)$ | exp |

Logarithm | 1 | $\mathrm{log}\left(x\right)$ | log |

Inverse | 1 | $\frac{1}{x}$ | inv |

Sign | 1 | $-x$ | – |

Addition | 2 | $x+y$ | + |

Subtraction | 2 | $x-y$ | - |

Multiplication | 2 | $x\times y$ | * |

Power | 2 | ${x}^{y}$ | pow |

Dataset | #users | #items | #ratings | Rating Scale |
---|---|---|---|---|

MovieLens | 943 | 1682 | 100,000 | 1–5 |

FilmTrust | 1508 | 2071 | 35,497 | 0.5–4.0 |

**Table 3.**Mean Absolute Error (MAE) and Mean Squared Error (MSE) in predictions for MovieLens dataset.

Method | MAE | MSE |
---|---|---|

PMF | 0.7225 | 0.8492 |

BiasedMF | 0.7160 | 0.8406 |

NMF | 0.7672 | 0.9867 |

BNMF | 0.7500 | 0.8860 |

EMF-1 | 0.7256 | 0.8728 |

EMF-2 | 0.7195 | 0.8332 |

EMF-3 | 0.7210 | 0.8427 |

EMF-4 | 0.7197 | 0.8282 |

EMF-5 | 0.7195 | 0.8592 |

EMF-6 | 0.7220 | 0.8377 |

EMF-7 | 0.7255 | 0.8721 |

EMF-8 | 0.7193 | 0.8442 |

EMF-9 | 0.7161 | 0.8441 |

EMF-10 | 0.7163 | 0.8381 |

EMF (best) | 0.7161 | 0.8282 |

EMF (worst) | 0.7256 | 0.8728 |

EMF (avg) | 0.7205 | 0.8472 |

Method | MAE | MSE |
---|---|---|

PMF | 0.7514 | 1.1321 |

BiasedMF | 0.6277 | 0.7050 |

NMF | 0.7950 | 1.3710 |

BNMF | 0.6598 | 0.6987 |

EMF-1 | 0.6046 | 0.6705 |

EMF-2 | 0.6114 | 0.6653 |

EMF-3 | 0.6013 | 0.6778 |

EMF-4 | 0.6303 | 0.6780 |

EMF-5 | 0.6109 | 0.6846 |

EMF-6 | 0.6087 | 0.6808 |

EMF-7 | 0.6108 | 0.6652 |

EMF-8 | 0.6075 | 0.6672 |

EMF-9 | 0.6209 | 0.7050 |

EMF-10 | 0.5993 | 0.6581 |

EMF (best) | 0.5993 | 0.6581 |

EMF (worst) | 0.6303 | 0.7050 |

EMF (avg) | 0.6105 | 0.6752 |

