# Recommendation System Using Autoencoders

## Abstract

## 1. Introduction

## 2. Related Work

## 3. Methodology

#### 3.1. Business Understanding

- Increase the number of sales;
- Improve company’s revenue;
- Encourage engagement and activity on products and services;
- Gain competitive advantage;
- Calibrate user preferences;
- Make personalized recommendations;
- Find the recommendation algorithm and parameterization that leads to the highest overall performance of the product recommendation system.

#### 3.2. Data Understanding

#### 3.3. Data Preparation

#### 3.4. Modelling

#### 3.4.1. Autoencoder Overview

**Encoder**has the function of compressing input information into a different latent space;**Code**is a part of the network that represents the compressed input which is fed to the decoder;**Decoder**does the reverse work and reconstructs the original information, moving from the latent space to the original information space.

#### 3.4.2. Initialization of Parameters

- For the loss function, because we are not dealing with a classification problem where binary cross-entropy can be used, the loss function used had to be the Mean Square Error (MSE);
- The selection of the optimizer had to do with the fact that ADAM presents itself as one of the best, as it is very fast, requires little memory and is ideal for handling large amounts of data;
- The dropout is only used during training, and is then automatically disabled during execution;
- Finally, the choice of the activation function was based on the range to which the network values went, as well as the results of the experiments performed while running the autoencoder with other functions that were tested, but the loss and RMSE results were worse than the ones obtained with Tanh for the encoder layer, latent space and Linear for the output layer.

#### 3.4.3. Architecture

#### 3.5. Evaluation

## 4. Results and Discussion

## 5. Conclusions and Future Work

## Author Contributions

## Funding

## Conflicts of Interest

## Abbreviations

CF | Collaborative Filtering |

CRISP-DM | Cross Industry Standard Process for Data Mining |

CVAE | Collaborative Variational Autoencoder |

DAE | Deep Autoencoder |

DM | Data Mining |

ML | MovieLens |

MSE | Mean Square Error |

PCA | Principal Component Analysis |

RBM | Restricted Boltzman Machines |

RMSE | Root Mean Square Error |

SVD | Singular Value Decomposition |

## References

**Figure 4.**Distribution ratings of the ML-1M dataset: (

**A**) Distribution of the quantity of products by a certain number of evaluations; (

**B**) Total number of ratings for each scale value [1–5].

ML-1M | ML-10M | |
---|---|---|

Items | 3043 | 8940 |

Users | 6022 | 69,838 |

Activation Function | Tanh for encoder layer, latent space and Linear for output layer |
---|---|

Optimizer | Adam with a learning rate of 0.0001 |

Loss Function | Mean Square Error (MSE) |

Dropout | 0.1 |

**Table 3.**Comparison of Root Mean Square Error (RMSE) with a different layer number in the autoencoder.

RMSE | ||
---|---|---|

ML-1M | ML-10M | |

1Layer + 1Latent_Space | 0.029 | 0.010 |

2layers + 1Latent_Space | 0.038 | 0.012 |

3Layers + 1Latent_Space | 0.050 | 0.015 |

RMSE | ||
---|---|---|

ML-1M | ML-10M | |

SVD | 0.184 | 0.176 |

Autoencoder | 0.029 | 0.010 |

