# PRIPRO: A Comparison of Classification Algorithms for Managing Receiving Notifications in Smart Environments

## 1. Introduction

## 2. Related Work

## 3. Methods and Materials

#### 3.1. Machine Learning Concept

#### 3.1.1. Classification Task

#### 3.1.2. Naive Bayes Algorithm

#### 3.1.3. J48 Algorithm

#### 3.1.4. K-Nearest Neighbors Algorithm

#### 3.1.5. Multilayer Perceptron Algorithm

#### 3.1.6. PRISM Algorithm

#### 3.1.7. Support Vector Machine Algorithm

#### 3.2. Statistical Hypothesis Tests

- Normality testing that is used to evaluate the assumption of a sample taken from a distributed population [24];
- Correlation test that analyzes sample datasets to identify if two variables are related to each other [25];
- Association test that reports on the relationship of the statistical association between variables [26];
- Variance test comparing the means of different populations [27];
- Central tendency test that uses central tendency measures (arithmetic mean, median) to test a probability distribution [28].

- Categorization indicates whether the test is parametric or nonparametric. Parametric tests evaluate the null hypothesis from specific data or parameters (mean, standard deviation, etc.). Nonparametric tests evaluate the null hypothesis from distribution types and group relationships [29];
- Variable indicates the types of variables the test supports;
- Group, which matches whether the group comparison is individual, paired, or multiple. In this context the classification algorithms are the groups;
- Pairing, which corresponds to whether it is paired or unpaired. Paired tests match that the data used for predictor model training are also used to test the predictor model, whereas unpaired tests use one dataset for training and another for testing [30].

- ${R}_{i}$ and ${R}_{j}$ is the sum of the positions of the algorithms i e j in the ranking;
- $|{R}_{i}-{R}_{j}|$ is the difference between the sum of the algorithms;
- $Z\left(\frac{\alpha}{k\left(k-1\right)}\right)\sqrt{\frac{N\times k\left(k+1\right)}{6}}$ is the critical difference.

#### 3.3. Artificial Datasets

- Target: classifies which user the notification must be notified to;
- Period: classifies what time of day the notification must be notified;
- Setting: classifies which device configuration notification must be notified.

## 4. Notification Management Architecture

- Customer: He arrives at the dealership in the middle of the morning to search for cars to buy in the showroom. He is serviced by the employee, who then directs him to the sales department to negotiate with the owner. He leaves late in the morning and returns midway through the afternoon. He is attended to by the employee in the showroom, who then directs him to the sales department to continue negotiating with the owner. The deal is closed at the owner’s office. He leaves late in the afternoon.
- Employee: He arrives at the dealership early in the morning to open it and perform its tasks in the showroom. He takes a break in the kitchen. He serves the customer and forwards it for negotiation with the owner in the sales sector. He takes a break for lunch in the kitchen. He opens the dealership in the afternoon and performs tasks in the showroom. He covers the homeowner on sales tasks and leaves late in the afternoon.
- Owner: He arrives at the dealership, already opened by the employee early in the morning, and performs their tasks in the sales department. He performs tasks in his office and soon after goes to the sales department to attend the customer referred by the employee. He takes a break in the kitchen. In the early afternoon, he performs tasks in the sales department. He meets the customer again, they close the deal in their private office. He takes a break in the early evening in the kitchen. He does some tasks in his office and leaves in the middle of the night.

## 5. Experiments and Results

- Intel Core i5-5200U;
- CPU: 2.20 GHz;
- RAM: 6.00 GB.

#### 5.1. Artificial Datasets Test

#### 5.2. Adjustable Parameters Test

#### 5.3. CPU Time Test

#### 5.4. Classification Precision Metric Test

#### 5.5. Friedman Test

#### 5.6. Application Scenario Test

## 6. Conclusions

- Extend the comparison to use real data;
- Extend the comparison to include other classification algorithms;
- Extend the comparison to include other predictive attributes;
- Extend the comparison to include other classifying attributes;
- Extend the comparison to include other metrics and statistical tests;
- Develop the notification manager PRIPRO and PRINM;
- Perform tests on real application scenarios;

**Figure 1.**Privacy manager model. Adapted from UBIPRI (2015) [6].

Work | Authors | Uses Classification Algorithms | Uses Artificial Data | Uses Algorithm Comparison | Uses Hypothesis Tests Statistics |
---|---|---|---|---|---|

[7] | Smith (2014) | Yes | No | No | No |

[8] | Corno (2015) | Yes | Yes | Yes | No |

[11] | Fraser (2017) | No | Yes | No | No |

[10] | Ghodse (2018) | No | No | No | No |

[12] | Martins (2018) | Yes | Yes | Yes | Partial |

[9] | Silva (2019) | Yes | No | No | No |

This Work | This Work | Yes | Yes | Yes | Yes |

Name | Categorization | Variable | Group | Pairing |
---|---|---|---|---|

Z-test | Parametric | Quantitative | Individual | - |

T-test | Parametric | Quantitative | Individual | - |

Wilcoxon for 1 sample | No parametric | Quantitative, ordinal qualitative | Individual | - |

T-test for 2 samples | Parametric | Quantitative, nominal | Pairs | No paired |

T-test for 2 samples with different variances | Parametric | Quantitative, nominal | Pairs | No paired |

T-test paired | Parametric | Quantitative, nominal | Pairs | Paired |

ANOVA | Parametric | Quantitative, nominal | Multiple | No paired |

Welch’s ANOVA | Parametric | Quantitative, nominal | Multiple | No paired |

ANOVA for repeated measures | Parametric | Quantitative, nominal | Multiple | Paired |

Mann-Whitney | No parametric | Quantitative, ordinal qualitative, nominal | Pairs | No paired |

Wilcoxon Paired | No parametric | Quantitative, ordinal qualitative, nominal | Pairs | Paired |

Kruskal-Wallis | No parametric | Quantitative, ordinal qualitative, nominal | Multiple | No paired |

Friedman | No parametric | Quantitative, ordinal qualitative, nominal | Multiple | Paired |

Test for 1 proportion | Parametric | Nominal | Individual | - |

Test for 2 proportion | Parametric | Nominal | Pairs | No paired |

Artificial Dataset | Classification Accuracy | Majority Class |
---|---|---|

Target | 80% | OutTargetNone |

Period | 80% | OutPeriodNone |

Setting | 75% | OutSettingNone |

Artificial Dataset | Rules |
---|---|

Target | Member1 > OutTargetNone |

Member2 > OutTargetNone | |

Member3 > OutTargetNone | |

Period | Member1 > OutPeriodNone |

Member2 > OutPeriodNone | |

Member3 > OutPeriodNone | |

Setting | Member1 > OutSettingNone |

Member2 > OutSettingNone | |

Member3 > OutSettingNone |

Artificial Dataset | Rules |
---|---|

Target | InMember1 > OutMember1 |

InMember2 > OutMember2 | |

InMember3 > OutMember3 | |

InAll > OutAll | |

Period | InMorning > OutMorning |

InAfternoon > OutAfternoon | |

InNight > OutMorning | |

InDawn > OutMorning | |

Setting | Relevance1 > OutCurrent |

Relevance2 > OutVibrate | |

Relevance3 > OutSilent |

Algorithms | Parameters |
---|---|

J48 | Pruning: use |

KNN | Distance calculation: Euclidean |

K: 3 | |

SVM | Kernel: Linear |

Cost: 10 | |

MLP | Iteration: 500 |

Learning rate: 0.3 | |

Momentum: 0.2 | |

Hidden layer neurons: attribute + class |

Artificial Dataset | NB | PRISM | J48 | KNN | SVM | MLP |
---|---|---|---|---|---|---|

Target | 0.47 ms | 12.03 ms | 2.19 ms | 0.47 ms | 43.44 ms | 15.815,31 ms |

Period | 0.47 ms | 124.69 ms | 5.63 ms | 0.47 ms | 417.66 ms | 15.969,53 ms |

Setting | 0.94 ms | 48.91 ms | 3.75 ms | 0.16 ms | 472.66 ms | 14.249,53 ms |

Average | 0.62 ms | 61.87 ms | 3.85 ms | 0.36 ms | 311.25 ms | 62.011,45 ms |

Artificial Dataset | NB | PRISM | J48 | KNN | SVM | MLP |
---|---|---|---|---|---|---|

Target | 1.25 ms | 0.16 ms | 0.63 ms | 116.56 ms | 6.25 ms | 2.19 ms |

Period | 1.56 ms | 1.87 ms | 0.31 ms | 123.59 ms | 47.03 ms | 3.13 ms |

Setting | 1.72 ms | 0.47 ms | 0.31 ms | 123.28 ms | 49.06 ms | 1.25 ms |

Average | 1.51 ms | 0.83 ms | 0.41 ms | 121.14 ms | 34.11 ms | 2.19 ms |

Artificial Dataset | NB | PRISM | J48 | KNN | SVM | MLP |
---|---|---|---|---|---|---|

Target | 100% | 100% | 100% | 99.99% | 100% | 100% |

Period | 80% | 96.87% | 97.49% | 97.22% | 87.53% | 99.96% |

Setting | 85.26% | 98.86% | 99.54% | 99.89% | 86.89% | 100% |

Artificial Dataset | NB | PRISM | J48 | KNN | SVM | MLP |
---|---|---|---|---|---|---|

Target | 3.45 | 3.45 | 3.45 | 3.75 | 3.45 | 3.45 |

Period | 6.0 | 3.85 | 2.15 | 3.0 | 5.0 | 1.0 |

Setting | 6.0 | 4.0 | 3.0 | 2.0 | 5.0 | 1.0 |

Period | Profile | Environment | Activity |
---|---|---|---|

10:10–10:25 | Morning | Guest | Show Room | Public | 1 |

10:25–11:50 | Morning | Guest | Sector of Sales | Private | 2 |

16:00–16:15 | Afternoon | Guest | Show Room | Public | 1 |

16:15–16:30 | Afternoon | Guest | Sector of Sales | Private | 2 |

16:30–18:00 | Afternoon | Basic | Office | Restrict | 3 |

Period | Profile | Environment | Activity |
---|---|---|---|

7:00–9:30 | Morning | Basic | Show Room | Public | 2 |

9:30–10:00 | Morning | Advanced | Kitchen | Private | 1 |

10:00–11:50 | Morning | Basic | Show Room | Public | 2 |

11:50–14:00 | Afternoon | Advanced | Kitchen | Private | 1 |

14:00–16:30 | Afternoon | Basic | Show Room | Public | 2 |

16:30–18:00 | Afternoon | Basic | Sector of Sales | Private | 3 |

Period | Profile | Environment | Activity |
---|---|---|---|

7:30–8:30 | Morning | Administrator | Sector of Sales | Private | 2 |

8:30–10:25 | Morning | Administrator | Office | Restrict | 3 |

10:25–11:50 | Morning | Administrator | Sector of Sales | Private | 2 |

11:50–14:00 | Afternoon | Administrator | Kitchen | Private | 1 |

14:00–16:30 | Afternoon | Administrator | Sector of Sales | Private | 2 |

16:30–18:00 | Afternoon | Administrator | Office | Restrict | 3 |

18:00–19:00 | Night | Administrator | Kitchen | Private | 1 |

19:00–20:00 | Night | Administrator | Office | Restrict | 2 |

20:00–22:00 | Night | Administrator | Office | Restrict | 1 |

