# Persistence Landscapes—Implementing a Dataset Verification Method in Resource-Scarce Embedded Systems

## Abstract

## 1. Introduction

#### 1.1. Contributions

- The first implementation of 0-dimensional PL analysis in an RSES.
- A post-deployment ML model data verification system using PL, which is helpful for TinyML/ Embedded Intelligence.
- A mechanism to verify whether devices are collecting the same dataset, through dataset convergence verification, which is helpful in applications such as the Internet of Things (IoT); Wireless Sensor Networks (WSN); and Cyber-Physical Systems (CPS).
- A reduction of the need for a central system (i.e., Cloud Computer) to analyze the entire dataset.

#### 1.2. Organization of the Work

## 2. Topological Data Analysis (TDA)

#### 2.1. Persistence Modules

#### 2.2. Persistence Landscapes

#### 2.3. 0-Dimensional Persistence Landscapes

#### 2.4. Related Work

## 3. Methodology

## 4. Experimental Design

## 5. Results

#### 5.1. Convergence between Devices and Datasets

#### 5.2. The Distance in Landscapes

#### 5.3. Dataset Identification

## 6. Conclusions and Future Work

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

**Figure 1.**Persistent Diagram and its respective peak functions. (

**a**) Peak function from the computed piecewise function (Equation (6)). (

**b**) Persistent Diagram for the point cloud $\chi $.

**Figure 2.**Landscape function obtained from the Persistence Diagram presented in Figure 1. For this diagram, one obtains two Persistence Landscapes, $\Lambda =\{{\lambda}_{1},{\lambda}_{2}\}$.

**Figure 3.**Peak functions and their respective $\overline{\lambda}$ values. (

**a**) Example of 0-dimensional peak functions for an infinity-sign-shaped dataset. (

**b**) $\overline{\lambda}$ for peak functions.

**Figure 4.**Graphical depiction of the point clouds used for testing. (

**a**) Infinity sign point cloud. (

**b**) Two circle point cloud. Each point cloud has 2000 points.

**Figure 5.**The training device’s ($\Pi $) landscape mean. (

**a**) Mean of 1 barcode. (

**b**) Mean of 4 barcodes. (

**c**) Mean of 16 barcodes. (

**d**) Mean of 32 barcodes. (

**e**) Mean of 64 barcodes. (

**f**) Mean of 128 barcodes.

**Figure 6.**The testing device’s ($\Psi $) landscape mean. (

**a**) Mean of 1 barcode. (

**b**) Mean of 4 barcodes. (

**c**) Mean of 16 barcodes. (

**d**) Mean of 32 barcodes. (

**e**) Mean of 64 barcodes. (

**f**) Mean of 128 barcodes.

**Figure 7.**The fake device’s ($\Omega $) landscape mean. (

**a**) Mean of 1 barcode. (

**b**) Mean of 4 barcodes. (

**c**) Mean of 16 barcodes. (

**d**) Mean of 32 barcodes. (

**e**) Mean of 64 barcodes. (

**f**) Mean of 128 barcodes.

**Figure 8.**Distance between each device’s landscape and the full point cloud landscape’s mean. (

**a**) Distance between the training devices ($\Pi $). (

**b**) Distance between the testing devices ($\Psi $). (

**c**) Distance between the fake devices ($\Omega $).

**Figure 9.**Percentage of barcodes for each flagged as not belonging to the original dataset according to the number of barcodes used in the mean. (

**a**) $\Pi $-Device’s flag rate. (

**b**) $\Psi $-Device’s flag rate. (

**c**) $\Omega $-Device’s flag rate.

