## Abstract

**:**

## 1. Introduction

#### 1.1. Problem Statement

- 1.1
- Original-by-original SI is the task used to detect the source cameras from which a set of “original images” directly coming from smartphones have been taken, see the arrow labeled “Classification (1)” in Figure 1b.
- 1.2
- Social-by-original SI represents the task used to identify the source cameras of a given set of “shared images”, see the arrow labeled “Classification (3)” in Figure 1c. In this case, the “original images” are input data and allow one to define the smartphone camera fingerprints.

- 2.1
- Intra-layer UPL is the task used to link a given set of user profiles within the same SN using “shared images”, see the arrows labeled “Classification (2)” on Facebook and WhatsApp in Figure 1b. Through this task, the profiles that share images from the same source are linked within the same SNs.
- 2.2
- Inter-layer UPL represents the task used to link a set of user profiles across different SNs by using “shared images”, see the arrow labeled “Classification (4)” in Figure 1c. Through this task, the profiles from different SNs that share images from the same source are linked.

#### 1.2. Contribution

## 2. Related Works

## 3. Methodology

#### 3.1. Smartphone Fingerprinting

#### 3.2. Pre-Processing

#### 3.3. Original-By-Original Smartphone Identification and Intra-Layer User Profile Linking

#### 3.4. Social-By-Original Smartphone Identification and Inter-Layer User Profile Linking

## 4. Experimental Results

#### 4.1. Original-By-Original Smartphone Identification Results

#### 4.2. Social-By-Original Smartphone Identification Results

#### 4.3. Intra-Layer User Profile Linking Results

#### 4.4. Inter-Layer User Profile Linking Results

## 5. Discussion

## 6. Conclusions

**Figure 1.**A visual example of the proposed methods: (

**a**) domain of the problem, (

**b**,

**c**) classification-based approaches for smartphone identification (SI) and user profile linking (UPL) by “original” and “shared images”, respectively. The labels (1) to (4) refer to Figure 2 presenting all the combinations of “original” and “shared images”.

**Figure 2.**All the possible combinations of “original” and “shared images” in the proposed methods. The green and magenta rounded arrows from A to A imply classifying images of A, while the blue and red straight arrows from A to B mean that we use the classified images of A to classify the images of B.

**Figure 3.**Original-by-original SI: the “original images” are classified according to the smartphone’s source camera.

**Figure 4.**Intra-layer UPL task: profiles ${P}_{1}$ and ${P}_{2}$ are linked since they share images taken from the same smartphone ${S}_{1}$.

**Figure 5.**Social-by-original SI task based on classification approach: the classified “original images” are used to train the ANN and classify the “shared images”.

**Figure 6.**Inter-layer UPL task based on classification approach: to classify the “shared images” on a given social network (SN) (e.g., WhatsApp), the ANN is trained by using the obtained classes of “shared images” on a different SN (e.g., Google Currents).

**Figure 7.**Pairwise similarities of residual noises (RNs): (

**a**,

**b**) without and (

**c**,

**d**) with using shared $\kappa $-nearest neighbor, respectively from left to right for ${\mathcal{D}}_{\mathrm{L}}^{\mathrm{O}}$, $\kappa $ = 20, and ${\mathcal{D}}_{\mathrm{V}}^{\mathrm{O}}$, $\kappa $ = 70.

**Figure 8.**Results (%) of original-by-original SI by using different methods on ${\mathcal{D}}_{\mathrm{V}}^{\mathrm{O}}$ with the RN resolution $1024\times 1024$.

**Figure 9.**Results (%) of social-by-original SI for systematically increasing the number of neurons in the hidden layer of ANN. The images in ${\mathcal{D}}_{\mathrm{L}}^{\mathrm{G}}$ are classified by the obtained classes of images in ${\mathcal{D}}_{\mathrm{L}}^{\mathrm{O}}$ and the trained ANN.

Phone ID | Brand | Model | Resolution |
---|---|---|---|

S1 | LG | Nexus 4 | $3264\times 2448$ |

S2 | Samsung | Galaxy S2 | $3264\times 2448$ |

S3 | Apple | iPhone 6+ | $3264\times 2448$ |

S4 | LG | Nexus 5 | $3264\times 2448$ |

S5 | Huawei | Y550 | $2592\times 1944$ |

S6 | Apple | iPhone 5 | $3264\times 2448$ |

S7 | Motorola | Moto G | $2592\times 1456$ |

S8 | Samsung | Galaxy S4 | $4128\times 3096$ |

S9 | LG | G3 | $4160\times 3120$ |

S10 | LG | Nexus 5 | $3264\times 2448$ |

S11 | Sony | Xperia Z3 | $5248\times 3936$ |

S12 | Samsung | Samsung S3 | $3264\times 2448$ |

S13 | HTC | One S | $3264\times 2448$ |

S14 | LG | Nexus 5 | $3264\times 2448$ |

S15 | Apple | iPhone 6 | $3264\times 2448$ |

S16 | Samsung | Galaxy S2 | $3264\times 2448$ |

S17 | Nokia | Lumia 625 | $2592\times 1456$ |

S18 | Apple | iPhone 5S | $3264\times 2448$ |

Dataset | Lowest Resolution | Highest Resolution |
---|---|---|

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{O}}$ | $960\times 544$ | $5248\times 3936$ |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{G}}$ | $960\times 544$ | $5248\times 3936$ |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{W}}$ | $960\times 544$ | $1600\times 1200$ |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{FH}}$ | $960\times 544$ | $2048\times 1536$ |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{T}}$ | $960\times 544$ | $1280\times 960$ |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{O}}$ | $960\times 720$ | $5248\times 3936$ |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{W}}$ | $960\times 720$ | $1280\times 960$ |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{FH}}$ | $960\times 720$ | $2048\times 1536$ |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{FL}}$ | $1040\times 584$ | $1312\times 984$ |

Type | Multi-Layer Perceptron (MLP) |
---|---|

Number of layers | 2 |

Neurons in input layer | $\{\begin{array}{l}900\mathrm{for}{\mathcal{D}}_{\mathrm{L}}\\ 7480\mathrm{for}{\mathcal{D}}_{\mathrm{V}}\end{array}$ |

Neurons in hidden layer | 50 |

Neurons in output layer | $\{\begin{array}{l}18\mathrm{for}{\mathcal{D}}_{\mathrm{L}}\\ 35\mathrm{for}{\mathcal{D}}_{\mathrm{V}}\end{array}$ |

Learning rule | Back Propagation (BP) |

Training function | trainscg |

Activation function | logsig |

Error | Mean Squared Error (MSE) |

**Table 4.**Results (%) of resizing versus cropping the RNs in original-by-original SI on ${\mathcal{D}}_{\mathrm{L}}^{\mathrm{O}}$, by testing different image resolution.

Resizing | Cropping * | |||||||||
---|---|---|---|---|---|---|---|---|---|---|

Size | 𝓢𝓔 | 𝓢𝓟 | 𝓐𝓡𝓘 | 𝓕 | 𝓟 | 𝓢𝓔 | 𝓢𝓟 | 𝓐𝓡𝓘 | 𝓕 | 𝓟 |

$1536\times 1536$ | 0.91 | 0.99 | 0.88 | 0.88 | 0.95 | —— | —— | —— | —— | —— |

$1280\times 1024$ | 0.89 | 0.99 | 0.85 | 0.86 | 0.94 | —— | —— | —— | —— | —— |

$1024\times 1024$ | 0.91 | 0.99 | 0.90 | 0.91 | 0.96 | —— | —— | —— | —— | —— |

$960\times 544$ | 0.90 | 0.99 | 0.87 | 0.88 | 0.95 | 0.91 | 0.99 | 0.89 | 0.90 | 0.95 |

$512\times 512$ | 0.90 | 0.99 | 0.87 | 0.88 | 0.94 | 0.85 | 0.98 | 0.81 | 0.82 | 0.89 |

$256\times 256$ | 0.58 | 0.97 | 0.55 | 0.57 | 0.75 | 0.76 | 0.98 | 0.74 | 0.75 | 0.87 |

$128\times 128$ | 0.18 | 0.94 | 0.12 | 0.17 | 0.37 | 0.43 | 0.96 | 0.39 | 0.42 | 0.66 |

^{*}The highest resolution for cropping RNs is 960 × 544 px, based on Table 2.

Dataset | 𝓢𝓔 | 𝓢𝓟 | 𝓐𝓡𝓘 | 𝓕 | 𝓟 |
---|---|---|---|---|---|

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{O}}$ | 0.91 | 0.99 | 0.90 | 0.91 | 0.96 |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{O}}$ | 0.84 | 0.99 | 0.84 | 0.85 | 0.894 |

Dataset | 𝓢𝓔 | 𝓢𝓟 | 𝓐𝓡𝓘 | 𝓕 | 𝓟 |
---|---|---|---|---|---|

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{G}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{O}}$ | 0.92 | 0.99 | 0.91 | 0.91 | 0.97 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{W}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{O}}$ | 0.85 | 0.99 | 0.82 | 0.83 | 0.92 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{FH}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{O}}$ | 0.85 | 0.99 | 0.82 | 0.83 | 0.92 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{T}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{O}}$ | 0.86 | 0.99 | 0.83 | 0.84 | 0.93 |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{W}}-{\mathcal{D}}_{\mathrm{V}}^{\mathrm{O}}$ | 0.81 | 0.99 | 0.79 | 0.80 | 0.91 |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{FH}}-{\mathcal{D}}_{\mathrm{V}}^{\mathrm{O}}$ | 0.80 | 0.99 | 0.77 | 0.77 | 0.90 |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{FL}}-{\mathcal{D}}_{\mathrm{V}}^{\mathrm{O}}$ | 0.78 | 0.99 | 0.75 | 0.75 | 0.89 |

Dataset | 𝓓${}_{\mathbf{L}}^{\mathbf{G}}$ | 𝓓${}_{\mathbf{L}}^{\mathbf{W}}$ | 𝓓${}_{\mathbf{L}}^{\mathbf{FH}}$ | 𝓓${}_{\mathbf{L}}^{\mathbf{T}}$ | 𝓓${}_{\mathbf{V}}^{\mathbf{W}}$ | 𝓓 ${}_{\mathbf{V}}^{\mathbf{FH}}$ | 𝓓${}_{\mathbf{V}}^{\mathbf{FL}}$ |
---|---|---|---|---|---|---|---|

$\mathcal{SE}$ | 0.91 | 0.87 | 0.88 | 0.87 | 0.75 | 0.73 | 0.43 |

$\mathcal{SP}$ | 0.99 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.98 |

$\mathcal{ARI}$ | 0.88 | 0.84 | 0.86 | 0.86 | 0.74 | 0.71 | 0.40 |

$\mathcal{F}$ | 0.89 | 0.86 | 0.85 | 0.85 | 0.75 | 0.71 | 0.42 |

$\mathcal{P}$ | 0.96 | 0.94 | 0.93 | 0.92 | 0.84 | 0.80 | 0.58 |

Dataset | 𝓢𝓔 | 𝓢𝓟 | 𝓐𝓡𝓘 | 𝓕 | 𝓟 |
---|---|---|---|---|---|

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{W}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{G}}$ | 0.90 | 0.99 | 0.87 | 0.88 | 0.96 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{FH}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{G}}$ | 0.90 | 0.99 | 0.87 | 0.87 | 0.95 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{T}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{G}}$ | 0.92 | 0.99 | 0.90 | 0.91 | 0.96 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{G}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{W}}$ | 0.91 | 0.99 | 0.90 | 0.90 | 0.96 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{FH}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{W}}$ | 0.86 | 0.99 | 0.83 | 0.83 | 0.94 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{T}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{W}}$ | 0.90 | 0.99 | 0.88 | 0.87 | 0.95 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{G}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{FH}}$ | 0.90 | 0.99 | 0.88 | 0.88 | 0.95 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{W}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{FH}}$ | 0.86 | 0.98 | 0.82 | 0.83 | 0.94 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{T}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{FH}}$ | 0.87 | 0.99 | 0.84 | 0.85 | 0.93 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{G}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{T}}$ | 0.90 | 0.99 | 0.88 | 0.90 | 0.95 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{W}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{T}}$ | 0.87 | 0.98 | 0.85 | 0.85 | 0.94 |

${\mathcal{D}}_{\mathrm{L}}^{\mathrm{FH}}-{\mathcal{D}}_{\mathrm{L}}^{\mathrm{T}}$ | 0.87 | 0.98 | 0.85 | 0.86 | 0.94 |

Dataset | 𝓢𝓔 | 𝓢𝓟 | 𝓐𝓡𝓘 | 𝓕 | 𝓟 |
---|---|---|---|---|---|

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{FH}}-{\mathcal{D}}_{\mathrm{V}}^{\mathrm{W}}$ | 0.80 | 0.99 | 0.78 | 0.79 | 0.90 |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{FL}}-{\mathcal{D}}_{\mathrm{V}}^{\mathrm{W}}$ | 0.80 | 0.99 | 0.78 | 0.78 | 0.88 |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{W}}-{\mathcal{D}}_{\mathrm{V}}^{\mathrm{FH}}$ | 0.78 | 0.99 | 0.76 | 0.77 | 0.87 |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{FL}}-{\mathcal{D}}_{\mathrm{V}}^{\mathrm{FH}}$ | 0.77 | 0.99 | 0.76 | 0.76 | 0.87 |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{W}}-{\mathcal{D}}_{\mathrm{V}}^{\mathrm{FL}}$ | 0.61 | 0.99 | 0.58 | 0.59 | 0.72 |

${\mathcal{D}}_{\mathrm{V}}^{\mathrm{FH}}-{\mathcal{D}}_{\mathrm{V}}^{\mathrm{FL}}$ | 0.61 | 0.99 | 0.59 | 0.60 | 0.73 |

