# A Two-Stage Siamese Network Model for Offline Handwritten Signature Verification

## Abstract

## 1. Introduction

- This is a significant attempt to study Chinese signature identification.
- A two-stage Siamese network model is proposed to verify the offline handwritten signature.
- Visualization of the process of feature representation is analyzed.

## 2. Preliminaries

#### 2.1. Related Work

- Most of them only treat handwriting signature as a picture and do not mine deep signature style.
- They commonly ignore the imbalance distribution of positive and negative signatures that often occurs in real scenarios.
- The signature samples of each writer are usually small and the similarity between real signature and forged signature is high in real scenarios. The existing models usually generate synthetic data that are quite different from the real ones.

- It has a two-stage Siamese network module to verify the offline-handwritten signature. This network includes both traditional original handwriting recognition and data-enhanced handwriting recognition to mine the writers’ deep signature style.
- It employs the Focal loss to deal with the extreme imbalance between positive and negative offline signatures, which is quite different from previous studies.
- It is the first attempt to study the Chinese signatures with a real Chinese signature dataset.

#### 2.2. CNN and Siamese Neural Network

#### 2.3. Focal Loss

## 3. Model

#### 3.1. Problem Formulation

**s**signature image. Thus, this signature verification problem can be represented as

#### 3.2. Architecture of the Two-Stage Network

#### 3.3. The Feature Extractor

#### 3.4. The Signature Image Data Enhancement

#### 3.5. Loss Function

#### 3.6. Algorithm Design

Algorithm 1: Training Process of the Proposed Algorithm |

Require: set up the batch size m, the maximum number of epoch k, the learning rate LR, and the penalty factors $\lambda $ |

Require: Initialize the weights of the networks $\theta $. |

for epoch number = 1 : k do |

Randomly select m images from the training image dataset: ${x}_{i}$ |

Select m corresponding genuine images from the preprocessed dataset: ${s}_{i}$ |

Calculate the eigenvector and the Loss according to the network weights $\theta $. |

Update the weights of the networks $\theta $. |

${\nabla}_{\theta}\frac{1}{k}{\displaystyle \sum _{i=1}^{k}[F{L}_{O}({x}_{i},{s}_{i})+\lambda \cdot F{L}_{E}({\tilde{x}}_{i},{\tilde{s}}_{i})]}$ |

End for |

## 4. Empirical Studies

#### 4.1. General Settings

**Chinese Handwritten Signature Dataset**

**:**Since the previous Chinese handwriting signatures were imitated in the laboratory and the data amount was small, there was no suitable Chinese handwriting signature dataset. Therefore, we collected a multi-source Chinese handwriting signature dataset with a large period and strong practical significance. This data set includes both positive and negative signatures sample., which are from the National Forensic Center of Southwest University of Political Science and Law between 2009 and 2020. As it is a real case, these signatures all come from real-life signatures such as credit card consumption signatures, personal file signatures, signatures in document contracts. In a real setting, to ensure the consistency of the data set sample data, the datasets collected an examination of each signature handwriting and three certified true signature handwriting, and formed a set of data signature handwriting. There are altogether 500 sets of such signature handwriting data, including 220 sets of negative signature handwriting data and 280 sets of positive signature handwriting data. The handwriting data of each signature applying for handwriting identification are highly similar. All the signed signatures were scanned into images at 300DPI. The Chinese offline signature dataset consists of 500 names and 2000 signature images. This dataset has the characteristics of multi-source, real, and large scale. First, all the signatures are from real cases, which is often challenging. Secondly, the Chinese data set belongs to a relatively large-scale data set, in which the handwriting period of decades. Third, real signatures are collected at different times and in different scenarios, and the signatures of the same person may be significantly different. All of these characteristics make this data set very valuable and challenging.

**Evaluation Metrics:**In this study, a group of positive samples is composed of two genuine signatures written by the same person, and the corresponding recognition decision label y = 1. The evaluation metrics are based on the prediction of the sample pairs in all validation sets and the statistical analysis of the predicted results. Three evaluation indicators were used to evaluate and compare the proposed method with other methods: false acceptance rate (FAR), false rejection rate (FRR), and accuracy (ACC). The false acceptance rate is defined as the ratio of the number of false acceptances divided by the number of negative signature samples. The false rejection rate is defined as the ratio of the number of false rejections divided by the number of positive signature samples. Lower FRR or FAR and higher ACC mean better performance. They are calculated as follows:

**Baselines:**This study compares our proposed model with eight involved state-of-the-art models, including five writer-independent methods models (SigNet [37], Surroundness [38], Chain code [39], Ensemble Learning [40], Morphology [41], and DeepHSV [30]) and three writer-dependent models (Chain code [39], Texture Feature [42] and Fusion of HTF [6]). Table 3 describes the main descriptions of the correlation models.

#### 4.2. Comparison with State-of-the-Art Models

^{−5}, and the batch size was set as 32. The λ in Equation (5) is set as the empirical value 2.5. The proposed model is compared with the baseline model, the traditional Siamese neural network method, and the classical cross-entropy loss function method. As shown in Table 4 and Table 5, our model achieves better performance than state-of-the-art models.

#### 4.3. Chinese Signature Dataset

#### 4.4. Process Visualization

- Compared with previous methods, this model has better prediction performance. On the CEDAR signature dataset, the FRR, FAR, and ACC of the proposed method reach 6.78%, 4.20%, and 95.66%, respectively, which are superior to the existing comparison methods under all evaluation indicators. On the BHSIG-Bengali and BHSIG-Hindi signature datasets, our model achieves ACC of 90.64% and 88.98%, respectively, which is superior to other models. These results show that our method is superior to other comparison methods. In addition, our writer-independent approach still performs better than the writer-dependent approach.
- The data enhancement method adopted in this study is only related to the original input signature image. The original input signature image is processed by a series of neural networks to generate a data enhancement weight matrix. Finally, the degree of image data enhancement is adjusted by adjusting the proportion of the weight matrix, which improves the accuracy of experimental results, and the proposed model has strong robustness.
- The focal Loss function is very effective for solving the problem of unbalanced positive and negative data.
- The proposed model also has good performance in Chinese signature datasets, and this conclusion will be helpful for further research on offline Chinese signature verification.

## 5. Conclusions

**Figure 6.**(

**a**) Original Signature images, (

**b**–

**d**), feature extraction in CNN process, (

**e**) Signatures after data enhancement.

Genuine | Forgery | |
---|---|---|

CEDAR | ||

BHSig-Bengali | ||

BHSig-Hindi | ||

CHINESE | ||

Datasets | CEDAR | BHSig-B | BHSig-H | CHINESE |
---|---|---|---|---|

languages | English | Bengali | Hindi | Chinese |

People | 55 | 100 | 160 | 500 |

Signatures | 2640 | 5400 | 8640 | 2000 |

Total sample | 46,860 | 99,600 | 159,360 | 1500 |

Positive: negative | 276:576 | 276:720 | 276:720 | 840:660 |

Model | Description |
---|---|

SigNet | The writer independent Siamese network model proposed in 2017 [37] and is often applied to signature verification. |

Surroundness | A signature feature extraction model based on envelopment was proposed in 2012 [38]. |

Chain code | In 2013 [39], a model based on the histogram features of chain codes was proposed and enhanced by Laplacian Gaussian filter. |

Eensemble Learning | Deep learning model proposed in 2019 [40], which improves an integration model for offline writer independent signature verification. |

Morphology | Feature analysis technology based on multi-layer perceptron was proposed in 2010 [41]. |

Texture Feature | a texture-oriented signature verification method was proposed in 2016 [42]. It has good performance for Indian scripts. |

Fusion of HTF | A Signature verification model proposed in 2019 [6]. It adopts discrete wavelet and local quantized patterns features |

DeepHSV | A neural network model proposed in 2019 [30], which improves the network with a two-channel CNN network |

Method | Type | FRR | FAR | ACC |
---|---|---|---|---|

Morphology | WI | 12.39 | 11.23 | 88.19 |

Surroundness | WI | 8.33 | 8.33 | 91.67 |

Chain code | WD | 9.36 | 7.84 | 92.16 |

Ensemble Learning | WI | 8.48 | 7.88 | 92.00 |

ISNN + CrossEntropy | WI | 9.38 | 7.68 | 92.55 |

SNN + Focal Loss | WI | 8.92 | 6.94 | 93.47 |

Our method | WI | 6.78 | 4.20 | 95.66 |

BHSig-Bengali | BHSig-Hindi | ||||||
---|---|---|---|---|---|---|---|

Method | Type | FRR | FAR | ACC | FRR | FAR | ACC |

SigNet | WI | 13.89 | 13.89 | 86.11 | 15.36 | 15.36 | 84.64 |

Texture Feature | WD | 33.82 | 33.82 | 66.18 | 24.47 | 24.47 | 75.53 |

Fusion of HTF | WD | 18.42 | 23.10 | 79.24 | 11.46 | 10.36 | 79.89 |

DeepHSV | WI | 11.92 | 11.92 | 88.08 | 13.34 | 13.34 | 86.66 |

ISNN + CrossEntropy | WI | 18.64 | 12.86 | 86.66 | 15.63 | 15.49 | 84.54 |

SNN + Focal Loss | WI | 16.87 | 9.43 | 87.69 | 13.38 | 10.91 | 84.79 |

Our method | WI | 14.25 | 6.41 | 90.64 | 12.29 | 9.6 | 88.98 |

Method | Type | FRR | FAR | Acc |

SigNet | WI | 42.36 | 42.36 | 57.64 |

DeepHSV | WI | 41.87 | 41.87 | 58.13 |

SNN + CrossEntropy | WI | 38.98 | 35.77 | 64.79 |

ISNN + CrossEntropy | WI | 33.66 | 31.24 | 68.88 |

SNN + Focal Loss | WI | 36.74 | 30.92 | 65.88 |

ISNN + Focal Loss | WI | 32.18 | 30.59 | 70.31 |

