Learning Global-Local Distance Metrics for Signature-Based Biometric Cryptosystems

Biometric traits, such as fingerprints, faces and signatures have been employed in bio-cryptosystems to secure cryptographic keys within digital security schemes. Reliable implementations of these systems employ error correction codes formulated as simple distance thresholds, although they may not e actively model the complex variability of behavioral biometrics like signatures. In this paper, a Global-Local Distance Metric (GLDM) framework is proposed to learn cost active distance metrics, which reduce within-class variability and augment between class variability, such that simple error correction thresholds of bio-cryptosystems provide high classification accuracy. First, a large number of samples from a development dataset are used to train a global distance metric that differentiates with in class from between-class samples of the population. Then, once user species samples are available for enrollment, the global metric is tuned to a local user specific one. Proof-of-concept experiments on two reference one signature databases confirm the viability of the proposed approach. Distance metrics are produced based on concise signature representations consisting of about 20 features and a single prototype. A signature-based bio-cryptosystem is de-signed using the produced metrics and has shown average classification error rates of about 7% and 17% for the PUCPR and the GPDS-300 databases, respectively. This level of performance is comparable to that obtained with complex state-of-the-art classifiers.


Introduction
Biometric traits, such as fingerprints, faces, signatures, etc., are strong candidates to replace traditional passwords and access codes in many security systems, including those for access control and digital rights management [1]. Since biometrics represent physiological or behavioral traits of a human, they cannot be lost and they are less likely to be stolen or to be shared. Recently, some researchers have focused on employing biometrics to operate cryptosystems, such as encryption and digital signature systems [2]. In such biometric cryptosystems (also known as bio-cryptosystems), biometric traits replace traditional passwords to protect the cryptography keys (cryptokeys). A user must provide a genuine biometric sample, e.g., his fingerprint, to retrieve a crypto-key, by which he accesses confidential information or digitally signs some data.
Several authors have presented bio-cryptosystems. For instance, Soutar et al. [3] & Davida et al. [4] designed systems based on finger prints and iris, respectively. These early systems addressed the challenge in producing robust crypto-keys from variable biometric signals. Davida et al. [5] highlighted the relation between error correction codes and bio-cryptography in correcting biometric variability. This concept has been consolidated by Juels & Sudan [6,7] who proposed two generic bio-cryptographic schemes called fuzzy commitment (FC) [6] and fuzzy vault (FV) [7]. They consider the query biometric signal as a noisy version of its prototype. If the query sample is genuine, the distance between the query and its prototype is limited, and as a result, this noise can be eliminated by the error correction decoder and the locked crypto-key is released to its owner.
Most FC and FV implementations focus on more physiological traits such as fingerprints [8], iris [9], retina [10], face [11]. With samples obtained for such biometrics, intra-class variability is a less intrinsic property, and results mostly from the acquisition process. For instance, fingerprint-based FVs are encoded with some minutia points extracted from the enrolled fingerprint.

Biostatistics and Biometrics Open Access Journal
During authentication, decoding points are extracted from the query fingerprint, and might differ from corresponding encoding points due to misalignment. Researchers have alleviated such distances by aligning query and template fingerprints and positioning them that they are within the error correction capacity of the decoder [8].
Conversely, intra-class variability is a more intrinsic property of behavioral biometrics, like handwritten signatures, since individuals do not behave identically at all times. For operating robust systems based on signatures, modeling of high intra-class variations may require employing high-dimensional feature descriptors and complex classification rules, as found in traditional sig-nature verification (SV) systems [12]. These tools are not suitable when designing signature-based biocryptosystems since we are restricted by the error correction code functionality (simple distance threshold) and the compact signal representation. For instance, it has been shown that direct implementation of an FV scheme based on offline signature images produces unreliable systems since the inherent variability is too high to model with a simple FV decoder [13].
In this paper, instead of using distance cancellation methods (like aligning samples), the bio-cryptosystem design problem is addressed by employing the distance metric learning concept [14]. To that end, a new approach for learning distance metrics called a Global-Local Distance Metric (GLDM) is proposed to produce similar within-class (WC) and dissimilar betweenclass (BC) distance measures. Once the metric is learned, its information is used to design the signature-based bio-cryptosystem. Since the distance metric is designed to minimize the WC and maximize the BC distance measures, it is more likely that the noise of genuine queries is corrected by the errorcorrection decoder of the bio-cryptosystem while impostor queries are not corrected, and thus good classification accuracy is obtained.
To initiate a bio-cryptosystem for a user when only few reference samples are available for enrollment, the proposed approach starts with the learning of a global metric from an independent (global) development dataset that includes a huge number of samples. Hence, the produced global distance metric discriminates between WC and BC distances for any user even for users who are not included in the global dataset. Then, when more enrollment samples are available for a specific user, they are employed to tune his metric, and produce a user specific (local) distance metric.
Preliminary research on this approach appears in [15,16]. In this paper, the GLDM approach is proposed under a distance metric formulation of the biometric cryptosystem design problem. For experimental validation, the UPCR and the GPDS-300 signature verification databases are used [17,18]. Distance metrics are optimized based on the proposed approach, where the impact of each processing step on the metric e effectiveness is measured by its impact on the separation of WC and BC distances. The resulting metrics are used to design signaturebased bio-cryptosystems and the classification error rates are reported. , where it succeeds in locating the corresponding element only if the elements' dissimilarity is within the modeled tolerance . To correctly decode the FV, and release the locked crypto-key K, the overall distance between the locking and unlocking messages , besides the noise error (resulting from false matching with cha s), should not exceed the error correction capacity of the decoder.
The rest of the paper is organized as follows. In the next section, the formulation of biometric cryptosystems as distance metrics is described. Section 3 reviews some distance metric learning approaches and their relation to the proposed method. Section 4 describes the new Global-Local Distance Metric (GLDM) learning approach. The experimental methodology is described in Section 5. Finally, the experimental results are presented and discussed in Section 6 ( Figure 1).

Formulating Biometric Cryptosystems as Distance Metrics
Robust bio-cryptosystems (like FC and FV) operate in keybinding mode, where classical crypto-keys are coupled with a biometric message. In the enrollment phase, a prototype biometric message encodes the secret key. In the authentication phase, a message is extracted from the query sample to decode the key. If the query sample is genuine, the distance between the encoding and decoding messages is limited, and as a result, this distance can be eliminated by the decoder. On the other hand, if the query sample belongs to another person, or if it is a forged sample, the distance between the two messages is too high to cancel. Accordingly, the secret key will be unlocked only to users who apply query samples that are quite similar. Because of practical decoding complexity of such codes, biometric messages used must be concise. Produce a concise and informative message from the biometric signals is a challenging task, as it is using simple classifiers like bio-cryptographic decoders to differentiate between genuine and forged samples.
In this paper, the FV key binding cryptographic scheme is considered [7]. In FVs, a feature vector There are two sources of matching errors: erasures and noise. In the case of erasures, some unlocking elements do not match their corresponding locking elements, and so they are therefore not added to the matching set. In the noise case, some unlocking elements match some of the cha points, and so they are therefore added as noise δ ′ 0 to the matching set. For efficient FV implementation, the sum of these errors may not exceed for genuine query signals, while it exceeds it for impostors. The FV functionality can be formulated as a distance metric as shown in Figure 1. Consider a prototype * ur p is selected to lock a cryptographic key K of a user . During enrollment, each locking element * ur p n f locks a piece of information about the key . During authentication, an unlocking element is extracted from the query sample , and is matched to all locking elements of the FV. The unlocking element can locate its corresponding locking element only if their similarity lies within a matching tolerance n δ . Accordingly two corresponding locking and unlocking elements constitute a distance element: Where we define     the operator as follows Details of how the crypto-key is encoded/decoded by means of biometrics are out of the scope of this paper. For more details on this aspect see [6,7]. The error correction capacity of a FV bio-cryptosystems relies on the sizes of both the cryptographic key and the encoding messages. Also, for technical issues, the The accumulation of the individual distance elements constitutes the FV distance metric: Considering the extra noise (chaff) errors δ ′ , the total distance (matching error) between a query and its prototype should not exceed the FV error correction capacity . Accordingly, the proposed formulation of the FV functionality is: Where the distance is smaller than , this function outputs

Distance metric learning
State-of-the-art: The distance metric learning concept is introduced mainly enhance the performance of distance classifiers that take explicit distances (or kernels) as inputs, e.g., KNN, SVM, etc [14]. For such distance/kernel-based classifiers, a distance function that measures the true proximity between feature representations (FR) of patterns is firstly designed, and is then fed to the classification stage. The performance of such classifiers relies on the quality of the employed distance measure, which in turn relies on the employed FR, the distance function applied to the representation, and the prototypes that are used as references for distance computations.
In the literature, such systems are optimized with the use of distance function learning [19], and/or prototype selection [20]. Distance function learning is done through the optimization of a parameterized function, such that the WC distances are minimized and BC distances are maximized. Examples of the employed distance functions are L2 distance [21], Chi-squared [22], weighted similarity [23], and probability of belongingness to different classes [24]. However, most employed distance functions take the following form: Where Q F and p F are the feature representations for the questioned and the prototype samples, respectively. This technique provides a means of translating hardly separable distributions to a space where the distributions are more separable. For that conventional pattern recognition approaches to hold in the new space, A is restricted to being a symmetric and positive definite matrix (or kernel), such that is a metric function [19]. According to Eq. 6, it is obvious that entries of the A matrix determine the impact of the pair wise distances between individual features on the distance measure. Thus, learning A implies feature selection [25]. It has been shown that the accuracy of this metric increases when full matrices are considered (i.e., not only a diagonal matrix but some weighted relations among individual distances exist) [26]. Also, it is shown that global distance functions do not frequently represent all classes [27]. Instead, the concept of local distance functions is presented [19]. For instance, the metric tensor concept is represented, where instead of learning a metric A for the whole population, a specific metric AT is learned for every class T This approach becomes complex for large numbers of classes, and some authors have suggested grouping similar classes under larger classes so that a trade-o between global and class specific distance functions can be achieved [23]. Moreover, as indicated earlier, the quality of the proximity measure also depends on the prototype set used as a reference for distance measuring. Prototype selection is extensively studied for distance-based classifiers like KNN [20].

Proposed method
Since existing distance learning approaches are mainly designed for the enhanced performance of generic distance classifiers such as KNN, SVM, etc., there are no specific constraints applied to either classifier complexity, training size, or employed representation. Conversely, there are restrictions that apply when designing distance metrics for signaturebased bio-cryptography. For instance, these systems involve a simple thresholding distance classifier which makes it hard to model complex problems like one signature verification (OLSV). Moreover, OLSV systems should learn from limited positive signature samples and almost no forgeries are available during the design phase. Lastly, the design of such systems requires concise feature representations which might not capable of alleviating the high variability in the signature images. It is because of these design constraints that a specialized distance learning method for the problem at hand is proposed. between prototype and query samples constitute distance cells, which accumulates distance elements computed in the FD space. The distance cells constitute a dissimilarity matrix, where each row contains the distances from a specific query to all of the prototypes In this paper, the distance metric defined by Eq. 4 is optimized based on a mixture of Feature-Distance (FD) space [28,29] and dissimilarity matrix analysis. Figure 2 illustrates the different distance metric computational spaces. Let us assume a system is designed for U different classes, where for any The distance between a questioned sample vj Q and a proto type The distances between all the questioned and prototypes samples constitute a dissimilarity matrix, where each row contains distances from a specific query to all of the prototypes.
Where questioned and prototype samples belong to the same class, i.e., u v = , the distance sample is a WC sample (black cells in Figure 2). On the other hand, if questioned and prototype samples belong to different classes, i.e., u v ≠ , then the distance sample is a BC sample (white cells in Figure 2). An ideal distance metric implies that all the WC distances have zero values, while all the BC distances have large values. This occurs when the employed metric d absorbs all the WC variabilities, and detects all the BC similarities. The proposed approach aims to enlarge the separation between the BC and WC distance ranges, such that a simple distance threshold rule (like that involved in error correcting codes) produces accurate classification.
As shown in the Figure 2, a distance measure accumulates some distance elements that are computed in the FD space, where the distance between a query

Biostatistics and Biometrics Open Access Journal
values. Since a distance element relies on a specific feature n f and its associated tolerance n δ , these building blocks should be optimized accordingly.
In the case of signature-based bio-cryptography systems, the optimization of aforementioned distance metric is a challenging task since a concise representation must be selected from highdimensional representations, especially when only a few positive samples and almost no forgery samples are available for training. We tackle this challenging problem by proposing a hybrid Global-Local learning framework that also achieves a trade-o between global and local distance metric approaches [23]. The global learning process overrides the curse -of -dimensionality by using huge numbers of samples of a development dataset for learning in the original high-dimensional feature space. Therefore, it becomes feasible for the local learning process to learn in the resulting reduced space even when limited samples are available for training.

Global-local distance metric learning approach (GLDM)
Overview: This section provides a detailed description of the proposed GLDM distance metric learning framework illustrated in Figure 3. In the rest step, a large number of samples of a global dataset is used to design a global distance metric that differentiates between WC and BC samples of the population. A preliminary feature space of huge dimensionality M is produced and is reduced to a global space of dimensionality g H M << through the application of a Boosting Distance Elements Finally, depending on whether or not a local solution is available, either a global or a local distance metric is computed by employing Eqs.1-4, using the distance element constituents produced by the above steps. It is important to note that for both global and local distance metric computations, only the best elements are used since the metric produced is mainly designed for building FV systems that require a concise number of locking/unlocking elements.

Global distance metric learning (GDM)
Algorithm 1 describes the processing steps for learning global distance metric constituents. A feature representation G F of high dimensionality M is extracted from a global dataset G , and is translated to the FD space by applying a dichotomy transformation. A boosting distance elements (BDE) process (Algorithm 2) runs in the FD space producing a set of global feature indexes.

{ }
consists of S samples from different classes.

1: Extract feature representation
is a WC samples if i,j belong to same class, otherwise it is a BC sample. and associated tolerance

3: Label dichotomy samples
Output: Global feature indexes FI g , global tolerance g  both of H g dimensionality.

Algorithm 2: Boosting Distance Elements (BDE)
Dr s + is a distribution 14: Compute AUC using validation set V and the distance metric: represents the distance as measured by a single feature. In the FD space, it is easier to rank distance elements by their impact on the enlargement of the separation between WC and BC distance ranges. Also, in F D space, it is easier to learn a tolerance value n for each element fn that discriminates between WC and BC samples.

Dichotomy transformation
This transformation is applied to the original feature space F and translates samples to the feature-distance F D space, where distance elements are selected and optimized. Each distance element is computed based on a single feature f and its associated tolerance value n δ . To illustrate the importance of this transformation, consider the example shown in  ) are generally smaller than BC distances (like ( ) 21 11 , E d Q p ). However, in the feature space F, the impact of each feature on the WC and BC distances is not clear. With representations of high dimensionality, high number of classes, and a small number of training samples per class, it is not feasible to select the most discriminative features in the feature space F. On the other hand, in the feature-distance space FD, the impact of every individual feature on the WC and BC distances is clear, (see right side of Figure 4)

Boosting distance elements (BDE)
Algorithm 2 describes the BDE process. Training distance samples are represented in the FD space of a dimensionality A and they are initially given equal weights (Step 1). They are then sent to a boosting feature selection (BFS) method [30], for fast searching in high dimensional spaces (Steps 2:20). At each boosting iteration b, the best dimension fb is selected, along with the adjustment of its associated tolerance b that best splits the WC and BC distances (Steps 5:11). After each boosting iteration, training samples are given new weights based on the extend to which they are accurately classified by the current distance element, and the weight distribution The resulting local representation could be employed directly to compute a local distance metric using any available prototype as a reference, however, we propose a prototype selection process instead which picks the most stable and discriminant prototype for enhanced metric accuracy.

Prototype selection (PS)
Algorithm 4 illustrates the prototype selection process. Firstly, the global representations are used for the distance metric computations defined by Eqs.1-4. The produced matrix D is then used to select the most stable and discriminant prototype .

Algorithm 4: Prototype Selection (PS)
Input: Global feature representations   are projected to a dissimilarity matrix u D , where each row contains distances between a specific query to all prototypes and each column contains distances between all queries to a specific prototype.
Here, we investigate a part of D (see Figure 4)

Biostatistics and Biometrics Open Access Journal
It is clear that the dissimilarity matrix provides an easier way of ranking prototypes according to their discriminative power. For instance, for class 1, ). Thus, for this class, measuring the distance relative to 12 p results in more isolated WC and BC distance ranges. To automate the dissimilarity matrix analysis and the selection of the best prototype, we propose a distance separability measure: The left part of the above equation measures the discriminative power of a prototype where high values indicate a large separation between global and local samples (BC distances). The right side for its part measures the stability of a prototype in which low values indicate a small separation between local samples (WC distances). Accordingly, we select the prototype that maximizes this distance separability measure and the best prototype is given by: Finally, after the best prototype is selected, the optimal local distance metric dl is computed according to Eqs.1-4 with reference to * p .

Experimental methodology
The experiments conducted investigate both the proposed GLDM distance metric learning approach and the signaturebased FV bio-cryptosystem, designed based on the resulting distance metrics. First, the experimental databases are split into a global (development) set G and an exploitation set that consists of several local sets (L), one for each user. A preliminary feature representation (FR) of huge dimensionality M is extracted and used to generate a preliminary dissimilarity matrix. This matrix is refined several times by applying the different steps of the GLDM learning approach (as described by Figure 3). Importance of the different processing steps is measured by their impact on separating the WC and BC distance ranges. The learned distance metrics are then used to design the target signature-based FV system. Since we proposed the first bio-cryptosystem based on offline signature images, there is no benchmark available for testing our system, and we make comparison with stateof-the-art classical signature verification (SV) systems instead. However, we emphasis here that this comparative study might be biased because while the FV systems involve a very simple classification rule (a distance threshold) and employs concise feature representations, the SV systems might employ complex classification rules and high dimensional representations.

Databases
Two different offline signature databases are used for proofof-concept simulations: the Brazilian PUCPR database [17], and the GPDS-300 database [18]. Firstly, we executed a proof-of-concept for GDML prototyping using the PUCPR database, and then the concept is generalized by testing a GLDM-based FV system using both of the databases. While the PUCPR database is composed of random, simple and skilled forgeries, the GPDS database for its part is composed of random and skilled forgeries. Random forgeries occur when the query signature presented to the system is mislabeled to another user. Further, forgers produce random forgeries when they know neither the signer's name nor the signature morphology. For simple forgeries, the forger knows the writer's name but not the signature morphology, and can only produce a simple forgery using his writing style. Finally, skilled forgeries imitate the signatures as they have access to a genuine signatures sample.

Brazilian PUCPR database
The PUCPR database contains 7920 samples of signatures that were digitized as 8-bit grayscale images over 400X1000 pixels at a resolution of 300 dpi. The signatures were provided by 168 of these writers. For the last 108 writers, there are only 40 genuine signatures per writer, and no forgeries. We consider these signatures as the global dataset (G), and they are employed for the GDM learning phase. For the first 60 writers, there are 40 genuine signatures, 10 simple forgeries and 10 skilled forgeries per writer. These signatures are considered as the exploitation dataset consisting of several local subsets (L), one per user. Of these, the first 30 genuine signatures are used for the LDM learning phase, however, the last 10 genuine signatures and all of the different forgeries are used for performance evaluation.

GPDS-300 database
The GPDS-300 database contains signatures of 300 users, which were digitized as 8-bit grayscale images at a resolution of 300 dpi. This database contains images of different sizes (vary from 51 82 to 402 649 pixels). All users have 24 genuine signatures and 30 skilled forgeries. The database is split into two parts. One part contains the signatures of the last 140 users and is considered as the global set (G) used for GDM training. The other part contains the signatures of the first 160 users and is considered as the exploitation dataset that consists of several local subsets (L), one per user; as well. Of these, the first 14 genuine signatures are used for the LDM learning phase, however, the last 10 genuine signatures and all of the forgeries are used for performance evaluation.

Global distance metric learning
The processing steps of the GDM learning algorithm (Algorithm 1) are executed using the global dataset (G) of both databases as follows: Feature extraction: The Extended-Shadow-Code (ESC) [31] and Directional Probability Density Function (DPDF) [32] are employed. Features are extracted based on different grid scales, hence a range of details are detected in the signature image. A set of 30 grid scales is used for each feature type, producing 60 different single scale feature representations. These representations are then fused to produce a FR of huge dimensionality, M=30, 201 [29].

Dichotomy transformation:
The initial dissimilarity matrix, is constituted by translating the FR, of M dimensionality, for all users of G, to a FD space of same dimensionality. As this matrix is huge, not all of its cells are used for the GDM learning process. Also, to avoid overfitting the G dataset, some of the WC and BC samples are used for training, while other sets of samples are used for validation. The global distance metric d g is then computed using the resulting representation FI g and the associated tolerance values g  according to Eqs.1-4.

Local distance metric learning
The processing steps of the LDM learning algorithm (Algorithm 3) are executed using the local dataset (L) of both databases as follows: Prototype selection: Finally, the dissimilarity matrix 4 is refined for the best column, by selecting the most stable and discriminative prototype * p that maximizes the distance separability measure as defined by Eqs. 9-10, where R = 30 for the PUCPR database and R = 14 for the GPDS database.

Fuzzy vault encoding and decoding
Once the local distance metric dl is learned, the information embedded in its constituting distance elements is used for FV encoding and decoding (as described in Section 2). In the encoding phase, we set the encoding message length N=20 and the cryptographic key K size to 128-bits. Accordingly, the distance metric constituents l FI are extracted from the best enrolled signature prototype * p , and are quantized in 8 bit words. Then, the cryptographic key K is used to generate a polynomial of k=7 degree and the quantized features are then projected on the polynomial. The features and their projections constitute a set of genuine FV locking points. Finally, a set of z = 180 chaff (noise) points are generated to conceal the genuine points, and both sets constitute the FV.
During FV decoding, the same F Il features are extracted from the query signature image Q. The features are quantized and matched with the FV points. Since the distance metric is designed such that WC distances are small and BC distances are large, it is expected that most of decoding features of genuine query samples will match the FV locking points, while only a few features of impostor queries will match FV points. The matching points are then corrected by the FV error correction decoder, where we set the error correction capacity 6 ∈= , and they are used to re-generate the polynomial and release the cryptographic key K to the user.

Performance evaluation
Evaluation of the GLDM approach: The proposed GLDM approach is evaluated by investigating its power to generate well separated WC and BC distance ranges. A testing distance matrix is generated for each user of local datasets. To investigate the different processing steps of the proposed approach, the impacts of the different learning steps on the separation of the WC and BC clusters are decoupled by employing the following experimental scenarios: where the learned feature tolerance l  is not employed. We can thus decouple the impact of embedding the modeled feature tolerance in the metric computations, and only test the applicability of tuning the global metric to specific classes/users. In all the above scenarios, we employed a strict distance metric as a non-tolerant variant of distance metrics, where the distance element (defined by Eq. 1) is replaced by a strict distance element: IV. Local distance metric (LDM): the local metric dl is computed as defined by Eqs. 1-4. There-fore, the impact of absorbing a feature variability, through learning representation variability (tolerance), is tested in this experiment.
For all the above cases, the testing datasets are used to generate dissimilarity matrices, according to the investigated scenario. The separability of the WC and BC clusters are then measured by the Hellinger distance [33]. Assuming normal distributions of the WC and BC clusters, the squared Hellinger distance between them is given by: Moreover, since the main target application of the proposed GLDM approach is the development of a reliable FV system, we decouple here any factor that impacts the recognition accuracy of such a system other than those related to the GDLM method. More specifically, FV recognition accuracy relies on the separability of the WC and BC distance ranges (which reflects the e effectiveness of the GLDM approach), as well as on the error correction capacity (which is equivalent to the distance threshold that split the WC and BC ranges (see Eq. 5). Accordingly, we decouple the impact of the choice of the threshold and only test the impact of the metric l d on FV performance. To that end, we generate ROC (receiver operating curve) curves by computing recognition errors for all possible distance measures.
A ROC curve plots the False Accept Rate (FAR) against the Genuine Accept Rate (GAR) for all possible thresholds (all distance measures). FAR for a specific threshold is the ratio of forgery samples with a distance measure smaller than this threshold. GAR is the ratio of genuine samples with a distance measure smaller than the threshold. In order to have a global assessment of the FV quality, we compute and average the AUC (area under ROC curves), for all users in the testing subset. A high AUC indicates more separation between the distance score distributions for the WC and BC classes.

Evaluation of the GLDM-based FV system
After evaluating the GLDM approach, we investigate the performance of the FV bio-cryptosystem, designed based on the learned distance metrics, by employing the same experimental scenarios mentioned above. To measure actual recognition rates, we apply a fixed FV error correction capacity   Figure 6 illustrates the impact of each processing step on the separation of the WC and BC clusters, for a specific user. It is obvious that, without distance metric learning, the distance distributions are overlapped. Learning a global metric using a global data set, increases the separation. This validates our hypothesis that distance metrics learned based on high dimensional FRs extracted from large numbers of classes, relatively, generalize for unseen classes. Running a local metric learning process using class-specific datasets increased separability. This validates our hypothesis that, the global metrics are adaptable for new classes. Embedding information about feature variability in distance metric computations increased the stability of the genuine class: for instance, the maximum distance score for the genuine class decreased from 9 to 5. This validates our hypothesis that the modeling of representation variability in the FD space absorbs some intrinsic signal variability. is increased from 0.24 to 0.66. Also, the average AUC is increased by about 47% (from 0.65 to 0.97). The distance measures reported above are averaged for all prototypes. However, class separation differs for the different prototypes. For instance, Figure 7 shows distributions of the best and worst prototypes for a specific user.   Recognition rates for the FV system, designed based on the learned local distance metric , are reported in Tables 2 [34][35][36][37][38][39] and VII for the PUCPR and GPDS databases, respectively. For the PUCPR database (see Table 2), it is clear that each step of the GLDM learning approach enhances the FV recognition performance and this result is correlated with the distance separability investigations reported in Table 1. Also, applying the prototype selection method (variant 5) enhances the FV performance significantly. The recognition rates of this FV variant (where all GLDM processing steps are executed) is compared to state-of-the-art classical signature verification (SV) systems.

Results of the GLDM-based FV system
The first three SV systems are writer-independent (WI-SV), where a global development (in-dependent) database is used to generate a global classifier. All systems employed an ensemble technique such that the classification decision relies on multiple prototypes, and high dimensional presentations are employed. The other SV systems are writer-dependent (WD-SV), where one local (dependent) dataset is used to generate a local classifier per user. For these systems, various complicated techniques, such as the dynamic selection of classifier and ensemble methods, are employed for enhanced recognition.
Generally, although the proposed FV implementation can be considered a very simple classifier (only a distance threshold) with concise feature representations (only 20 features), its recognition rates are comparable to those of more complex SV systems. For instance, compared to state-of-the-art WI-SV systems (system 3), where both the FV and SV systems rely on a single prototype for authentication, the proposed FV bio-cryptosystem has shown similar accuracy, while employing only 20 features instead of the 555 features present in the SV system [29]. Thus, applying our proposed GLDM learning approach maintained the performance, while decreasing the representation complexity by about 96% (from 555 to only 20 features). Moreover, digitizing the feature values (as they are represented in 8-bit words for bio-cryptographic encoding), had no impact on the recognition accuracy. Recently, the authors proposed a hybrid WI-WD SV system that tunes a global representation to specific users (see system 7). This system outperforms the aforementioned SV systems and provides similar insights about the e effectiveness of applying a hybrid Global-Local training scheme.
For the GPDS database (Table 3) [40][41][42][43], each step of the GLDM learning approach enhances the FV recognition performance, providing results similar to the PUCPR experimental results. Comparisons with state-of-the art SV systems are promising as well. The first SV system employed a WI-SV classifier, where skilled forgery samples are used to train the global classifier. The following four SV systems employed WD-SV classifiers. The first three WD-SV systems also used skilled forgery samples to train the classifiers. This might bias the performance evaluation process since such knowledge is not available for training practical SV systems, e.g., banking systems. On the other hand, for the last WD-SV system, no forgery knowledge is used for training, however, complicated classification rules, such as the dynamic selection of classifiers and ensemble methods are employed for enhanced recognition. The hybrid WI-WD SV system outperforms both pure WI and WD systems, which leads to a similar conclusion as with the PUCPR database experiments, and supports the hypothesis regarding the e effectiveness of tuning global solutions to specific classes. The proposed FV system has shown performance that is comparable to those of such complicated SV systems, where no forgery samples are used for training, proving the e effectiveness of the pro-posed GLDM learning approach and supporting the proposed approach in modeling FV systems as distance metrics.

Conclusion
This paper presents an approach for learning distance metrics adapted for bio-cryptosystem design. The proposed approach produces global metrics that generalize well to unknown classes that are not used for training. In addition, these metrics can be further tuned to new classes. This property permits the design of global classification systems adaptable to specific classes. Moreover, the produced metrics rely on concise representations, in terms of number of employed feature extractions and prototypes. This allow the design of systems with limitations in their computational complexity and that rely on high-dimensional FRs, such as signature-based bio-cryptosystems. In addition, the modeling of representation variability and the selection of discriminant prototypes enhance the distance metric efficiency.
The proposed Global-Local distance metric (GLDM) learning approach is applied to the design of a key binding bio-cryptosystem based on the fuzzy vault (FV) scheme and handwritten signature images. To that end, the FV system functionality is formulated as a simple thresholding distance classifier. It is shown that such a simple classifier provides a level of accuracy that is as high as that of complex signature verification (SV) systems in the literature. The proposed approach can also be employed as an intermediate tool for designing traditional feature-based classifiers, where the produced distance metrics feed distancebased classifiers, e.g., KNN. Future work will investigate the power of the proposed approach on other applications (e.g., face recognition, video surveillance, image retrieval, etc). Also, comparing the effectiveness of the produced metrics to that of other local distance design methods in the literature is of great interest.