1. Introduction
Machine learning has made significant strides in various classification and recognition tasks, often surpassing human performance. For example, the current benchmark is a model that achieves an exceptionally low error rate of just
on the MNIST handwritten digit image dataset [
1]. Although the field may appear to have overcome all challenges, these accomplishments are primarily confined to closed-set scenarios, where all classes are known during training. However, new classes can appear during real-world applications at the time of testing, necessitating models to make informed rejections in open-set scenarios.
Our previous research tackled this fundamental challenge by introducing an effective Open-Set Recognition (OSR) methodology [
2]. The core of our approach was generating synthetic samples from actual data instances to represent the unknown space. A key finding was that training the model to identify and reject these artificial samples significantly enhances its ability to reject genuine unknown samples during testing.
Our approach diverges from traditional methods by generating synthetic features within a hidden layer rather than entirely new inputs. This strategy not only improves accuracy but also reduces computational overhead. The generative model for these features is simpler than the input layer, optimizing computational efficiency. Moreover, placing the synthetic samples in a hidden layer bypasses the initial model segments, saving substantial computational resources. Despite this departure, we still use Generative Adversarial Networks (GANs) [
3], including refined and simplified generator and discriminator networks.
One application of OSR is authentication, because, besides recognizing known subjects, the model also needs to reject unknown subjects. The data serving as a base for classification is not necessarily in the form of images. Other types of data can also occur, for example, time series. We are unaware of any OSR model tailored for different data types besides images. Some existing methods appear to be easy to adapt for various kinds of data, but none have been applied to or tested on those. Hence, it is necessary to develop OSR models capable of processing time-series data.
Initially designed for image datasets using convolutional networks, our OSR model is highly adaptable to various data types. This adaptability is due to the model’s modular architecture, specifically the feature extraction module located just before the hidden layer, where synthetic samples are generated. Once the necessary features are extracted, the generative and feature-classifier components work seamlessly together.
In this work, we adapt this model to classify multi-channel time-series data, focusing on biometric signals. We aim to accurately identify users based on the vibrational patterns of their hands, which are captured by accelerometer and gyroscope sensors in mobile phones [
4]. Our preliminary results, published in an earlier paper, demonstrate promising outcomes using one-dimensional convolutional networks for feature extraction. The model also retains its favorable time complexity, a significant advantage of its original design [
5].
The focus of this work is not on advancing biometric methods per se, but rather on the broader machine learning challenge of Open-Set Recognition (OSR)—enabling classifiers to detect and reject inputs from previously unseen classes reliably. The biometric dataset serves as a case study to demonstrate the model’s adaptation to time-series data, but the methodology itself is domain-independent.
In this work, we conducted a comprehensive experiment, implementing and evaluating several feature extraction methods. We combined these methods into a single model, where the resulting feature vector is the concatenated output of the individual techniques. This approach enhances the model’s performance, albeit at the cost of additional computational resources due to the simultaneous execution of multiple feature-extraction models.
This paper is structured as follows: We commence with an exhaustive literature review, providing a comprehensive background to contextualize our work. Next, we present an overview of the original OSR model. Following this, we discuss in detail how our model was adapted to accommodate the new data type, including data preprocessing and feature extraction methodologies. Finally, we present our results, analyzing the model’s performance and cost-efficiency using various combinations of feature extraction methods.
2. Literature Review
In this section, we provide a succinct overview of the relevant literature. First, we introduce the concept of Open-Set Recognition, followed by an overview of the most relevant existing solutions. Then, we introduce our previous solution to the problem, which serves as the basis for the work described in this article. Lastly, we present the dataset utilized to evaluate our model’s performance.
2.1. Theory of Open-Set Recognition
Many algorithms have long been used to solve classification tasks where only some samples belong to any known class [
6,
7] or the model lacks sufficient confidence [
8,
9] in classifying a sample. These models have addressed problems similar to OSR, but without laying down the theoretical background for it. OSR itself was finally formalized by Scheirer et al. [
10]:
Let
O denote the open space (i.e., the space far from any known data). The open space risk is defined as follows:
where
denotes the space containing both the positive training examples and the positively labeled open space, and
f is the recognition function, with
if the sample
x is recognized as a known class, and
otherwise. Intuitively,
measures the proportion of the function’s support that lies in open space, i.e., the extent to which the classifier incorrectly labels regions far from the training data as known.
Definition 1. Open-Set Recognition Problem: Let V be the set of training samples, the open-space risk, and the empirical risk (i.e., the closed-set classification risk, associated with misclassifications). Then, Open-Set Recognition is employed to find an , where H is the hypothesis space of measurable recognition functions, such that indicates assignment to a known class, and f minimizes the open-set risk :where is a regularization parameter balancing open-space risk and empirical risk. Definition 2. The openness of an Open-Set Recognition problem is defined as follows:where , and denote the sets of training, target, and test classes, respectively. 2.2. Existing Approaches
After introducing the concept of Open-Set Recognition, Scheirer et al. [
10] proposed the 1-vs-Set Machine as an initial solution. This specialized Support Vector Machine (SVM) model is designed to address open-set challenges. After training, the model incorporates a second hyperplane parallel to the first. Inputs classified between these hyperplanes are labeled as positive. The authors argue that by comparing the volume of a d-dimensional ball to the positively labeled slab inside it, the open-space risk of the model approaches zero as the ball’s radius increases. Despite this approach, the positively labeled space remains unbounded.
In a subsequent study, ref. [
11] applied the Radial Basis Function (RBF) kernel to the SVM model. They noted that the radial kernel function
K satisfies
, where d(x, x′) represents the distance between feature vectors
x and
. The study identified a negative bias term as a necessary and sufficient condition for ensuring a bounded positively labeled open space. This was achieved by adding a regularization term for the bias in the objective function.
Bendale et al. proposed OpenMax [
12], a method that adapts a neural network classifier trained initially in a closed-set scenario using a SoftMax layer. OpenMax modifies this setup to allow for the rejection of open-set samples.
Traditional SVMs and SoftMax classifiers are initially tailored for closed-set scenarios, where all classes are known during training. To effectively reject open-set samples, these models must be adapted, which involves rethinking how the input or feature space is divided using hyperplanes or other methods. Such adaptations require fundamentally different approaches to improve the results.
In Open-Set Recognition, the likelihood that a sample belongs to a known class can be estimated through the distribution of its distances to training data in the feature space.
Distance-based classifiers naturally align with the open-set framework. Besides identifying the most similar class for a given input, they yield a similarity score that can be thresholded to decide whether the input should be assigned to that class or rejected as unknown.
Ref. [
13] adapted the nearest neighbor method for open-set conditions. To classify a sample
s, its nearest neighbor
t is located, followed by another neighbor
u that belongs to a different class than
t. The ratio
is then compared against a threshold
T: if
, the label of
t is assigned to
s; otherwise,
s is rejected.
Miller et al. [
14] proposed an alternative that uses predefined class centroids instead of pairwise sample comparisons. Their method projects inputs into a logit space and evaluates Euclidean distances between the input logit and each class mean for classification.
A central difficulty in open-set learning is the lack of negative (unknown) samples during training, as these appear only at the time of testing. Introducing synthetic data to approximate the distribution of unknowns can help mitigate this limitation.
Generative Adversarial Networks (GANs), introduced by Goodfellow et al. [
3], provide a way to produce realistic artificial data. A GAN is composed of two competing models: a generator that maps noise to candidate samples, and a discriminator that attempts to separate real from generated instances.
Kong and Ramanan [
15] proposed an OSR-specific GAN framework that conditions the generator on feature vectors to synthesize class-related data, thereby improving rejection of unknowns.
Other works, such as those by Jo et al. and Ge et al. [
16,
17], also employed GAN-based augmentation strategies for Open-Set Recognition.
Generative modeling has been successfully applied in biometric and physiological signal contexts as well, for tasks such as classification [
18] as well as for data augmentation and validation [
19,
20].
Generating negative samples enhances the performance of an OSR model. However, their substantial additional computational overhead represents a significant drawback.
Most OSR models are primarily designed for processing images, highlighting the need for algorithms working on time-series data. Among the pioneers in this field were Tornai et al. [
21], who, following the extraction of statistical features from data collected via smartphones’ sensors, applied the
[
22] and EVM [
23] models to time-series data. Their work demonstrated the feasibility of OSR on time series, albeit with room for improvement in the results.
The latter solution was similar to that employed in this work in that it utilized OSR on biometric data for authentication purposes. As Maiorana et al. discussed, another source of such biometric data can be the user’s keystroke dynamics when typing various kinds of texts, e.g., PINs or passwords [
24].
Wandji Piugie et al. [
25] converted the time-series data of the HAR dataset [
26] into images and processed the converted data with the standard convolutional network-based models. We argue that images are not inherently better suited for efficient feature extraction than time series; the reason for having better results on images is that models working on them received significantly more attention. Therefore, developing more time-series-optimized solutions remains a pressing need.
2.3. Previous Work
We previously implemented an OSR method that employs a distance-based approach, diverging from the traditional softmax-based structure to better accommodate open-set scenarios. In this method, training is simplified into a quadratic regression by using fixed class centers. To prepare the model for future unknown inputs, synthetic samples were generated within a hidden feature layer rather than in the input space. The neural network was divided into two segments: the first segment extracts features from samples, with its output serving as the layer where synthetic features are generated. This configuration enabled the training procedure described in Algorithm 1.
Initially, both segments of the model were pre-trained as a unified network. Subsequently, the outputs from the pre-trained first segment were recorded and used as real inputs for training the generative model. The genuine features, along with those produced by the generative model, were then used to train the second segment of the network further.
Figure 1 illustrates the overall model architecture.
| Algorithm 1 Procedure for training the model. Here, : feature extractor; : classifier; : generator; : discriminator; Y: fixed class centers. |
| Require: training samples, numbers of iterations |
| Ensure: trained networks and fixed class centers |
| 1: Initialize with random parameters; initialize class centers |
| 2: for to do | ▹Train feature extractor and classifier |
| 3: for each batch do |
| 4: |
| 5: |
| 6: Update and with the gradient of |
| 7: end for |
| 8: end for |
| 9: | ▹ Extract features |
| 10: | ▹ Train the generative model using as real samples |
| 11: random noise |
| 12: | ▹ Generate synthetic features |
| 13: for to do | ▹ Refine classifier with generated samples |
| 14: for each batch j do |
| 15: |
| 16: |
| 17: |
| 18: |
| 19: Update with the gradient of |
| 20: end for |
| 21: end for |
| 22: return
|
The model outperformed most competing methods on commonly used image datasets. For instance, on CIFAR10, it achieved an open-set detection AUC of
and a closed-set accuracy of
. Both values were the highest among the evaluated OSR algorithms, with the closed-set accuracy trailing only behind that of highly optimized closed-set classifiers [
2].
Like most OSR approaches, however, the original model was tailored for image data, using convolutional architectures optimized for spatial features. Direct application to time series or other modalities is not feasible due to fundamental differences in data structure. This limitation necessitated adapting the method to accommodate sequential data. Fortunately, the modular design of the model significantly simplifies this process: only the feature extraction component requires modification, while the remaining architecture—including the generative module and the distance-based classification mechanism—remains unchanged. This flexibility allows the core principles of the original method to be preserved across different data types.
2.4. Dataset
The primary motivation behind this study is to demonstrate the adaptability of our previously proposed OSR model to time-series data. To this end, we evaluate the method in a user identification setting based on hand-gesture signals, which serves as a biometric case study. A dedicated database tailored to this purpose is currently under development. The software that runs on smartphones and performs the measurements is complete; however, the necessary quantity of data has yet to be collected. Therefore, we conduct our tests on a publicly available dataset, that of Jiokeng et al., who devised a distinct biometric authentication system in a related effort. In their work, the basis of the classification is the subject’s heart signal, detected through the vibration of the hand and measured by a phone held in the hand. The data, when collected in this manner, exhibits a poor signal-to-noise ratio, with the meaningful component being very faint. Still, with extensive preprocessing efforts, the authors achieved high-accuracy results, although only in a closed-set scenario with 112 users [
4,
27].
The dataset contains measurements from 112 participants (93 male, 19 female), aged between 20 and 60 years. Each user contributed 10 measurement sessions of 30 s each, recorded with a Samsung Galaxy S8 smartphone. This resulted in a total of 1120 raw recordings. The dataset’s challenging properties—weak signals embedded in substantial noise—make it particularly well-suited for testing the robustness and adaptability of OSR approaches.
3. Adapting the Previously Proposed OSR Model to Time-Series Data
As shown in
Figure 1 and described above, the model is composed of distinct components. The first part extracts suitable features for training the generative model. The second component classifies these features. Separately, a generative model produces negative inputs.
The latter two components operate on extracted features and are therefore independent of the input data type. As a result, adapting the model to time-series data requires modifying only the feature extraction component.
Three different feature extraction methods were implemented: a convolutional network with 1D convolutional layers reflecting the distinct nature of the data compared to images, predefined statistical features that do not need training, and a lightweight neural network operating in the frequency domain. Their combinations were also tested; multiple of these can be used in the same model, running in parallel. The outputs of the individual methods were then concatenated into the output of the model’s first part. This, along with the individual methods, each with different complexities and levels of performance, gives plenty of room to calibrate the model, optimizing it more for either resources or performance.
3.1. Convolutional Network
Convolutional Neural Networks (CNNs) have demonstrated exceptional performance in handling images due to their ability to process two-dimensional spatial data. When dealing with time-series data, which typically exists in a single dimension (not counting the channels of multivariate time series comparable to the channels of colored images), it is logical to utilize convolutional layers designed for one-dimensional data. Our research identified the architecture illustrated in
Figure 2 as the most effective among the various configurations tested. This model comprises five stages, each doubling the number of channels from the previous stage. Each stage contains several 1D convolutional layers activated by the ReLU function, followed by a max pooling layer. The design of our network closely parallels the VGG19 network architecture [
28]. Ultimately, the network generates a feature vector matching the size of those produced in our earlier image-based projects [
2], ensuring consistency and compatibility with our existing methodologies. We have already implemented this method in [
5]. In this work, the method is tested with slightly improved hyperparameters and compared to other methods.
3.2. Predefined Statistical Features
Jiokeng and colleagues [
4] demonstrated promising results with a Support Vector Machine (SVM) model that leveraged predefined features. Their findings suggest that these features efficiently encapsulate the essential information needed for the task. Consequently, a similar approach has been adopted in this study, with notable advantages. By employing predefined features, we can bypass the initial phase of model training, requiring only a single extraction of features from each data element.
The selected features include statistical measures such as the mean, median, variance, and standard deviation, alongside other data aggregates like total signal energy, binned entropy, and permutation entropy. Additionally, ref. [
4] employed Fourier transform, incorporating all Fourier coefficients and the complete set of auto-correlation and cross-correlation values between signal axes. However, incorporating all these coefficients resulted in their overwhelming dominance in the feature vector, effectively overshadowing the other aggregate features. Our final feature vector excluded these coefficients to maintain a balanced and representative feature set. The exhaustive list of features—covering statistical, entropy-based, and frequency-related descriptors—is provided in the
Appendix A.
3.3. Neural Network in Frequency Domain
In the frequency domain, the data can reveal significant insights that might not be immediately apparent. A notable benefit of this representation is that it does not necessitate using complex Convolutional Neural Networks (CNNs). Instead, more straightforward and cost-effective fully connected networks (FCNs) can be utilized. This is because translational invariance—a property where a slight shift in an image or time series does not change the input’s essence—does not apply in the frequency domain. Each value in a specific position retains its unique significance regardless of any shift. Consequently, we employed a relatively small, Fully Connected Network to extract features from the Fast Fourier Transform (FFT) values effectively. Specifically, the FCN consists of three hidden layers of 512 neurons each, with ReLU activations, followed by a linear output layer of 256 neurons. The output of this final layer serves as the feature vector that is passed on to the OSR model.
Recurrent networks such as LSTMs [
29] could also be considered for feature extraction from sequential data. They are particularly advantageous for modeling long-term dependencies or handling variable-length sequences. However, in our case, each input window is of fixed length and relatively short, where convolutional or fully connected approaches can already capture the relevant temporal or spectral patterns. LSTMs would therefore introduce additional computational cost without a clear performance benefit. For this reason, we reserve their exploration for future work.
Preprocessing
Each acquisition session was stored in a single file containing measurements from multiple sensors. For our experiments, only the accelerometer and gyroscope data were retained, as these form the basis of the classification task. The raw signals from these two sensors were sampled at irregular timestamps and with non-uniform sampling rates, which required resampling and synchronization. Following the procedure in [
4], the data were resampled to a uniform frequency of 200 Hz, aligning the accelerometer and gyroscope streams. Since both sensors operate on three axes, the result was a synchronized six-channel time series.
To focus on the relevant signal components, a fourth-order Butterworth bandpass filter was applied with cutoff frequencies of 0.5 Hz and 30 Hz. The filtered signals were then segmented into windows of a 1.5 s duration with a stride of 0.05 s, which produces a large number of overlapping segments while ensuring sufficient variability for training. Importantly, the division into training and test sets was performed at the session level: two out of ten sessions per subject (20%) were assigned to the test set, and all segments from those sessions were included there. This ensures that overlapping windows never span across training and test sets, preventing information leakage and maintaining strict separation between the two.
For each segment, three different representations were produced and stored together for unified access: the raw six-channel signal (for the CNN-based extractor), the Fourier transform (for the frequency-domain FCN), and the predefined statistical and entropy-based features (for the handcrafted extractor). The Fourier transform values were processed as real and imaginary coefficients, while the handcrafted features are detailed in
Appendix A. All representations were normalized to zero mean and unit variance—per channel for the raw and FFT data, and per feature element for the predefined features.
4. Experimental Results
The model has been extensively tested on the dataset described in
Section 2.4. We are unaware of any OSR experiments conducted on this dataset by others. Thus, only the different feature extraction methods and their combinations can be compared regarding the open-set detection performance. The closed-set performance (accuracy) can still be compared to that presented in [
4]. The relevant hardware specifications were as follows:
For each experiment, the set of classes was randomly divided into known and unknown groups. The training subset of the known classes was used for model training, while the corresponding test subset was used for evaluation. For the unknown classes, only the test samples were included, serving as negative examples during evaluation. The training subsets of the unknown classes were not used at all. This procedure was repeated independently in each run, ensuring that the unknown classes represented truly unseen categories at the time of testing.
4.1. Evaluation Metrics
A suitable and commonly used metric for evaluating OSR methods is the F1 measure, which is capable of capturing both closed-set and open-set performance to some extent. However, this metric is highly sensitive to the model’s calibration—for example, the confidence threshold above which a sample is considered to belong to a known class. Hence, we instead chose the two metrics described below.
Closed-set accuracy: This metric considers only the results for the positive (known) samples. It measures the percentage of correctly classified samples. The model’s prediction is the class with the highest probability, regardless of whether this probability is above a threshold, or the sample would otherwise be classified as unknown. Since all classes in our dataset contribute an equal number of samples, overall accuracy is an appropriate and unbiased measure of closed-set performance. In the OSR setting, it is equally as important to ensure that known samples are not misclassified into other known classes as it is to detect unknowns. Reporting closed-set accuracy therefore complements the open-set detection metrics by indicating how well the model preserves discrimination among the enrolled classes.
AUC: The receiver operating characteristic (ROC) curve is obtained by plotting the true positive rate (sensitivity) against the false positive rate (1—specificity) at each relevant threshold setting. The area under this curve provides a calibration-free measure of open-set detection performance. While calculating it, the known samples are considered positive—regardless of their class—and the unknown samples are considered negative [
30].
EER: The Equal Error Rate is the point on the ROC curve where the false acceptance rate (FAR) equals the false rejection rate (FRR). Like the AUC, it is derived from the ROC, but unlike the AUC, it corresponds to a specific operating point. While the AUC is the most widely used metric in OSR research, the EER is a standard measure in biometrics. Including it therefore allows us to position our results in the broader context of biometric evaluation.
The AUC provides a threshold-independent assessment of open-set detection capability, making it particularly suitable for comparing methods across different datasets and configurations. In practical biometric deployments, however, operating thresholds must be chosen to balance the rates of false acceptances and false rejections. This calibration can, for example, be performed by cross-class validation, as demonstrated in our previous work [
2], where one testing scenario followed a protocol for outlier detection using the F1 measure. In that setup, the threshold was determined via cross-class validation under the assumption that both types of errors—misclassifying unknowns and rejecting known-class samples—carry equal weight.
4.2. Results
The model was evaluated with several known-class configurations ranging from 10 to 60, using all non-empty subsets of the three feature extraction strategies introduced earlier.
Table 1 reports the open-set detection performance. Among the single-method setups, the 1D convolutional network consistently delivers the strongest results, closely followed by predefined statistical features. While the frequency-domain model alone is not competitive as a standalone solution, it provides a measurable boost when combined with other methods. Therefore, it remains a valuable option in scenarios where computational resources are plentiful and performance is the top priority, making the performance improvement worth the added cost. In fact, incorporating multiple feature extraction methods generally leads to clear performance improvements. The number of known classes does not noticeably degrade performance, and the low standard deviation values confirm the robustness of the approach with respect to class selection. To complement the AUC results,
Table 2 presents the corresponding EER values. Including the EER allows our results to be related more directly to the biometrics literature, where it is a widely used measure. As expected, the EER values are strongly and almost linearly correlated with the AUC, so they convey largely the same information about model performance. Nonetheless, their inclusion facilitates comparison with prior biometrics studies.
The closed-set accuracy results (
Table 3) remain high across all configurations. Although marginally below the upper bound reported in the original HandBCG study [
4] in a purely closed-set scenario (98.27% to 99%), our approach maintains strong accuracy while introducing the capability to reject unknown classes—an essential open-set property. The ranking of the methods remains consistent with the AUC results, although the differences in accuracy are smaller.
Table 4 highlights the computational trade-offs between the examined methods during the first training phase. As expected, the convolutional network is the most resource-intensive option. However, the additional cost translates to a solid performance advantage. At the other end of the spectrum, predefined features entirely eliminate the first phase of training. Their extraction cost is negligible (performed once), and the subsequent model is extremely lightweight, making this setup ideal for constrained environments such as embedded systems. Frequency-domain features also offer a resource-efficient alternative, requiring significantly less computation than convolutional networks—though this comes at the price of reduced performance. In all cases, the flexibility of combining feature sets allows practitioners to balance performance and efficiency according to deployment constraints.
Overall, these results demonstrate a favorable performance–cost trade-off across all configurations, underlining the scalability of the proposed approach. A lightweight yet effective model can be built with predefined features for low-resource settings, while maximum accuracy can be achieved through convolutional networks or their combinations with other methods.
A major advantage of our design lies in generating samples within the hidden layer. This strategy almost eliminates the overhead of producing additional training data while retaining the performance benefits of augmentation. The hidden layer’s simpler structure requires only a lightweight generative model, avoiding the need to propagate generated samples through the entire network. Previous work [
2] confirmed this as a key efficiency advantage, and our experiments reinforce this observation. On the time-series dataset, the generative model trains in only
5.2 ms per batch—comparable to the image-based results due to identical feature vector sizes. Notably, there are no established GAN architectures specifically designed for time-series data, which makes input-level sample generation not only computationally expensive but also practically challenging. Our approach sidesteps this limitation by working in the feature space, eliminating the need to design or train a specialized generative model for raw sequences. Furthermore, skipping the heavy first stage of the model provides substantial runtime savings: for example, when using the convolutional network as a feature extractor,
91% of the cost of processing a generated batch is eliminated (2.1 ms versus 0.23 ms). This efficiency, combined with strong open-set performance, makes the method not only practical but also uniquely suited for domains where data-level generative models are less feasible.
5. Conclusions
Our exploration of Open-Set Recognition (OSR) in time-series data has unveiled several critical insights with significant implications. OSR is a powerful enhancement of traditional classification techniques, proving particularly relevant in real-world applications. One of the most pressing areas of application is authentication, which serves as a valuable use case for our research.
Authentication systems must not only recognize legitimate users but also effectively identify unknown or unauthorized individuals. This is where OSR becomes indispensable, acting as a robust defense against potential security breaches and unauthorized access. Our research highlights the crucial role OSR can play in enhancing the security and reliability of authentication systems, emphasizing the urgency of integrating OSR into practical applications.
Additionally, our study identifies a significant gap in current research: the need for OSR methods specifically designed for time-series data. Our model demonstrates that it is feasible to adapt OSR techniques to time-series datasets, suggesting that this area will likely attract increasing research attention in the future.
In our experiments, we evaluated various feature extraction methods for OSR. The 1D convolutional network emerged as the most effective, although it incurred the highest computational cost. Predefined features offered a more lightweight alternative, delivering performance only slightly inferior to that of the convolutional network. Combining multiple methods further improved performance but also increased computational demands. This adaptability enables our approach to be tailored to diverse scenarios, allowing users to prioritize either accuracy or, in constrained environments such as embedded systems, optimize the model for computational efficiency while maintaining satisfactory performance.
Looking ahead, several promising research avenues emerge. First, further evaluation on additional biometric modalities, such as voice, ECG, or gait, could demonstrate the generality of our OSR framework beyond the current case study. Sequential architectures like LSTMs also represent a natural extension, as they can explicitly model long-term temporal dependencies; while unnecessary for the fixed-length short segments considered here, they could become valuable for other settings. Similarly, multimodal or lightweight sensor-fusion approaches may increase robustness, especially in mobile or embedded applications. An interesting direction is the application of the adapted OSR framework to continuous monitoring scenarios, where long-term non-stationarity of signals may be addressed.
Another direction concerns the variability of real-world conditions. While the present dataset already contains faint signals embedded in significant noise, making it a challenging benchmark, future work could explore robustness across different acquisition environments and sensor qualities. Adaptation strategies such as transfer learning or online learning may also help maintain performance under varying signal conditions or during long-term monitoring. Finally, threshold optimization and calibration strategies tailored to open-set detection should be studied more explicitly to bridge the gap between research evaluation and deployment in real authentication systems.
In parallel with these extensions, we are developing a new biometric database in which participants are classified based on hand gestures recorded while picking up their phones, using measurements from the device’s sensors. This aims to support a continuous authentication system that operates passively, without requiring any explicit action from the user. Future research may also study how time-dependent changes in the signal—caused by shifts in user behavior or health conditions—affect classification performance.
In summary, our findings underscore the transformative potential of OSR in enhancing authentication systems, particularly when applied to time-series data. As we continue to refine these techniques, OSR is poised to become an increasingly vital component in developing secure and adaptable systems.