MixCFormer: A CNN–Transformer Hybrid with Mixup Augmentation for Enhanced Finger Vein Attack Detection
Abstract
:1. Introduction
1.1. Related Work
1.2. Motivation
1.3. Our Work
- MixCFormer Architecture: We propose MixCFormer, a convolutional–transformer hybrid architecture with residual linking, which combines the local feature extraction capabilities of CNNs with the global context modeling of transformers. The CNN branch captures local vein texture features, while the transformer branch integrates global information to capture long-range dependencies. Residual linking enhances the efficiency of feature transfer, improving the stability of feature representation. This architectural synergy enables MixCFormer to achieve higher accuracy and robustness in the complex task of finger vein liveness detection.
- Mixup Data Enhancement: We introduce the Mixup data augmentation technique to improve the generalization ability of the model, reduce reliance on large-scale real datasets, and enhance the recognition accuracy for forged samples. Additionally, we construct a novel dataset that includes real live finger vein data as well as three types of attack samples (two live attacks and one non-live attack). This dataset enriches the diversity of training samples and provides a comprehensive validation foundation, enhancing the model’s ability to recognize and resist various attack scenarios.
- Feature Sequence Processing: We propose an innovative approach that converts finger vein video data into feature sequences for more efficient processing. This method optimizes feature extraction and matching by capturing dynamically changing temporal information, which enhances the discriminative power between live and forged vein samples. As a result, the model’s real-time performance and recognition speed are improved.
- Noise and Light Variation Suppression Techniques: For the first time, we apply a combination of baseline drift cancellation, morphological filtering, and Butterworth filtering to mitigate the impact of noise and light variation on finger vein liveness detection. Baseline drift cancellation eliminates low-frequency noise; morphological filtering optimizes image structure and accentuates vein features; and Butterworth filtering reduces high-frequency noise. The integration of these three techniques significantly enhances the model’s robustness, maintaining excellent detection performance under complex lighting conditions and noisy environments, thereby improving the overall reliability and practicality of the system.
- Experimental Validation and Performance Enhancement: Rigorous experimental evaluations demonstrate that MixCFormer outperforms the current state-of-the-art methods in terms of detection accuracy on finger vein datasets. This performance validation underscores the effectiveness and innovation of the proposed architecture, highlighting MixCFormer’s potential for enhanced performance and broader application in finger vein liveness detection tasks.
2. The Proposed Approach
2.1. MixCFormer Model
2.2. Data Acquisition and Processing
2.2.1. Acquisition of Attack Data
- (1)
- Acquisition of Real Human Vein Data: The heart rate of each subject was first measured using a finger-clip pulse oximeter to confirm their liveliness. Subsequently, the finger vein acquisition device was used to capture video data of the veins from the index, middle, and ring fingers of both hands. This process was repeated six times for each subject, with vein data collected from all six fingers. Before each acquisition, the heart rate was re-measured to ensure the validity of the data, ensuring that all six sets of video samples were from live subjects, thus providing high-quality finger vein data.
- (2)
- Acquisition of Heart-Rate-Based Attack Data: To increase the diversity of attack samples, two types of heart-rate-based attack data were designed:
- Attack Type I: The subject wore thin gloves with disturbance patterns (Figure 4a), simulating surface disturbances on the finger veins. The data collection process was identical to that of real human vein data, with the same procedure applied to all six fingers.
- Attack Type II: The subject wore thick gloves (Figure 4b) with disturbance patterns drawn on the glove surfaces, adding further intrusion to the detection algorithm. The acquisition method was the same as for real vein data.
- (3)
- Acquisition of Heart-Rate-Free Attack Data: A prosthetic finger made from colored clay (Figure 4c) was used to simulate attacks without heart rate. Each colored clay prosthesis was modeled to resemble the index, middle, and ring fingers of both hands. The finger vein data of these prostheses were recorded using the same video acquisition method to create samples of heart-rate-free attack data.
2.2.2. Generating the Sequence Signal
- (1)
- Parameter Setting and Filter Design
- (2)
- Vein Signal Filtering
2.3. Mixup Data Augmentation
2.4. CNN–Transformer Hybrid Model
2.4.1. CNN Feature Extraction
2.4.2. Transformer Coding Module
2.4.3. Fully Connected Network
2.5. Model Training and Optimization
3. Experimental Results
3.1. Dataset Description
3.2. Evaluation Metrics
3.3. Comparison and Analysis
3.3.1. Comparison Experiment
3.3.2. Effect of Noise Enhancement on Model Robustness
3.4. Ablation Experiment
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Jain, A.K.; Kumar, A. Biometrics of next generation: An overview. Second. Gener. Biom. 2010, 12, 2–3. [Google Scholar]
- Zhang, L.; Li, W.; Ning, X.; Sun, L.; Dong, X. A local descriptor with physiological characteristic for finger vein recognition. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 4873–4878. [Google Scholar]
- Shaheed, K.; Liu, H.; Yang, G.; Qureshi, I.; Gou, J.; Yin, Y. A systematic review of finger vein recognition techniques. Information 2018, 9, 213. [Google Scholar] [CrossRef]
- Chugh, T.; Cao, K.; Jain, A.K. Fingerprint spoof buster: Use of minutiae-centered patches. IEEE Trans. Inf. Forensics Secur. 2018, 13, 2190–2202. [Google Scholar] [CrossRef]
- Opanasenko, V.M.; Fazilov, S.K.; Mirzaev, O.N.; Sa’dullo ugli Kakharov, S. An Ensemble Approach To Face Recognition In Access Control Systems. J. Mob. Multimed. 2024, 20, 749–768. [Google Scholar] [CrossRef]
- Shen, J.; Liu, N.; Xu, C.; Sun, H.; Xiao, Y.; Li, D.; Zhang, Y. Finger vein recognition algorithm based on lightweight deep convolutional neural network. IEEE Trans. Instrum. Meas. 2021, 71, 1–13. [Google Scholar] [CrossRef]
- Nguyen, K.; Proença, H.; Alonso-Fernandez, F. Deep learning for iris recognition: A survey. ACM Comput. Surv. 2024, 56, 1–35. [Google Scholar] [CrossRef]
- Cola, G.; Avvenuti, M.; Musso, F.; Vecchio, A. Gait-based authentication using a wrist-worn device. In Proceedings of the 13th International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, New York, NY, USA, 28 November 2016; pp. 208–217. [Google Scholar]
- Qin, H.; Zhu, H.; Jin, X.; Song, Q.; El-Yacoubi, M.A.; Gao, X. EmMixformer: Mix transformer for eye movement recognition. arXiv 2024, arXiv:2401.04956. [Google Scholar]
- Poddar, J.; Parikh, V.; Bharti, S.K. Offline signature recognition and forgery detection using deep learning. Procedia Comput. Sci. 2020, 170, 610–617. [Google Scholar] [CrossRef]
- Xie, J.; Zhao, Y.; Zhu, D.; Yan, J.; Li, J.; Qiao, M.; He, G.; Deng, S. A machine learning-combined flexible sensor for tactile detection and voice recognition. ACS Appl. Mater. Interfaces 2023, 15, 12551–12559. [Google Scholar] [CrossRef]
- Hou, B.; Zhang, H.; Yan, R. Finger-vein biometric recognition: A review. IEEE Trans. Instrum. Meas. 2022, 71, 1–26. [Google Scholar] [CrossRef]
- Mathur, L.; Matarić, M.J. Introducing representations of facial affect in automated multimodal deception detection. In Proceedings of the 2020 International Conference on Multimodal Interaction, New York, NY, USA, 22 October 2020; pp. 305–314. [Google Scholar]
- Hsia, C.-H.; Yang, Z.-H.; Wang, H.-J.; Lai, K.-K. A new enhancement edge detection of finger-vein identification for carputer system. Appl. Sci. 2022, 12, 10127. [Google Scholar] [CrossRef]
- Godoy, R.I.U.; Panzo, E.G.V.; Cruz, J.C.D. Vein Location and Feature Detection using Image Analysis. In Proceedings of the 2021 5th International Conference on Electrical, Telecommunication and Computer Engineering (ELTICOM), Medan, Indonesia, 15–16 September 2021; pp. 33–37. [Google Scholar]
- Khellat-Kihel, S.; Cardoso, N.; Monteiro, J.; Benyettou, M. Finger vein recognition using Gabor filter and support vector machine. In Proceedings of the International Image Processing, Applications and Systems Conference, Sfax, Tunisia, 5–7 November 2014; pp. 1–6. [Google Scholar]
- Park, K.R. Finger vein recognition by combining global and local features based on SVM. Comput. Inform. 2011, 30, 295–309. [Google Scholar]
- Krishnan, A.; Thomas, T.; Mishra, D. Finger vein pulsation-based biometric recognition. IEEE Trans. Inf. Forensics Secur. 2021, 16, 5034–5044. [Google Scholar] [CrossRef]
- Crisan, S.; Tebrean, B. Low cost, high quality vein pattern recognition device with liveness Detection. Workflow and implementations. Measurement 2017, 108, 207–216. [Google Scholar] [CrossRef]
- Das, R.; Piciucco, E.; Maiorana, E.; Campisi, P. Convolutional neural network for finger-vein-based biometric identification. IEEE Trans. Inf. Forensics Secur. 2018, 14, 360–373. [Google Scholar] [CrossRef]
- Qin, H.; Wang, P. Finger-vein verification based on LSTM recurrent neural networks. Appl. Sci. 2019, 9, 1687. [Google Scholar] [CrossRef]
- Qin, H.; Gong, C.; Li, Y.; El-Yacoubi, M.A.; Gao, X.; Wang, J. Attention Label Learning to Enhance Interactive Vein Transformer for Palm-Vein Recognition. IEEE Trans. Biom. Behav. Identity Sci. 2024, 6, 341–351. [Google Scholar] [CrossRef]
- Tyagi, S.; Chawla, B.; Jain, R.; Srivastava, S. Multimodal biometric system using deep learning based on face and finger vein fusion. J. Intell. Fuzzy Syst. 2022, 42, 943–955. [Google Scholar] [CrossRef]
- Liu, W.; Lu, H.; Wang, Y.; Li, Y.; Qu, Z.; Li, Y. Mmran: A novel model for finger vein recognition based on a residual attention mechanism: Mmran: A novel finger vein recognition model. Appl. Intell. 2023, 53, 3273–3290. [Google Scholar] [CrossRef]
- Wang, Y.; Wu, W.; Yao, J.; Li, D. A Palm Vein Recognition Method Based on LSTM-CNN. In Proceedings of the 2023 IEEE 5th International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Dali, China, 11–13 October 2023; pp. 1027–1030. [Google Scholar]
- Abbas, T. Finger Vein Recognition with Hybrid Deep Learning Approach. J. La Multiapp 2023, 4, 23–33. [Google Scholar] [CrossRef]
- Li, X.; Zhang, B.-B. FV-ViT: Vision transformer for finger vein recognition. IEEE Access 2023, 11, 75451–75461. [Google Scholar] [CrossRef]
- Qin, H.; Gong, C.; Li, Y.; Gao, X.; El-Yacoubi, M.A. Label enhancement-based multiscale transformer for palm-vein recognition. IEEE Trans. Instrum. Meas. 2023, 72, 2509217. [Google Scholar] [CrossRef]
- Wang, S.; Qin, H.; Zhang, X.; Xiong, Z.; Wu, Y. VeinCnnformer: Convolutional neural network based transformer for vein recognition. In Proceedings of the Fourth International Conference on Computer Vision and Data Mining (ICCVDM 2023), Changchun, China, 20–22 October 2023; pp. 400–407. [Google Scholar]
- Kim, W.; Song, J.M.; Park, K.R. Multimodal biometric recognition based on convolutional neural network by the fusion of finger-vein and finger shape using near-infrared (NIR) camera sensor. Sensors 2018, 18, 2296. [Google Scholar] [CrossRef]
- Alshardan, A.; Kumar, A.; Alghamdi, M.; Maashi, M.; Alahmari, S.; Alharbi, A.A.; Almukadi, W.; Alzahrani, Y. Multimodal biometric identification: Leveraging convolutional neural network (CNN) architectures and fusion techniques with fingerprint and finger vein data. PeerJ Comput. Sci. 2024, 10, e2440. [Google Scholar] [CrossRef]
- El-Rahiem, B.A.; El-Samie, F.E.A.; Amin, M. Multimodal biometric authentication based on deep fusion of electrocardiogram (ECG) and finger vein. Multimedia Syst. 2022, 28, 1325–1337. [Google Scholar] [CrossRef]
- Alay, N.; Al-Baity, H.H. Deep learning approach for multimodal biometric recognition system based on fusion of iris, face, and finger vein traits. Sensors 2020, 20, 5523. [Google Scholar] [CrossRef]
- Tao, Z.; Zhou, X.; Xu, Z.; Lin, S.; Hu, Y.; Wei, T. Finger-Vein Recognition Using Bidirectional Feature Extraction and Transfer Learning. Math. Probl. Eng. 2021, 2021, 6664809. [Google Scholar] [CrossRef]
- Huang, Z.; Guo, C. Robust finger vein recognition based on deep CNN with spatial attention and bias field correction. Int. J. Artif. Intell. Tools 2021, 30, 2140005. [Google Scholar] [CrossRef]
- Babalola, F.O.; Bitirim, Y.; Toygar, Ö. Palm vein recognition through fusion of texture-based and CNN-based methods. Signal Image Video Process. 2021, 15, 459–466. [Google Scholar] [CrossRef]
- Yang, H.; Fang, P.; Hao, Z. A gan-based method for generating finger vein dataset. In Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence, Sanya, China, 24–26 December 2020; pp. 1–6. [Google Scholar]
- Madhusudhan, M.; Udayarani, V.; Hegde, C. An intelligent deep learning LSTM-DM tool for finger vein recognition model USING DSAE classifier. Int. J. Syst. Assur. Eng. Manag. 2024, 15, 532–540. [Google Scholar] [CrossRef]
- Lu, S.; Fung, S.; Pan, W.; Wickramasinghe, N.; Lu, X. Veintr: Robust end-to-end full-hand vein identification with transformer. Vis. Comput. 2024, 40, 7015–7023. [Google Scholar] [CrossRef]
- Li, X.; Feng, J.; Cai, J.; Lin, G. FV-MViT: Mobile Vision Transformer for Finger Vein Recognition. Sensors 2024, 24, 1331. [Google Scholar] [CrossRef]
- Abdullahi, S.B.; Bature, Z.A.; Chophuk, P.; Muhammad, A. Sequence-wise multimodal biometric fingerprint and finger-vein recognition network (STMFPFV-Net). Intell. Syst. Appl. 2023, 19, 200256. [Google Scholar] [CrossRef]
- Chen, Y.; Ji, D.; Ma, Q.; Zhai, C.; Ma, Y. A Novel Generative Adversarial Network for the Removal of Noise and Baseline Drift in Seismic Signals. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5904814. [Google Scholar] [CrossRef]
- Khosravy, M.; Gupta, N.; Patel, N.; Senjyu, T.; Duque, C.A. Particle swarm optimization of morphological filters for electrocardiogram baseline drift estimation. In Applied Nature-Inspired Computing: Algorithms and Case Studies; Springer: Singapore, 2020; pp. 1–21. [Google Scholar]
- Zhang, X.; Jiang, S. Application of fourier transform and butterworth filter in signal denoising. In Proceedings of the 2021 6th International Conference on Intelligent Computing and Signal Processing (ICSP), Xi’an, China, 9–11 April 2021; pp. 1277–1281. [Google Scholar]
- Zhang, Z.; Wang, H.; Geng, J.; Deng, X.; Jiang, W. A New Data Augmentation Method Based on Mixup and Dempster-Shafer Theory. IEEE Trans. Multimedia 2023, 26, 4998–5013. [Google Scholar] [CrossRef]
- Kong, F.; Zhang, R.; Guo, X.; Mensah, S.; Mao, Y. Dropmix: A textual data augmentation combining dropout with mixup. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emerites, 7–11 December 2022; pp. 890–899. [Google Scholar]
- Li, Y.; Lu, H.; Wang, Y.; Gao, R.; Zhao, C. ViT-Cap: A novel vision transformer-based capsule network model for finger vein recognition. Appl. Sci. 2022, 12, 10364. [Google Scholar] [CrossRef]
- Sağ, T.; Abdullah Jalil Jalil, Z. Vortex search optimization algorithm for training of feed-forward neural network. Int. J. Mach. Learn. Cybern. 2021, 12, 1517–1544. [Google Scholar] [CrossRef]
- Basha, S.S.; Dubey, S.R.; Pulabaigari, V.; Mukherjee, S. Impact of fully connected layers on performance of convolutional neural networks for image classification. Neurocomputing 2020, 378, 112–119. [Google Scholar] [CrossRef]
- Mao, A.; Mohri, M.; Zhong, Y. Cross-entropy loss functions: Theoretical analysis and applications. In Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; pp. 23803–23828. [Google Scholar]
- Chandriah, K.K.; Naraganahalli, R.V. RNN/LSTM with modified Adam optimizer in deep learning approach for automobile spare parts demand forecasting. Multimedia Tools Appl. 2021, 80, 26145–26159. [Google Scholar] [CrossRef]
Models | GRU | CNN | LSTM | Transformer | Network Structure | Mixup | Precision (%) |
---|---|---|---|---|---|---|---|
GRU | √ | / | / | 93.78 | |||
CNN [20] | √ | / | / | 94.50 | |||
LSTM [21] | √ | / | / | 94.39 | |||
Transformer [22] | √ | / | / | 91.50 | |||
CNN + LSTM [25] | √ | √ | Cascade | / | 93.57 | ||
CFormer | √ | √ | Cascade | / | 95.50 | ||
CLT | √ | √ | √ | Cascade | / | 94.26 | |
MixCNN | √ | / | √ | 97.53 | |||
MixLSTM | √ | / | √ | 93.43 | |||
MixCLT | √ | √ | √ | Cascade | √ | 93.73 | |
Ours | √ | √ | Cascade | √ | 99.51 |
Models | Test Loss (%) | Test Accuracy (%) | Precision (%) | Recall (%) |
---|---|---|---|---|
GRU | 0.1950 | 93.75 | 93.78 | 93.75 |
CNN [20] | 0.1613 | 94.25 | 94.50 | 94.25 |
LSTM [21] | 0.1996 | 94.25 | 94.39 | 94.25 |
Transformer [22] | 0.2530 | 91.50 | 91.50 | 91.50 |
CNN +LSTM [25] | 0.2121 | 93.50 | 93.57 | 93.50 |
CFormer | 0.1706 | 95.50 | 95.50 | 95.50 |
CLT | 0.2017 | 94.25 | 94.26 | 94.25 |
MixCNN | 0.0949 | 97.50 | 97.53 | 97.50 |
MixLSTM | 0.1919 | 93.37 | 93.43 | 93.37 |
MixCLT | 0.1827 | 93.63 | 93.73 | 93.63 |
Ours | 0.0414 | 99.50 | 99.51 | 99.51 |
Datasets | Gaussian Noise (σ = 0.01) | Gaussian Noise (σ = 0.1) | Poisson Noise (Slight) | Poisson Noise (Severe) | No Noise |
---|---|---|---|---|---|
Original datasets | 99.51 | 97.89 | 98.76 | 96.54 | 99.51 |
Add noise-enhanced datasets | 99.06 | 98.15 | 99.05 | 97.10 | 99.51 |
Datasets | Normal Data | High Noise Conditions | No Noise Conditions |
---|---|---|---|
Original datasets | 2.34 | 4.56 | 2.33 |
Add noise-enhanced datasets | 2.30 | 2.98 | 2.33 |
Models | Baseline Drift | Morphological Filtering | Butterworth Filtering | Mixup | Precision (%) |
---|---|---|---|---|---|
CFormer | / | √ | √ | / | 93.24 |
√ | / | √ | 94.62 | ||
√ | √ | / | 92.81 | ||
√ | √ | √ | 95.50 | ||
MixCFormer | / | √ | √ | √ | 97.41 |
√ | / | √ | 97.77 | ||
√ | √ | / | 96.08 | ||
√ | √ | √ | 99.51 | ||
Ours | √ | √ | √ | √ | 99.51 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Z.; Yang, S.; Qin, H.; Liu, Y.; Wang, J. MixCFormer: A CNN–Transformer Hybrid with Mixup Augmentation for Enhanced Finger Vein Attack Detection. Electronics 2025, 14, 362. https://doi.org/10.3390/electronics14020362
Wang Z, Yang S, Qin H, Liu Y, Wang J. MixCFormer: A CNN–Transformer Hybrid with Mixup Augmentation for Enhanced Finger Vein Attack Detection. Electronics. 2025; 14(2):362. https://doi.org/10.3390/electronics14020362
Chicago/Turabian StyleWang, Zhaodi, Shuqiang Yang, Huafeng Qin, Yike Liu, and Junqiang Wang. 2025. "MixCFormer: A CNN–Transformer Hybrid with Mixup Augmentation for Enhanced Finger Vein Attack Detection" Electronics 14, no. 2: 362. https://doi.org/10.3390/electronics14020362
APA StyleWang, Z., Yang, S., Qin, H., Liu, Y., & Wang, J. (2025). MixCFormer: A CNN–Transformer Hybrid with Mixup Augmentation for Enhanced Finger Vein Attack Detection. Electronics, 14(2), 362. https://doi.org/10.3390/electronics14020362