Voter Authentication Using Enhanced ResNet50 for Facial Recognition

Halidou, Aminou; Olle, Daniel Georges Olle; Fadja, Arnaud Nguembang; Kallon, Daramy Vandi Von; Thibault, Tchana Ngninkeu Gil

doi:10.3390/signals6020025

Open AccessArticle

Voter Authentication Using Enhanced ResNet50 for Facial Recognition

by

Aminou Halidou

^1,2,*,†

,

Daniel Georges Olle Olle

^1,3,

Arnaud Nguembang Fadja

⁴

,

Daramy Vandi Von Kallon

²

and

Tchana Ngninkeu Gil Thibault

^1,†

¹

Department of Computer Sciences, University of Yaoundé 1, Yaounde 337, Cameroon

²

Department of Mechanical and Industrial Engineering Technology (DMIET), University of Johannesburg, Johannesburg 524, South Africa

³

Department of Computer Engineering, Higher Technical Teacher’s Training College (ENSET), University of Ebolowa, Ebolowa 886, Cameroon

⁴

Department of Engineering, University of Ferrara, 44121 Ferrara, Italy

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Signals 2025, 6(2), 25; https://doi.org/10.3390/signals6020025

Submission received: 14 December 2024 / Revised: 2 April 2025 / Accepted: 30 April 2025 / Published: 23 May 2025

Download

Browse Figures

Versions Notes

Abstract

Electoral fraud, particularly multiple voting, undermines the integrity of democratic processes. To address this challenge, this study introduces an innovative facial recognition system that integrates an enhanced 50-layer Residual Network (ResNet50) architecture with Additive Angular Margin Loss (ArcFace) and Multi-Task Cascaded Convolutional Neural Networks (MTCNN) for face detection. Using the Mahalanobis distance, the system verifies voter identities by comparing captured facial images with previously recorded biometric features. Extensive evaluations demonstrate the methodology’s effectiveness, achieving a facial recognition accuracy of 99.85%. This significant improvement over existing baseline methods has the potential to enhance electoral transparency and prevent multiple voting. The findings contribute to developing robust biometric-based electoral systems, thereby promoting democratic trust and accountability.

Keywords:

facial recognition; electoral fraud prevention; biometric voting systems; ResNet50; ArcFace loss; MTCNN

1. Introduction

The increasing adoption of biometric technologies worldwide has transformed various applications, including voting systems, by enhancing security and efficiency. Biometric modalities such as iris recognition, fingerprint analysis, voice recognition, and facial recognition offer significant advantages for authentication and identification. In many countries, international organizations such as the Friedrich Ebert Stiftung emphasize the crucial role of free and fair electoral participation in the promotion of democratic governance [1]. In response to growing concerns over electoral integrity, several nations have implemented biometric-based voter authentication systems. For example, some electoral authorities have integrated biometric fingerprint systems since the early 2010s to improve the verification process [2]. However, the existing paper-based identification system used in polling stations has faced significant scrutiny. Post-election allegations of electoral fraud, including ballot stuffing, falsification of results, and multiple voting, have become commonplace [3,4,5]. The phenomenon of multiple voting is particularly problematic, as it represents a significant loophole in the current system. This issue erodes public trust in the electoral process and poses a threat to national stability, essential for development [3,4,5]. To address these challenges, this study proposes a novel voter authentication system based on facial recognition.

1.1. Proposed System

The system we propose uses facial recognition technology to guarantee the integrity of the electoral process. Facial recognition is a technology based on facial features. It is used either to verify a person’s identity, that is, to ensure that the person is who they claim to be (for access control purposes), or to identify a person within a group of people, in a place or on an image [6].

The system will be installed in the polling stations and the cameras will be placed in the entrance, also inside the polling station. The cameras at the entrance will capture images of the arrival of the voters. The images captured will be RGB, 150 × 150 pixels in size. This RGB format has been chosen because it is widely used in facial recognition systems due to its ability to represent information in colors in a way that is compatible with human vision. This representation provides more details on facial features, which improves differentiation between individuals [7]. The use of this type of image also facilitates the extraction of more detailed features, which can improve the performance of recognition systems, particularly when it comes to distinguishing between individuals with similar appearances in grayscale [8].

The MTCNN algorithm will be used for efficient face detection. Firstly, it is renowned for its ability to accurately detect faces, even in multi-angle and multi-scale scenarios [9]. Secondly, it is robust to variations in pose, lighting, and occlusion [10]. Third, it improves the accuracy of face detection by simultaneously locating facial regions, identifying facial landmarks, and aligning faces. Fourth, it also has the ability to optimize computational resources by progressively filtering out non-face regions, which improves speed [11].

In addition, the ResNet50 architecture [12] (see Figure 1), combined with ArcFace loss, will further improve the accuracy of facial recognition. Its usefulness is multiple. Firstly, it allows gradients to circulate more efficiently in the network, thus solving the problem of gradient disappearance, which is common in deep architectures [12]. Secondly, it facilitates the learning of deeper networks, improving the performance of image classification tasks [13]. In addition, thanks to its robustness, it serves as an effective basis for transfer learning in various computer vision tasks [14].

1.2. Expected Outcomes

Our approach has demonstrated a facial recognition accuracy of 99.56%, representing a substantial improvement over existing methods (Table 1). This significant improvement in accuracy can be attributed to the combination of the ResNet50 architecture, ArcFace loss, and MTCNN algorithm. The ResNet50 architecture’s ability to learn deeper networks and facilitate efficient gradient circulation contributes to its robustness. ArcFace loss further enhances facial recognition accuracy by optimizing the angular margin between different classes. The MTCNN algorithm’s efficiency in face detection and robustness to variations in pose, lighting, and occlusion also play a crucial role in achieving high accuracy.

The effectiveness of the proposed system in preventing electoral fraud, particularly multiple voting, has the potential to enhance electoral transparency and promote democratic trust and accountability. By verifying voter identities through facial recognition, the system ensures that only eligible voters can cast their ballots. This not only prevents fraudulent activities, but also boosts public confidence in the electoral process.

The implementation of this system can be tailored to various electoral contexts, making it a versatile solution to promote electoral integrity throughout the world. However, it is essential to ensure that voter registration is performed properly to guarantee the system’s effectiveness. By addressing the concerns surrounding electoral fraud, this system can contribute to the development of robust biometric-based electoral systems, ultimately promoting democratic governance and stability.

1.3. Paper Structure

This paper is meticulously structured into five sections, each contributing to a coherent narrative of our research on voter authentication through enhanced facial recognition. As an introduction, this initial section articulates the critical problem of electoral fraud, elucidating our study’s motivation and specific objectives. We emphasize the potential of biometric technologies, particularly facial recognition, to mitigate multiple voting and reinforce electoral integrity. A concise overview of our proposed system is provided, highlighting components and their synergistic integration for high accuracy and robustness. This introduction lays the groundwork for a detailed exploration of the existing literature, our innovative methodology, comprehensive experimental results, and insightful concluding remarks.

Section 2 presents a rigorous review of the existing literature on facial recognition and detection methods. Going beyond a mere list of previous work, this section critically analyzes the strengths and limitations inherent in various approaches, from traditional feature-based techniques to State-of-the-Art deep learning architectures. We examine the nuances of algorithms such as Haar cascades, HOG, and LBP, alongside contemporary convolutional neural networks (CNNs) like VGGNet, Inception, and MobileNet. Furthermore, we evaluated the characteristics and suitability of commonly used facial recognition datasets, including LFW, PubFig, and MegaFace. A key focus is placed on the performance metrics used to assess the accuracy and efficiency of facial recognition systems, such as precision, recall, F1 score, and ROC curves. This comprehensive review establishes a solid foundation for understanding the novelty and significance of our proposed approach.

Section 3 meticulously details our proposed methodology, providing a step-by-step explanation of the development and implementation of our ResNet50-based facial recognition system. We begin by describing the facial image collection process and specifying the hardware and software utilized to capture voter images at polling stations. We then elaborate on the data pre-processing and augmentation techniques employed to enhance the robustness of our model, addressing challenges posed by variations in lighting, pose, and expression. A critical aspect of this section is a detailed explanation of the feature extraction process, highlighting the integration of the MTCNN algorithm for accurate face detection and the ResNet50 architecture for robust feature embedding. We thoroughly discuss the ArcFace loss function, elucidating its role in optimizing the angular margin between different classes and improving the discriminative power of the model. Finally, we describe the creation of the secure facial image database for voter registration and the feature comparison method using Mahalanobis distance for reliable identity verification.

Section 4 presents a comprehensive evaluation of the performance of our system, analyzing the results obtained through rigorous experimentation. We present key metrics such as accuracy, precision, recall, F1-score, and area under the ROC curve (AUC), demonstrating the effectiveness of our approach in accurately identifying voters. These metrics are evaluated using a diverse set of facial images, carefully selected to reflect the real-world conditions encountered at polling stations. Furthermore, we conduct ablation studies to assess the individual contributions of the MTCNN algorithm, ResNet50 architecture, and ArcFace loss function to the overall system performance. We compare our results with those of existing baseline methods, demonstrating the superiority of our proposed approach in terms of accuracy, robustness, and efficiency.

Finally, Section 5 concludes the paper by summarizing our key findings, highlighting the contributions of our work, discussing the limitations of our study, and highlighting potential directions for future research in the field of biometric voter authentication systems. We emphasize the potential of our system to improve electoral transparency, prevent multiple voting, and promote democratic trust and accountability.

2. Related Works

Many studies have investigated anti-cheating mechanisms in online voting systems to combat voter fraud. For example, Kumar et al. proposed a system that combines face detection and OTP-based authentication to prevent cheating kumar2017secure. Similarly, Choudhary et al. presented a smart voting system that uses facial recognition technology to ensure secure and accurate voting processes [20]. Yang et al. explored the use of facial recognition and machine learning algorithms to detect and prevent cheating in online voting systems [21]. Facial recognition techniques have also been studied in depth. Traditional methods rely on manual extraction of facial features using techniques such as Local Binary Patterns (LBP) and Histogram of Oriented Gradient (HOG) for classification [22]. However, these approaches are difficult to implement for large datasets. Artificial neural networks have recently been used to automatically extract features from facial images using substantial amounts of data. For example, Zhang et al. used multitask cascaded convolutional networks (MTCNNs) to extract features from facial imageszhang2020research. Deep convolutional neural networks (CNNs) have achieved State-of-the-Art performance in a variety of image recognition tasks. ResNet50, a variant of the ResNet family (Residual Network), has been widely used for facial recognition. Yuxiang et al. employed a deep learning approach using transfer learning with ResNet50, fine-tuning, data augmentation, and a softmax classifier for face and gender recognition, achieving a classification accuracy of 99.33% for face recognition and 93% for gender classification [19]. Omkar et al. used an 8-layer AlexNet to manually filter a dataset of 2.6 million images, ultimately producing 982,803 usable images [17]. Jing et al. improved the performance of ResNet50 for facial recognition by incorporating ArcFace loss, achieving an impressive 99.53% accuracy in the LFW dataset [23]. Mondal et al. used a 22-layer GoogleNet, incorporating a triple loss function to extract facial image features [18]. However, their proposed vote security system has limitations in handling cases of identical voters or false positives. To address this, we propose a new voter authentication system that leverages the CutMix algorithm for data augmentation, convolutional neural networks (CNNs), and ArcFace loss to minimize false positives in facial recognition. Our approach aims to improve the efficiency of the system in solving the problems associated with multiple voting. To improve the efficiency of the model, we incorporated an existing dataset (LFW) and our image approach performed an election simulation which gave us better accuracy. To validate our system, we performed an election simulation with a limited number of images captured by our voting system over four months (July–November 2021). This complete dataset will be used at three polling stations—P1, P2 and P3.

3. Facial Recognition-Based Voting System

As illustrated in Figure 2, this section outlines the methodology used in the development of a facial recognition-based voting system. The proposed system comprises two main phases: enrollment and authentication.

During the enrollment phase, the system captures a facial image of each voter using a deep learning approach. This image is then compared with the features of facial images stored in the facial database. To assess similarity, we utilize the Mahalanobis distance [22], which computes the similarity between the captured image and the stored features, allowing the system to assign the most relevant labels to each image. If the calculated similarity value falls below a predetermined threshold, the system will return an error indicating that the individual cannot be identified. This mechanism underscores the principle of facial recognition: If two individuals are identical, their features will be exceedingly similar [18].

The facial recognition process typically consists of four key modules: facial image acquisition, data pre-processing and augmentation, feature extraction using ResNet50, and feature comparison. In addition, a facial database is employed to store feature vectors and facilitate the comparison of facial features.

3.1. Facial Image Collection

First, the OpenCV 4.10.0 method operating under Python 3.11 is used to collect the 150 × 150 px facial images [24]. Images from the LFW dataset, images of people taken from the Internet, and images taken by the OpenCV method are used to verify the reliability of our system.

3.2. Preprocessing and Data Augmentation

Preprocessing uses the MTCNN method, a cascaded convolutional neural network, to detect faces in the image. MTCNN generates face proposals, refines them by removing false positives, and estimates facial landmarks. This robust approach enables accurate face detection even in challenging conditions. The detected faces are then aligned using a landmark estimation algorithm, normalizing the images for the facial recognition model. Finally, these images are cropped to a 47 × 55 px image using the OPENCV method. Data augmentation is performed only during the first enrollment phase [24]. For data augmentation, we propose to use CutMix. This method combines two training images by cutting a rectangular region from one image and pasting it onto another. It uses 85% of each image in one class and randomly selects 15% of an image belonging to another class to perform the merge. This process is repeated ten times for each image. The results will then be compared with those obtained by using the rotational data augmentation method. It starts by browsing the given set and retrieving the path to an image file. Then copy the image and apply the rotation function ten times, saving after each step. Finally, repeat the previous steps until all images in the dataset have been traversed. For our case, we propose horizontally increasing the LFW dataset, and thus it goes from 13,233 images to 173,820 images.

3.3. Feature Extraction

The feature extraction process is a critical component of the proposed facial recognition system. This process involves selecting an optimal architecture that can efficiently extract relevant features from facial images. To achieve this, we chose to use ResNet50, which facilitates the flow of information between layers through shortcut connections, reducing the disappearance of gradients as the depth increases. Furthermore, the residual structure of ResNet50 enables simpler network optimization, even for very deep networks. Additionally, ResNet50’s ability to run advanced facial recognition algorithms makes it an ideal choice for this application [25]. The Deep Residual Network, presented by a Microsoft team at the ILSVRC competition in 2015, is similar to other known networks, such as VGG, Googlenet and AlexNet, which have convolution, grouping, activation and fully connected layers stacked on top of each other [12]. Our novel approach replaces GAP layers with flattened layers and dense layers, where the flattened layers receive input from the ResNet50 function, and subsequent dense layers (1024 and 512 parameters) further reduce the tensor dimensions. All layers utilize glorot_uniform kernel initializers in Keras with ReLU activation, and the final dense layer employs ArcFace activation, enhancing discriminative power by introducing a geometrically interpretable margin in the angular space [23].

Mathematical formula:

Let X be the initial image input.
Let F(x) be the transformation function performed by ResNet50.
Let Y1 be the output after the flattened layer.
Let Y2 be the output after the dense layer (1024) with ReLU activation.
Let Y3 be the output after the dense layer (512) with ReLU activation.
Let Y4 be the final output after the dense layer (5745) with ArcFace activation.

The mathematical formula for the new model can be expressed as follows.

\begin{matrix} Y_{1} & = flatten (F (X)) \end{matrix}

(1)

\begin{matrix} Y_{2} & = relu (Y_{1} \cdot W_{2} + b_{2}) \end{matrix}

(2)

\begin{matrix} Y_{3} & = relu (Y_{2} \cdot W_{3} + b_{3}) \end{matrix}

(3)

\begin{matrix} Y_{4} & = Arcface (Y_{3} \cdot W_{4} + b_{4}) \end{matrix}

(4)

where flatten is the function that transforms the spatial dimensions of the input into a one-dimensional vector, relu is the ReLU activation function, Arcface is the softmax activation function,

W_{2}, W_{3}, W_{4}

are the weight matrices corresponding to each dense layer and

b_{2}, b_{3}, b_{4}

are the corresponding bias vectors.

The formula indicates that the output

Y_{1}

is obtained by applying the flattened layer to the output of ResNet50. Then, the outputs

Y_{1}

,

Y_{2}

,

Y_{3}

and

Y_{4}

are calculated using the dense layers with corresponding weights and biases, applying the ReLU activation function to the outputs

Y_{2}

and

Y_{3}

and the Arcface activation function to output

Y_{4}

.

3.4. Facial Database

The RGB face images, information, and voter characteristics templates are stored in a relational database on a remote server.

3.5. Comparison of Characteristics and Decision

Features are compared by searching the database to identify a match, and then making a decision based on a similarity threshold.

The search is performed from 128 representations of the face of each person obtained from the extraction of characteristics while being based on the Mahalanobis distance corresponding directly to a measure of the similarity of the faces. These values are then compared to make a decision.

The decision is made whether or not to find a match. When it does match, the input image is labeled with its input name; this means that the similarity measure is above the threshold. In our context, the system will have the identity of this voter, and from a Boolean variable, it tells if he/she has already voted or not. Otherwise, when this measure is below a certain threshold, the input image is labeled unknown. This means that he/she will not be able to vote.

4. Results and Interpretation

The facial recognition-based voting system was evaluated on a personal dataset and an LFW dataset.

4.1. Evaluation Metrics

The classifier is evaluated in terms of classification rate, f-measure, recall, and test precision.

\begin{matrix} A c c u r a c y & = \frac{V P + V N}{V P + F P + V N + F N} \end{matrix}

(5)

\begin{matrix} r e c a l l & = \frac{V P}{V P + F N} \end{matrix}

(6)

\begin{matrix} p r e c i s i o n & = \frac{V P}{V P + F P} \end{matrix}

(7)

where TP, TN, FP, and FN represent, respectively:

True positives (TP): enrolled individuals (are recognized) can vote.
True negatives (TN): non-enrolled individuals are not recognised to vote;
False positives (FP): non-enrolled individuals are recognised to vote;
False negatives (FN): enrolled individuals are not recognised to vote.

4.2. Datasets

The LFW dataset contains 5749 identities comprising 13,323 images of RGB type and size 250 × 250 px. This dataset is then cropped to 47 × 55 px size and then enhanced by the data augmentation algorithm, which now contains 173,820 images as mentioned above. This makes it contain 139,069 images for training, 34,751 images for validation, and 34,751 images for testing. It is accessible at the following address [26].

The dataset for our image approach contains 1,170 identities that make up 13,435 RGB images. It is also cropped to a size of 47 × 55 px, containing 10,756 images for training, 2679 images for validation, and 2679 images for testing. The dataset contains 20 identities that make up 1140 RGB images and a size of 255 × 255 px. This dataset is then cropped to a size of 47 × 55 px and then augmented by the CutMix data augmentation algorithm, which now contains 5864 images as mentioned above. This means that it contains 4105 images for training, 1172 images for validation, and 334 images for testing.

4.3. Facial Recognition Performance

From the results obtained, the Resnet50-based model that we used shows a better ability to extract features. The high classification rates on the LFW dataset (99.60%) and the dataset for our image approach (98.73%) indicate good facial recognition performance. However, increasing the number of epochs improved the performance of some metrics, such as recall and reduction in validation loss. However, from the 20^e epoch onward, performance saturation was observed with a slight decrease in accuracy rate. The use of detection, cropping, and LFW data augmentation techniques appears to be a relevant asset to have an average validation performance of 99.56% (Table 1). However, the best performance 100% was observed with the dataset for our image approach, although it decreased slightly to 99.81%. The slight decrease in the validation accuracy rate during the long learning periods highlights an indicator of finding a balance between the generalization of the model and the number of epochs to avoid overlearning. During the voting simulation, we present a portion of the results observed in Table 2 and Table 3. To preserve the identity of voters, we have changed some of the information in these tables. In this section, 10 people participated, 4 people were absent, and 2 people were unable to vote because the system indicated that they had already voted.

Figure 3 illustrates three voting scenarios. In the first scenario (top row), three voters approach the entrance to the polling station. After image capture, the system processes the data for each voter. The first voter is granted access, which allows him to exercise his civic right.

In the second scenario, the system denies access to the second voter. This is because the voter has already cast a ballot at another polling station; the system prevents duplicate voting, thereby ensuring electoral integrity. Consequently, the voter cannot vote again.

Finally, the third scenario depicts a voter who is also denied access. Because this individual is not registered within the system, he cannot be recognized and is therefore ineligible to vote. This restriction ensures that only registered voters participate, maintaining the legitimacy of the electoral process.

We compared the previously reported results with those computed by our facial recognition voting system. As shown in Table 4, the comparison of the final results is performed in the same dataset. There is a clear, noticeable improvement in the correction rate in our work.

4.4. Voting System Performance

Once the votes have been cast, information about the voters who participated in the vote is generated in an Excel file. This file certifies that a voter has voted only once. If voters wish to check their results, the manager of the polling station can provide them with this information. This will depend on the policy adopted by the responsible organization. In the simulation we have performed, Table 2 and Table 3 show the voting results. As we said earlier, the voter information has been modified to preserve their identity. The number of voters is generated automatically when a voter registers. We used 3 polling stations P1, P2 and P3. Ten people participated, four were absent, and two were unable to vote because the system indicated that they had already voted. After voting, each voter received a confirmation text message.

4.5. Interpretation

Two datasets, LFW and personal, were introduced separately into the RESNET architecture. Each dataset was divided into three parts: 80% for training, 20% for testing, and 20% for validation. The longest training time was around 10 hours and 10 minutes for the LFW dataset, while our image approach dataset took around 2 hours. The Adam optimization algorithm with a learning rate of lr = 0.00001 was used to improve the performance of the model and minimize the value of the loss function. Indeed, in the LFW dataset (Table 4), we observed that after 10 epochs, we obtained a 99.51% validation classification rate, a 99.92% validation test precision rate, a 99.27% validation recall rate and a 0.0296 validation loss rate. After 20 epochs, we obtained a slight improvement in some measures, including a validation classification rate of 99.60%, a validation recall rate of 99.50% and a validation loss rate of 0.0210. However, the validation precision rate decreased from 99.92% to 99.84%. These results show that increasing the number of epochs improves overall performance and reduces losses. However, saturation occurs at a validation classification rate of 99.60%. Furthermore, the validation accuracy rate does not increase despite the increase in the number of epochs but decreases slightly.

Furthermore, for the dataset of our approach images (Table 4), we observed as results that after 10 epochs, we obtained a validation classification rate of 98.51%, a validation test accuracy rate of 100%, a validation recall rate of 96.45% and a validation loss rate of 0.1469%. After 20 epochs, we found a slight improvement in some measures, including a validation classification rate of 98.73%, a validation recall rate of 98.17% and a validation loss rate of 0.0755. However, the validation accuracy rate decreased from 100% to 99.81%. These results also show that increasing the number of epochs improves overall performance and reduces losses.

5. Conclusions

This study presents a facial recognition-based voter authentication system that effectively integrates modern biometric technology into the electoral process, addressing significant concerns about multiple voting and electoral fraud. By employing an enhanced ResNet50 neural network architecture combined with the Adam optimisation algorithm and utilising flattened layers alongside a sequence of dense layers, we have improved classification accuracy compared to traditional methods that rely on global average pooling (GAP).

Key innovations in our system include the adoption of ArcFace loss, which minimizes false positives, and the implementation of MTCNN for precise facial detection. Furthermore, we used the CutMix data augmentation technique to increase the diversity of the training dataset, contributing to the overall enhancement of facial recognition capabilities. Our rigorous testing on LFW dataset yielded an impressive average validation accuracy of 99.56%, demonstrating the efficacy of the proposed solution.

Looking ahead, there are opportunities to further refine this system. Future work could explore continual learning approaches to enable rapid adaptation to new datasets or changing conditions without extensive retraining. Furthermore, investigating adaptive lighting techniques, such as contrast-limited Adaptive Histogram Equalization (CLAHE), could enhance the system’s robustness against varying lighting scenarios, thereby improving recognition accuracy in real-world environments. In general, our proposed system not only improves the integrity of the electoral process but also provides a foundation for the future application of biometric technologies in voting systems around the world.

Author Contributions

Conceptualization: A.H. and T.N.G.T.; Methodology: A.H., D.G.O.O. and A.N.F.; Formal analysis: A.H., D.G.O.O. and D.V.V.K.; Investigation: A.H., D.G.O.O., A.N.F. and D.V.V.K.; Data Curation: A.H., T.N.G.T., D.V.V.K. and A.N.F.; Writing—original draft preparation: A.H. and T.N.G.T.; Writing—Review and Editing: A.H., T.N.G.T., D.G.O.O., A.N.F. and D.V.V.K.; Supervision: A.H. and T.N.G.T. All authors have read and agreed to the published version of the manuscript.

Funding

The APC was funded by University of Johannesburg.

Data Availability Statement

Supporting data are available upon request. Please contact Tchana Ngninkeu Gil Thibault (gilthibault5@gmail.com) for access.

Conflicts of Interest

The authors declare that they have no competing interests.

References

Stiftung, F.E. How to Prevent and Combat Electoral Fraud in Cameroon. 2012. Available online: https://library.fes.de/pdf-files/bueros/kamerun/09614.pdf (accessed on 1 January 2025).
Jain, A.; Flynn, P.; Ross, A. Handbook of Biometrics; Springer: New York, NY, USA, 2007. [Google Scholar]
Demers-Labrousse, N.; Vandal, G.; Aoun, S. La Démocratie en Afrïque Subsaharienne: Le cas du Cameroun; Université de Sherbrooke: Sherbrooke, QC, Canada, 2012. [Google Scholar]
Kindzeka, M.E. Claiming Massive Fraud, Cameroon Opposition Challenges Ruling Party Landslide Victory. 2020. Available online: https://www.voanews.com/a/africa_claiming-massive-fraud-cameroon-opposition-challenges-ruling-party-landslide-victory/6184185.html (accessed on 1 January 2025).
NSONGAN, P.M. Cameroon Bishops Warn Against Election Fraud. 2018. Available online: https://www.africanews.com/2018/10/11/cameroon-bishops-warn-against-election-fraud/ (accessed on 1 January 2025).
Jain, A.K.; Li, S.Z. Handbook of Face Recognition; Springer: New York, NY, USA, 2011; Volume 1, p. 699. [Google Scholar]
Kortli, Y.; Jridi, M.; Al Falou, A.; Atri, M. Face recognition systems: A survey. Sensors 2020, 20, 342. [Google Scholar] [CrossRef] [PubMed]
Bhatta, A.; Mery, D.; Wu, H.; Annan, J.; King, M.C.; Bowyer, K.W. What’s color got to do with it? Face recognition in grayscale. IEEE Trans. Biom. Behav. Identity Sci. 2025. Available online: https://ieeexplore.ieee.org/abstract/document/10887263 (accessed on 1 January 2025). [CrossRef]
Zhang, K.; Zhang, Z.; Li, Z.; Qiao, Y. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 2016, 23, 1499–1503. [Google Scholar] [CrossRef]
Zeng, D.; Veldhuis, R.; Spreeuwers, L. A survey of face recognition techniques under occlusion. IET Biom. 2021, 10, 581–606. [Google Scholar] [CrossRef]
He, Y.; Xu, D.; Wu, L.; Jian, M.; Xiang, S.; Pan, C. Lffd: A light and fast face detector for edge devices. arXiv 2019, arXiv:1904.10633. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part IV 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 630–645. [Google Scholar]
Deng, J.; Guo, J.; Xue, N.; Zafeiriou, S. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4690–4699. [Google Scholar]
Gurnani, A.; Shah, K.; Gajjar, V.; Mavani, V.; Khandhediya, Y. Saf-bage: Salient approach for facial soft-biometric classification-age, gender, and facial expression. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA, 7–11 January 2019; pp. 839–847. [Google Scholar]
Schroff, F.; Kalenichenko, D.; Philbin, J. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 815–823. [Google Scholar]
Parkhi, O.M.; Vedaldi, A.; Zisserman, A. Deep face recognition. In Proceedings of the British Machine Vision Conference (BMVC) 2015, Swansea, UK, 7–10 September 2015; pp. 1–12. [Google Scholar] [CrossRef]
Mondal, I.; Chatterjee, S. Secure and hassle-free EVM through deep learning based face recognition. In Proceedings of the 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, 14–16 February 2019; pp. 109–113. [Google Scholar]
Zhou, Y.; Ni, H.; Ren, F.; Kang, X. Face and gender recognition system based on convolutional neural networks. In Proceedings of the 2019 IEEE International Conference on Mechatronics and Automation (ICMA), Tianjin, China, 4–7 August 2019; pp. 1091–1095. [Google Scholar]
Choudhary, N.; Agarwal, S.; Lavania, G. Smart voting system through facial recognition. Int. J. Sci. Res. Comput. Sci. Eng. 2019, 7, 7–10. [Google Scholar] [CrossRef]
Yang, D.; Alsadoon, A.; Prasad, P.W.C.; Singh, A.K.; Elchouemi, A. An emotion recognition model based on facial recognition in virtual learning environment. Procedia Comput. Sci. 2018, 125, 2–10. [Google Scholar] [CrossRef]
Wolf, L.; Hassner, T.; Maoz, I. Face recognition in unconstrained videos with matched background similarity. In Proceedings of the CVPR 2011, Providence, RI, USA, 20–25 June 2011; pp. 529–534. [Google Scholar]
Jing, H.; Lin, G.; Zhang, H.; Chen, T. A face recognition algorithm based on improved resnet. Front. Comput. Intell. Syst. 2022, 1, 22–25. [Google Scholar] [CrossRef]
Kaehler, A.; Bradski, G. Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library; O’Reilly Media: Sebastopol, CA, USA, 2016. [Google Scholar]
Zaeemzadeh, A.; Rahnavard, N.; Shah, M. Norm-preservation: Why residual networks can become extremely deep? IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3980–3990. [Google Scholar] [CrossRef] [PubMed]
Huang, G. Labeled Faces in the Wild Home. 2018. Available online: https://www.aiaaic.org/aiaaic-repository/ai-algorithmic-and-automation-incidents/labeled-faces-in-the-wild-lfw-dataset (accessed on 1 January 2025).

Figure 1. ResNet50 architecture.

Figure 2. A voting system based on facial recognition.

Figure 3. Voters taken at the entrance at the the polling station.

Table 1. Face detection result on LFW dataset.

Approaches	Techniques	Accuracy
Gumani et al. [15]	AlexNet	91.8%
Schroff et al. [16]	FaceNet	98.87%
Omkar et al. [17]	ConvNet	98.95%
Mondal et al. [18]	GoogleNet	99.1%
Yuxiang et al. [19]	Transfer learning with ResNet50	99.33%
Proposed	MTCNN + ResNet50	99.56%

Table 2. Vote result Table (Part 1).

Attribute	1	2	3	4	5
ID Number	2021081528	2021075431	2021075617	2021075852	2021030908
Name	Voter1	Voter2	Voter3	Voter4	Voter5
Date of Birth	15 December 1998	14 March 1996	17 May 1997	17 November 1998	25 December 1998
Place of Birth	City1	City2	City1	City1	City3
Phone	694X	57X	59X	59X	69X
Gender	MALE	FEMALE	MALE	FEMALE	MALE
Voting Office	P1	P1	P3	P3	P1
Voted	No	Yes	Yes	Yes	No
Date	19 August 2021	19 August 2021	19 August 2021	19 August 2021	16 August 2021
Time	07:16:56	07:11:09	07:09:55	07:06:17	21:51:38

Table 3. Vote result Table (Part 2).

Attribute	6	7	8	9	10
ID Number	2021021123	2021075224	2021145028	2021021123	2021021128
Name	Voter6	Voter7	Voter8	Voter9	Voter10
Date of Birth	21 December 1999	15 October 1998	16 June 1990	8 December 1970	8 December 1970
Place of Birth	City1	City2	City3	City3	City1
Phone	69X	67X	69X	69X	69X
Gender	MALE	MALE	FEMALE	MALE	MALE
Voting Office	P2	P2	P1	P2	P3
Voted	Yes	Absent	Absent	Absent	Absent
Date	16 August 2021	16 August 2021	16 August 2021	16 August 2021	16 August 2021
Time	21:39:59	21:55:59	21:55:59	21:55:59	21:55:59

Table 4. Training results comparison between LFW Dataset and Our Approach Images.

Metric	LFW Dataset		Our Approach Images
Epoch	10	20	10	20
Loss	0.0613	0.0081	0.2296	0.0081
Accuracy	98.78%	99.86%	97.11%	99.87%
Val_Loss	0.0296	0.0210	0.1469	0.0755
Val_Acc	99.51%	99.60%	98.51%	98.73%
Precision	99.80%	99.93%	99.91%	99.99%
Val_Prec	99.92%	99.84%	100%	99.81%
Recall	99.83%	99.74%	93.22%	99.58%
Val_Recall	99.27%	99.50%	96.45%	98.17%
Train Time (s)	20,363	34,922	1383	2794

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Halidou, A.; Olle, D.G.O.; Fadja, A.N.; Kallon, D.V.V.; Thibault, T.N.G. Voter Authentication Using Enhanced ResNet50 for Facial Recognition. Signals 2025, 6, 25. https://doi.org/10.3390/signals6020025

AMA Style

Halidou A, Olle DGO, Fadja AN, Kallon DVV, Thibault TNG. Voter Authentication Using Enhanced ResNet50 for Facial Recognition. Signals. 2025; 6(2):25. https://doi.org/10.3390/signals6020025

Chicago/Turabian Style

Halidou, Aminou, Daniel Georges Olle Olle, Arnaud Nguembang Fadja, Daramy Vandi Von Kallon, and Tchana Ngninkeu Gil Thibault. 2025. "Voter Authentication Using Enhanced ResNet50 for Facial Recognition" Signals 6, no. 2: 25. https://doi.org/10.3390/signals6020025

APA Style

Halidou, A., Olle, D. G. O., Fadja, A. N., Kallon, D. V. V., & Thibault, T. N. G. (2025). Voter Authentication Using Enhanced ResNet50 for Facial Recognition. Signals, 6(2), 25. https://doi.org/10.3390/signals6020025

Article Menu

Voter Authentication Using Enhanced ResNet50 for Facial Recognition

Abstract

1. Introduction

1.1. Proposed System

1.2. Expected Outcomes

1.3. Paper Structure

2. Related Works

3. Facial Recognition-Based Voting System

3.1. Facial Image Collection

3.2. Preprocessing and Data Augmentation

3.3. Feature Extraction

3.4. Facial Database

3.5. Comparison of Characteristics and Decision

4. Results and Interpretation

4.1. Evaluation Metrics

4.2. Datasets

4.3. Facial Recognition Performance

4.4. Voting System Performance

4.5. Interpretation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI