1. Introduction
Facial recognition technology (FRT) has rapidly emerged as a cornerstone of digital identity verification, with applications ranging from smartphones and online services to border control and law enforcement. In controlled tests, leading systems can achieve recognition accuracy rates above 99% [
1,
2]. However, the increasing adoption of FRT has amplified concerns about spoofing attacks. Malicious actors can exploit presentation attacks using printed photos, replayed videos, or, more recently, AI-generated deepfakes. The widespread availability of generative AI tools makes it inexpensive and accessible for almost anyone to create lifelike fake faces, intensifying the pressure on security measures [
3].
Traditional face recognition systems, while accurate in ideal conditions, are highly vulnerable without robust liveness detection mechanisms. To mitigate such risks, this work introduces a secure Face Recognition-as-a-Service (FRaaS) platform that integrates multiple layers of anti-spoofing defenses. The system combines passive liveness checks, which analyze natural cues in captured face images, with active challenge–response prompts that request simple real-time user actions (e.g., blinking, smiling, or turning the head). This layered design ensures resilience against both low-effort and advanced spoofing attacks [
4].
Beyond technical robustness, ethical and regulatory considerations guide the design. The platform aligns with the General Data Protection Regulation (GDPR) and emphasizes fairness, transparency, and user consent in all authentication processes. These principles are essential for ensuring not only system integrity but also public trust in the deployment of biometric authentication technologies.
This work introduces a two-layer anti-spoofing framework that combines passive single-frame analysis with active challenge–response prompts. It also implements a scalable cloud-based SaaS platform with an SDK that enables seamless third-party integration. In addition, the study provides a comprehensive evaluation of system performance, covering frontend responsiveness, backend scalability, and anti-spoofing effectiveness on public benchmarks. Furthermore, the platform is designed in accordance with ethical AI principles and privacy regulations, ensuring compliance and fostering user trust.
The main contributions of this paper are as follows:
Develops AuthVisage, a scalable Face Recognition-as-a-Service (FRaaS) platform, including an open-source SDK that enables seamless integration of face authentication into third-party applications.
Proposes a multi-layered anti-spoofing framework that combines fast passive liveness detection with on-demand active challenge–response, striking a balance between security and usability.
Integrates trustworthy AI principles—explainability, fairness, transparency, and privacy—into both technical and ethical dimensions of the design.
The potential applications of AuthVisage extend across multiple domains. In online banking and digital payments, the platform can provide secure, low-latency identity verification to prevent fraud. In enterprise access control, it can replace or complement traditional badge systems with a more secure and user-friendly alternative. In e-government services, the technology can support trustworthy remote authentication for sensitive citizen services. Furthermore, healthcare systems can benefit from reliable identity verification to protect patient records, while educational platforms can employ the system to secure remote examinations. Finally, in consumer applications such as social media or e-commerce, AuthVisage can offer frictionless login experiences while ensuring protection against impersonation and spoofing.
2. Related Work
Research on face recognition has matured considerably, with leading systems now achieving accuracy above 99% in controlled conditions [
1,
2]. Yet, the practical deployment of these technologies in high-stakes domains such as finance, e-commerce, and border security continues to raise concerns about spoofing, reliability, and fairness. This section reviews the most relevant prior work, focusing on three interrelated themes: anti-spoofing methods, reliability and usability trade-offs, and the ethical and regulatory landscape.
2.1. Face Anti-Spoofing Methods
Preventing presentation attacks remains one of the most studied challenges in biometric authentication. Traditional approaches relied on low-level cues such as texture, motion, or image quality. For instance, Fourati et al. [
2] demonstrated that lightweight image-quality checks and motion cues could be used on mobile devices, offering fast performance but limited robustness. More recent work has shifted toward deep learning. For example, Vision Transformer (ViT) models [
5] have been explored for zero-shot detection of novel attack types, though their computational cost hinders real-time use.
A growing body of work emphasizes generalization across domains. Xu et al. [
6] proposed FasTCo, stabilizing temporal predictions to handle variable conditions, while Liu et al. [
7] leveraged contrastive learning to detect high-quality 3D mask attacks. Generative approaches such as DiffFAS [
8] augment training with synthetic variations to improve cross-domain robustness. Collectively, these works highlight the progress of data-driven methods but also their dependence on high computational resources and balanced datasets.
Additional research has further expanded the spectrum of anti-spoofing techniques. Yu et al. [
9] introduced the
Central Difference Convolutional Network (CDCN), a novel convolutional operator that incorporates gradient-level features into standard convolutions, thereby enhancing the model’s sensitivity to subtle spoofing cues. Their work also employed neural architecture search to optimize the CDCN design, achieving strong generalization across benchmark datasets. In another direction, Kong et al. [
10] proposed an
acoustic-based face anti-spoofing system for smartphones, which uses active audio signals emitted and received by the device to analyze echo patterns. This approach demonstrated that sound-based cues can effectively distinguish between real and fake faces, providing a complementary modality to vision-based methods. These contributions highlight the diversity of sensing modalities and architectural innovations being explored in the field.
2.2. Balancing Reliability and Usability
Another stream of research examines how to maintain strong security without undermining user experience. Hofbauer et al. [
1] showed that commercial face-recognition systems often trade usability for robustness against spoofing, leading to friction in everyday scenarios. Lavens et al. [
11] responded with adaptive frameworks that adjust liveness checks depending on context, thereby reducing false rejections in benign settings. These studies underscore the persistent tension between seamless user experiences and the stringent requirements of secure authentication.
2.3. Fairness, Transparency, and Accountability
Technical advances alone are insufficient without addressing fairness and transparency. Bias in face recognition has been well-documented; Buolamwini and Gebru [
12] revealed significantly higher error rates for darker-skinned and female subjects compared to lighter-skinned males. Since then, vendors and researchers have integrated fairness testing and diverse training datasets to mitigate these disparities. Beyond bias, transparency has gained prominence: Microsoft’s responsible AI principles, for instance, emphasize user awareness, clear communication of system limitations, and external validation [
13]. Accountability mechanisms, such as human-in-the-loop review for high-stakes decisions, are now considered essential safeguards.
2.4. Privacy and Regulatory Compliance
The collection and processing of biometric data bring stringent regulatory obligations. The European Union’s General Data Protection Regulation (GDPR) explicitly classifies facial images as sensitive biometric data, imposing strict consent, minimization, and purpose-limitation requirements [
14]. Similar principles are echoed in the U.S. Biometric Information Privacy Act (BIPA) and California’s CCPA. To align with these regulations, privacy-by-design has become a guiding principle in system development, where face templates are processed locally when possible and storage is limited to encrypted embeddings rather than raw images [
15].
2.5. Positioning of the Present Work
In summary, prior research demonstrates notable advances in anti-spoofing methods, practical strategies to balance security and usability, and growing attention to ethical and legal considerations. However, existing approaches often address these challenges in isolation: models emphasize technical robustness but overlook deployment efficiency, usability studies neglect integration with state-of-the-art anti-spoofing, and regulatory compliance is treated as an afterthought rather than a design principle. This work positions itself at the intersection of these domains by presenting a multi-layered, cloud-based authentication service that integrates advanced anti-spoofing, emphasizes usability through adaptive checks, and embeds privacy and fairness considerations into the platform architecture. In doing so, it builds directly on the trajectory of prior research while offering a practical framework for trustworthy facial recognition in real-world applications.
3. Methodology
This chapter describes the design and implementation of the face authentication platform. It covers the system’s structure, workflow, user roles, and anti-spoofing method. The section also explains the chosen methods and technologies, why they were selected, and how they support the thesis goals and build on existing research.
3.1. System Architecture and Technology Stack
The platform is built on a modern web stack comprising four core components: the SDK (Software Development Kit), frontend, backend, and data management layer. Each was selected for its performance, scalability, and compatibility with real-time workflows. The following sections detail the role and rationale behind each component.
Frontend (Next.js (
https://nextjs.org/docs/app/getting-started, accesed on 25 September 2025), TypeScript, Tailwind CSS, ShadCN UI, MediaPipe (
https://ai.google.dev/edge/mediapipe/solutions/vision/face_landmarker, accessed on 25 September 2025)): The frontend is built using Next.js with TypeScript to enable static type checking and improved developer productivity. Server-side rendering in Next.js enhances performance, while Tailwind CSS and ShadCN UI provide utility-first styling and consistent component design. The frontend includes interfaces for user registration, project configuration, biometric capture, and OAuth-based authorization workflows. MediaPipe is used in-browser to detect face landmarks in real time. Frames that meet specific conditions (e.g., correct orientation, centered face) are streamed to the backend, optimizing both bandwidth and backend load.
SDK (AuthVisage SDK): The SDK is a central integration layer that enables third-party applications to adopt the platform’s face authentication flow with minimal effort. It abstracts the complexity of the authentication lifecycle, handling PKCE-based OAuth, session state, token retrieval, and event hooks through a streamlined interface. The SDK allows developers to integrate face authentication quickly and securely without handling the complexities of OAuth or token management directly.
Backend (FastAPI and Socket.IO in Python): The core server logic runs on FastAPI, a high-performance Python web framework ideal for building APIs [
16]. While FastAPI can be used to define REST endpoints, in this system, Socket.IO is utilized for the real-time, event-driven communication required for functionalities such as user registration, face verification, and liveness checks. Python was chosen for the backend due to its rich ecosystem of machine learning libraries—for example, the face recognition and liveness detection model (such as the DeepFace model [
17]) can be integrated directly in Python. FastAPI’s asynchronous capabilities and automatic data validation, complemented by Socket.IO’s real-time communication, help ensure the service can handle concurrent requests (e.g., multiple login attempts) efficiently and securely, providing immediate feedback during interactive processes.
Database and Authentication (Supabase): All persistent data is stored in a Supabase-hosted PostgreSQL database [
18].
Supabase was selected because it provides an integrated solution for database, authentication, and file storage. User accounts and authentication are managed via Supabase Auth, which handles secure user credential storage (passwords are hashed with bcrypt and never stored in plain-text [
19]) and provides JWTs (JSON Web Tokens) for session management. Supabase enforces row-level security policies to ensure that each client organization (project) can only access its own data, a critical feature for a multi-tenant service. The database holds tables for users, projects, and logs of authentication attempts. All network communication between the frontend, backend, and database occurs over HTTPS, aligning with best practices for data privacy and integrity.
3.2. System Workflow
The system workflow includes project setup, SDK integration, user enrollment, and face-based authentication.
Figure 1 summarizes the runtime login path between a third-party site, the AuthVisage SDK, and the AuthVisage platform.
Step-by-step description
User Clicks “Face Login” (Third-Party Site). The relying party (client) exposes a “Continue with Face” button. This action does not handle biometrics locally; it simply triggers the SDK.
AuthVisage SDK (Third-Party Site). The SDK initializes a PKCE-based OAuth request, binds it to a short-lived session identifier, and stores the code_verifier locally. No tokens are exposed in the browser URL.
Redirect to AuthVisage Platform. The browser is redirected to the platform’s hosted capture page with non-sensitive parameters (e.g., session_id, project_id, code_challenge, state). This ensures that biometric capture and evaluation happen in a trusted first-party context.
Identity and Liveness Check (AuthVisage Platform). The platform opens the camera and streams valid frames to the backend for
Passive liveness (single frame) to screen out presentation attacks;
(If needed) Active challenge–response (e.g., blink/turn head) for higher assurance;
Identity matching against the user’s enrolled template (see Section 4.3.3);
Consent check for the requesting project.
Platform Decision.
Success: If liveness and identity checks pass (and consent is present), the platform issues an authorization code back to the browser context and returns control to the SDK.
Failure: If liveness or identity fails (or consent is denied), the platform responds with 401 Unauthorized, and the attempt is logged.
Auth Code Exchange (SDK on Third-Party Site). The SDK redeems the authorization code with the original code_verifier (PKCE) and receives a scoped token for the client app (login only; no project-admin privileges). The user is then considered authenticated in the relying party.
3.3. Multi-Layered Anti-Spoofing Mechanism
A core contribution of our system is its two-layered approach to liveness detection and anti-spoofing, which is crucial for fighting against fraudulent attempts. Face spoofing can involve using a photo, video, or mask of someone else’s face to impersonate them. Without robust liveness detection, a malicious actor could easily fool a face recognition system by presenting a well-crafted fake face. In fact, it is widely acknowledged that a facial recognition system without any liveness detector “is just useless” in terms of security [
20]. Our methodology implements two sequential layers of defense:
3.3.1. First Layer: Passive Liveness Detection (Single-Image Analysis)
The first layer is a
passive anti-spoofing check performed automatically on the initial face image captured during authentication. “Passive” means that the system does not require the user to do anything extra; the user simply looks at the camera as they normally would. Under the hood, when the image arrives at the FastAPI backend, it is processed by a deep learning model that has been trained to distinguish real live faces from fake presentations. Specifically, we employ the Deep Pixel-wise Binary Supervision (DeepPixBiS) network, introduced by George and Marcel [
21], which is designed for frame-level face presentation attack detection (PAD). DeepPixBiS utilizes a convolutional neural network (CNN) architecture that combines both binary and pixel-wise binary supervision during training. This dual-supervision approach enables the model to learn discriminative features at both the global image level and the local pixel level, enhancing its ability to distinguish between genuine and spoofed faces.
If the passive liveness check concludes that the face is genuine (score above a certain threshold), then the first layer is passed successfully. This means the system did not detect any obvious spoofing signs in the image. Because passive detection operates on just a single frame and requires no user effort, it is very fast—the check adds only a few milliseconds to the authentication process. It is also user-friendly: the user typically will not even notice that a liveness assessment is happening. The benefit of passive liveness is that it introduces no extra friction into the login experience; users are not asked to perform any special tasks, making the process seamless and natural [
22]. This layer, by catching many simple attacks (like someone holding up a photo), prevents unauthorized access in the majority of cases without inconveniencing legitimate users.
However, passive liveness alone may not defeat more sophisticated attacks, especially those that mimic live traits or use AI-generated deepfakes. If the result of the first layer is uncertain or negative (i.e., the model outputs a low liveness score or flags the image as potential spoof), the system does not immediately reject the login. Instead, it escalates to the second layer of defense for a more conclusive verification.
3.3.2. Second Layer: Active Liveness Detection (Challenge–Response Test)
The second layer is an active liveness detection procedure, triggered only when the passive check raises suspicion (i.e., when the confidence in liveness is below a threshold). Active liveness detection requires the user to perform a specific action in real-time, proving that they are physically present. When this layer is invoked, the frontend will prompt the user with a simple task, such as: “Please blink your eyes,” or “Turn your head to the left,” or perhaps “Smile” or “Follow the moving icon on the screen with your eyes.” The exact action can be random or configurable to prevent predictability. The camera captures this interaction (often as a short video clip or a series of images), which is then sent to the backend for analysis.
The concept of active liveness is essentially a challenge–response mechanism [
23,
24]: the system challenges the user to perform a live motion, and the correct response (the user doing the motion in front of the camera) confirms liveness. For instance, a static image of a person cannot blink on command, and a deepfake video would have difficulty seamlessly performing an unexpected head turn in sync with the request if it’s not a real person controlling it. By using active prompts, we add an extra hurdle for an attacker—they would need to create not only a realistic image but a controllable puppet of the victim’s face that could respond to prompts, which is significantly harder.
It is important to note that we do
not perform active liveness for every login attempt—it is used selectively. This design is deliberate: while active liveness is very secure, it can introduce some user friction. Prompting users to do an action for every single login might annoy them or slow down the process, akin to having to solve a CAPTCHA at each login [
22]. By using a risk-based approach (passive first, active only if needed), we maintain a balance between security and usability. If an attempt is clearly legitimate (high passive liveness score), users won’t be bothered with extra steps. But if there is any doubt (e.g., borderline score or known high-risk scenario), the system seamlessly escalates to the active check. This two-tier strategy significantly improves security: it is robust against both low-effort attacks (stopped by layer one) and more advanced spoofing attempts (caught by layer two). In fact, research recommends the incorporation of active strategies to strengthen defenses against new or AI-driven spoofing techniques [
20].
The decision logic of the two-layered liveness and verification process is illustrated in
Figure 2. It shows how a face sample is first passed through passive liveness detection, escalated to an active challenge if necessary, and finally verified against the enrolled identity before granting access.
3.4. Alignment with Trustworthy AI Principles
Beyond functional and security considerations, our system is built with a strong emphasis on
trustworthy AI principles. Facial recognition, especially in authentication, carries significant ethical and social implications. In designing the service, we adhere to key principles such as explainability, fairness, transparency, robustness, and privacy [
25]. These principles align with widely accepted guidelines for ethical AI (e.g., the EU’s AI ethics framework and IBM’s pillars of trustworthy AI) and are ingrained in our methodology. Below, we explain how each principle is addressed:
3.4.1. Explainability
We strive to make the system’s AI-driven decisions understandable to both end-users and administrators. In a face recognition authentication scenario, explainability means that users should not be left confused as to why they were accepted or rejected by the system. Our implementation provides feedback and information at multiple levels:
User-Facing Feedback: When an authentication attempt fails, the user interface provides a clear message. Instead of a generic “Authentication failed,” the system indicates the cause when possible, such as “Face not recognized” or “Liveness check unsuccessful.” This hints to the user whether the issue was a face matching problem or a spoofing suspicion. In the case of a liveness challenge, if the user fails (perhaps they did not blink in time or the detection failed), the system will explicitly inform them and often give another chance or guidance. By doing so, the user understands what aspect of the AI’s decision needs their attention, demystifying the process.
Model Interpretability: While deep learning models are complex “black boxes” in many respects, we incorporate techniques to make them partially interpretable. During development, for instance, we use visualization tools to confirm that the model focuses on relevant facial regions (e.g., eyes, nose) when predicting liveness, rather than arbitrary background patterns. Though we do not expose these visualization tools directly to end-users, they help the development team validate the model’s consistency and provide higher-level explanations to admins if needed.
3.4.2. Fairness
Facial recognition technologies have historically faced challenges with demographic bias—differences in accuracy across user groups of various ethnicities, skin tones, ages, or genders [
12]. A trustworthy face recognition system must strive to be fair and not unduly favor or disadvantage any group. We take several steps to mitigate bias and ensure equitable performance:
Diverse Dataset and Testing: The face recognition model and the liveness detector were evaluated on a huge and diverse dataset during development. We tested with faces of varied skin tones and from different demographic backgrounds to measure if the false rejection or false acceptance rates differed.
Ongoing Monitoring: Fairness is not a one-time configuration but a continuous process. Our system includes monitoring for bias in the field. The Super Admin’s global statistics can reveal if certain demographics experience higher failure rates. Feedback channels allow Project Owners or users to report suspected bias. These insights feed back into model retraining or adjustments.
Transparency about Performance: We do not conceal known limitations. If certain lighting conditions or demographics need special consideration, we convey that to clients so they can guide their users accordingly. We also comply with best practices for fairness testing and publish (where possible) high-level metrics of model accuracy across subgroups.
3.4.3. Transparency
Transparency in our context means being open about how the face recognition service operates, which AI models are in use, how user data is processed, and which parties have access. Key steps include the following:
Documentation of the System and Models: We maintain thorough technical documentation describing the SDK, backend, and basics of the AI models used (e.g., “DeepFace for recognition,” “CNN-based liveness model”), including known performance metrics and limitations.
Access to Own Data: End-users can see and manage their own accounts (e.g., request data deletion, revoke a website’s access). Project Owners can export logs for independent analysis. We never hide behind a “black box” claim when it comes to user data usage.
3.4.4. Privacy
Privacy is of paramount importance where biometric data is concerned. Facial data is highly sensitive and essentially immutable. Thus, our methodology enforces strict privacy safeguards, aligned with regulations like GDPR [
26]:
Data Minimization and Purpose Limitation: We store only the minimum facial data (embeddings or templates) required for authentication. We never repurpose it for advertising or other non-essential uses.
Informed Consent: Users cannot complete enrollment without explicitly agreeing to share their biometric data for face authentication. This aligns with GDPR’s stance on biometric data as a “special category” [
26].
Encryption and Secure Storage: Biometric data is stored in encrypted form in Supabase [
18]. All traffic is HTTPS, and backups are also encrypted. We employ row-level security, ensuring that only authorized roles can query specific data.
Access Control and Anonymization: Administrators do not see raw images; they only manage user accounts. If a face image is needed for debugging, the request is audited, and only minimal data is exposed. We rely primarily on biometric embeddings rather than storing full images, reducing the risk of reconstruction attacks.
User Control and Data Deletion: GDPR grants the “right to be forgotten.” Our system allows users to delete their accounts or facial data, which purges the relevant records from the database. When a project is removed, its users’ biometric data is also deleted. Periodic cleanup of inactive accounts further minimizes unnecessary retention.
Compliance and Auditing: We perform Data Protection Impact Assessments (DPIA) for biometric data usage, keep records of processing activities, and can provide them to authorities if required. This helps ensure ongoing compliance with privacy regulations.
4. Implementation
4.1. Frontend Implementation—SDK and Platform
The frontend of the face authentication system consists of two main modules: the AuthVisage SDK and the AuthVisage Platform. These components enable a secure and seamless OAuth-inspired facial authentication workflow using real-time video analysis and anti-spoofing techniques.
4.1.1. AuthVisage SDK
The AuthVisage SDK is an open-source JavaScript library published on npm (
https://www.npmjs.com/package/authvisage-sdk, accesed on 25 September 2025). It abstracts the biometric login workflow into a simple API for third-party developers, allowing them to integrate “Continue with Face” authentication with minimal configuration. The SDK handles session creation, PKCE flow generation, redirect logic, and final token exchange after user verification.
To ensure secure authorization, the SDK adopts the Proof Key for Code Exchange (PKCE) flow, which is specifically designed for public clients that cannot store secrets securely. This is used in conjunction with a temporary session_id, which acts as a short-lived identifier to bind the client request to its corresponding biometric session on the backend.
The comparison highlights that PKCE eliminates token exposure in browser URLs and removes the need for a client secret, which is critical for public clients (SPAs, mobile) that cannot safeguard secrets (see
Table 1). By binding the authorization code to a one-time code verifier, PKCE materially reduces interception and replay risks, thereby raising the baseline security of third-party integrations. The main trade-off is a minor increase in round-trip overhead and local storage of the short-lived verifier, which we mitigate with strict lifetimes and cleanup.
The intentionally small API surface lowers integration complexity for developers—
faceLogin() encapsulates PKCE session creation and redirects, while
authStateChange enables reactive UI patterns without polling. The explicit
logout() endpoint enforces token hygiene on the client, reducing the risk of stale or orphaned sessions. In practice, this minimal interface was sufficient for our example apps while leaving room for future extensibility via event hooks (see
Table 2).
4.1.2. AuthVisage Platform
The platform is a web application built using Next.js, TypeScript, Tailwind CSS, and ShadCN UI. It serves as the centralized biometric identity interface, handling consent flows, face capture, real-time validation, and communication with the backend.
Upon redirection from the SDK, the platform performs the following:
Parses query parameters: session_id, code_challenge, state, project_id.
Establishes a WebSocket connection to the backend using the session ID.
Initializes the user’s camera and starts MediaPipe-based facial landmark detection.
Sends only valid frames to the backend for anti-spoofing analysis.
MediaPipe is used for landmark detection to assess
Once a user’s identity is verified, the backend returns an authorization code to the platform via WebSocket. The platform then redirects the user back to the SDK’s original redirect URL.
The SDK captures this code and sends a POST request with the stored code_verifier to the backend, which responds with a custom token. This token is used only within the third-party app and is scoped to login-only functionality.
Each of these pages is protected via Supabase-authenticated sessions and is only available to first-party users.
4.2. Backend Implementation
4.2.1. Core Framework and Communication
The core of the backend is a FastAPI application, as initialized in app/main.py. FastAPI’s asynchronous nature, built on Starlette and Pydantic, allows for non-blocking I/O operations, which are crucial for handling multiple simultaneous connections and processing-intensive tasks like facial recognition without significant delays. The application is configured with CORS (Cross-Origin Resource Sharing) middleware to allow requests from configured origins, as specified in settings.all_cors_origins.
4.2.2. API Structure and Routing
The backend exposes a set of RESTful API endpoints and Socket.IO event handlers. The main API router (api_router) is defined in app/api/main.py and aggregates various sub-routers for different functionalities. This modular approach organizes the API into logical sections:
OAuth Endpoints (app.api.routes.oauth): Manages OAuth 2.0 flows, including token issuance and validation, crucial for secure communication between the SDK and the backend.
User Management Endpoints (app.api.routes.users): Handles user registration, profile management, and project associations.
Utility Endpoints (app.api.routes.utils): Provides miscellaneous helper functions and status checks.
These routers are included in the main FastAPI application with a prefix defined by settings.API_V1_STR (e.g., /api/v1).
4.2.3. Application Lifecycle and Initialization
The FastAPI application utilizes a lifespan context manager, defined in app/main.py, to handle startup and shutdown events. During startup (logger.info(“lifespan start”)), a key initialization step is cache_models(). This function, likely defined in app.utils.cache_models, is responsible for loading machine learning models (e.g., for face recognition and liveness detection) into memory. This pre-loading strategy minimizes latency during authentication requests, as models are readily available for inference.
4.2.4. Configuration and Logging
Application settings, such as project name, API version string, and CORS origins, are managed through app.core.config.settings. This centralized configuration allows for easy adjustments across different environments (development, testing, production). Logging is configured to include timestamps and detailed information for both default application logs and access logs, as demonstrated by the timestamp_log_config function in app/main.py. This structured logging is vital for monitoring system behavior, debugging issues, and auditing security events.
4.3. Face-Recognition and Liveness Detection
4.3.1. Face Detection
Faces are first localized with RetinaFace, a single-shot multi-level detector that jointly predicts bounding boxes and five landmark points in one feed-forward pass. RetinaFace reaches state-of-the-art accuracy on the WIDER-Face benchmark while operating in real time on modern GPUs. The five landmarks detected represent the eye centers, nose tip, and mouth corners.
4.3.2. Liveness Detection
The passive liveness model is trained on two publicly available datasets: Replay-Mobile and OULU-NPU. Replay-Mobile comprises 1190 video clips of both bonafide (live) and spoofed face presentations captured under various lighting conditions using mobile devices. OULU-NPU consists of 4950 video clips recorded with six different smartphones across three sessions, encompassing diverse illumination conditions and presentation attack instruments. Each dataset provides subject-disjoint training, development, and testing splits, facilitating robust evaluation protocols [
21].
The model employs the first eight layers of the DenseNet-161 architecture, initialized with ImageNet-pretrained weights. This configuration includes two dense blocks and two transition layers, culminating in a
feature map. A
convolution followed by a sigmoid activation generates a pixel-wise binary map, while a fully connected layer with sigmoid activation produces an image-level binary output. The training objective combines pixel-wise and image-level binary cross-entropy losses with equal weighting (
). Data augmentation techniques such as random horizontal flips and color jittering are applied. The model is optimized using the Adam optimizer with a learning rate of
and a weight decay of
[
21].
For deployment, the trained model is exported to the ONNX format and quantized to FP16 precision, reducing the model size while maintaining performance. During inference, a single frame is processed to produce a
feature map, whose mean value serves as the final PAD score. This approach enables real-time processing with minimal computational overhead [
21].
When higher assurance is needed, an active liveness check is triggered. The server sends a randomized head-movement prompt (e.g., “turn right, turn left”). The observed pose trajectory must reach all target angles within a specified time window; failure aborts the login.
4.3.3. Face Recognition
After the user successfully passes both passive and (if triggered) active liveness checks, the aligned face crop is forwarded to the recognition stage. First, the crop is processed by FaceNet, which produces a 128-dimensional embedding that compactly represents the facial identity. Identity matching is then cast as a cosine-similarity comparison between the probe embedding and the gallery templates; scores approaching 1 indicate that the two images depict the same person. A configurable decision threshold determines whether the claim is accepted or rejected.
5. Evaluation
5.1. Frontend Evaluation
This section presents a comprehensive evaluation of the face authentication platform, examining both the SDK and the platform. The methodology combines unit testing, integration validation, and large-scale load simulations to measure reliability, scalability, and user experience.
5.1.1. Testing Methodology
The evaluation employed a multi-tiered strategy:
SDK Testing: Implemented using Jest, ensuring correct behavior across authentication workflows.
Platform Load Testing: Conducted with Grafana K6 (
https://grafana.com/docs/k6/latest/, accesed on 25 September 2025), simulating diverse traffic patterns (e.g., login bursts, registration flows).
Performance Audits: Performed with Lighthouse to assess frontend quality.
5.1.2. SDK Evaluation
The SDK underwent extensive testing using Jest, with test cases spanning from initialization to session management and error handling. Test coverage metrics are as follows (
Table 3):
Core test cases include
OAuth Flow Testing: Verifies PKCE generation, redirect logic, and token exchange.
State Handling: Tests anti-CSRF state generation and validation.
Error Resilience: Confirms the SDK behaves predictably under network or logic failures.
5.1.3. Platform Evaluation
Using K6, 10 realistic traffic scenarios were implemented, including burst traffic, mobile interactions, and sustained sessions. Testing was performed locally on a developer-grade laptop (Intel i5, 16GB RAM). While this limits the absolute scalability tested, the simulation focuses on logical correctness and latency consistency.
All thresholds were met with a significant margin (
Table 4). Below is a synthesis of the results:
Requests p95: 1.51 ms (Threshold: 1500 ms)
Auth Completions: 18,374 (Threshold: 50)
Request Rate: 276 req/s (Threshold: 100)
Error Rate: 0.00% (Threshold: <5%)
5.1.4. Frontend Performance (Lighthouse Audit)
Key routes were audited using Lighthouse. Most pages scored 96–100% in all categories, with minor accessibility dips.
5.1.5. Rationale for Local Testing
Due to hardware constraints, the tests were carried out without using distributed infrastructure, auto-scaling, or CDN optimizations. The focus was instead on verifying logical integrity, internal routing behavior, and the responsiveness of individual instances.
Although this setup does not reflect full production-scale performance, it offers a reliable baseline for validating system correctness and sets the stage for a stable rollout when transitioning to cloud-based scaling.
5.2. Backend Evaluation
5.2.1. Test Suite
A comprehensive test suite was developed to ensure the reliability and accuracy of the backend. Automated tests are located in the
backend/tests/ directory and cover core API endpoints, authentication flows, and edge cases. For example,
test_main_api.py and
test_oauth.py validate the OAuth flow, user registration, and error handling. The test suite can be executed using the provided shell scripts, such as
scripts/test.sh, which runs all tests and reports coverage(see
Table 5).
5.2.2. Testing Methodology
Testing was performed using both unit and integration tests. Unit tests validate individual functions and endpoints in isolation, while integration tests simulate real-world authentication scenarios, including user registration, face data streaming, and liveness checks. The use of automated tests ensures repeatability and rapid feedback during development.
Manual testing was also conducted using API clients and simulated client applications to verify real-time interactions over WebSocket and to assess system behavior under concurrent load. This approach was chosen to complement automated tests and to validate the user experience in realistic conditions.
5.3. Face Recognition and Anti-Spoofing Models Evaluation
Benchmark and Metrics
The passive liveness component is assessed on the OULU–NPU benchmark using the standard error rates (
Table 6):
Attack Presentation Classification Error Rate (APCER),
Bonafide Presentation Classification Error Rate (BPCER), and their mean, the
Average Classification Error Rate (ACER). Lower values denote better performance.
DeepPixBiS attains an ACER of 0.4, improving upon previous best results and achieving a perfect BPCER, i.e., no genuine users are rejected during evaluation.
6. Discussion
The evaluation confirms that our platform provides a secure, efficient, and scalable face authentication system. Each component—frontend, backend, SDK, and biometric models—was assessed using both automated and manual techniques and consistently met or exceeded performance and correctness thresholds.
The SDK exhibited high reliability, with 100% function coverage and robust handling of edge cases such as token errors, invalid states, and network interruptions. Its modular design and high test coverage ensure that it can be confidently integrated into third-party applications.
Frontend evaluation via Lighthouse audits demonstrated excellent performance and SEO optimization across all tested routes, with scores ranging from 96% to 100%. The main limitation lies in accessibility compliance, where some pages scored below 80%. These deficits are minor and primarily relate to ARIA labeling and color contrast, which can be addressed in future iterations.
Platform scalability was validated through simulated load tests using K6. The platform maintained a request p95 latency of 1.51 ms and zero error rate across burst and sustained traffic scenarios. These results provide confidence in its ability to scale under production conditions, especially when deployed to cloud-native environments.
The passive liveness detection model, based on DeepPixBiS, achieved an ACER of 0.4 on the OULU–NPU benchmark, outperforming previous state-of-the-art methods. The model also demonstrated real-time inference speed at approximately 5 FPS on commodity GPU hardware, making it suitable for mainstream deployment without requiring specialized accelerators.
We primarily report APCER, BPCER, and ACER since they are the standardized metrics for face anti-spoofing benchmarks such as OULU-NPU. While other metrics like AUC (Area Under the ROC Curve) and EER (Equal Error Rate) are also informative, we found ACER to provide a balanced interpretation across both false acceptance and false rejection. Importantly, prior anti-spoofing studies report results in terms of ACER, enabling direct comparison with state-of-the-art methods. For this reason, we prioritize ACER while noting that complementary metrics such as EER could be incorporated in future extensions to provide additional perspectives.
Beyond traditional presentation attacks such as printed photos or replayed videos, the rapid progress of generative AI introduces deepfakes as a significant new threat. Modern deepfake techniques can synthesize highly realistic facial videos that preserve subtle dynamics like blinking, head movement, and expression changes, making them increasingly difficult for conventional liveness checks to detect. This raises serious concerns for the reliability of face recognition in security-critical applications such as banking, e-government, and border control. Moreover, deepfakes lower the barrier for attackers: open-source models and consumer-level hardware now make the production of convincing forgeries accessible to non-experts. The broader risks extend beyond system compromise—successful deepfake-based intrusions can undermine public trust in biometric authentication as a whole, amplifying regulatory and ethical challenges. Addressing this evolving threat will likely require continuous innovation in multimodal liveness detection, adaptive training with new attack data, and integration of interpretability tools to help identify when systems are being deceived.
The OULU-NPU dataset includes 55 subjects with variation in gender and ethnic background (Asian and Caucasian), recorded under multiple illumination settings and devices [
27]. However, the age distribution is relatively narrow, primarily young adults, which limits the ability to fully assess age-related effects. Future studies could extend validation to a broader range of demographics to further strengthen generalizability.
Future research directions could explore further optimizations in inference latency, broader demographic validations, and the integration of additional passive and active anti-spoofing layers. Emphasis on privacy-preserving methods such as federated learning could further enhance user trust and compliance with emerging privacy regulations.
7. Conclusions
In this work, we built a multi-layer Face Recognition-as-a-Service platform that combines passive and active liveness checks with strict role-based access controls and privacy-by-design measures to defend against spoofing attacks while safeguarding user data. The platform provides a seamless sign-in experience through a modular microservice backend, a reliable SDK, and localization-ready interfaces for multilingual deployment. Looking ahead, the system can be strengthened with additional anti-spoofing techniques, privacy-preserving methods such as federated learning, and broader demographic validation to improve fairness and inclusivity. These directions, together with continued improvements in accessibility and inference efficiency, will ensure that the platform remains robust, scalable, and trustworthy for security-critical applications.