1. Introduction
Artificial intelligence systems have displayed remarkable performance capabilities over a wide range of tasks, including image recognition and natural language processing. However, despite these developments, practical applications of artificial intelligence systems reveal inherent limitations that cannot be addressed through automation alone. In situations involving sophisticated decision-making processes, such as in healthcare, finance, legal processes, and autonomous systems, understanding contexts, making ethical judgments, and being accountable are requirements that artificial intelligence systems are not yet able to offer. This has led to the development of various Human-in-the-Loop (HITL) methods that involve the integration of human expertise into artificial intelligence systems during various phases of learning processes [
1].
The term human-in-the-loop (HITL) extends beyond the simple oversight of automated systems. Modern HITL systems enable a two-way interaction in which human input is incorporated to influence the model’s response. Artificial intelligence systems enable the extension of human capabilities to process large volumes of data to identify patterns that may be difficult for humans to identify on their own. This model has been found to be particularly effective in environments with high error costs and the need for decision-making processes to be explainable [
2]. The growing adoption of HITL approaches reflects a broader shift in AI research from pursuing full autonomy toward designing systems that enhance rather than replace human decision-making.
Several factors have contributed to the increased interest in HITL systems over the past decade. The deployment of machine learning models in sensitive domains such as medical diagnosis, criminal justice, and financial lending has raised concerns about algorithmic bias, lack of transparency, and the potential for harmful outcomes when humans are excluded from the decision process [
3]. Regulatory frameworks, including the European Union AI Act, now mandate human oversight for high-risk AI applications, creating both legal requirements and practical incentives for HITL design [
4]. At the same time, advances in interactive machine learning, active learning, and reinforcement learning from human feedback have provided technical foundations for building effective HITL systems that can learn efficiently from limited human input. Together, these regulatory, ethical, and technical developments motivate the Human-in-the-Loop design space summarized in
Figure 1, which outlines the main technical approaches for integrating human input and feedback across the AI pipeline.
The scope of HITL research spans multiple disciplines and application domains. In machine learning, HITL methods address challenges of data annotation, model training, and output validation through structured human involvement [
5]. Human–computer interaction research focuses on the ways human–computer interaction can be designed so that human–artificial intelligence collaboration is facilitated, the cognitive load is managed, and user trust is maintained. The domain of AI ethics deals with the issues of responsibility, accountability, and value alignment that arise from human–artificial intelligence co-decision-making. Domain-specific research, for instance, in the domains of healthcare, autonomous vehicles, and cybersecurity, focuses on the ways human-in-the-loop principles can be adapted to the specific domain’s requirements and constraints [
6].
Yet, even with the increasing number of publications on human-in-the-loop (HITL) research, the domain is fragmented over a range of disciplines. The results are not well integrated. The existing surveys are found to focus on specific technical approaches (e.g., active learning) or specific application domains (e.g., medical imaging). A unifying framework that connects theory, methods, and applications is absent. This prevents both researchers and practitioners from fully appreciating the range of HITL methodologies and the best approach for their needs [
7]. The present survey addresses this gap by providing a systematic review that synthesizes findings across technical and application domains.
This review makes several contributions to the human-in-the-loop (HITL) literature. First, the review proposes a taxonomy for HITL systems, which considers the placement of the human-in-the-loop, the level of interaction, and the temporal aspects of the interaction. This taxonomy provides a framework for comparing the wide range of techniques used for human-in-the-loop systems, as well as the underlying principles for the construction of HITL systems. Second, the review provides a survey of the underlying techniques that enable human–AI collaboration, including active learning, reinforcement learning from human feedback, and explainability. Third, the review considers the ethical and governance implications of human-in-the-loop systems, including fairness, bias, and legal requirements for human oversight [
8]. Fourth, we survey applications across multiple high-stakes domains, identifying common patterns and domain-specific adaptations. Fifth, we discuss open challenges and future research directions, including scalability of human oversight, management of conflicting human feedback, and the design of adaptive HITL architectures [
9]. To avoid ambiguity in overlapping terminology,
Section 2.6 explicitly defines the scope boundaries among HITL AI, human–AI collaboration, IML, XAI, and RLHF.
Summary of contributions:
We introduce a unified taxonomy for HITL systems organized along three explicit dimensions—loop placement, interaction granularity, and temporal characteristics—and use it consistently to structure the survey.
We provide two reusable synthesis “anchor” tables: a method-family comparison (
Table 1) and a domain-focused comparison (
Table 2).
We consolidate major HITL method families (e.g., active learning, RLHF, interactive model steering, post-hoc validation/escalation, prompt-based workflows) into a single comparison view that specifies required human inputs, typical costs, risks, and failure modes (
Table 1).
We synthesize cross-cutting deployment challenges—scalability of oversight, cognitive load, trust calibration, and security/adversarial manipulation—and connect them to concrete design concerns discussed throughout the paper.
We outline open research directions and articulate a design-oriented perspective for moving from static HITL configurations toward adaptive human–AI oversight architectures.
1.1. Distinct Contributions Relative to Prior HITL Surveys
In order to clarify the novelty of the manuscript with respect to previous HITL review papers, four key distinguishing factors are emphasized. First, an integrative 3D taxonomy is utilized (loop placement, interaction granularity, temporal characteristics), which allows for a more straightforward comparison across systems that may be classified under similar HITL designations but differ with respect to workload, latency tolerance, and oversight costs. Second, a wide range of application domains from healthcare to autonomous systems, cybersecurity, finance, education, and industry is synthesized under a unified framework that allows for cross-domain comparison while respecting domain-specific constraints. Third, whereas previous reviews focus primarily on the description of the various methods available for HITL design, a synthesis of the relevant trust calibration and ethical governance considerations is provided. Finally, a design-focused approach is taken that emphasizes the connections between the various method families, failure modes, and ethical considerations to configuration possibilities within a practical HITL design.
1.2. Systematic Review Protocol
To clarify the review methodology, we followed a PRISMA-aligned workflow for evidence identification, screening, eligibility assessment, and synthesis. The core systematic corpus used for structured synthesis comprised 134 studies; additional references were included selectively for methodological background, policy context, or illustrative domain examples.
We searched Scopus and Google Scholar for studies published between January 2018 and January 2026. The search combined HITL and domain terms, including variants of “human-in-the-loop”, “human oversight”, “human-AI collaboration”, “human-on-the-loop”, “active learning”, “RLHF”, “AI governance”, and domain qualifiers (healthcare, autonomous systems, cybersecurity, finance, education, and manufacturing). Forward/backward citation tracking was then applied to capture high-relevance studies not retrieved in the initial database query.
Studies were included when they: (i) addressed explicit human involvement in AI system training, inference, supervision, or governance; (ii) reported conceptual, methodological, or empirical findings relevant to HITL design/evaluation; and (iii) were available in full text in English. Studies were excluded when they: (i) used HITL terminology without substantive human-role specification; (ii) were purely opinion/editorial pieces without analyzable technical or empirical contribution; (iii) duplicated substantially overlapping content; or (iv) fell outside the scope of AI-assisted decision systems.
Screening proceeded in three stages: title/abstract review, full-text eligibility assessment, and final synthesis coding.
Figure 2 summarizes the resulting flow.
2. Theoretical Foundations
2.1. Anchor Tables: Methods and Domains
To help readers navigate the HITL design space,
Table 1 and
Table 2 summarize (i) common HITL method families and (ii) how HITL design choices typically manifest across high-stakes application domains.
Table 1 organizes method families by required human input, indicative cost, key risks, and common failure modes; bullet-style entries in dense cells are used to improve scanability and support quick cross-row comparison.
Table 2 compares application domains using consistent headings (human oversight points, regulation/standards pressure, evaluation metrics, and pitfalls) to make differences in oversight design and deployment constraints easier to interpret.
The intellectual underpinnings of Human-in-the-Loop AI are varied, drawing upon cybernetics, cognitive science, decision theory, as well as human factors engineering. An understanding of the intellectual underpinnings of HITL AI systems provides a context for the design principles and techniques that characterize modern HITL systems. This section follows the historical path of human–machine collaboration, the philosophical underpinnings of the current research, as well as the concept of hybrid intelligence, which structures much of the current research in the field.
2.2. Historical Evolution
The role of human judgment within an automated system can be traced back to before the advent of modern artificial intelligence by several decades. The foundational work done by cybernetic researchers in the 1940s and 1950s provided foundational principles for feedback control that are still relevant today for human-in-the-loop (HITL) design. The work done by Norbert Wiener on human–machine systems identified the need for mechanisms to be present that enable humans to observe system behavior and act accordingly. These ideas contributed to the development of decision support systems in the 1960s and 1970s, which did not replace human decision-makers but complemented them in business management or military command scenarios [
5].
The shift from decision support paradigms to interactive machine learning can be seen as a major shift in the way the role of humans is conceptualized in intelligent systems. Instead of seeing humans as mere receivers of system-suggested decisions, interactive approaches emphasize the role of humans in the process of machine learning itself. This shift is also partly driven by the understanding that many problems in reality involve tacit knowledge, contextual information, and value judgments that are difficult to formally specify [
10]. The emergence of active learning algorithms in the 1990s formalized methods for systems to query human experts strategically, optimizing the use of scarce human attention and expertise.
Recent advances in deep learning have paradoxically reinforced the importance of human involvement despite dramatic improvements in automated performance. The opacity of neural network models, their susceptibility to distributional shift, and their potential for encoding harmful biases have made human oversight essential for responsible deployment [
11]. The present-day focus of human-in-the-loop (HITL) studies is on both using human knowledge to improve model performance and offering effective human oversight/correction for AI systems that are integrated into complex environments. This is a process of maturity in understanding that we are not trying to minimize human involvement but are instead seeking to maximize its utilization in conjunction with machine performance based on the capabilities of each.
2.3. Philosophical and Cognitive Perspectives
The human-in-the-loop (HITL) design also gives rise to some fundamental issues related to the very nature of intelligence and human judgment. The proponents of human-centric views argue that some aspects of decision-making processes, such as morality and accountability, should remain exclusive to human capabilities and should not be entrusted to machines regardless of their capabilities [
12]. This view draws support from phenomenological traditions that highlight the embodied and situated nature of human understanding, which differs qualitatively from computational information processing.
The bounded rationality concept, which was first proposed by Herbert A. Simon, provides a different point of view that has major implications for the design of human-in-the-loop systems. According to Simon’s theory, decision-makers are faced with information constraints, cognitive limitations, and time constraints that force them to satisfice rather than optimize. In this regard, human-in-the-loop systems have the ability to deal with vast amounts of information, identify relevant choices, and present information in a format that is cognitively friendly [
12]. At the same time, designers must recognize that AI systems introduce their own forms of bounded rationality, including training data limitations, objective function misspecification, and inability to reason about situations outside their training distribution.
The question of cognitive load management is central to effective HITL design. Humans interacting with AI systems must process system outputs, maintain situational awareness, make decisions, and provide feedback, all while managing competing demands on attention and working memory [
13]. Research in human factors has established that poorly designed automation can actually degrade human performance by inducing complacency, reducing skill maintenance, or overwhelming operators with alerts and information. Effective HITL architectures must therefore balance the benefits of automation against these cognitive costs, designing interactions that keep humans appropriately engaged without exceeding their information processing capacity [
14].
2.4. Hybrid and Centaur Intelligence
The metaphor of the centaur, a mythological creature combining human and equine elements, has gained currency as a way of conceptualizing human–AI collaboration. Tang proposed the Chiron Imperative, a framework identifying six models for creating human–AI centaurs that combine the wisdom and ethical judgment of humans with the computational power of AI systems [
15]. This framework emphasizes that effective collaboration requires more than simply dividing tasks between humans and machines. Instead, it calls for designing systems where human and artificial intelligence amplify each other’s capabilities in ways that neither could achieve alone.
The idea of the centaur is rooted in the domain of competitive chess, where human–computer collaborations have demonstrated capabilities beyond the capabilities of either humans or computers individually. In the context of freestyle chess tournaments, the winning teams did not comprise the individuals with the highest capabilities or the computers with the highest capabilities, but rather the teams that developed the best methods for interacting with one another. This suggests that the interface for interaction between humans and artificial intelligence systems can be as important as the capabilities of the human or the artificial intelligence system [
16]. Translating this insight to other domains requires understanding the specific forms of complementarity that exist between human and machine capabilities in each application context.
The hybrid intelligence concept extends beyond task allocation to consider how human and machine learning can co-evolve over time. In this view, HITL systems are not static configurations but dynamic partnerships where both participants adapt based on their interactions [
5]. Human cognition involves creating mental constructs of the capabilities and limitations of artificial intelligence systems and being able to identify when to trust the advice provided by the system or when to trust the decision-making capabilities provided by the artificial intelligence on their own. At the same time, artificial intelligence systems can be designed that mirror human desires, levels of expertise, and decision-making patterns in order to optimize the effectiveness of collaboration between humans and artificial intelligence systems [
8].
2.5. Terminology and Loop Configurations
The proliferation of terminology for different human–AI relations is an outcome of both the diversity of the methodological approaches adopted and the lack of standardized terminology within the domain. The most common distinction is between the human-in-the-loop and human-on-the-loop relations. In the former, human participation is necessary for system functioning, often by means of active participation in decisions or authorizing system actions. In the latter case, human monitoring of system functioning is optional while allowing for possible intervention [
17].
Singh and Szajnfarber proposed a more nuanced taxonomy that distinguishes Human-in-the-Loop, Human-on-the-Loop, Human-over-the-Loop, Human-under-the-Loop, and Human-along-the-Loop configurations [
17]. Each configuration implies different relationships in terms of power, responsibility, and interaction rate between humans and AI systems. Human-over-the-Loop suggests a configuration where humans are in a position of power with respect to system goals and constraints. Human-under-the-Loop describes a configuration where AI systems are used for controlling or influencing human behavior. Human-along-the-Loop suggests a configuration where humans and AI systems perform related tasks in parallel with lateral interaction. To clarify the operational roles of humans and AI in HITL settings,
Table 3 contrasts common loop configurations and their typical deployment contexts.
The choice among these configurations depends on multiple factors, including the stakes involved in decisions, the reliability of AI components, regulatory requirements, and the availability of qualified human operators [
6]. High-stakes applications with high harm potential usually require a Human-in-the-Loop or a Human-over-the-Loop configuration to ensure significant control. Conversely, low-stakes applications may be satisfied with a Human-on-the-Loop monitoring configuration. Understanding the different configuration options and their implications is important to the architects of human–AI systems to achieve a balance between performance, safety, and resource efficiency [
18].
Figure 3 illustrates these Human–AI loop configurations, highlighting how different placements of human involvement correspond to varying levels of oversight and autonomy.
2.6. Conceptual Scope and Term Boundaries
In consideration of the tendency for these terms to be used interchangeably in the literature, clear scope distinctions are provided in this review. HITL AI is specified as the overarching design paradigm by which human input has operational impact on model development, deployment, supervision, or governance. Human–AI collaboration is a more general socio-technical construct that includes HITL AI but also spans configurations where humans and AI collaborate without a discernible loop-based control structure. Interactive machine learning (IML) is specified as a methodological sub-set of HITL AI that prioritizes the iterative update of models based on ongoing human interaction. Reinforcement learning from human feedback (RLHF) is specified as a particular type of HITL-based training paradigm that involves the conversion of human preference or critique into reward functions for policy optimization. Explainable AI (XAI) is specified as a supporting layer rather than a type of loop that provides benefits in terms of increased interpretability, trust calibration, or auditability without necessarily providing a basis for meaningful human control.
Consequently, the definitional hierarchy that is used in this survey is: (i) HITL AI as the overarching framework, (ii) method families such as IML and RLHF as specific instantiations of that framework, and (iii) cross-cutting enablers such as XAI and trust calibration that enable effective oversight of multiple methods and domains. This hierarchy is used uniformly in the following sections of this survey to identify: (i) the loop structure, (ii) the technical integration of human feedback, and (iii) the enablers that make oversight effective in practice.
2.7. Interaction Granularity and Temporal Characteristics
While loop placement describes where human authority is positioned, two additional dimensions are needed to characterize how collaboration unfolds in practice: interaction granularity and temporal characteristics. These dimensions affect annotation cost, cognitive load, latency, and safety, and therefore influence whether a HITL design remains feasible at deployment scale [
13,
18].
The concept of interaction granularity refers to the level and detail of human input that is expected by the system. The coarse-grained interaction model describes human input as being sparse and high-level. Examples include approval or rejection of a model’s output, escalation decisions, and quality assessment at a batch level. The medium-grained interaction model includes selective corrections, ranking or alternatives, and labeling of uncertain data in active learning. The fine-grained interaction model requires detailed human input. Examples include token-level corrections, corrections in feature attributions, trajectory guidance in reinforcement learning, and step-wise guidance in interactive generation. The more fine-grained the interaction model, the greater the possibility for precise matching and the greater the human input burden and fatigue [
19,
20].
For example, a coarse-grained interaction may be defined as a physician simply accepting or rejecting the model’s suggested level of urgency for a given case under AI-assisted emergency triage. On the other hand, a fine-grained interaction may require the physician to edit specific components of the model’s rationale, such as the weighting of symptoms or risk factors, providing a more detailed corrective signal at a higher temporal and cognitive cost.
The temporal properties concern the rate at which human input is integrated into the model’s behavior. Synchronous interaction requires immediate human involvement in the decision-making process (e.g., confirmation of a clinical decision prior to taking an action), while asynchronous interaction allows delayed input to be used to inform future model behavior (e.g., periodic relabeling or retrospective audit feedback). Another important distinction lies along the dimension of the rate of updates. Continuous feedback flows provide rapid adaptability to changing conditions but may cause unstable model behavior given noisy feedback, while episodic feedback occurs on a pre-scheduled review cycle to provide improved traceability of governance interventions at the expense of response speed [
5,
8].
There is a strong interaction between these two dimensions and the placement of the loop. For example, human-on-the-loop supervision is often used with coarse-grained and asynchronous interaction for moderate-risk operations, whereas human-in-the-loop control for high-risk operations often requires finer-grained and more synchronous interaction. Likewise, human-over-the-loop governance can use episodic temporal patterns, even for highly automated operation. This overlooks the fact that the placement of the loop is not the only relevant axis for the taxonomy.
From a design perspective, this three-dimensional taxonomy supports explicit trade-off analysis. Systems that prioritize throughput may select coarser interaction and episodic review, then add targeted synchronous checkpoints for edge cases. Systems that prioritize accountability and value alignment may adopt finer-grained interventions at selected stages while constraining interaction frequency to preserve human attention. Throughout this survey, technical methods (
Section 3), application deployments (
Section 5), and governance mechanisms (
Section 7) are interpreted through these trade-offs to clarify why similar HITL labels can correspond to very different operational realities.
3. Technical Approaches
The technical underpinnings of Human-in-the-Loop AI technologies include a variety of techniques that enable significant human involvement in machine learning processes. These techniques include well-established approaches like active learning and human-in-the-loop, as well as more recent developments in reinforcement learning with human feedback and generative AI systems. This section examines the fundamental technical concepts that are integral to successful human–AI interaction, with a focus on how these techniques overcome challenges like efficiency, alignment, and reliability.
Table 4 summarizes the main technical approaches used to incorporate human input in HITL AI systems, describing the underlying mechanism, the required type of human contribution, and representative studies.
3.1. Cross-Method Comparative Analysis and Applicability Boundaries
To enable a rich horizontal comparison between methodological classes, this paper considers the primary human-in-the-loop (HITL) techniques along six different dimensions: typical use cases, type of human involvement, interaction costs, scalability, real-time capabilities, and robustness against distribution shift/adversarial pressure. Active learning and human annotation tend to be advantageous in data-scarce domains where label quality is the primary bottleneck; these techniques offer high controllability but can become prohibitively expensive in large-scale annotation scenarios. Reinforcement learning from human feedback and preference optimization can be effective in supporting behavior alignment in generative models but suffer from increased interaction costs and require stronger countermeasures against reward hacking, preference drift, and evaluator inconsistency.
The interactive machine learning model and human guidance are best suited when iteration is possible during development, where domain specialists are readily available, allowing for rapid adaptation of local task alignment, but they are also prone to non-stationary feedback and operator variance. The post hoc validation and escalation approaches are best suited for scalability and ease of deployment, particularly when uncertainty triage is effective, but are not effective in addressing model structure issues if data or model problems are not addressed. The prompt-based human-in-the-loop model is best suited for cost-effective, rapid adaptation for generative problems but is also prone to brittleness when faced with adversarial examples and needs verification protocols for maintaining factual accuracy.
In all these dimensions, the fundamental trade-off for each method concerns not just the accuracy of the model itself, but the attention that humans have available over a period of time. Methods that demand a high level of human interaction are those that improve the accuracy of alignment while sacrificing scalability; those that require little or no human interaction heavily depend on the quality of confidence calibration, escalation, and governance. This comparative analysis is used throughout this section to highlight the strength that each method exhibits, the scope for which each method is applicable, and the measures that need to be taken for each method.
3.2. Active Learning and Human Annotation
Active learning represents one of the most mature and widely deployed approaches to Human-in-the-Loop machine learning. The fundamental principle underlying active learning is that machine learning algorithms can achieve better performance with fewer training examples if they are allowed to select the data from which they learn [
19]. Rather than training on randomly sampled data, active learning systems identify instances where human annotation would be most informative for improving model performance. This selective approach to data labeling addresses a persistent challenge in machine learning: the high cost and limited availability of human-annotated training data.
The query strategies employed in active learning systems determine which instances are presented to human annotators. Uncertainty sampling, one of the most common strategies, selects instances for which the current model has the least confidence in its predictions [
19]. Query by committee methods utilize a committee of models and select data points based on the strongest disagreement among the committee. Expected model change methods select data points that will result in the largest change in the current model. Huang et al. proposed a fast active learning method that optimizes the selection process for active learning while minimizing computational cost. This demonstrates that a well-designed algorithm can greatly improve the usability of active learning in constrained environments [
21]. Each strategy embodies different assumptions about what makes an instance informative, and the choice among them depends on the specific characteristics of the learning task and the available computational resources.
The design of annotation interfaces and workflows significantly affects both the quality and efficiency of human labeling efforts. Effective annotation systems must balance the need for detailed, accurate labels against the cognitive demands placed on human annotators [
34]. Research has shown that annotation quality can degrade substantially when annotators experience fatigue or when task complexity exceeds their working memory capacity. Modern annotation platforms therefore incorporate features such as adaptive task difficulty, real-time feedback on annotation consistency, and mechanisms for identifying and resolving disagreements among multiple annotators. Alla proposed an intelligent automation framework that integrates active learning with AI-driven feedback loops, enabling systems to adapt their query strategies based on annotator performance patterns [
22].
The emergence of crowdsourcing platforms has expanded the scale at which human annotation can be performed while introducing new challenges related to annotator expertise and quality control. Crowdsourced annotation enables rapid collection of large labeled datasets but requires careful attention to annotator selection, training, and quality assurance [
24]. Techniques such as the use of gold standard questions, inter-annotator agreement measures, and weighted aggregation of multiple annotations are helpful in maintaining the quality of labels. Wiethof et al. studied the gamification approach to boost the motivation of the annotators. Gamification can increase the quality of each annotation as well as the overall quality of all annotations by reducing the monotony of the tasks [
25]. The trade-offs between expert annotation and crowdsourced labeling depend on task complexity, the availability of domain expertise, and the acceptable level of label noise for the downstream application.
A more challenging case arises when the disagreement is due to underlying ambiguity, insufficiently defined task definitions, or genuine, albeit varying, expert judgment, as opposed to error due to randomness. In these situations, majority voting can lead to the suppression of relevant, albeit minority, interpretations. Human-in-the-loop (HITL) systems can take advantage of the ability to model the annotators, for instance, through probabilistic voting, which can estimate annotator reliability and bias, or confusion matrix-based methods, which can distinguish systematic from random error. Uncertainty can be propagated downstream as soft labels or label distributions, rather than as single, discrete class targets. Operationally, the system can identify the high-disagreement instances as candidates for additional processing, for instance, through additional expert review, while low-disagreement, stable instances are left within the high-throughput annotation system. This addresses the issue of disagreement as a means to improve both the calibration of the model as well as the governance of the system, highlighting the areas of strong human agreement as well as areas of human judgment that are contestable [
24,
26,
35].
Domain-expert annotation presents distinct challenges and opportunities compared to crowdsourced approaches. In fields such as medical imaging, legal document analysis, and scientific research, annotations require specialized knowledge that cannot be readily obtained from general crowdsourcing platforms [
26]. Expert annotators can provide richer, more nuanced labels but are scarce and expensive resources. Chandler et al. examined human-in-the-loop methodologies for psychiatric applications, demonstrating how expert clinicians can be effectively integrated into machine learning workflows while respecting the constraints on their time and cognitive resources [
27]. Active learning becomes particularly valuable in expert annotation contexts because it maximizes the information gained from each expert interaction. Hybrid approaches that combine expert annotation for difficult cases with crowdsourced annotation for straightforward instances can achieve favorable trade-offs between cost and quality.
Specialized annotation tasks often require custom interfaces and protocols tailored to the specific characteristics of the data and the expertise of annotators. Butler et al. developed a human-in-the-loop system for analyzing facial expression labels, addressing the particular challenges of annotating affective data where ground truth is inherently subjective and context-dependent [
36]. Their work illustrates how annotation systems must be designed with careful attention to the nature of the labeling task and the cognitive processes involved in human judgment. Similarly, applications in industrial quality inspection have required annotation interfaces that present visual information in ways that support rapid and accurate defect identification by trained inspectors [
23].
Recent advances in generative AI have created new possibilities for human-in-the-loop annotation workflows. Large language models can generate candidate annotations or explanations that human annotators then verify, correct, or refine [
34]. This approach can substantially accelerate annotation throughput while maintaining human oversight of the final labels. Chen et al. demonstrated this framework in an educational context, developing a generative AI-based system for creating teaching materials where human educators review and refine AI-generated content [
37]. The human function changes from producing annotations de novo to evaluating and editing the proposals that machines make. The empirical results show that a verification-based process can reduce the time spent in annotation while preserving or improving the quality of the annotations, provided that annotators are aware of the potential for automation bias and do not over-rely on machine recommendations.
The integration of active learning with explainable AI techniques offers promising directions for improving annotation efficiency and quality. When active learning systems can explain why a particular instance was selected for annotation, human annotators gain insight into the model’s current limitations and can provide more targeted feedback [
23]. Explanations can also help annotators understand edge cases and ambiguous instances, leading to more consistent labeling decisions. Harris demonstrated how combining human-in-the-loop systems with AI fairness toolkits can help identify and mitigate biases in training data, particularly in sensitive applications such as job hiring algorithms where annotation decisions can have significant social consequences [
38]. This combination of active selection, explanatory context, and fairness awareness represents a more sophisticated form of human–AI collaboration than traditional active learning approaches that treat annotation as a simple labeling task.
3.3. Human-in-the-Loop Reinforcement Learning
Reinforcement learning from human feedback has emerged as a powerful approach for training AI systems that align with human preferences and values. Traditional reinforcement learning relies on reward functions that specify desired behavior through numerical signals, but designing appropriate reward functions for complex tasks proves extremely difficult in practice [
29]. Human-in-the-loop reinforcement learning addresses this challenge by incorporating human judgment directly into the learning process, either through explicit reward signals, demonstrations of desired behavior, or comparative preferences between alternative actions.
The most direct form of human involvement in reinforcement learning is human reward shaping, where human observers provide reward signals based on their evaluation of agent behavior. This approach has proven effective in domains where the objectives are clear to human observers but difficult to formalize mathematically [
28]. In the context of autonomous driving scenarios, it is easy for humans to ascertain whether a driving action is safe and comfortable or not, even though it would be extremely challenging to define the safety and comfort criteria precisely within a reward function. The role of human rewards is to enable the learning of behaviors that are aligned with what can be considered implicit expectations.
Demonstration-based learning, also known as learning from demonstration or imitation learning, leverages human expertise by training agents to replicate observed human behavior. In this context, human experts perform tasks while the system records their actions, and the agent learns a policy that reproduces these demonstrated behaviors [
29]. The method proves particularly beneficial when the behavior is difficult to articulate but easy to illustrate, a case that often arises in physical manipulation tasks, physical skills like art, and complex decision-making in dynamic environments. The quality of the learned behavior is heavily dependent on the proficiency of the human demonstrators and the number of demonstrations provided.
The idea of humans as mentors for artificial intelligence can be seen as an extension of demonstration-based learning since it enables continuous learning with human mentoring as opposed to only initial mentoring. Huang et al. introduced a framework that enables mentors to correct the behavior of an agent in real time, provide additional demonstrations for complex scenarios, and adjust learning based on performance [
28]. This model of mentorship recognizes that effective learning processes are often realized through adaptive guidance that is responsive to the learner’s current capabilities and specific issues that emerge during training processes. The role of a mentor is one that involves less supervisory control than that of teleoperation but offers more feedback than that of demonstration collection.
Preference-based reinforcement learning represents a particularly influential approach that has enabled significant advances in language model alignment. Rather than providing explicit rewards or demonstrations, humans express preferences between pairs of agent behaviors, indicating which outcome they prefer [
30]. Such preference comparison is subsequently used for training a reward model that represents human values, with the learned reward model guiding the agent learning. The preference-based approach relieves human evaluators from cognitive burdens by replacing absolute judgments with relative comparison, which humans are more likely to do uniformly.
The operational risks involved in reinforcement learning from human feedback are significant and must be addressed as first-class design considerations rather than auxiliary caveats. Reward models carry the risk of encoding evaluator bias, discounting minority opinions, and being vulnerable to reward hacking or specification gaming if policies over-optimize proxy reward signals. There are also risks of preference drift over time, brittleness in the face of distribution shift, and safety regressions that are only discovered post-deployment via interactions. These are the reasons for the importance of ongoing auditing, red teaming, and rollback planning in RLHF pipelines, in addition to optimization (see
Table 1).
Another important difference from a technical standpoint is whether reinforcement learning from human feedback (RLHF) is conducted online or offline. Offline RLHF makes use of a dataset, which can improve the reproducibility of the system as well as pre-deployment governance, although the system may not capture rare threats as well as online RLHF. On the other hand, online RLHF can learn from interaction with users, which can improve the system’s ability to correct its own behavior, although the system may become more susceptible to adversarial attacks, feedback loops, as well as rapid policy change without human intervention. Thus, online RLHF is more difficult to integrate with a high-assurance validation approach than offline RLHF [
29,
30].
Safety considerations are paramount in human-in-the-loop reinforcement learning, particularly for applications in autonomous systems and robotics. Learning agents may explore dangerous actions during training, and the consequences of unsafe behavior can be severe in physical environments [
28]. Human involvement in these environments has several safety-related functions: detection and prevention of potential hazardous actions before execution, provision of corrective feedback upon the occurrence of hazardous actions, and specification of safety constraints that restrict the action set for the agent. The design of human–AI interfaces for safety-critical reinforcement learning agents should allow for prompt human involvement while disturbing the learning process as little as possible.
The application of human-in-the-loop reinforcement learning to autonomous driving has produced substantial research contributions and practical systems. Autonomous vehicles must navigate complex traffic environments while satisfying multiple objectives including safety, efficiency, passenger comfort, and compliance with traffic rules [
29]. Human-in-the-loop approaches enable these systems to learn driving behaviors that satisfy human expectations across these multiple dimensions. Real-time human guidance during training can help agents learn appropriate responses to rare but important situations that might be underrepresented in demonstration data or difficult to specify through reward engineering [
28]. Ahmad examined the broader question of how human-in-the-loop AI models can support trustworthy autonomous driving systems, emphasizing the importance of maintaining meaningful human oversight even as vehicle automation capabilities increase [
39].
Control room and industrial applications present distinctive requirements for human-in-the-loop reinforcement learning. Operators in process control environments must manage complex systems with multiple interacting variables, competing objectives, and significant consequences for errors [
40]. The reinforcement learning agents can assist the operators in suggesting actions, predicting outcomes, or identifying anomalies; yet, the decision-making prerogative lies in the hands of the operators. Research studies in this area have investigated the cognitive states of the operators, which include fatigue, workload, and trust, that affect the effectiveness of human–AI collaboration. Emmanouilidis et al. researched the integration of human-in-the-loop AI systems into production environments, which pinpointed key factors that affect the effectiveness of the integration [
41].
Apart from industrial control systems, human-in-the-loop reinforcement learning has also found some applications in building management systems and environmental control systems. Liang et al. proposed a human-in-the-loop AI system for HVAC management that meets both efficiency and comfort requirements. This shows that reinforcement learning agents can be used to meet human requirements that differ from person to person [
42]. This application illustrates how human feedback can guide learning in domains where objectives are inherently subjective and where automated systems must adapt to diverse user preferences.
Adaptive learning systems in education represent another promising application domain for human-in-the-loop reinforcement learning. Tarun et al. explored how generative AI combined with human-in-the-loop feedback can create personalized learning experiences that adapt to individual student needs [
43]. In these systems, human educators are used to provide feedback, which is used to guide the AI system. The reinforcement learning framework is used for the refinement of educational interventions, which is informed by the learning outcomes as well as the educators’ feedback.
Swarm intelligence approaches offer an alternative example for incorporating human input into collective AI systems. Rosenberg’s work on artificial swarm intelligence demonstrated that groups of humans connected through real-time feedback systems can function as unified intelligent systems that outperform both individual humans and traditional AI approaches on certain tasks [
44]. This approach turns the traditional human-in-the-loop concept on its head, as it involves the incorporation of artificial intelligence into collective human processes, as opposed to incorporating humans into artificial intelligence systems. This results in a hybrid swarms concept, which combines human intuition and understanding with machine-based aggregation and coordination.
3.4. Generative AI with Human-in-the-Loop Feedback
The accelerated development of generative AI systems, with a focus on large language models, has enabled a number of emerging paradigms for human-in-the-loop interaction that are quite different from traditional machine learning approaches. These generative models are capable of creating text, code, images, and other media at a quality that approaches or rivals human levels, yet they require human intervention to ensure that they are correct with respect to user intent, factual correctness, and ethical appropriateness [
31]. The human role in generative AI systems encompasses prompt design, output evaluation, iterative refinement, and ongoing monitoring of system behavior across diverse use cases.
Prompt engineering is a skill that has emerged as a key competence for successful human–AI collaboration with large language models, where the quality and precision of prompts play a significant role in determining their relevance, accuracy, and utility. Ranade et al. showed that rhetorical strategies can be applied in a systematic way for prompt engineering, conceptualizing the interaction between humans and AI as a communicative process for which principles of effective discourse are well established [
31]. This perspective reframes prompt engineering from ad hoc experimentation to a principled practice grounded in communication theory. Effective prompts must convey not only the desired task but also relevant context, constraints, output format preferences, and quality criteria.
However, the quality of the response does not address the underlying structural failure modes of the generative model. The issues of hallucination, factual inconsistency across generated responses, and stability with respect to minor changes in the prompts continue to be core technical risks with HITL. The reasons for these risks are that the responses generated are fluent and plausible even when they are incorrect. This makes human over-trust a significant risk in a high-throughput scenario. This essentially means that HITL-style governance must treat generated responses as statements that must be verified rather than as texts that must be rewritten [
32,
45].
In a technically sound human-in-the-loop (HITL) pipeline, generation and verification are kept decoupled as a matter of course. In most cases, standard security measures include retrieval-grounded generation, citation or evidence fields, and consistency checks among various model versions. In addition, in a scaled-up environment, organizations may employ a system of triage, where generated content is categorized into risk levels, with low-risk content possibly subjected to spot checks, whereas high-priority content, such as medical, legal, or financial, may require a more complex system of structured review and sign-off accountability. In a system where verification is not layered in such a manner, it may become a bottleneck, with human verifiers reverting to superficial approval-based behaviors that are not adequate for infrequent yet significant errors [
46,
47,
48].
The iterative refinement of generative AI outputs represents a distinctive form of human-in-the-loop interaction. Unlike traditional machine learning where human input primarily occurs during training, generative AI systems enable continuous human feedback during inference [
32]. There is a capacity for users to judge the content generated, recognize areas for improvement, and offer remedial advice that can guide the subsequent content. This form of dialogue allows human users to guide the content towards the desired form without the need to specify the requirements a priori. The model is more similar to co-editing than supervising, as the human user and the AI system work together to create content through a series of iterations.
Chain-of-thought prompting and related techniques have demonstrated that encouraging language models to articulate intermediate reasoning steps can substantially improve performance on complex tasks. Atkinson extended this approach through chain-of-code prompting, which integrates human validation at key points in multi-step reasoning processes [
33]. Human evaluators can authenticate intermediate conclusions, correct inaccuracies in the reasoning process, and provide guidance when the model is uncertain. This nested human-in-the-loop model improves the reliability of processing complex tasks by combining the generative capabilities of language models with human judgment at the point of decision-making. Fu et al. extended the concept of combining language models with human judgment by incorporating non-monotonic logical reasoning, thereby creating assistive AI agents with more robust reasoning capabilities under uncertain situations [
49].
The application of human-in-the-loop generative AI to professional domains has produced systems that augment expert capabilities while maintaining appropriate oversight. Bui examined the use of generative AI with human oversight for patent law applications, including AI-assisted drafting, prior art search, and multimodal intellectual property protection [
50]. These applications require high accuracy and must satisfy strict professional standards, making human validation essential despite the capabilities of underlying AI systems. Yuan et al. developed Alpha-GPT 2.0, a human-in-the-loop system for quantitative investment that combines language model capabilities with human trader expertise to generate and refine investment strategies [
51]. In both cases, the human role extends beyond simple approval to include substantive evaluation of AI-generated content against domain-specific criteria.
The healthcare applications of generative AI pose unique challenges in human-in-the-loop design due to the potential impact of erroneous outputs and the need for accountability. Fahad and Huang suggested a framework for continuous validation in healthcare applications of generative AI outputs. They emphasize that human involvement should be an integral part of the workflow and not just at the end stages [
32]. Their framework addresses the issue of maintaining diligent human review in the face of the usual high-quality output of AI systems while also acknowledging that occasional errors can have significant consequences in clinical settings. The construction of a robust human review for medical generative AI needs to take into consideration the cognitive burden on clinicians, time constraints, and the need to prevent the degradation of clinical skills that can result from over-reliance on AI.
Financial services represent another domain where generative AI is being deployed with human-in-the-loop safeguards. Singh proposed a five-step governance framework for generative AI in banking that operationalizes trust through structured human oversight at multiple stages [
48]. The model recognizes that regulatory demands, reputation, and fiduciary duty require a high level of human oversight for AI outputs in a financial setting. Anniciello et al. studied human-in-the-loop generative AI for insurance decision support. The authors created an explainable system that provides justifications for AI recommendations [
45].
The balance between the efficiency of automated systems and the effectiveness of human oversight is a key challenge for the deployment of generative artificial intelligence systems. Verma examined if generative AI could be used as a substitute for human-in-the-loop methods in urban design research. He found that although generative AI could speed up some tasks, human judgment was necessary for evaluating the quality and contextual appropriateness of the design [
52]. This finding echoes broader concerns about maintaining meaningful human engagement as AI capabilities improve. Effective human-in-the-loop generative AI systems must be designed to keep humans cognitively engaged and capable of identifying AI errors, rather than reducing humans to passive approvers of AI outputs.
Content generation at scale introduces additional considerations for human-in-the-loop workflows. Nuotio investigated the impact of generative AI on journalistic processes, examining how human-in-the-loop approaches can maintain editorial standards while leveraging AI capabilities for content production [
46]. Organizational factors that are relevant for successful integration were identified, e.g., clear job definitions, provision of training for human reviewers, and quality assurance approaches that are adapted for AI-based workflows. Kolagani and Vuppala examined related aspects in the context of enterprise customer services, proposing a hybrid approach for balancing efficiency with human oversight for quality maintenance in these services [
47].
3.5. Explainability, Interpretability, and Trust
The ability of humans to comprehend, evaluate, and correctly depend on AI systems is significantly dependent on the explainability of AI systems. Explainable AI is a term that comprises various methods that make AI model behavior comprehensible to humans, thus aiding them in decision-making based on when to trust AI suggestions and when not to [
4]. Without adequate explainability, human-in-the-loop oversight becomes superficial, as humans cannot meaningfully evaluate outputs they do not understand. The development of explainable AI methods is therefore not merely a technical convenience but a prerequisite for effective human–AI collaboration.
The difference between interpretability and explainability, although sometimes fuzzy in practice, implies a number of differences regarding the way an AI system can be made understandable. Interpretability is derived from the intrinsic understandability of the system based on structural properties such as decision trees, rule-based systems, or linear models with a reduced number of features. Conversely, explainable AI refers to methods that provide explanations for models that are not intrinsically interpretable, such as deep neural networks [
53]. In the study by Assadi & Safaei, interpretable artificial intelligence is discussed in the context of product recommendation systems. This demonstrates that the effectiveness of incorporating human feedback into the loop is increased when users are able to grasp the rationale behind the decision made by the system. Both methods are intended for improving human understanding; however, there are clear distinctions between them.
Factual explanations describe the features or patterns that led to a particular AI output, while counterfactual explanations describe what would need to change for the output to be different. Ibrahim et al. conducted an algorithm-in-the-loop analysis comparing these explanation types, finding that their effectiveness depends on the decision context and the expertise of human users [
54]. Counterfactual explanations proved particularly valuable for helping users understand decision boundaries and identify actionable changes. The choice between explanation types should be guided by the specific needs of human decision-makers and the characteristics of the decisions they face.
Trust calibration represents a critical challenge in human–AI systems where humans must learn to rely appropriately on AI capabilities. Both over-trust and under-trust can compromise system performance: over-trust leads humans to accept AI errors uncritically, while under-trust causes humans to reject valid AI recommendations [
20]. In practice, adoption is strongest when users experience consistently calibrated trust because they view the system as both useful and safe enough to incorporate into routine workflows. Tsiakas and Murray-Rust explored how explainable AI can help humans develop appropriate trust by providing insight into AI reasoning processes and limitations. Their work emphasizes that trust should not be unconditional but calibrated to the actual reliability of AI systems across different situations and task types.
The cognitive alignment between AI explanations and human mental models significantly affects whether explanations actually improve human decision-making. Explanations that are technically accurate but do not match how humans think about a problem may fail to improve understanding or may even introduce confusion [
13]. Kotsiopoulos et al. examined this issue in industrial defect recognition, developing explanations designed to align with the cognitive mechanisms that expert inspectors use when evaluating product quality. Their approach illustrates the importance of user-centered design in explainable AI, where explanation methods must be tailored to the knowledge and reasoning patterns of intended users.
The affective dimensions of human–AI interaction influence how explanations are received and whether they achieve their intended effects. Charoenrat developed an affective and explainable AI-driven model for adaptive learning that considers learner emotional states alongside cognitive factors [
14]. The current research recognizes that human interactions with AI systems are not just rational in nature, as human responses to AI systems are also subject to emotional responses to the AI system’s behavior, explanations provided, and the interactive nature of the AI system. Explainable AI systems that consider affective factors may help to achieve more effective human–AI collaborations compared to AI systems that are designed based on cognitive models of human users.
The practical implementation of explainable AI in human-in-the-loop systems requires careful attention to explanation timing, format, and level of detail. Explanations that interrupt workflow, require excessive cognitive effort to process, or provide irrelevant detail can reduce rather than enhance human performance [
4]. Effective explanation interfaces must balance completeness against usability, providing sufficient information for informed decisions without overwhelming users. Research on explanation design has identified principles such as progressive disclosure, where users can access additional detail on demand, and contrastive explanation, where systems highlight differences from typical cases rather than exhaustively describing all features.
Table 5 summarizes trust calibration states in human–AI interaction, outlining their defining characteristics, associated risks, and practical interventions for achieving appropriate reliance.
The relationship between explainability and human learning creates opportunities for AI systems that not only support individual decisions but also help humans develop expertise over time. When explanations reveal the patterns and relationships that underlie AI predictions, humans can internalize this knowledge and apply it in situations where AI assistance is unavailable [
20]. This role of explainable AI, as part of the educational goals, points to the design strategy that places human learning as a priority alongside prompt decision support. The systems developed for the achievement of these purposes may create more valuable systems for the future, as they improve human capabilities, not reliance on AI systems.
3.6. Trust Calibration and Human–AI Interaction Failures
The effectiveness of human-in-the-loop systems depends fundamentally on whether humans can develop and maintain appropriate levels of trust in AI components. Trust calibration refers to the alignment between a user’s confidence in an AI system and the system’s actual reliability [
11]. When trust is well-calibrated, people are generally able to trust AI recommendations in cases where the system performs well, and use their own judgment in cases where the system tends to perform poorly. The challenge in achieving well-calibrated trust lies in the need for people to build accurate mental models of how well AI systems perform in a wide range of cases.
Over-trust occurs when humans place excessive confidence in AI systems, leading them to accept erroneous outputs without adequate scrutiny. Agudo et al. conducted empirical studies examining how AI errors propagate through human-in-the-loop processes, finding that humans often fail to detect and correct AI mistakes even when they have the knowledge and ability to do so [
11]. This is sometimes termed automation bias or even automation complacency, and this is a major risk when AI is used in situations where errors could have serious consequences. The study also revealed that the rate of error detection reduces when individuals adapt to high accuracy levels of AI, which means that AI’s success could be its own failure in terms of requiring human oversight.
Under-trust presents the opposite problem, where humans discount valid AI recommendations due to skepticism, unfamiliarity, or negative prior experiences. Baroni et al. developed the AI-TAM model to investigate factors affecting user acceptance and collaborative intention in human-in-the-loop applications [
56]. In their study, they were able to identify several determinants of trust, such as perceived usefulness, perceived ease of use, and social influence, which clearly shows that trust development involves a rational assessment of system capabilities as well as contextual factors. Under-trust may cause humans to turn away from AI assistance in cases where the performance of AI systems is significantly better than human judgment alone.
The dynamics of trust development over extended interaction periods introduce additional complexity. Lopes conducted studies on operator fatigue, trust, and workload demand in human-in-the-loop AI-enabled drone systems, revealing how trust evolves as operators gain experience and as their cognitive resources become depleted [
57]. The initial level of trust, whether high or low, has a tendency to set a foundation for future trust evaluations. As such, interactions with AI at the onset have a strong impact. Fatigue was shown to affect the calibration of trust by reducing cognitive resources for monitoring and evaluation. The influence of individual differences on trust calibration has received increasing research attention. Dores Cruz et al. demonstrated that political preferences can compromise human-in-the-loop oversight of AI, with individuals showing systematic biases in how they evaluate AI outputs depending on whether those outputs align with their prior beliefs [
58]. This result carries significant implications for applications in which AI systems are involved in discussions about politically or socially contested issues, suggesting that a variety of oversight bodies may be necessary in order to combat individual biases. In a more general sense, the present study underscores that trust in AI systems cannot be accounted for by system-related factors alone, but is shaped by what people believe, value, and cognitively tend toward in their interactions with others.
The issue of transparency in regard to the boundaries of artificial intelligence is one of the ways of building calibrated trust, but the relationship between transparency and calibrated trust is not immediately clear. Brooks argues that it is important to maintain proper expectations of artificial intelligence in order to enable effective cooperation between humans and artificial intelligence, but this should not be done in an optimistic or dismissively skeptical way [
55]. Communicating uncertainty and limitations can help in the development of accurate mental models in humans; however, over-hedging can create a lack of confidence in the face of valuable assistance from an artificial intelligence system. Transparency that is effective requires calibration in the communication of limitations.
System failure and error recovery mechanisms can be identified as critical junctures for trust calibration. The actions taken by the AI system during such failure and its potential to help in the recovery from human errors can impact the overall level of trust. Alpay and Alpay examined the deficient human-in-the-loop oversight mechanisms in sophisticated AI systems and identified patterns that occur in such failure scenarios [
59]. Their results show that trust violations resulting from unforeseen system failures are difficult to mitigate, especially when humans are not provided with clear explanations for the reasons behind these unforeseen system failures. Designing for graceful degradation can aid in ensuring trust levels are maintained in spite of unforeseen system failures in AI systems.
The organizational context in which human–AI collaboration occurs shapes trust dynamics in ways that extend beyond individual user–system interactions. James examined human-in-the-loop architectures for trustworthy AI planning in mission-critical business intelligence systems, emphasizing how organizational structures, accountability mechanisms, and cultural factors influence whether humans exercise meaningful oversight [
60]. In an organizational setting, trust exists at various levels: individuals need to trust the AI system, individuals need to be trusted by the organization to provide adequate oversight, and the organization needs to trust that the overall human–AI system meets performance and safety requirements. Disalignment at these different levels can create oversight issues even if all levels appear to be functioning correctly in isolation.
Responsibility attribution in human–AI systems creates complex dynamics that affect trust and oversight behavior. When errors occur in collaborative human–AI processes, questions arise about whether responsibility lies with the AI system, the human operator, the system designers, or the organization that deployed the system [
61]. Mellamphy discusses how different understandings of the relationship between humans and artificial intelligence, including humanistic and posthumanist understandings, imply different understandings of responsibility. The unclear sense of responsibility can lead to unwise human intervention due to a sense of unaccountability for artificial intelligence errors or can lead to obstructive behavior due to a sense of blame for uncontrollable artificial intelligence errors.
The scapegoat-in-the-loop concept captures situations where humans are nominally included in AI systems primarily to absorb responsibility rather than to provide meaningful oversight [
61]. In these configurations, human involvement may satisfy legal or regulatory requirements without actually improving system safety or performance. Ottun and Flores conducted a review of human oversight and human-in-the-loop approaches, identifying characteristics that distinguish meaningful oversight from superficial compliance [
2]. Meaningful oversight requires that humans have sufficient information, time, expertise, and authority to evaluate and override AI decisions, conditions that are not always met in practice despite nominal human-in-the-loop designs.
Adaptive methods for trust calibration attempt to control the behavior of a system by responding to patterns of trust from humans. In other words, rather than presenting the results of the AI systems in a uniform manner, it is possible for the presentation of the results or recommendations to be adjusted based on the trustworthiness of the results or the patterns of trust from individual users. This was demonstrated by Cho et al. in a wearable sensor for thermal comfort control [
62]. Such adaptive approaches can help correct both over-trust and under-trust by providing stronger endorsements when AI confidence is high and more hedged recommendations when uncertainty is elevated.
The long-term sustainability of human-in-the-loop oversight requires attention to skill maintenance and engagement. When AI systems perform well consistently, human operators may experience skill decay in the tasks that AI has assumed, reducing their ability to detect errors or take over when AI systems fail [
57]. Concurrently, a decrease in the cadence of substantive intervention opportunities could also foster boredom, which would further impair the quality of oversight. The development of human-in-the-loop systems for sustainable operation requires the intentional preservation of human skills and engagement, which could be done through training exercises, task allocations, or system designs that sustain substantive human involvement even when AI automation could be used for autonomous performance of tasks.
5. Applications in High-Stakes Domains
The principles and methodologies of human-in-the-loop artificial intelligence are best implemented in areas that have significant implications for human well-being, safety, and human rights. Such areas as healthcare, autonomous systems, cybersecurity, etc., are the most motivating factors for human-in-the-loop artificial intelligence and simultaneously the most challenging areas for the implementation of human–AI cooperation. This section will discuss the adaptation of human-in-the-loop artificial intelligence to the requirements of these areas.
Figure 4 provides an overview of these application domains, highlighting their associated risk levels, Human–AI loop configurations, and characteristic challenges.
With respect to the operational functions, human-in-the-loop feedback processes demonstrate a range of domain-specific variation, from primarily supporting clinical validation and judgment context in the health domain, to supervisory intervention and edge case correction in the autonomous systems domain, to threat triage and decision making under uncertainty in the cybersecurity domain, to supporting compliance review and fairness/accountability assessment in the finance domain, to moderating AI output against rubric-based pedagogical criteria in the education domain, and finally, to supporting defect adjudication and adaptation processes in the manufacturing domain. This range of domain-specific variation implies that the effectiveness of HITL processes may not necessarily be related to the human element, but rather to the alignment of the feedback processes with the primary risk profile of the domain.
5.1. Healthcare and Life Sciences
The medical applications of AI in healthcare support a human-in-the-loop approach since decision-making in healthcare has a direct impact on health and the accountability requirements in healthcare mandate that only qualified individuals be responsible for healthcare. Bakken highlights the need for human involvement in health AI by arguing that the complexity of clinical reasoning and decision-making, as well as the relevance of the context in which patients receive healthcare and the ethical requirements in healthcare decision-making, requires involvement in AI [
78]. This perspective reflects broader consensus in medical informatics that AI should augment rather than replace clinical judgment, with systems designed to support rather than supplant the expertise of healthcare professionals.
Medical imaging represents one of the most active areas for HITL AI development in healthcare. Yu et al. developed PI-RADSAI, a human-in-the-loop model for prostate cancer diagnosis based on MRI that integrates radiologist expertise with machine learning capabilities [
26]. The system presents AI-generated assessments to radiologists who can confirm, modify, or reject the automated analysis based on their clinical judgment and additional patient information not available to the algorithm. Wu et al. demonstrated AI-accelerated structuring of radiology reports with human oversight, showing how AI can reduce documentation burden while maintaining the accuracy and completeness that clinical communication requires [
88]. These applications illustrate the pattern of AI handling routine processing while humans focus on interpretation, verification, and communication.
Neurological applications have demonstrated the potential for HITL systems to achieve performance that generalizes across diverse clinical settings. Yang et al. developed a human-in-the-loop AI system for clinical seizure recognition that achieved continental generalization, maintaining diagnostic accuracy across patient populations in different healthcare systems [
79]. The human-in-the-loop part was also essential in managing cases in which the automated detection was still in doubt. The study showed that human involvement can help address the generalization issues that automated detection systems face.
Clinical decision support systems represent another important application area where human-in-the-loop design principles inform system architecture. Steffny et al. developed a human-in-the-loop centered AI-based clinical decision support system for professional care planning, emphasizing the importance of designing AI assistance that aligns with clinical workflows and decision-making processes [
80]. Theilmann et al. examined success factors for AI in healthcare, identifying human-in-the-loop integration as a key determinant of whether AI systems achieve their intended benefits in clinical practice [
89]. These studies highlight that technical performance alone does not guarantee clinical value and that effective integration with human practitioners requires attention to workflow, interface design, and organizational factors.
Notably, the acceptance of human-in-the-loop (HITL) systems by medical professionals can differ substantially from one medical domain to another, as can the type of task. This difference is often more related to the workflow compatibility than the actual model accuracy. In many cases, system deployment failure is related to the presentation of AI results at inappropriate points within the care pathway, increasing documentation burdens, or the uncompensated verification demands placed on the clinician under time pressures. This can result in the system, under these circumstances, reverting to a state of ’rubber-stamping’ or bypassing the system, which can lead to a failure of safety and acceptance. Conversely, the acceptance of the system can be facilitated by the integration with existing decision points, the provision of explanations at a clinically relevant level of granularity, and the establishment of clear escalation and accountability processes [
80,
89,
90]. This evidence suggests that workflow integration is not a secondary implementation detail but a primary determinant of whether healthcare HITL AI delivers real-world benefit.
Healthcare applications in resource-constrained settings present particular challenges and opportunities for HITL AI. Kabata and Thaldar examined human-in-the-loop requirements for AI healthcare applications in low-resource settings, where the scarcity of medical expertise makes AI assistance potentially more valuable but also raises concerns about appropriate oversight [
3]. This suggests that Human-in-the-Loop (HITL) design for low-resource contexts should take into consideration the lack of access to expert human evaluators and possibly involve alternative forms of oversight. Fahad and Huang suggested a framework for continuous validation of generative AI in healthcare. The need for this arose from the limitation of sustaining vigilant human oversight in the face of generally reliable AI outputs [
32].
The organizational dimensions of human-in-the-loop healthcare AI extend beyond individual clinical encounters to encompass institutional governance and quality assurance. Herrmann and Pfeiffer argued for keeping the organization in the loop as a general concept for human-centered AI, using medical imaging as an illustrative example [
90]. Their framework recognizes that effective human oversight requires not only capable individual practitioners but also organizational structures that support monitoring, feedback, and continuous improvement. Griffen and Owens proposed moving from traditional human-in-the-loop models to participatory systems of governance for AI in healthcare, envisioning patient and community involvement in shaping how AI systems are developed and deployed [
64].
Emerging applications in pathology and laboratory medicine demonstrate the expanding scope of HITL healthcare AI. Guo et al. evaluated cell AI foundation models in kidney pathology using human-in-the-loop enrichment, developing methods for pathologists to guide model improvement through targeted feedback on challenging cases [
91]. Lin et al. applied human-in-the-loop AI screening for hepatic porphyria diagnosis, demonstrating potential improvements over standard diagnostic approaches [
92]. Kandala et al. developed cross-lingual mental health ontologies for Indian languages using explainable AI and human-in-the-loop validation, addressing the challenge of extending AI capabilities to underserved linguistic communities [
93]. These applications illustrate continuing expansion of HITL approaches into new clinical domains.
5.2. Autonomous Systems and Robotics
One such domain where the balance between the capability of artificial intelligence and human oversight is a safety concern is that of autonomous vehicles. In such a case, human-in-the-loop design should be considered on a spectrum of SAE levels of automation, as opposed to a binary concept. Lower levels of automation are typically those where humans are in a constant loop of control assistance, while higher levels of automation leave humans in a role of monitoring with occasional fallback intervention. As levels of automation increase, the bottleneck in human performance shifts from vehicle control proficiency to attention, situational awareness, and preparedness for intervention under time pressure [
39].
A key conceptual distinction is between training-time and deployment-time HITL paradigms. Training-time approaches, such as human-guided reinforcement learning and mentor-style correction, use human input to shape policy learning before deployment and to reduce unsafe exploration during development [
28,
29]. Paradigms related to deployment time, on the other hand, relate to supervisory control, takeover, and response in real-time operational traffic. These paradigms are not interchangeable since they differ in terms of required cognitive resources, failure modes, and levels of regulatory interest. The equivalency of these paradigms obscures critical trade-offs inherent in safety assurance.
Recent empirical work on occupant intervention behavior under extreme driving conditions further underscores this distinction. Xu et al. show that intervention decisions depend on perceived risk trajectory, cue timing, and human confidence in automation status, not only on objective hazard intensity [
94]. Related driver-in-the-loop evidence indicates that collaboration quality changes over time as users adapt, which affects both trust calibration and takeover performance [
95]. For HITL evaluation, this implies that autonomous-driving systems should be assessed with paradigm-specific metrics (e.g., intervention timing, missed versus unnecessary interventions, recovery quality after takeover) rather than only aggregate task success.
Self-driving laboratories represent an emerging application domain that combines autonomous experimentation with human scientific judgment. Hysmith et al. examined the future of self-driving laboratories, exploring the progression from human-in-the-loop interactive AI to gamification approaches that can engage broader communities in guiding automated scientific discovery [
96]. These systems largely automate the experimental procedures while still relying on human scientists to generate hypotheses, interpret results, and control the direction of the research. The human-in-the-loop component ensures that automated experiments align with scientific goals and that unexpected results are properly addressed.
Drone and unmanned aerial vehicle applications have motivated substantial HITL research due to their operational complexity and potential for harm. Lopes conducted studies on operator fatigue, trust, and workload demand in human-in-the-loop AI-enabled drone systems, revealing how extended operation affects human oversight quality [
57]. The study also shows that fatigue affects not only trust calibration but also performance, which is a clear indication of the need for effective workload management when it comes to sustaining effective human performance during operations. Inoguchi et al. proposed various workflows for roof damage detection using drones in a collaborative framework between humans and AI [
97].
Robotics applications in care and service contexts raise distinctive questions about the appropriate relationship between humans and AI systems. Liu examined human-in-the-loop ethical AI for care robots, drawing on Confucian virtue ethics to develop frameworks for robots that support human flourishing rather than merely performing assigned tasks [
75]. Ali explored how human-in-the-loop approaches can enhance safety and adaptability in interactive AI robotic systems, emphasizing the importance of mechanisms for humans to guide and correct robot behavior in dynamic environments [
98]. These applications require HITL designs that support nuanced human–robot interaction rather than simple supervisory oversight.
The transportation infrastructure applications of human-in-the-loop (HITL) autonomous systems expand upon the single-vehicle setting by considering broader traffic management objectives. Previati et al. created simulation frameworks for roundabout traffic scenarios that incorporate automated vehicles, artificial intelligence, edge computing, and human-in-the-loop components to study the integration of human oversight in complex traffic scenarios that involve multiple autonomous agents [
99]. Happer examined human-in-the-loop versus fully autonomous AI systems for crisis-driven defense electronics manufacturing, analyzing trade-offs between automation efficiency and human adaptability in high-pressure production environments [
100].
5.3. Cybersecurity and Critical Infrastructure
Cybersecurity is a domain wherein the adversarial nature of cyberthreats and the dynamic nature of cyberattacks present unique challenges for the functioning of AI systems without human intervention. Karunamurthy et al. examined human-in-the-loop intelligence for advancing AI-centric cybersecurity by arguing that cybersecurity requires the integration of AI-based pattern recognition with human expertise on attackers’ motivations and organizational context [
81]. Their analysis emphasizes that cybersecurity threats often involve social engineering and exploitation of human factors that AI systems struggle to model, making human judgment essential for comprehensive threat assessment.
The integration of human expertise into AI-driven cybersecurity operations requires careful attention to workflow design and decision support. Owen et al. developed approaches for proactive AI in cybersecurity with human-in-the-loop collaboration for intelligent threat detection and alerting [
18]. Their approach utilizes artificial intelligence to prioritize potential threats and offer relevant context to human analysts. The human has the authority to decide the response actions. Turner et al. studied human-in-the-loop decision-making for AI-based cyber defense. The authors examined the interaction between security analysts and AI-based recommendations for cyberdefense. The authors also examined factors that impact human judgment on AI-based cyberdefense [
83].
Critical infrastructure protection presents high-stakes applications where the consequences of both successful attacks and false alarms can be severe. Campbell et al. developed human-in-the-loop adaptive AI cybersecurity frameworks for safety-critical infrastructure systems, addressing the challenge of maintaining security while avoiding disruptions caused by overly aggressive automated responses [
82]. da Silva examined AI-driven cybersecurity with a human-in-the-loop approach, proposing methods for integrating human expertise into automated security operations centers [
101]. These applications require HITL designs that support rapid human response while avoiding alert fatigue that could cause analysts to miss genuine threats.
Software development security has emerged as an application area where generative AI capabilities create both opportunities and risks. Sharma et al. developed cybersecurity-aware human-in-the-loop test orchestration for AI-powered DevSecOps, examining how human oversight can be integrated into automated development pipelines to catch security vulnerabilities before deployment [
102]. Konakanchi examined human-in-the-loop secure code synthesis, addressing the challenge of ensuring that AI-generated code does not introduce security vulnerabilities [
103]. These applications recognize that AI code generation capabilities must be paired with appropriate security review to avoid introducing new attack surfaces.
5.4. Finance, Education, and Industry
This subsection considers the logics of finance, education, and industry, which utilize the same HITL labels, though with different optimization objectives, error cost, and accountability structures. The logics are as follows: finance focuses on legal defensibility and rights protection, education focuses on pedagogical validity and equity, while industry focuses on throughput, quality consistency, and safety. The analytical point is that the effectiveness of the “human-in-the-loop” is not necessarily a guarantee, as its effectiveness depends on the alignment of the points of human intervention with the dominant risks.
With regard to finance, especially in lending and associated decision processes, the primary concern is one of algorithmic accountability rather than the sheer number of human checks. Human override layers can contribute to greater fairness only in combination with decision rationales, explainability for adverse actions, and monitoring for disparate impact across protected groups. Without these components, human checks can be nothing more than a buffer for liabilities while patterns of biased decision processes persist. The governance-focused frameworks developed by Joshi and Singh can only be taken seriously as institutional controls such as documentation governance, escalation governance, and review governance [
48,
84]. This aligns with broader regulatory debates on meaningful human oversight and accountability design under high-risk AI governance, where policy compliance and substantive fairness can diverge [
65,
71,
72].
The main challenge in the academic setting is how to maintain the legitimacy of assessment while utilizing the potential of artificial intelligence for scalability. Human involvement in the process can help in better judgment; however, this can also lead to instructor inconsistency or institutional bias if the rubrics are not defined well. There is evidence that in all use cases involving grading and content generation, human-in-the-loop design should be aligned to rubrics and moderation for equitable outcomes for different student groups [
9,
85,
86,
87,
104]. In this setting, human oversight is effective when it is structured as calibrated academic judgment rather than discretionary exception handling.
In industrial or manufacturing settings, human-in-the-loop (HITL) systems are typically assessed for their reliability under production stress. The review process improves defect detection and adaptability to changing environments, but performance suffers when interfaces are overly burdensome cognitively or when operators are relegated to passive monitoring for exceptions. Empirical work on manufacturing and visual inspection settings suggests that successful implementation relies on strong interplay between explainability, training, and feedback mechanisms for continually refining predictive models and practices [
23,
41]. Across all three domains, the common lesson is that analytical review should focus on how oversight is operationalized—authority, timing, evidence, and feedback—rather than on whether a human is nominally present in the loop.
5.5. Cross-Scenario Common Challenges in High-Risk HITL Deployments
In all these domains—healthcare, autonomous systems, cybersecurity, finance, education, and industry—these recurrent problems follow a common pattern despite the varying objectives of each domain. Firstly, there is a problem in establishing accountability in a distributed manner among different models, operators, and organizations. There is a need for clear override privileges and decision logs to avoid diffusion of responsibility. Secondly, trust calibration is another problem that is easily disrupted by over-trust or under-trust. Over-trust leads to automation bias, whereas under-trust leads to a decrease in the effectiveness or usage of the systems. Thirdly, cognitive load and fatigue limit the actual effectiveness of operators in providing oversight.
Fourthly, the timing of human interventions is just as important as the interventions themselves. In other words, delayed or untimely interventions may be just as detrimental to safety despite the formal human-in-the-loop structures that are in place. Fifthly, feedback has a different quality that may include noise, bias, or even strategic behavior that may remain unresolved. This has a significant impact on the reliability of the models. Lastly, there are institutional constraints that affect the feasibility of different types of oversight. Overall, these cross-scenario challenges point to the fact that human-in-the-loop systems need to be viewed as part of an organizational control system rather than just a model–human interface.
8. Open Challenges and Future Directions
The previous sections have discussed the current status of human-in-the-loop AI systems with regard to their theoretical underpinnings, technical methodologies, applications, and governance frameworks. The following sections will discuss the challenges that affect the efficacy of HITL systems and also present the possible research directions to mitigate the challenges associated with the technology. The challenges are multifaceted, including technical, cognitive, organizational, and societal aspects, thus reflecting the interdisciplinary nature of human–AI collaboration.
8.1. Layered Future Research Agenda Aligned with the HITL Taxonomy
To ensure a more systematic approach for future work, four interrelated layers can be followed and aligned with the aforementioned taxonomy’s dimensions: loop placement and interaction granularity. The technical layer’s priority areas include uncertainty estimation, learning in the presence of disagreement, secure feedback mechanisms, and adaptive escalation strategies. The aforementioned aspects directly affect the loop placement and interaction granularity. The cognitive layer’s priority areas include trust calibration, human workload-aware interface design, and mitigation of human bias and deskilling. The aforementioned aspects affect the interaction granularity. The organizational layer’s priority areas include governance capabilities, staffing models, accountability and traceability, audit processes, and escalation processes. The aforementioned aspects affect the human-in-the-loop design’s viability. The ethical and institutional layer’s priority areas include fairness in heterogeneous feedback mechanisms, transparency requirements, value pluralism management, and sector-specific regulations. The aforementioned aspects affect the human-in-the-loop design’s viability. The aforementioned aspects suggest that future human-in-the-loop research should focus on evaluating the performance of human-in-the-loop methods in terms of not only their predictive performance but also their viability regarding human control calibration at different risk levels and interaction densities.
8.2. Scalability of Human Oversight
The scalability of human oversight is one of the key issues in the effective implementation of human-in-the-loop (HITL) technology. The more the AI technology is extended to different decision scenarios and contexts, the more the human oversight capability is limited. The current solutions for the scalability of human oversight in AI technology include active learning techniques for decision scenarios, tiered human oversight for decision scenarios, and sampling-based audit techniques. Future research on human oversight in AI technology should include the development of more sophisticated techniques for identifying scenarios that require human attention and differentiating them from scenarios that can be processed by machines. The development of such techniques can help AI technology in accurately assessing its own uncertainty and can help in the effective allocation of human resources for decision scenarios. The development of different human–AI team configurations can help in identifying different human–AI team configurations that can help in the effective allocation of human resources for decision scenarios. The development of AI technology that can explain itself and identify potential issues can help human overseers in effectively allocating resources for decision scenarios.
8.3. Human Factors and Cognitive Limitations
The cognitive limitations of humans are fundamental constraints on the performance of human-in-the-loop systems that cannot be alleviated by technology alone. Fatigue, lapses of attention, cognitive biases, and the bounded nature of human rationality are all factors that can affect the performance of humans in an oversight role. These factors may be exacerbated by the characteristics of the HITL tasks, such as the need to maintain vigilance, repetitive tasks, and the difficulty of maintaining engagement when the AI system is working well. Research on the human factors associated with HITL systems has identified the issues associated with the performance of humans in an oversight role. However, the solution to the problem remains an open issue.
The areas to be addressed in future research on HITL systems are the design of the system to accommodate the limitations of humans and the development of support tools to enhance the performance of humans. Adaptive systems that can respond to the cognitive state of humans by making adjustments to the system to accommodate the limitations of humans are a promising area of research. However, the question of what remedial action to take when a problem is detected is still an open one. Training methods to enhance the skills of humans in an HITL system are an important area of research. This is especially true in the face of the problem of the degradation of skills when AI systems are used. The design of work schedules to maintain the performance of humans over long periods of time is an important area of research.
8.4. Conflicting Human Feedback
In human-in-the-loop (HITL) systems that combine the feedback of various contributors, there is a problem of disagreement among the contributors. There is a chance for annotators to assign different labels to similar situations, for stakeholders to hold incompatible views on fairness criteria, and for experts to draw different conclusions on the best course of action. The existing solutions to the problem of dealing with the disagreements of human contributors include majority voting, quality-weighted aggregation of the inputs from the human contributors, and approaches that preserve the information on the disagreements rather than forcing consensus. However, these solutions do not completely address the problem of how to create ground truth in the face of human conflict. The area of conflicting human feedback should be further researched with a differentiation between the cases of human conflict due to ambiguity or plurality of values on the one hand, and human conflict due to errors or lack of information on the other. The development of appropriate tools for detecting the causes of the conflict may help to address the different types of conflict appropriately. Research on the deliberative procedures that help to achieve consensus among humans or that help to understand the nature of the conflict may complement the aggregation-based approaches to human feedback that treat human opinions as given. The impact of conflicting human feedback on the learning of the system and the validity of the AI actions should continue to be an important research question.
8.5. Adversarial Manipulation and Security
Human-in-the-loop (HITL) systems incorporate human components that may be vulnerable to adversarial attacks, such as social engineering attacks on trust, cognition, or organizational factors. Attackers may aim to compromise training data by attacking human annotators, evade detection by exploiting human fatigue, or manipulate system behavior by targeting individuals who are involved in feedback provision or deployment decision-making. Hence, the security of HITL systems will not only depend on system security, but also on the robustness of human components against adversarial attacks.
In order to advance HITL security, it is recommended that in the future, threat models for HITL systems are developed, highlighting vulnerabilities in HITL systems resulting from human involvement, as well as proposing countermeasures for different risk contexts. Detection of potential attacks on human components, such as detection of anomalous behavior or feedback, may aid in early detection of attacks on human components in HITL systems. In addition, organizational factors that reduce susceptibility to social engineering attacks should be given more importance in HITL system security, in addition to technical security aspects. Developing HITL systems with security guarantees even when some human components are compromised is a promising yet challenging direction for future research on HITL system security.
8.6. Toward Adaptive and Self-Regulating Architectures
The existing state of affairs of human-in-the-loop systems is based on a set of fixed configurations that identify when human input is required during specific junctures of AI system execution. A system that is adaptive in nature, where human input is modulated based on risk levels, AI confidence, or performance, could be seen as offering a better balance between levels of oversight and system efficiency. However, a self-regulating system also poses a unique problem: if left to its own devices, where it is responsible for determining when levels of oversight are required, it is possible that problems of self-assessment could result in levels of oversight being minimized or eliminated altogether. This creates a vicious loop that undermines the normative purpose of a HITL system unless bounded by external limits that are not represented within the model that is being overseen.
A solution that is robust in nature would be to ensure that adaptive triggering is supplemented by non-adaptive measures of oversight. These could be represented by measures that are based on hard levels of oversight for high levels of risk, policy-based intervention rules that are developed by external agencies or organizations, randomized mandatory audit rules, sentinel models that are independent of the AI system, or default levels of oversight that are conservative in nature when uncertainty or changes in probability are represented by levels that are unstable or uncertain. In this regard, risk levels determined by AI represent a means of identifying where levels of oversight are required but do not represent a means of eliminating levels of oversight altogether. Future research should focus on ensuring that meta-oversight is incorporated, where humans govern the levels of adaptation in AI systems, including trigger levels, override levels, and post-incident levels of assessment.
9. Conclusions
This survey provides a systematic overview of the field of Human-in-the-Loop AI by discussing its underlying theory, technology, ethics, and practice. The integration of human judgment into the decision-making process in AI systems helps reduce the problems that occur due to the limitations of fully automated systems. At the same time, it presents new challenges for human–AI interaction, management of cognitive load, and designing efficient collaborative systems. A taxonomy for HITL systems is proposed that organizes them based on the position of the loop, granularity of interaction, and temporal characteristics. This helps in comparing different systems and identifying relevant design issues depending on the application domain.
The underlying technology for HITL AI systems has become quite mature with active learning, reinforcement learning from human feedback, and human-in-the-loop generative AI being well-developed fields supported by a wealth of research. Explanatory techniques for AI systems help address the problem of enabling humans to effectively interact with these systems. The problem of trust calibration in these systems has also been addressed by the underlying technology. The application of these techniques in different domains like healthcare, autonomous systems, and cybersecurity highlights the importance of human involvement in decision-making in AI systems and the need for adapting these systems for different domains.
Figure 5 summarizes the trust calibration dynamics discussed above, illustrating how different levels of human trust influence interaction with AI systems and how calibration mechanisms can support effective oversight.
Ethical and governance considerations continue to increasingly occupy a central position in human-in-the-loop research. The issues of fairness, bias mitigation, and value alignment continue to be a challenge that human-in-the-loop research can help alleviate but never completely address. The regulatory environment surrounding AI systems, such as the European AI Act, the various initiatives for AI governance in the United States government, as well as sector-specific guidelines such as the FDA AI/ML SaMD guidelines and autonomous vehicle safety guidelines, all indicate the requirements for human oversight and risk mitigation in AI systems that impact society. The organizational level of human-in-the-loop research is an area that continues to deserve attention as the use of AI systems continues to grow exponentially.
There are various challenges that continue to impede the current effectiveness of human-in-the-loop systems. The challenges also indicate avenues for future research. The scalability of human oversight in AI systems, the management of human cognitive biases, and the management of conflicting human opinions, as well as the security of human-in-the-loop systems from adversarial attacks, are all areas that still deserve attention. The design of adaptive architectures that can modulate human oversight based on the level of human oversight required is an area that could help address the scalability of human oversight.
We envision a future where the technology shifts away from the current dominant paradigm of “human-in-the-loop” to more sophisticated “human-with-the-loop” partnerships. These partnerships will be marked by the dynamic distribution of roles, adaptation between human and artificial intelligence partners, and a governance structure to provide accountability without stifling positive innovation. The fundamental design goal is to adjust the level of control provided to the human partner according to levels of risk and uncertainty while maintaining non-negotiable mechanisms to override the AI system and accountability points for high-impact decision-making. Achieving this vision will require sustained collaboration among the fields of computer science, cognitive science, organizational studies, ethics, and policy research. This is particularly true given the high stakes involved in high-impact AI applications. This interdisciplinary collaboration is necessary to ensure that AI systems are aligned with human values and remain under the control of humans.