4.1. The Basic Architecture of VRAIO
The starting point of VRAIO lies not in attempting to elucidate the internal structure of AI systems, but in making the outputs that actually affect society the object of governance. However complex and black-box an AI system may be, what ultimately acts upon society is its outputs, namely recommendations, judgments, warnings, and administrative decision support. Accordingly, making it verifiable under what conditions outputs are generated, for what purpose, and to whom they are sent becomes the central task of AI governance [
39,
40,
41].
The norm running through this governance is simple (lower part of
Figure 2): AI outputs that can have a significant impact on society bear an obligation to argue that their purpose and content conform to the Rules, and an output that cannot be so argued must not be released in the first place. The architecture of VRAIO is an attempt to embed this norm not as a mere declaration but as a mechanism—the Valve described below is nothing other than the physical embodiment of this norm that “outputs that cannot be argued are not let through.”
The principal components of
Figure 2 are as follows.
The Government Regulatory Agency for AI (GRA-AI) formulates the Rules on the basis of social deliberation, legal institutions, and democratic decision-making. The Rules are not merely ethical guidelines; they are definitions of the purposes, targets, conditions, and prohibitions under which AI outputs are permissible, specified in a form that can be mechanically verified. These Rules are shared with the AI system, the Recorder, and the Data Receiver.
The Outbound Firewall is the exit structure surrounding the AI system. Generated output candidates cannot be released externally without passing through this Firewall. Inside the Firewall, the AI system generates a GLO metadata declaration (described below) together with the output candidate and transmits this to the Recorder as an Application.
The Recorder functions as an independent third-party body. The role of the Recorder is, in principle, to confirm that the declared GLO metadata is formally consistent with the Rules; it is not necessary for the Recorder to read the content of the output at all times. When it determines conformity, the Recorder returns the release condition to the Valve inside the Firewall. When it does not determine conformity, the Valve remains closed and the output is blocked.
The ledger (such as a Blockchain network) records the determination process and its results in a tamper-resistant form. The information recorded comprises the identifying information of the output candidate, the GLO metadata, the Rules applied, the determination result, timestamp information, and the output hash value. The ledger is the institutional memory concerning AI outputs and forms the foundation for post hoc auditing [
29,
39,
40].
Auditing by citizens and audit bodies is a structural feature that demonstrates that VRAIO is not merely an administrative surveillance apparatus. Researchers, citizens, and independent investigators can refer to the ledger to verify the effectiveness of the Rules, the compliance status of AI systems, and whether any violations have occurred [
31,
32]. Auditing involves not only reading the metadata in the ledger but also, through spot audits, cross-referencing the preserved actual output against the GLO metadata and output hash value in the ledger, thereby confirming that the declaration contains no falsehood.
The operation of VRAIO can be organized into the following four stages.
Rule formulation: The Government Regulatory Agency for AI formulates the Rules through democratic procedures.
Output candidate generation and GLO metadata declaration: The AI system generates an output candidate and transmits the corresponding GLO metadata to the Recorder as an Application.
Verification, determination, and recording: The Recorder verifies the GLO metadata against the Rules and returns the determination result to the Valve while simultaneously recording it in the ledger. Only when the Valve opens is the output transmitted to the Data Receiver.
Auditing and institutional improvement: Citizens and audit bodies audit the ledger and verify the validity of the Rules and the state of compliance. Audit results are fed back into the updating of the Rules.
The core of VRAIO’s deterrent power does not lie in real-time blocking by the Valve. If an AI system operator declares GLO metadata that differs from the actual output, that false declaration leaves a trace in the ledger [
29,
39]. The serious sanctions imposed when falsehood is discovered through unannounced spot audits—sanctions severe enough to destroy the incentive for false declaration—render false declaration an irrational choice [
30,
32]. VRAIO is not a system that monitors outputs one by one; it is an architecture that institutionally destroys the incentive to make false declarations [
39,
40,
47].
It should be noted that VRAIO is not limited to specific applications. The Data Receiver may be any of a diverse range of parties, including individuals, public institutions, other AI systems, and public infrastructure. VRAIO is applicable to a diverse range of AI outputs with social impact, including platform AI, generative AI, autonomous driving AI, public space camera AI, and administrative decision support AI. However, what is presented in this section is no more than the basic structure; how the legitimacy of each individual output is quantified and declared (
Section 4.2 and
Section 4.3) and how its verification is guaranteed (
Section 4.4) are the keys to making VRAIO effective.
4.2. GLO Metadata: A Common Language for Declaring Output Legitimacy
GLO (Guide to the Expression of Legitimacy of Output) is a common language that defines the format of the metadata required to be attached to output candidates that can affect society. In VRAIO, GLO metadata functions as the declaration information by which the Recorder mechanically cross-references against the Rules—that is, the permissible output range described in GLO format. It does not disclose the full text or raw data of the output; rather, it describes the information necessary for governing the output in a form that can be cross-referenced against the Rules. At its core, it treats the output candidate as a single “claim” and declares, on a fact-based footing, the argument that its purpose and content conform to the Rules (
Section 4.3).
The items of GLO metadata are as follows.
Output ID: An identifier uniquely assigned to each output candidate. It is indispensable for recording in the ledger and for cross-referencing in post hoc auditing.
Timestamp: The date and time at which the output candidate was generated (UTC standard). Tampering is detected through cross-referencing with the time of receipt by the Recorder.
Operator ID: Information identifying the operator of the AI system that generates and transmits the output. Assurance of authenticity through means such as digital signatures is required, clarifying the attribution of responsibility for false declaration.
Receiver ID and Category: The identifying information of the Data Receiver and its attribute category (adult, minor, medical institution, public institution, other AI system, etc.). This directly affects the selection of Rules to be applied.
Output Type: The type of the output in question. It is structured in accordance with a predefined classification system (ontology); once the type is determined, the applicable Rules, the types of “facts” required, and the types of “primary sources” that should serve as their basis are thereby determined.
Risk Classification: The risk level of the output in question, declared in accordance with the classification system defined in the Rules (low, medium, high, emergency, etc.). It is reflected in the priority of the Recorder’s verification and in the control conditions of the Valve.
Applicable Rules Reference: The clause identifiers of the Rules applied to the output in question. Verification is performed on the basis of this reference.
Legitimacy Confidence L (L ∈ [0,1]): The probability that the purpose and content of the output in question conform to the Rules. It is composed as
L =
L_purpose ×
L_content, the product of the legitimacy of the purpose and the legitimacy of the content given that purpose (
Section 4.3).
Legitimacy Budget: Information that decomposes and presents the computation process and grounds of
L. It comprises the list of “facts” that constitute the argument, each fact and the “primary source” that serves as its basis, the reliability of each fact and the method of its evaluation, and the composition rule for combining them into
L.
L does not circulate on its own; it must always be accompanied by the legitimacy budget. This forms the core of GLO metadata (
Section 4.3).
Output Hash: A hash value generated from the output content. In spot audits, it is used to verify the identity between the declared GLO metadata and the actual output content. It is one of the technical foundations for deterring false declaration.
Domain-Specific Fields: Extension items according to the nature and use of the output. These include flags for the possible inclusion of personally identifying information, risk classifications for emotional or psychological impact, the presence or absence of emergency exception application, and a list of the multiple models involved in a multi-agent configuration.
The types of “facts” required and the “primary sources” that serve as their basis differ for each output type. For example, in the search for a missing child in an FMPS (Fully Monitored Public Space), discussed later (
Section 4.5), facts such as a search authorization and the identification of the police officer who requested and confirmed the search are required to be traceable to primary sources such as the records of the issuing court or the police. Which facts and primary sources are required for which type is determined as part of the Rules through democratic procedures.
These items can be mechanically cross-referenced against the Rules. However, the depth of scrutiny is divided into two tiers. The Recorder’s constant cross-referencing covers the Timestamp, Operator ID, Receiver ID and Category, Output Type, and
L, confirming that these are formally consistent with the Rules. The legitimacy budget is not an object of the Recorder’s constant scrutiny but is scrutinized at the time of unannounced spot audits. That is, an audit body inputs the “facts” in the budget into the same Rule-Judgment AI (
Section 4.4) to recompute
L, confirms its agreement with the declared
L, and cross-references whether each “fact” is traceable to a primary source as declared. This two-tier configuration—formal verification up to
L on a constant basis, and substantive scrutiny of the budget on a spot basis—is the core of VRAIO’s operation, which renders false declaration irrational without requiring constant monitoring of every output (
Section 4.4).
The Compliance Verdict is not a declared item but an item assigned by the adjudicating side upon receiving the declaration. The sealed, deterministically operating Rule-Judgment AI (
Section 4.4) computes and verifies
L, and the Recorder, upon receiving that result, determines conformity and records it in the ledger. The Recorder has the Rule-Judgment AI bear the substance of the determination while itself handling the recording of that result and the confirmation of formal consistency with the Rules. The Compliance Verdict becomes the institutional basis for ledger recording, auditing, and penalties.
A supplementary note on the flow of determination. The default operation of the Recorder is limited to confirming, upon receiving the result of L computation and verification by the Rule-Judgment AI, the formal consistency between the GLO metadata and the Rules; it is not necessary to read the output content itself at all times. Because the Rule-Judgment AI is public and deterministic, the AI system side can also predict the verification result in advance, and in normal operation, the Valve functions as a formal pass-through gate. Exceptionally, only in serious cases or when there is sufficient time for determination, a transition to a Substantive Review that enters into the output candidate itself is also institutionally envisaged. However, this is an exception to the basic operation of VRAIO, and its activation conditions, authority, and procedures must be explicitly defined in advance within the Rules. The determination mode is also recorded in the ledger as part of the determination result.
Although the specific items of GLO metadata vary by application domain, the basic structure—declaration of the argument and facts, accompaniment of L by the legitimacy budget, cross-referencing against the Rules, official determination, and identity assurance by hashing—is common. It is precisely this common structure that enables VRAIO to function as a general-purpose output governance infrastructure not limited to specific applications.
4.3. The Legitimacy Confidence L and Its Composition
The core of the GLO metadata introduced in
Section 4.2 was the self-declaration of the legitimacy confidence
L of the output in question. This section shows how
L is defined and on what basis it is computed. Specifically, it formulates
L as a continuous quantity
L ∈ [0,1] representing the degree of legitimacy of an output candidate, and introduces the
legitimacy budget, which decomposes and accompanies the grounds for its computation. These give concrete form, at the level of individual outputs, to the structural correspondence with the GUM discussed in
Section 3.
Treating the output candidate as a “claim”
Before introducing
L, the basic idea of GLO should be made clear. GLO does not require an explanation of the internal process of the AI system that led to the output—through what weights and inference paths that output was generated. In large-scale AI systems, this is fundamentally difficult, and moreover, even if the internal process were explained, whether an individual output conforms to the rules would remain a separate problem (
Section 3).
GLO therefore treats the output candidate itself as a single “claim” and requires that, for that claim, an
argument that its purpose and content conform to the Rules be constructed post hoc. The argument is constructed by combining “facts.” That is, for an output candidate to be permissible, it must be possible to argue, on the basis of facts, that its purpose and content conform to the Rules. An output that cannot be argued must not be released in the first place (the norm of
Section 4.1). What is at issue in this conception is not how the output was generated (the process) but how the output can be justified by the Rules (justification).
Here, the consequence that this norm has for the behavior of the AI system should be noted. Because the Rules are public (in GLO format) and the Rule-Judgment AI that performs the conformity determination (
Section 4.4) is deterministic, the AI system can itself determine, prior to declaration, whether the argument it has constructed yields an
L within the range permitted by the Rules. Accordingly, the AI system attempts to construct an argument until one that yields a sufficient
L is obtained, and it abandons the output if none is obtained. There is no incentive to make a declaration foreseeably destined to be rejected. As a result, the situations in which the Valve actually blocks an out-of-range declaration are exceptional, and in normal operation, the Valve is open. VRAIO’s deterrent power lies not in this pre-emptive blocking by the Valve but in the post hoc destruction of incentives through the recording of declarations, unannounced spot audits, and severe sanctions for false declaration (
Section 4.1 and
Section 4.4).
Definition and composition of L
The legitimacy confidence
L expresses the plausibility of this argument as a value in [0,1]. The dimensions along which the Rules judge the permissibility of an output can be broadly divided into two: whether the purpose of the output is legitimate and whether the content of the output is legitimate. Corresponding to this,
L is constructed as the product of two factors.
Here, L_purpose is the plausibility of the legitimacy of the purpose, and L_content is the plausibility of the legitimacy of the content given the declared purpose. Taking the form of a product has a clear meaning. First, if either the purpose or the content is not legitimate (if either factor is 0), the overall L is also 0. An output whose purpose is illegitimate is not permitted however appropriate its content, and vice versa. Second, if both factors are interpreted as “the probability of conforming,” their product can be interpreted probabilistically as “the probability that both the purpose and the content conform.”
The point that the legitimacy of the content is evaluated conditionally on the purpose is important. Whether a given content is appropriate is not determined on its own but in relation to the declared purpose. For example, an output that accesses a certain range of data may be appropriate under a legitimate purpose but excessive in the absence of that purpose. In this sense, L_content can be understood as the quantification of data minimization and proportionality in light of the declared purpose—content exceeding the range reasonably necessary to achieve the purpose lowers L.
It should be noted that whether to declare purpose and content composed into a single L or to declare two values (L_purpose, L_content) side by side, and how to define the composition rule are themselves matters to be determined as part of the Rules through democratic procedures. This paper presents composition by product as the basic form but does not fix it as the only form.
The legitimacy budget: prohibiting the circulation of the value alone
L must not circulate on its own. A legitimacy budget that decomposes and presents the facts and assumptions, as well as the underlying composition, on which L was computed must always accompany L. This corresponds structurally to the way that, in metrology, the GUM does not permit a combined standard uncertainty to be presented as a single number but requires the accompaniment of an uncertainty budget that decomposes the evaluation method and contribution of each component.
The legitimacy budget includes the list of facts that constitute the argument, the reliability of each fact and the method of its evaluation, and the rule used for composition into L. Just as the GUM distinguishes components of uncertainty into statistical evaluation (Type A) and other evaluation (Type B), the reliability of each fact in the legitimacy budget is likewise envisaged to be classified according to its method of evaluation.
Here, the facts that constitute the argument must be traceable to primary sources. This is the criterion for the legitimacy budget to be auditable. For example, the fact that “a court warrant exists” must be traceable to a primary source, namely the records of the issuing court. A fact that cannot be traced to a primary source is not admitted as a constituent of the argument. This traceability is what makes possible the detection of factual falsity discussed in
Section 4.4.
However, requiring the accompaniment of the legitimacy budget does not mean requiring its disclosure. The content of the budget—for example, the search range or the records referred to—may itself contain privacy-sensitive information. What is required is not disclosure but reachability from the audit channel. That is, the budget need only be preserved in a form that an authorized audit subject can verify as necessary. How to achieve this reconciliation of auditability and privacy protection is borne by the Rule-Judgment AI of
Section 4.4.
The three-layer correspondence with GUM
By the above, the correspondence between GLO and GUM forms the following three layers at the level of individual quantities (complementing the institution-level correspondence discussed in
Section 3). That is, just as a standard uncertainty accompanies a measurement value and an uncertainty budget accompanies the standard uncertainty,
L accompanies an output candidate and a legitimacy budget accompanies
L.
L is not the measurement value itself but a second-order quantity assigned to it, and the legitimacy budget is third-order information that further decomposes the grounds for its computation. It is precisely this correspondence that distinguishes GLO from a merely declaratory classification system, for
L is not a primitive, self-declared label but a derived quantity computed from a fact-based argument, the computation process of which is open to auditing.
The scope of the formalization of composition
This section has presented composition by the product of purpose and content as the basic form of L. In more complex outputs—for example, a policy judgment in which numerous facts are hierarchically combined to support a single conclusion—L is not a simple product of two factors but a derived quantity computed by propagation over an argument graph whose leaves are facts and whose root is the conclusion. In this case, the rule by which the reliability of each fact is combined to yield the L of the conclusion—for example, whether by composition analogous to the propagation of variance, or by composition dominated by the weakest argument—is a design matter that depends on the structure of the argument and the domain.
In this paper, we do not enter into the general formalization of this composition rule, but confine ourselves to establishing the structural requirement that
L be “a derived quantity that is computed from a fact-based argument, accompanied by a legitimacy budget, and recomputable in auditing.” The formulation of specific composition rules and the verification of their validity for each domain are matters for future research. Indeed, the complexity of composition varies greatly by application domain, ranging from the relatively simple case in the FMPS discussed later (
Section 4.5) to cases such as policy support in which numerous facts are deeply combined.
4.4. The Rule-Judgment AI: A Sealed Verifier
The GLO/VRAIO framework presented in
Section 4.1,
Section 4.2 and
Section 4.3 imposes on output candidates the declaration of a legitimacy confidence
L and a legitimacy budget, and guarantees their authenticity through spot audits. This framework stands on its own—as a deterrent based on unannounced auditing and severe sanctions. However, two questions remain here that could further enhance the effectiveness of verification. First, how is it retroactively verified that the computation of
L itself was performed correctly? Second, the computation of
L requires entering into the content of an output candidate, which may contain privacy-sensitive information; how is that verification performed without leakage? This section presents the
Rule-Judgment AI as a mechanism that answers these two questions simultaneously, and further shows the structure of incentives and responsibility for AI system operators that becomes possible only when this verification mechanism is in place. This verifier does not replace the deterrence of the preceding sections; it adds, on top of it, a further layer of reproducibility in the computation of
L.
The sealed verifier
The Rule-Judgment AI is an independent component that receives an output candidate and its GLO metadata (facts and argument) as input and computes L in light of the Rules. Its first characteristic is that it is sealed. That is, while it may receive as input any information, including the output candidate itself, its output to the outside is limited to the single determination result (L and conformity), and it is designed and institutionally certified to expose no output channel other than the determination result.
This sealedness yields a decisive advantage. Even when the Rule-Judgment AI enters directly into the content of an output candidate, including privacy-sensitive information, to perform verification, it does not leak that content to the outside. Reading the maximum of information necessary for verification while limiting the information it outputs to the minimal determination—this asymmetry is precisely what makes possible the reconciliation of auditability and privacy protection. The same kind of requirement, namely verifying a property without disclosing the content, is being technically realized in the financial sector through privacy-enhancing technologies such as zero-knowledge proofs, secure computation, and trusted execution environments (TEEs) [
48]. The Rule-Judgment AI can be positioned as an application of this idea of sealed execution to the verification of AI output legitimacy.
Determinism and two-layer falsity detection
The second characteristic of the Rule-Judgment AI is that it is deterministic—its behavior does not change through learning, and it is guaranteed to return the same output for the same input. This property fundamentally strengthens the detection of false declaration.
Falsity relating to AI outputs can be divided into two layers.
First, computational falsity—the case in which an L that should properly be derived as low from the declared “facts” is falsely declared as high. This is reliably caught by the deterministic Rule-Judgment AI. In a spot audit, if the audit body inputs the same “facts” recorded in the metadata into the same Rule-Judgment AI and recomputes L, exactly the same L is obtained so long as the computation was performed correctly. If the AI system falsifies the computation of L, this is exposed as a discrepancy with the recomputed result. When an AI system behaves probabilistically, judging the truth of a declared L after the fact gives rise to a difficult gray zone; recomputation by a deterministic verifier turns this into a clear binary determination of agreement or disagreement.
Second, factual falsity—the case in which “a court warrant exists”—is declared in the facts field while, in reality, no warrant exists. This cannot be caught by the Rule-Judgment AI, because the verifier merely computes on the premise that the given facts are true. This layer of falsity is therefore confirmed by cross-referencing against authoritative external records—for example, by confirming whether the declared warrant actually exists in the records of the issuing court. To make this cross-referencing possible, the “facts” constituting the argument must be traceable to primary sources (
Section 4.3).
This two-layer separation limits the scope of the Rule-Judgment AI’s responsibility to “whether the computation from facts to L is correct” and entrusts “whether the facts are true” to external reference, which is also consistent with sealedness. If the verifier were also made to bear the determination of the truth of facts, it would require reference to all manner of external records, which could contradict sealedness. It is precisely because it adheres to the verification of computation that the verifier completes its task with the given input alone and can maintain its seal. This structure is, moreover, isomorphic to the auditing of an uncertainty budget in metrology: it corresponds to the two-stage verification of whether the computation from each component to the combined uncertainty is correct (confirmation by recomputation), and whether the value of each component itself is valid (reference to external grounds such as calibration certificates). By blocking both layers together, the pathways of false declaration are closed on both the computational and the factual sides. This is the technical underpinning of the deterrence that “renders false declaration an irrational choice” in VRAIO.
This division into two layers gives concrete form to the positioning of the verifier stated at the beginning of this section. What the introduction of the verifier newly guarantees is the bit-for-bit reproducibility of the reliability evaluation of each fact and the computation of L through their composition, whereby computational falsity is caught deterministically. On the other hand, the point that verification of the truth of the facts themselves is entrusted to spot audits, regardless of whether the verifier is present, is unchanged from the framework without the verifier.
Positioning as a reference standard
The fact that the Rule-Judgment AI is deterministic and that its specification is public and
immutable has a clear correspondence with the certified reference standard in metrology. Just as in calibration, trust is generated by cross-referencing measured values against a public and immutable reference standard, the Rule-Judgment AI functions as a public and invariant standard against which legitimacy judgments of AI outputs are cross-referenced. With this, a new term is added to the structural correspondence between GUM and GLO shown in
Section 3. That is, in addition to the correspondence between the standard format of the uncertainty budget (GUM) and the format of the legitimacy budget (GLO), there is correspondence between the certified reference standard and the Rule-Judgment AI. The operator of an AI system bears the obligation to make its own computation of
L agree with the behavior of this public, invariant, and deterministic reference standard.
The limited role of the LLM, and the agreement gate
One concern must be answered here. If the Rule-Judgment AI adjudicates legitimacy, does this not, after all, reintroduce on the side of the verifier the very structure that this paper rejected at the outset—the inscrutability of the output process—namely “an opaque AI adjudicating legitimacy”?
To avoid this, the role of the machine-learning component in the Rule-Judgment AI is limited to mapping “facts” and “arguments” into the structured parameters defined by the Rules. As stated in
Section 4.1, the Rules are definitions of the purposes, targets, conditions, and prohibitions under which AI outputs are permissible, in a form that can be mechanically verified, and that verification is performed in a quantitative and qualitative (0/1) parameter space. The final conformity determination is rendered as a mechanical cross-reference in this parameter space, not entrusted to the opaque discretion of an AI. The LLM (large language model)-type component bears the role of converting natural-language facts and arguments into structured parameters and does not hold the determination of legitimacy itself.
In this structure, the relationship between the AI system (the side that generates the output candidate and L, the prover) and the Rule-Judgment AI (the side that verifies it, the verifier) corresponds to the asymmetry between verification and search in computational theory. Constructing a legitimate argument for an output candidate requires high computational cost, whereas verifying a given argument is far cheaper. Furthermore, by combining sampling-based spot audits and deterministic recomputation rather than verifying every output, the verification load is kept low independently of the load of output generation. This is why the present framework can hold at scale.
In operation, the release of an output candidate is permitted only when the L declared by the AI system and the L computed by the Rule-Judgment AI do not greatly diverge in the parameter space (the agreement gate). However, this agreement gate does not guarantee truth. If the AI system and the Rule-Judgment AI share the same kind of bias, the two may agree and yet still be in error (correlated error). The agreement gate is therefore a filter that reduces the release of inappropriate outputs in advance, not the ultimate guarantee. The prior agreement gate reduces the leakage of hallucinatory outputs, and the subsequent spot audit catches falsity—through this two-stage defense, the robustness of the framework is secured.
The structure of incentives and responsibility derived from verification
The verification mechanism above gives a clear incentive structure to the behavior of AI system operators. In operation, the AI system constructs an “argument” combining “facts” for an output candidate and computes L using the Rule-Judgment AI. If an argument yielding a sufficiently high L is found, it outputs; if not, it abandons the output—this is nothing other than the prover’s search for a certificate described in the previous section.
If the deterrence shown up to the previous point is a stick of “false declaration is invariably met with severe sanctions,” there is here a carrot that forms its counterpart. That is, unless the operator makes a false declaration, it is in principle exempted from liability for that output. This is a decisive advantage for operators. At present, AI operators are placed in a dilemma between hiding behind opacity and thereby inviting distrust, or being exposed to unbounded liability for their outputs. VRAIO offers a third path—in exchange for declaring honestly and conforming to the Rules, the operator obtains a predictable exemption (safe harbor). Liability that was open-ended and unpredictable is transformed into something bounded and predictable. This structure has precedents in existing legal institutions, such as platform safe-harbor provisions and the due-diligence defense in regulatory law. With both stick and carrot in place, the equilibrium of “honest declaration” is stably established.
However, one difficulty must be addressed here. Factual falsity (the second layer above) includes both intentional falsity and faultless misapprehension. For example, if an operator declared the existence of a warrant believing it to be true, but the record it referred to was itself erroneous, this cannot be condemned as an intentional lie. The two are indistinguishable in their outcome, namely “the declared fact was false”; the difference lies in intent, but intent cannot be observed externally.
This difficulty is handled by an observable criterion rather than by intent. That is, the criterion is whether the operator complied with the discipline of fact procurement—that the facts constituting the argument must be traceable to primary sources (
Section 4.3). If a fact turns out to be false even though the prescribed procurement procedure was complied with, it is treated as faultless misapprehension and addressed through guidance for preventing recurrence and through the strengthening of spot audits for that operator (and the bearing of its cost). By contrast, a sloppy declaration that neglected the procurement procedure is not granted the defense of misapprehension and is subject to sanction as negligence or falsity. Furthermore, if misapprehension recurs or becomes habitual even when the discipline is complied with, the response is escalated in stages from strengthened auditing to a review of exemption eligibility. This prevents the evasion of responsibility through the excuse that “it was an oversight” (moral hazard).
One further point: the relationship between exemption and victim compensation must be organized. The principle that “honest declaration is exempted” and the requirement that “compensation must be made when harm has actually occurred” are reconciled by dividing responsibility into two kinds. One is punitive and regulatory responsibility, which is exempted by honest declaration (safe harbor). The other is compensatory responsibility that restores the victim, which is not exempted. To satisfy the latter, the establishment of a compensation fund financed by operator contributions can be considered. Operators contribute to this fund together with the operating costs of VRAIO (including spot audits), and when harm occurs, the fund compensates the victim. This ensures that an honest operator is not exposed to catastrophic damages from a single misapprehension, while the relief of the victim is reliably achieved. If contributions are linked to track record, the incentive to neglect care is also curbed. This structure has precedents in existing institutions such as no-fault compensation funds and liability insurance pools.
Finally, there are cases in which an output, despite conforming to the Rules, has a harmful impact on society. This is not the operator’s fault but signifies a deficiency in the Rules themselves. The correct response is therefore not retroactive punishment of the operator but a review of the Rules (the feedback loop of
Section 2.3 and
Section 4.1). In this way, the pathways by which harm reaches an honest operator are limited to two—deficiency in the Rules (conforming yet harmful) and faultless misapprehension—both of which are handled without unjustly punishing the operator. This whole makes VRAIO not a mere surveillance apparatus but an institution in which operators have an incentive to participate.
The scope of implementation
What this section has presented is a specification of the properties that the Rule-Judgment AI must satisfy—sealedness, determinism, public specification, immutability, mechanical cross-referencing by means of mechanically verifiable parameters, and continuous evaluation and improvement through democratic procedures. These can be realized by existing technical elements such as trusted execution environments, deterministically configured local models, and cryptographic tamper-prevention techniques. Likewise, regarding the structure of exemption, compensation, and misapprehension handling shown in the latter half of this section, what this paper presents is a framework at the level of principle; the concrete design of legal institutions—the precise scope of exemption, the design of the compensation fund, and the procedures for determining misapprehension—is a matter for future work beyond the scope of this paper. The contribution of this paper lies in making explicit, as a structure, what kind of verification mechanism AI output governance requires and what kind of incentive and responsibility structure it requires on the basis of that mechanism.
4.5. An Application Example: Searching for a Missing Child in an FMPS
This section applies the framework discussed thus far to a concrete case. As the subject, we take the FMPS (Fully Monitored Public Space) [
39], the context in which VRAIO was first proposed. An FMPS is an environment in which numerous cameras continuously record a public space and are connected to a central AI system. Whereas the original proposal [
39] presented the basic conception of VRAIO, this section applies the legitimacy confidence
L, the legitimacy budget, and verification by the Rule-Judgment AI introduced in
Section 4.3 and
Section 4.4 to the same subject in order to show their concrete operation.
As the setting, consider the output of searching for a missing child, identifying their current location, and reporting to the relevant parties. The central AI system accesses the footage recorded by each camera, traces the child’s movements to identify its current location, and attempts to output the information necessary for protection. This output is directly tied to the life and safety of the child, yet because it enters into the highly sensitive information of public-space footage, its legitimacy is rigorously questioned in both purpose and content.
The legitimacy of the purpose, L_purpose
For the purpose of this output—searching for a missing child—to be legitimate, the facts underpinning that purpose must be declared as an argument. The Rules predetermine the types of facts that must be declared for this purpose class—for example, facts such as a court-issued search authorization or a search request from a guardian together with the identification of the police officer who confirmed it.
If these facts are appropriately declared and are traceable to primary sources—for example, if the declared search authorization actually exists in the records of the issuing court—the legitimacy of the purpose is high, and L_purpose takes a value close to 1.0. Conversely, a search lacking such underpinning facts, that is, a search with neither a legitimate request nor authorization, cannot argue the legitimacy of its purpose, and L_purpose remains at a low value.
The legitimacy of the content, L_content
The legitimacy of the content is evaluated conditionally on the purpose. That is, the question is whether the scope of the search and the content of the output remain within the range reasonably necessary to achieve the purpose of protecting the child.
Legitimate content is, for example, the tracing of the child’s movements from leaving home up to their current location, and as output, the current location, related information in cases where abduction is suspected (information on vehicles involved, information on suspects, etc.), and the information necessary for the child’s protection. All of these directly serve the purpose of protection.
By contrast, information such as the current locations and past behavioral trajectories of all persons who happened to be near the child, and the further behavioral trajectories of persons the child came into contact with, can expand without limit if one attempts to search for it. Unless a reasonable ground directly serving the child’s protection is shown, such information is excessive information unnecessary for achieving the purpose. The inclusion of such excessive content lowers the legitimacy of the content and reduces L_content.
Here, the quantification of data minimization and proportionality described in
Section 4.3 appears concretely. Even for the same act of accessing footage, content limited to the range necessary in light of the purpose of protection obtains a high
L_content, whereas content expanded beyond the purpose loses
L_content. The legitimacy of the content is determined only in relation to the purpose.
The privacy of the budget, and sealed verification
This case sharply illustrates the tension between auditability and privacy protection. The legitimacy budget that should accompany L—the search range, the child’s movements, and the search authorization referred to—is itself a mass of sensitive personal information that could identify the missing child. If the auditability of the budget were to be realized through its disclosure, the privacy of the very child who should be protected would, on the contrary, be exposed.
This tension is resolved by the principle of “auditable reachability rather than disclosure” described in
Section 4.3 and by the Rule-Judgment AI of
Section 4.4. The legitimacy budget is not disclosed. The sealed, deterministically operating Rule-Judgment AI enters directly into the budget, which contains sensitive content, to compute and verify
L, yet it does not leak that content to the outside, outputting only
L and the conformity determination. In a spot audit as well, the audit body can verify the truth of the declaration without exposing the content of the budget by recomputing
L using the same Rule-Judgment AI. The same kind of requirement as in the privacy-enhancing technologies of the financial sector—verifying a property without disclosing the content [
48]—is satisfied here.
Detection of falsity
We show how the two-layer falsity detection discussed in
Section 4.4 operates in this case.
Computational falsity—the case in which a high L is declared even though only a low L should properly be derived from the declared facts—is caught by recomputation with the deterministic Rule-Judgment AI. For example, if a high L_content is declared while including an excessive search range, recomputation at audit time with the same facts as input returns a different L, and this is exposed as a discrepancy.
Factual falsity—the case in which a nonexistent search authorization is declared to “exist”—is caught by cross-referencing against authoritative external records. One need only confirm whether the declared search authorization actually exists in the records of the issuing court. Similarly, falsity that substitutes a different person as the search target is exposed through cross-referencing against the target’s identifying information, through traces of searching for multiple persons simultaneously, or through an indication of nonconformity from the side that received the report.
It is worth noting here that this case is endowed with favorable conditions for falsity detection, because the essential facts underpinning the purpose (the search authorization, the confirming police officer) are all cross-referenceable against authoritative external records. Conversely, VRAIO’s deterrence operates most strongly in high-stakes outputs with large social impact, where, as here, the essential facts are backed by authoritative records. In domains of subjective judgment that lack external records against which to cross-reference, the detection of factual falsity becomes inherently difficult. This case is at once a typical example of the domain in which the framework functions most effectively and an indication of the boundary of its applicability.
Generalization to Other High-Stakes Domains
The same structure applies to other high-stakes domains of AI output. In credit decisioning, for instance, the legitimacy of purpose is argued as conformity between the application and the lending rules, and the legitimacy of content as the restriction of the data used in the decision to the scope required by those rules (the exclusion of irrelevant attributes). In the output of findings for clinical decision support, likewise, the purpose is argued as the clinical indication, and the content as the validity of the findings and test values relied upon. In every case, the argument is anchored in auditable facts (lending rules, medical records, and the like) and is verified through re-computation of L by the Rule-Judgment AI and cross-checking against external records. What differs from the FMPS case is the specific Rules applied and the type of required facts; the framework itself—structuring and verifying legitimacy—is common to all.
Misapprehension and exemption
Finally, we apply the incentive and responsibility structure discussed in
Section 4.4 to this case. The AI system operator constructs an argument based on the above facts and releases the output only when a sufficiently high
L is obtained. So long as it declares honestly, the operator is in principle exempted from liability for this output.
Suppose that a declared search authorization did not in fact exist—even though the operator complied with the prescribed fact-procurement procedure—owing to an error in the record it referred to. This is treated not as intentional falsity but as faultless misapprehension, and is addressed through guidance for preventing recurrence and through strengthened auditing. On the other hand, if actual harm arose to the child or related parties as a result of this misapprehension, its compensation must be reliably made through the compensation fund financed by operator contributions. And if the output had a harmful consequence despite conforming to the Rules, this indicates a deficiency in the Rules themselves and is fed back into a review of the Rules.
In this way, even in the single case of searching for a missing child, all of the elements discussed in this section—the quantification of the legitimacy of purpose and content, the privacy protection of the legitimacy budget, two-layer falsity detection, and the structure of incentives and responsibility—operate in concert. The concrete context of the FMPS shows how the GLO/VRAIO framework can structure legitimacy not merely as an abstract principle but in real, life-and-death judgments.