The CMA Agentic Platform: Autonomous Asset Verification and Algorithmic Auditor Governance

Alhazmi, Abdulkarim Hamdan J.; Islam, Sardar M. N.; Prokofieva, Maria

doi:10.3390/fintech5020055

Open AccessSystematic Review

The CMA Agentic Platform: Autonomous Asset Verification and Algorithmic Auditor Governance

by

Abdulkarim Hamdan J. Alhazmi

^1,2,*

,

Sardar M. N. Islam

¹ and

Maria Prokofieva

³

¹

Institute for Sustainable Industries & Liveable Cities (ISILC), Victoria University, Melbourne, VIC 3000, Australia

²

Department of Administrative Sciences, College of Applied Sciences, Northern Border University, Arar 91431, Saudi Arabia

³

College of Business, Victoria University, Melbourne, VIC 3000, Australia

^*

Author to whom correspondence should be addressed.

FinTech 2026, 5(2), 55; https://doi.org/10.3390/fintech5020055

Submission received: 5 May 2026 / Revised: 12 June 2026 / Accepted: 15 June 2026 / Published: 17 June 2026

Download

Browse Figures

Versions Notes

Abstract

Saudi Arabia’s audit market faces three governance challenges that existing frameworks may not fully address. These challenges concern a potential regulatory gap around autonomous AI accountability, a trust dimension that standard technology-adoption models may not fully capture, and limited mechanisms for independently verified ESG assurance under Vision 2030. This study adopts a conceptual design approach within the design science research tradition and proposes the CMA Agentic AI Platform as a practical response to these challenges. The platform comprises two segments. Segment 1 deploys autonomous drone swarms to verify corporate assets across four audit tasks—asset valuation, ESG compliance, anomaly detection and construction progress—using deep learning, thermal imaging and social-media cross-referencing. Segment 2 continuously monitors discretionary accruals and uses objective earnings-management data to inform auditor assignment and rotation decisions. This approach replaces subjective reputational assessments with transparent, quantifiable governance criteria. The platform is governed through the Triadic Agentic Framework, which extends classical agency theory by distributing authority across the Principal, the Human Agent and the AI Agent. The framework also operationalises Trust Expectancy as the primary adoption condition. The evidence base draws on two complementary streams: a PRISMA-guided systematic review and bibliometric analysis of thirty-nine peer-reviewed studies, and a documentary analysis of four national agentic-AI regulatory frameworks (SDAIA, MDDI/IMDA, NIST and ICO). The study contributes the concept of Algorithmic Accountability as a distinct governance domain, the Triadic Agentic Framework as an operational architecture for autonomous regulatory monitoring, and a reframing of the UTAUT trust construct for agentic-AI adoption in mature professional contexts. The platform converts theoretical governance into a regulatory architecture with direct implications for concentrated capital market regulators.

Keywords:

agentic AI; asset verification; discretionary accruals; auditor rotation; algorithmic accountability; drone technology; ESG assurance; Saudi Arabia; CMA; UTAUT; Trust Expectancy; earnings management; Vision 2030

JEL Classification:

M42

1. Introduction

1.1. Background

The Saudi Arabian audit market operates under a concentrated regulatory architecture in which the CMA permits only sixteen approved external audit firms to conduct engagements for listed companies. This concentration has created persistent governance challenges. Prolonged audit tenure contributes to familiarity bias between auditors and management [1,2]. Concentrated family-ownership structures heighten information asymmetry between principals and agents [3]. A persistent quality gap between Big Four and non-Big Four firms limits both the competitiveness and the coverage of the broader audit market. These structural vulnerabilities are not merely academic concerns; they represent systemic risks to the reliability of financial disclosures across the Saudi stock market.

The Saudi Arabian context warrants focused investigation for four reasons that extend beyond regional interest. First, Saudi Arabia operates the largest capital market in the MENA region by market capitalisation and is the only Gulf economy classified as an MSCI Emerging Market constituent at full inclusion weight. Its audit-governance architecture is therefore consequential for international institutional investors with exposure to the wider region. Second, the Kingdom’s audit market exhibits an unusually concentrated structure—sixteen CMA-approved firms serving all listed companies—that creates the precise governance conditions in which familiarity bias, related-party transactions and concentrated ownership most strongly interact. The structural vulnerabilities documented diffusely across the emerging-market audit literature are therefore observable in their most acute form. Third, Vision 2030 has positioned the Kingdom as a regional leader in AI adoption, with the SDAIA [4] establishing one of the world’s first mandatory national governance baselines for AI deployment. This regulatory environment allows agentic-AI architectures to be theorised within a real and enforceable governance frame rather than within a regulatory vacuum. Fourth, Saudi Arabia’s combination of rapid technological modernisation and conservative auditing institutions creates a natural laboratory for examining the Trust Expectancy barriers that this paper identifies as the primary obstacle to agentic-AI adoption. Nowhere else does an emerging market simultaneously deploy autonomous drone infrastructure at industrial scale [5] while maintaining the institutional caution that characterises mature audit professions. The lessons drawn from this study therefore extend directly to other concentrated capital markets, including the broader GCC, Singapore and emerging Asian financial centres facing comparable governance transitions.

This paper is a conceptual design study. It specifies the architectural and governance features of a proposed regulatory platform; it does not report pilot deployment, simulation results or empirical evaluation of the platform’s operational performance. The design science research tradition [6,7] frames this study as the artefact-specification stage of a multi-stage research programme. Pilot deployment, simulation and empirical validation are identified as future research directions in Section 5.2.

A critical and under-examined dimension of these risks is earnings management: the strategic manipulation of financial statements to disguise true income streams or profit levels. Traditional audit-quality frameworks have relied on discretionary accruals (DACC) as the primary empirical proxy for earnings management, with abnormally high or volatile DACC signalling managerial opportunism [8,9]. No existing governance framework within the CMA or SOCPA has systematically integrated DACC monitoring into auditor assignment or rotation decisions—a gap that the proposed platform directly addresses. Aramco’s adoption of drone technology for asset inspection [5] has demonstrated that autonomous physical verification is operationally feasible at industrial scale, inspecting seventy wells in a four-hour window compared with the previous rate of two wells per hour via land vehicles. However, as this paper documents throughout the literature review, no existing governance framework addresses what this research terms Algorithmic Accountability—the legal and professional standards governing autonomously generated audit evidence and the regulatory chain of responsibility when autonomous systems act without continuous human oversight.

1.2. The Proposed Platform

This study examines the design, governance architecture and implementation rationale for a CMA-administered Agentic AI Platform that performs two core regulatory functions, as summarised in Table 1.

Ref. [11] documents that AI agents and no-code tools have already entered accounting practice through specific applied case studies. This evidence confirms that the agentic-AI architecture proposed here is grounded in observed professional practice rather than abstract speculation. The platform operates on a principle derived from this paper’s reconceptualisation of UTAUT: the transition from passive AI assistance to agentic-AI governance requires not technical sophistication alone but the establishment of Trust Expectancy—the willingness of regulators, auditors and market participants to delegate legally consequential decisions to autonomous systems. The platform is not designed to replace the human judgement of CMA officials or external auditors. It is designed to elevate that judgement by ensuring that every decision is informed by independently verified, continuously updated and algorithmically auditable evidence that no human-directed procedure could replicate at equivalent speed or scale.

1.3. Research Questions

This study is guided by three interconnected research questions. Each question corresponds to a distinct governance gap within the existing audit governance framework that the proposed platform seeks to address. The first asks how the Triadic Agentic Framework can be operationalised within the CMA’s regulatory architecture to support Algorithmic Accountability, ensuring that autonomous audit evidence is subject to the same scrutiny as the financial records it examines. The second asks how continuous DACC monitoring can function as a primary mechanism within a composite signalling framework for preserving auditor independence. It further asks whether objective earnings-management data can address the theoretical tension between familiarity bias and the commitment-signalling value of long predecessor tenure. The third asks what technical and governance conditions are required for Agentic Drone Swarms to provide independently verified, real-time physical evidence of ESG compliance, replacing managerial self-reporting with continuously monitored, physically grounded sustainability data across Saudi Arabia’s highest-emission industrial sectors. Together, these questions move the research from theoretical proposition to practical specification: they ask not whether agentic-AI governance is desirable but whether it is achievable, and on what terms.

How can the TAF be operationalised within the CMA’s regulatory architecture to support Algorithmic Accountability for autonomously generated audit evidence?
How can continuous DACC monitoring complement or substitute for time-based mandatory rotation as part of a composite signalling framework for preserving auditor independence, and how might it address the theoretical tension between familiarity bias and predecessor-tenure commitment signalling?
What technical and governance conditions would enable Agentic Drone Swarms to provide independently verified, real-time physical verification of ESG disclosures within Saudi Arabia’s Vision 2030 sustainability framework?

2. Methods

2.1. Research Design: Conceptual Design Study

This study adopts a conceptual design study approach grounded in the design science research tradition [6,7,12]. Design science research is the appropriate methodological frame when the research deliverable is not the analysis of an observed phenomenon but the specification of a designed artefact intended to address a documented practical problem. In this case, the artefact is a regulatory architecture designed to address governance gaps within Saudi Arabia’s concentrated audit market. The deliverable of this study is therefore the platform itself: its segments, governance architecture, decision protocols and threshold conditions, evaluated against empirically validated capability evidence drawn from the systematic bibliometric corpus (Section 2.5) and current international regulatory practice (Section 2.3.3).

The conceptual design study is distinct from an empirical case study in three respects that bear directly on the research design. First, the unit of analysis is the proposed artefact rather than an observed case. The study therefore aims for design adequacy—the demonstration that the proposed platform is internally consistent, empirically grounded and theoretically defensible—rather than for analytic generalisation from observed events. Second, the methodology emphasises problem relevance, artefact specification and evaluation against capability evidence [6] rather than within-case and cross-case analysis. Third, the study acknowledges that the platform does not yet exist as an operational system. The next research stages—pilot deployment, regulatory experimentation and empirical validation—fall outside the scope of the present conceptual specification and are identified as future research directions in Section 5.2.

This approach is methodologically consistent with the closest prior study in the audit-technology literature. Ref. [13] adopt a design science research methodology to specify and evaluate a drone-enabled inventory audit procedure. The present study extends that methodological precedent in two directions: from a single audit procedure to a complete regulatory governance platform, and from a single technology to an integrated agentic-AI architecture incorporating drone-based asset verification, multi-source data fusion and continuous discretionary-accruals monitoring.

The study is explanatory in purpose, seeking to explain how the TAF can be operationalised within the CMA’s regulatory architecture and why existing governance frameworks are insufficient to accommodate autonomous audit evidence. It is also evaluative, assessing the platform’s design adequacy against the documentary evidence base described below. Together, these two purposes structure the platform’s specification as a defensible, transparently reasoned regulatory proposition that subsequent empirical research can pilot, test and refine.

2.2. Data Collection Strategy

Evidence collection follows a multi-source triangulation strategy consistent with the rigour requirements of design science research [6] and draws on three complementary evidence streams. The first is the systematic bibliometric literature review conducted through the Web of Science Core Collection and detailed in Section 2.5 and Section 2.6. This review maps the conceptual structure of the field and confirms the governance gaps that the platform is designed to address. The second is a structured documentary content analysis of empirical studies on drone-inspection performance and AI-powered verification. This analysis provides the technical foundation for the platform’s capability claims. The third is the theoretical literature on agency theory and UTAUT, which provides the governance framework within which the platform’s design is justified. The integration of these three streams ensures that no single source drives the conclusions in isolation—a requirement for construct validity in a conceptual design study.

2.3. Documentary Content Analysis

2.3.1. Two-Stream Source Classification Framework

The documentary content analysis distinguishes two categories of sources by evidentiary function, adapted from the [14] appraisal logic.

First, peer-reviewed empirical sources are primary academic studies indexed in Web of Science that report original quantitative performance data—accuracy rates, time-reduction figures and measurement-error margins—with clearly reported methodologies and replicable findings. These sources are drawn from the systematic bibliometric corpus described in Section 2.5 and form the empirical backbone of the platform’s technical-feasibility claims.

Second, government regulatory sources are national frameworks and standards initiatives published by named statutory authorities. They are used exclusively to anchor the platform’s governance architecture in current regulatory practice and are not used to support quantitative performance assertions. The four regulatory sources used in this study are listed in Section 2.3.3.

2.3.2. Document Selection Protocol

Peer-reviewed empirical sources entered the documentary analysis only after admission to the systematic corpus through the PRISMA workflow detailed in Section 2.5. Government regulatory sources were selected by three criteria applied in sequence: (i) publication by a named national statutory authority with explicit AI-governance jurisdiction; (ii) substantive engagement with the governance of autonomous or agentic-AI systems; and (iii) public availability at a verifiable institutional URL as of the documentary-analysis cut-off date (11 May 2026).

2.3.3. Documentary Evidence Base

The documentary content analysis draws on peer-reviewed empirical sources spanning drone-inspection performance, AI-powered detection accuracy and audit-technology efficiency, together with four government regulatory sources. These sources provide an independent foundation for each technical capability and governance claim within the platform’s design.

The most directly relevant audit evidence is provided by [13]. Their design science study compared drone-enabled inventory auditing against traditional manual procedures through controlled field testing in two agricultural settings—a sheep ranch and a cattle feedlot. The cattle-feedlot test documented a reduction in inventory-count time from 681 h to 19 h, alongside a simultaneous improvement in error rates from 0.15 per cent to 0.03 per cent, relative to the manual count performed by the company’s internal audit team. Ref. [13] also document the institutional barriers to adoption, including audit-firm concerns about being first movers and the inability of standard-setting bodies to provide guidance at the pace of technological development. These barriers directly inform the platform’s human-in-the-loop governance architecture and Trust Expectancy framing.

Ref. [15] validate autonomous drone inspection in industrial environments, documenting time and cost reductions of up to 50 per cent through thermal imaging and visual inspection across multiple sites. This evidence provides empirical grounding for the platform’s thermal-auditing capability across Saudi cement plants, petrochemical refineries and offshore infrastructure. Ref. [16] confirm reduced inspection time, lower costs, environmental benefits through a smaller operational footprint and accurate diagnosis of building envelope pathologies in hard-to-reach locations, directly supporting the platform’s ESG verification functions. Ref. [17], in their USDA Forest Service technical report, supply government-validated comparative evidence confirming significant time and cost reductions for infrastructure inspection applicable to Audit Task 4. Ref. [18] achieve classification accuracy exceeding 99 per cent in autonomous Scan-vs-BIM structural-discrepancy detection without manual intervention. Ref. [19] validate IoT-integrated multi-drone trajectory optimisation with F-measure performance above 88 per cent. Ref. [20] report an R² of 0.97 and ±1.02 per cent error in drone-based thermal-resistance measurement, setting the precision benchmark for the platform’s ESG physical-verification function.

For biological and agricultural asset verification, ref. [21] achieved 97.79 per cent accuracy in autonomous detection across a 22-hectare oil palm plantation using deep-learning architecture applied to UAV imagery. This evidence provides the empirical grounding for the platform’s biological asset detection capability across Saudi agricultural assets.

Four national regulatory frameworks anchor the platform’s governance architecture in current international practice. The SDAIA [4] sets a mandatory governance baseline for AI deployment in the Kingdom, covering data governance, model accountability, transparency, human oversight and risk management [4]. Singapore’s Model AI Governance Framework for Agentic AI—the first national framework dedicated to agentic AI—specifies four governance dimensions: risk bounding, human accountability, technical controls and end-user responsibility [22]. The [23] organises US agent-governance work around three pillars: industry-led standards, open-source protocols, and agent security and identity research [23]. The ICO Tech Futures: Agentic AI report addresses data-protection compliance, supplier accountability and transparency requirements for agentic systems [24].

2.3.4. Data Extraction and Analysis Protocol

Each peer-reviewed empirical source was analysed through a structured extraction framework targeting quantitative performance metrics, methodological design, applicability to specific platform functions and limitations relevant to the Saudi industrial context. Each government regulatory source was analysed through a parallel framework targeting governance scope, accountability architecture and applicability to the Kingdom’s institutional environment. Cross-source validation required that every quantitative performance claim appearing in the manuscript be supported by at least one verified peer-reviewed empirical source, and that every governance-architecture claim be supported by at least one named regulatory source. Validity was assessed through adapted [14] criteria—credibility, transferability to the Saudi context, dependability and confirmability—applied to each source before integration into the analysis.

2.4. Validity and Reliability

Construct validity is addressed by drawing on multiple evidence sources and by linking each platform-capability claim to its peer-reviewed empirical validation. Internal validity is addressed through pattern matching between the platform’s theoretical governance propositions and the empirical evidence base. This pattern matching tests whether the capabilities documented in the literature are consistent with the theoretical mechanisms that the platform proposes. Because the study specifies a designed artefact rather than analysing an observed case, the appropriate validity standard is design adequacy [6]: the demonstration that the platform’s capabilities, governance mechanisms and threshold conditions are empirically grounded, internally consistent and theoretically defensible. Findings are intended to inform subsequent pilot deployment and empirical evaluation rather than to claim statistical generalisation to a population of regulatory platforms. Reliability is supported through the two-stream source classification, the structured data-extraction protocol and the explicit exclusion criteria that prevent promotional or unverifiable material from entering the evidence base.

2.5. Web of Science Search and PRISMA Review

A systematic literature search was conducted on 29 May 2026 using the Web of Science Core Collection, following the PRISMA 2020 framework [25] to ensure transparency and reproducibility. The review protocol, search strategy, screening procedures, eligibility criteria and bibliometric analysis methods were registered on the Open Science Framework (OSF) prior to manuscript finalisation and are openly available at https://doi.org/10.17605/OSF.IO/892CH. The OSF registration includes the complete Web of Science search export, the screening log, the ten-criterion eligibility framework, the VOSviewer (version 1.6.20) thesaurus and the [14] appraisal grid, providing a transparent and reproducible methodological record.

A single, fully parenthesised Boolean search was executed within the Web of Science platform, combining a technology block with an audit-domain block:

(“Agentic AI” OR “AI Agent*” OR “agentic artificial intelligence” OR “autonomous AI” OR “drone*”) AND (“audit*” OR “accounting” OR “financial report*”)

The search returned 643 records. Application of English-language and Document Type = Article filters (Early Access included) reduced this to 466 records, with 177 records excluded by the filters. The excluded records included one book chapter and one retracted publication identified at the document-type stage. PRISMA 2020 screening was then applied in two rounds. Title and abstract screening of the 466 records yielded 51 candidate studies for full-text review, with 415 records excluded as not substantively addressing audit, accounting, financial reporting or assurance in conjunction with agentic-AI or drone technology. Full-text eligibility review then applied ten explicit exclusion criteria (recorded in Appendix B) to the 51 candidate studies, identifying records that pattern-matched on search keywords without substantive engagement with the review’s scope. Twelve studies were excluded at this stage. The resulting final synthesis corpus comprises 39 peer-reviewed studies, illustrated in Figure A1 in Appendix A.

Justifications for Using Web of Science

The Web of Science Core Collection was selected as the primary and sole formally searched database for this review. Ref. [26] confirm Web of Science as a reliable foundation for bibliometric research, citing its indexing standards and consistent citation metadata. The combination of the Science Citation Index Expanded (SCIE) and the Social Sciences Citation Index (SSCI) under unified metadata standards supports the interdisciplinary scope of this review.

Ref. [27] identify Web of Science as the more selective of the major scientific databases in social-science and management disciplines, with stricter inclusion criteria that raise the baseline quality of the candidate pool and reduce predatory-journal noise. Ref. [28] further establish Web of Science as a premier source for cross-disciplinary network analysis, providing the empirical basis for the keyword co-occurrence mapping applied in Section 2.7. The single-database approach is also consistent with prior bibliometric scholarship on AI and audit technology in the Saudi context, including [29], who relied on Web of Science as the sole search source in their study of AI-integrated drone technology and audit performance in emerging markets.

Multi-database searching, while standard in clinical systematic reviews, introduces deduplication complexity and metadata inconsistency that would have compromised the reproducibility of the bibliometric analysis. The single-database approach adopted here is methodologically appropriate for a conceptual bibliometric systematic review [26,27,29].

2.6. Bibliometric Analysis

Following data collection, bibliometric analysis was conducted using VOSviewer to map keyword co-occurrence patterns across the 39 studies retained at the eligibility stage of the PRISMA workflow described in Section 2.5. Prior to analysis, a systematic data-cleaning and label-normalisation protocol was applied to ensure terminological consistency across the corpus. Variant spellings and conceptually identical terms were unified through a structured thesaurus file. The terms AI, artificial-intelligence and artificial intelligence (ai) were standardised to artificial intelligence; audit to auditing; drones to drone; systems to system; and AI agents to agentic AI. Most consequentially, uav and unmanned aerial vehicle (uav) were merged with drone, since these terms are used interchangeably in the literature. Before this merge, the network was fragmented into two artificially separated technical hubs. This normalisation step was critical because inconsistent labelling artificially divides conceptually identical terms across separate nodes, inflating the number of clusters and undermining the validity of cluster-level interpretation [30].

Following cleaning, VOSviewer generated a keyword co-occurrence network using a minimum co-occurrence threshold of two. The resulting overlay visualisation—illustrated in Figure A2 in Appendix A—plots each retained keyword as a node sized by occurrence frequency, with links representing co-occurrence within the same study and node colour mapped to the average publication year of the studies in which the keyword appears (range 2021–2025).

The keyword co-occurrence network reveals three analytically significant patterns. First, the merged drone node emerges as the largest and most highly connected hub in the network. This centrality confirms drone-based physical verification as the central organising concept across the corpus and provides direct empirical support for the manuscript’s proposition that drone evidence has matured from an ancillary tool into a foundational evidence-collection modality for autonomous audit. Second, the network displays a clear temporal gradient running from right to left. The oldest publications (blue, 2021–2022) cluster around the technical drone-engineering and energy-audit terms on the right—system, optimisation, data fusion, energy efficiency, energy audit and environment—while the newest publications (bright yellow, 2025) cluster tightly on the left around agentic AI, artificial intelligence and cognition. This left-to-right chronology mirrors the field’s intellectual evolution from physical drone engineering through audit-application research toward autonomous AI governance—the precise trajectory within which the present paper situates itself. Third, the bright-yellow agentic-AI cluster on the left and the audit-practice cluster in the upper centre (auditing, trust, farmer) do not co-occur directly. Both link inward to the central drone hub, but no direct edge connects agentic AI to auditing. This visual gap is the bibliometric representation of the governance vacuum that the manuscript identifies as its primary research problem: the technical literature on agentic AI has not yet been integrated with the professional and regulatory literature on audit accountability. It is precisely this integration that the TAF and the CMA platform proposed in this study seek to achieve. Ref. [31] provide an integrative framework and research agenda for AI in accounting that confirms this conceptual integration as a recognised priority within the broader AI-accounting research literature.

2.7. Keyword Occurrences

Figure A2 in Appendix A presents the keyword co-occurrence map generated by VOSviewer for the 39 retained studies. Each node represents a keyword, sized by occurrence frequency and connected by lines indicating co-occurrence within the same study. The colour gradient—dark blue through teal and green to bright yellow—represents the average publication year of the studies in which each keyword appears, spanning 2021 through 2025.

Four analytical observations emerge from the map. First, after merging drone, uav and unmanned aerial vehicle (uav), the drone node occupies the structural centre and is the single largest node by occurrence frequency. Its centrality reflects the manuscript’s core argument that drone-based physical verification is the organising substrate connecting the technical-engineering literature, the audit-application literature and the emerging AI-governance literature.

Second, three thematic groupings radiate outward from the drone node, connected by a set of bridge nodes. The left-hand cluster (bright yellow, 2025 publications) groups agentic AI, artificial intelligence and cognition and represents the frontier governance and cognitive-substrate literature—[10,32,33,34,35,36]. The tight clustering visually validates the position that agentic AI, AI governance and Algorithmic Accountability must be theorised together. The upper-centre cluster (teal to green, 2023–2024 publications) groups auditing, trust and farmer, capturing the audit-application and adoption literature represented by [37,38,39,40]. The position of the trust node directly between auditing and the central drone hub provides bibliometric confirmation that Trust Expectancy and adoption-readiness constructs—drawn from the UTAUT lineage—are the dominant explanatory vocabulary in this strand. This finding justifies the retention of UTAUT and Trust Expectancy as the governing theoretical lens of Section 4. The right-hand cluster (blue to teal, 2021–2023 publications) groups system, optimisation, data fusion, energy efficiency, energy audit and environment, representing the technical-engineering and energy-assurance foundation: [13,20,41,42,43,44,45,46]. The older temporal signature of this cluster, relative to the agentic-AI cluster on the left, captures the field’s progression from passive technology adoption toward autonomous algorithmic agents. The colour gradient from blue through teal to yellow makes this progression directly visible. A bridge subset of intermediate nodes—industry 4.0, design science, big data, challenges, future and logistics—sits between the peripheral clusters and the drone hub, acting as connective tissue where the literature is beginning to integrate the agentic-AI, audit-application and technical-drone strands. The teal colour of these bridge nodes (2023–2024) suggests this integration is recent and incomplete.

Third, the absence of a direct edge between the left-hand agentic-AI cluster and the upper-centre auditing cluster is analytically significant. Despite their conceptual interdependence—autonomous AI agents acting within audit contexts create direct accountability challenges—these two groups co-occur only through the central drone hub or the bridge nodes, never directly. This visual separation is the bibliometric expression of the governance vacuum identified in Section 1.3: the technical and governance literature on agentic AI [10]; Ref. [30] has not yet been substantively integrated with the professional and regulatory literature on audit accountability [13,37,40,47]. It is precisely this integration that the TAF introduced in Section 3, and the CMA platform proposed throughout the study seeks to provide.

Fourth, consistent with [29] in the Saudi financial-reporting context, the map shows artificial intelligence as the highest-frequency conceptual cluster across the corpus. Within the present sample, however, artificial intelligence has been overtaken at the temporal frontier by its agentic variant. The brightest-yellow node is agentic AI itself, not artificial intelligence, confirming that the centre of gravity of the literature has shifted from generic AI capabilities toward the specific governance challenges of autonomous, goal-directed AI systems. This shift is the empirical anchor on which the manuscript’s autonomous-platform argument rests.

Methodological and technical limitations of the study are consolidated in Section 5.1.

3. Results

3.1. Segment 1: Autonomous Asset Verification

3.1.1. Operational Architecture

The Asset Verification segment operationalises Audit Tasks 1, 2, 3 and 4 as mapped in this paper’s TAF. The four tasks span asset existence and valuation verification, compliance and ESG monitoring, risk and anomaly detection, and construction progress assessment. When a company applies for listing on the Saudi Stock Exchange, or when a scheduled verification cycle is triggered by the platform’s continuous monitoring algorithms, the AI Agent autonomously initiates a multi-stage verification protocol. No human scheduling, mission planning or real-time supervision is required at any stage of execution. The architecture of this protocol is illustrated in Figure 1, which maps the four-stage workflow from financial-data extraction through autonomous drone deployment, multi-source data fusion and AI-powered verification to the final governance outcome.

Stage 1: Financial Data Reconciliation

The platform extracts asset data from the company’s financial records, cross-references the extracted figures against submissions to the Zakat, Tax and Customs Authority (ZATCA, hereafter “ZATCA Authority”), and identifies any discrepancies requiring physical verification. This stage implements what this paper terms Agentic Data Mining: the continuous, autonomous monitoring of big-data repositories for anomaly detection across multiple financial data sources simultaneously [48]. Unlike passive AI tools, which require an auditor to query a database and interpret results, the AI Agent monitors these repositories continuously. It identifies deviations between what firms report, what the ZATCA Authority holds on record, and what bank mortgage data confirms about the existence, condition and valuation of each registered asset. When a discrepancy is detected—for example, a building reported at a valuation materially higher than the average of three professional real-estate agent assessments held by the bank—the system does not flag the item for human scheduling. It autonomously advances the process to Stage 2 and triggers drone deployment without human instruction.

Stage 2: Autonomous Drone Deployment

On detection of a verification requirement, the AI Agent autonomously selects the appropriate drone platform based on asset type—industrial, agricultural, construction or offshore—and calculates optimal flight trajectories using fuzzy linear fractional transportation programming integrated with IoT-enabled multi-drone coordination [19]. This trajectory-optimisation model continuously adjusts flight altitude in real time based on terrain morphology, fuel constraints, flight restrictions and physical threats. It achieves precision, recall and F-measure performance metrics all exceeding 88 per cent across emergency and complex industrial environments. The platform deploys multiple drones simultaneously across large Saudi industrial facilities and transmits verified physical-asset data to the AI Agent for real-time reconciliation against financial ledger records. No human operator is required to schedule, supervise or interpret individual flights. As Figure 1 illustrates, this stage encompasses the full range of Saudi industrial geographies—from cement plants and petrochemical refineries to agricultural facilities and offshore oil infrastructure. Each setting presents distinct operational constraints that the autonomous trajectory system navigates without modification to its core logic.

Stage 3: Multi-Source Data Fusion

The platform integrates drone-captured imagery with a comprehensive constellation of internal and external data sources, as depicted in the Data Fusion Hub in Figure 1. Internally, the platform draws on three categories of data. The first is bank mortgage data, including professional real-estate agent valuations, photographs, locations and asset-condition reports. The second is loan firm documentation, including income operations, cash flow evidence and financial reports. The third is firm-provided account data submitted directly to the platform. Externally, the platform crawls social-media content—company advertisements and real-estate agent postings with asset information and pricing—and cross-references these against formal declarations. This social-media verification mechanism directly addresses one of the most persistent forms of information asymmetry in the Saudi market: a manager cannot simultaneously maintain two materially different asset valuations across formal regulatory channels and informal social-media channels without the platform detecting the discrepancy. Published reports and academic literature further enrich the contextual benchmarking layer. The fused data stream is then passed to Stage 4 for AI-powered verification, a transition that Figure 1 marks with the label “To Stage 4: AI Verification”. This confirms that the entire progression from anomaly detection to physical evidence fusion is executed without human intervention at any intermediate decision point.

Stage 4: AI-Powered Verification

Deep-learning models process the fused multi-source data stream through three parallel analytical pipelines, each validated by independent empirical literature reviewed in this paper. First, the Faster R-CNN architecture achieves more than 97 per cent accuracy in asset detection and counting. It autonomously locates, counts and precisely maps individual physical assets across large and complex sites, including biological assets in agricultural settings, without human interpretation of imagery [21]. Second, transformer-based anomaly detection identifies deviations from expected operational patterns without requiring human operators to predefine what an anomaly looks like. This enables the system to distinguish between normal asset conditions and anomalous deviations across diverse industrial environments where the range of possible irregularities is too varied to be exhaustively catalogued in advance [36]. Third, thermal imaging validates the operational status of physical assets with a measurement error of only ±1.02 per cent. It produces location-specific evidence of energy inefficiency, structural deterioration and invisible hazards—including heat leakages and insulation failures that managers might present as fully operational infrastructure in their financial disclosures [20]. The right-hand panel of Figure 1 illustrates these three analytical outputs—Asset Detection and Counting, Anomaly Detection, and Thermal Imaging—converging on the determination of objective verification status and triggering the downstream audit workflow. Where discrepancies persist after data fusion and AI analysis, the platform escalates to CMA human review. Where assets are confirmed, the registry is updated, and the verification record is permanently archived for audit-trail purposes.

3.1.2. Evidence Base

The technical feasibility of each stage of the Asset Verification segment is independently established by empirical studies reviewed across this paper’s literature, summarised in Table 2.

No single existing study integrates all four capabilities within an audit governance context. The platform proposed in this study is the first theoretical architecture to integrate them into a unified, autonomously triggered verification system operating under a formal Algorithmic Accountability framework.

3.1.3. Addressing the Greenwashing Risk

The Asset Verification segment directly addresses one of the three core research gaps identified in this paper: the absence of any existing study examining how Agentic Drone Swarms could provide objective, real-time physical verification of ESG disclosures within Saudi Vision 2030s’ sustainability framework [50].

Saudi Arabia’s Vision 2030 agenda mandates a transition toward sustainable industrial practices across the Kingdom’s highest-emission sectors, including petrochemicals, cement and energy. Current ESG disclosure practices within these sectors rely on managerial self-reporting, which is structurally vulnerable to greenwashing—the presentation of sustainability claims that do not reflect operational reality. The platform addresses this vulnerability directly through the Greenwashing Risk Mitigation capability illustrated in the bottom-right panel of Figure 1.

Drone-mounted infrared cameras autonomously measure thermal resistance across physical asset surfaces under variable environmental conditions. This produces location-specific, independently verified evidence of whether a facility’s environmental performance matches what its management reports [20]. The thermal data captured by the drone swarm cannot be altered retrospectively by management, cannot be selectively presented to favour particular site conditions, and does not depend on the auditor’s physical presence at a remote or hazardous facility. It therefore provides what self-reported ESG disclosures structurally cannot: independently generated, continuously updated and physically grounded evidence of environmental compliance.

This mechanism directly operationalises Research Question 3 of this paper—the theoretical basis for proposing Agentic Drone Swarms as a mechanism for real-time ESG verification. It translates the conceptual proposition into a technically specified and empirically grounded governance mechanism with immediate applicability to Saudi Arabia’s Vision 2030 regulatory environment.

3.1.4. Audit Task Mapping: From Theoretical Framework to Platform Architecture

The Asset Verification segment does not operate as a generic inspection tool. Each of its four analytical pipelines maps directly onto specific audit tasks derived from the TAF developed in this paper, translating theoretical governance propositions into operationally specified platform functions. The four audit tasks addressed by Segment 1—asset verification and valuation (Task 1), compliance and ESG integrity (Task 2), risk and anomaly detection (Task 3), and construction progress and quality control (Task 4)—collectively cover the range of physical evidence required to close the information-asymmetry gaps that define the Saudi audit-quality problem. Each task is described below with its data inputs, analytical purpose, and agency-theory alignment as implemented within the CMA platform.

Audit Task 1: Asset Verification and Valuation

The first and most foundational audit task the platform executes is the autonomous verification of the existence, condition and valuation accuracy of corporate assets declared in financial statements submitted to the CMA. This task directly addresses the information-asymmetry construct at the core of agency theory: the gap between what managers know about the true condition and value of the assets they steward, and what shareholders and regulators can independently verify [51]. In the Saudi context, this gap is structurally amplified by concentrated family ownership and the prevalence of related-party transactions. These conditions create a setting in which asset overstatement—the declaration of ghost inventory, the reporting of impaired assets as fully operational, or the inflation of property valuations—represents a persistent and documented form of earnings management [2]. Ref. [52] extend the [13] drone-enabled inventory observation precedent by examining its implications for auditor liability, providing direct legal-context support for the platform’s Audit Task 1 architecture.

The platform addresses this task through four parallel data streams, each targeting a distinct dimension of asset verification. Image data collected by the drone swarm is processed through the Faster R-CNN deep-learning architecture. This architecture autonomously detects, counts and precisely locates individual physical assets across large and complex sites—including biological assets in agricultural settings—with accuracy consistently exceeding 97 per cent and false-positive rates tenfold lower than conventional machine-learning approaches [21]. This capability eliminates the sampling limitations of traditional manual verification, in which an auditor visits a subset of assets during scheduled site visits and extrapolates findings to the entire population. That procedure creates a predictable window of opportunity for managers to present favourable conditions during the inspection period while concealing unfavourable ones between visits. Video data provides continuous real-time monitoring of asset counts and movements, offering visual audit documentation that cannot be altered retrospectively and that the platform archives in its permanent evidence record. Geospatial data confirms that declared assets occupy the physical locations recorded in the asset registry, using GPS coordinates to verify that a building reported at a specific location in Riyadh or Jeddah actually exists at those coordinates and not merely in the financial statements. Thermal data provides the deepest layer of verification, detecting the operational status of physical assets through heat signatures. This reveals idle machinery presented as active, structural deterioration presented as sound infrastructure, and insulation failures presented as fully functional facilities. Table 3 summarises the complete data architecture of Audit Task 1 as implemented within the CMA platform.

Audit Task 2: Compliance and ESG Integrity

The second audit task the platform executes extends the physical verification function beyond financial asset accuracy into the domain of environmental, social and governance (ESG) compliance. This task directly addresses Research Question 3 of this paper and the ESG assurance deficit identified by [50] as one of the most significant unresolved gaps in the AI-accounting literature. In the Saudi context, Vision 2030 mandates a transition toward sustainable industrial practices across the Kingdom’s highest-emission sectors. As long as sustainability disclosures continue to rely on managerial self-reporting, the absence of independently verified ESG data creates a structurally uncloseable greenwashing risk. The platform addresses this risk by deploying the same drone swarm infrastructure used for financial asset verification to collect environmental sensor data. This provides objective, real-time physical evidence of whether industrial operations comply with the environmental, safety and boundary conditions to which listed companies attest in their sustainability disclosures.

Environmental sensor data collected by the drone swarm monitors pollutant levels, including CO₂ and methane, across industrial facilities. It provides the CMA with objective evidence of whether emissions comply with Saudi environmental regulations, directly mitigating the moral hazard of managers who might otherwise suppress or understate environmental costs to protect short-term profitability [50]. Ref. [53] further demonstrate that multi-agentic architectures can monitor carbon-related operational constraints across global supply chains. This evidence provides additional scholarly support for the platform’s CO₂ and methane emission-verification function under Vision 2030 sustainability mandates. Geospatial data uses high-precision GPS to verify that physical activities—including mining, drilling and industrial processing—remain within the authorised legal boundaries declared in operating licences. This provides the principal, comprising shareholders and the CMA as regulator, with objective proof that the agent is operating within legal mandates rather than encroaching on protected areas or exceeding permitted extraction zones [54]. Image and video data enable the platform to conduct autonomous safety audits via drone footage, verifying adherence to occupational health and safety standards across active construction and industrial sites. This visual evidence cannot be manipulated in a written management report, and the platform archives it for regulatory inspection on demand [55]. Thermal and infrared data detect invisible hazards, including overheated machinery, heat leakages, insulation failures and structural deterioration, that jeopardise the safety and operational integrity of industrial facilities. They produce a permanent audit trail that holds executives accountable for asset upkeep and creates the evidentiary basis for [10] concept of leadership maintenance accountability. Table 4 summarises the complete data architecture of Audit Task 2 as implemented within the CMA platform.

Audit Task 3: Risk and Anomaly Detection

The third audit task the platform executes is the most analytically sophisticated. It involves the autonomous identification of anomalous events, patterns and changes across the physical environments of Saudi industrial facilities that may signal fraud, misappropriation or governance failure before those signals appear in financial statements. This task operationalises what agency theory terms fraud deterrence: the reduction in the information gap that managers might exploit to misappropriate or strip company assets during the periods between scheduled human audit visits. In the Saudi market, prolonged audit tenure creates extended windows of auditor familiarity and reduced professional scepticism. The continuous nature of platform-based anomaly detection therefore provides precisely the neutral, non-relational monitoring that human audit procedures cannot sustain.

Time-series imagery enables the platform to detect unauthorised alterations or missing assets by comparing sequential drone imagery captured across multiple verification cycles. It identifies changes in asset composition, inventory levels or structural configuration that are not reflected in the financial records submitted to the CMA [56]. This change-detection function is directly relevant to the moral hazard problem in the Saudi market. A manager who systematically strips assets from a subsidiary, misappropriates inventory between audit cycles, or authorises unauthorised construction or demolition cannot conceal these actions from a system that continuously compares current physical conditions against archived baseline imagery.

Video data enables the platform to monitor live drone feeds for unusual activity patterns—workers at unexpected locations, machinery operating outside declared operational hours, or vehicles accessing restricted areas—and to flag these deviations for CMA investigation. Transformer-based anomaly detection provides the technical foundation for this capability. Deep-learning architectures can learn representations of normal operational patterns and autonomously identify deviations from those patterns without requiring human operators to predefine what an anomaly looks like. This unsupervised learning paradigm is essential for the Saudi industrial audit context, where the range of possible anomalies across cement plants, petrochemical refineries and offshore infrastructure is too diverse and context-specific to be exhaustively catalogued in advance [36]. Thermal sensors detect unexpected temperature changes indicating equipment malfunctions, safety risks and deferred maintenance. This independent verification confirms that the agent is not neglecting asset upkeep to inflate short-term cash flows by deferring necessary repairs. Machine learning applied to historical drone data enables predictive risk assessment. It identifies facilities and asset classes where the probability of irregularity is elevated based on patterns in previous verification cycles.

Multi-agent coordination systems demonstrate that AI organiser agents can autonomously evaluate environmental conditions, resource availability and task urgency to dispatch the most appropriate autonomous vehicle without human scheduling. This validated self-organisation logic underpins the predictive, trigger-based drone deployment mechanism of the TAF [49]. Ref. [57] further establish drone-based assessment within the Maqasid Shariah framework for Takaful operations. Their work provides culturally and institutionally proximate precedent for drone-enabled assurance in the Saudi market and the broader GCC. Table 5 summarises the complete data architecture of Audit Task 3.

Audit Task 4: Construction Progress and Quality Control

The fourth and final audit task executed by the Asset Verification segment addresses one of the most persistent and practically consequential forms of earnings management in the Saudi construction and infrastructure sector: the overstatement of project completion percentages to accelerate cash disbursements from project owners to contractors under long-term construction contracts. Under the percentage-of-completion method required by IFRS for long-term contract revenue recognition, the timing and quantum of revenue and cash flows depend entirely on the objective assessment of how much work has actually been completed at each reporting date. If a contractor overstates completion—reporting 70 per cent completion when the physical site reflects only 55 per cent—the resulting acceleration of cash disbursements represents a direct transfer of value from project owner to contractor. The financial statements obscure this transfer, and traditional periodic site visits cannot reliably detect it [58]. Ref. [13] further establish, in their framework of potential drone applications in auditing, that drone-based verification can support the measurement of progress toward completion of performance obligations under FASB ASC 606-10-25-27 and 606-10-55-17. This provides direct conceptual precedent for the platform’s Audit Task 4 construction-progress verification function.

The platform addresses this task by deploying the drone swarm to provide continuous, objective visual documentation of construction milestone completion. This enables the CMA and external auditors to independently verify the percentage of completion used for revenue recognition in long-term contracts without requiring physical auditor presence at the site [58]. Image and video data provide real-time visual updates on construction milestones. The platform compares current site conditions against the completion percentages reported in interim financial statements and flags material discrepancies for escalation.

The Scan-vs-BIM methodology developed by [18] provides the technical foundation for this capability at the highest level of precision. Ground drones equipped with terrestrial laser scanners collect three-dimensional point cloud data from active construction sites, which is compared against pre-existing Building Information Modelling plans to identify structural discrepancies between what was planned and what was physically built. The methodology detects structural translations, overhangs, section variations and construction voids with classification accuracy exceeding 99 per cent and a mean Intersection over Union of 0.75, without manual intervention at any stage of the data processing pipeline. Ref. [47] provide the foundational exploratory framework establishing drone use in internal and external audits, on which the present Audit Task 4 architecture builds.

Geospatial data verifies that construction activity remains within the planned legal boundaries declared in the CMA project documentation. This protects the principal from legal liabilities and regulatory fines arising from the agent’s negligent or opportunistic encroachment. LiDAR and three-dimensional mapping assess structural dimensions against engineering specifications, ensuring that build quality meets contractually agreed standards. This independent quality verification confirms that contractors are not cutting corners on material quality to accelerate completion or protect performance bonuses.

Time-series data provides the platform’s most powerful construction governance capability. Sequential drone imagery compared against original construction blueprints through AI-enabled software produces an objective, time-stamped record of build quality and project advancement. This record cannot be altered retrospectively by management, ensuring that reported completion rates and associated financial disclosures accurately reflect physical reality at each stage of the construction lifecycle [58]. Table 6 summarises the complete data architecture of Audit Task 4 as implemented within the CMA platform.

3.1.5. Integrated Architecture: From Task to Platform

The four audit tasks described above do not operate as independent modules. Within the CMA platform, they function as an integrated, mutually reinforcing governance architecture in which findings from one task inform and trigger actions in another. A thermal anomaly detected under Task 2 compliance monitoring may simultaneously constitute an asset-condition discrepancy under Task 1 valuation verification and a risk signal under Task 3 anomaly detection. This triggers coordinated escalation across all three analytical pipelines simultaneously. A construction-progress discrepancy detected under Task 4 may correlate with an elevated DACC score calculated by Segment 2. This correlation suggests that the overstatement of completion percentage is part of a broader earnings-management strategy rather than an isolated reporting error, prompting the platform to recommend immediate auditor rotation alongside CMA enforcement review. This integration expresses the TAF’s core proposition. The AI Agent does not replicate human audit functions at greater speed. It generates new forms of governance intelligence by synthesising physical, financial, environmental and historical data simultaneously—a capability that no human-directed procedure can match within a regulatory timeline relevant to investor protection.

3.2. Segment 2: Auditor Assignment and Change

3.2.1. From Mandatory Rotation to Agentic Governance with DACC Monitoring

Traditional approaches to preserving auditor independence within the Saudi audit market have relied principally on mandatory rotation: the strategic changing of external auditors at prescribed intervals designed to break the familiarity bias that accumulates between auditors and management over prolonged engagements [1]. This approach, while intuitively appealing, rests on a simplifying assumption that tenure is inherently corrosive to independence. Recent empirical evidence has begun to complicate this assumption. Ref. [59] demonstrate that long predecessor auditor tenure can paradoxically enhance rather than undermine the independence of the incoming incumbent auditor among financially distressed firms by signalling the client’s genuine commitment to a rigorous audit process. Where a predecessor firm has maintained a long and professionally demanding engagement, the signal transmitted to the successor firm is not one of captured auditors but of a client that takes the audit relationship seriously. This dynamic supports rather than compromises the exercise of professional scepticism. The finding creates a theoretical tension that blanket rotation mandates cannot resolve: if predecessor tenure sometimes enhances independence and sometimes threatens it, the governance instrument must be capable of distinguishing between the two conditions rather than treating all long-tenure situations as equally problematic.

The CMA platform resolves this theoretical tension by shifting from rotation mandates to continuous performance-based monitoring. This approach replaces the blunt instrument of time-based rotation with a precision instrument grounded in observable, quantifiable evidence of audit quality outcomes. The primary metric through which this evidence is operationalised is discretionary accruals (DACC), the component of total accruals not explained by normal business operations and therefore attributable to managerial judgement in the preparation of financial statements. As documented in this paper’s agency theory framework, managers may collaborate with external auditors to manipulate earnings through the strategic exercise of accounting discretion, disguising specific income streams or profit levels in ways that are difficult for shareholders and regulators to detect from financial statement analysis alone [2]. DACC provides an empirically validated, quantitative measure of such manipulation. When DACC is abnormally high or volatile relative to industry benchmarks and historical trends, it signals that the discretionary component of accrual accounting has been exercised in ways that cannot be explained by legitimate business circumstances. This is a direct indicator of earnings-management risk that the platform monitors continuously rather than assessing periodically.

To execute this continuous monitoring function, the AI Agent evaluates six categories of evidence simultaneously for every auditor-client relationship within the CMA’s approved engagement registry (Table 7). These six categories—audit quality metrics, independence indicators, technological capability, asset verification accuracy, discretionary accruals and market competitiveness—collectively constitute a comprehensive governance profile of each engagement. The profile is updated in real time as new financial data, verification outcomes and market information become available. Table 7 summarises the six evaluation categories with their data sources, governance principles and DACC integration logic as implemented within the platform.

The introduction of DACC as the primary evaluation metric represents a fundamental departure from the subjective, reputationally weighted auditor-assessment processes that have historically governed the Saudi audit market. Previous assignment decisions have been shaped by firm size, brand recognition and relationship networks—factors that systematically advantage Big Four firms regardless of their actual performance on any given engagement. The platform substitutes objective, continuously updated and empirically grounded evidence of earnings-management outcomes. An audit firm that consistently delivers low DACC for its client portfolio is demonstrating, through observable financial statement outcomes, that it is exercising the professional scepticism and independent judgement that the audit function is designed to provide. An audit firm whose client portfolio shows persistently elevated or deteriorating DACC is demonstrating the opposite, and the platform responds accordingly, regardless of whether that firm carries a Big Four or non-Big Four designation.

3.2.2. The DACC Monitoring Framework

The platform operationalises discretionary-accrual monitoring using the modified Jones model [60], the most widely accepted and empirically validated methodology for earnings-management detection in the accounting research literature. The choice of the modified Jones model reflects both its established methodological rigour and its adaptation to address the limitation of the original [61] model in environments where revenue management is the primary manipulation mechanism. This context is directly relevant to the Saudi market, where revenue recognition under long-term construction contracts and related-party transactions provides particular scope for discretionary accounting choices.

The AI Agent executes the modified Jones model calculation autonomously and continuously for every listed company within the CMA’s regulatory perimeter, following a five-step computational protocol.

First, the platform calculates total accruals from each listed company’s financial statements by deriving the difference between net income and operating cash flows across the reporting period.

Second, it estimates non-discretionary accruals—the portion of total accruals attributable to normal business operations rather than managerial discretion—based on changes in revenue adjusted for changes in receivables, changes in property, plant and equipment, and other legitimate business factors that accounting standards require firms to reflect through accrual adjustments.

Third, it derives DACC as the residual between total accruals and non-discretionary accruals. This residual represents the portion of accrual accounting choices that cannot be explained by the company’s observable business circumstances and therefore reflects the exercise of discretionary judgement in financial statement preparation.

Fourth, it benchmarks the resulting DACC figure against three reference points simultaneously:

the company’s own historical DACC trend across previous reporting periods, providing a longitudinal baseline that identifies deteriorating or improving earnings-management behaviour;
the industry-sector DACC average for Saudi listed companies in the same sector, providing a cross-sectional reference that accounts for legitimate industry-specific accounting characteristics; and
the peer-group DACC average for companies of comparable size, ownership structure and profitability, providing the most precisely calibrated benchmark against which abnormal discretionary-accrual behaviour can be identified.

Fifth, the platform flags material deviations where DACC exceeds pre-defined materiality thresholds and triggers the governance actions specified in the platform’s escalation protocol.

These thresholds, derived from the audit quality literature and adapted for the Saudi market context [8,9], are structured as a four-band escalation ladder that maps DACC levels to increasingly assertive governance responses, as summarised in Table 8.

The threshold architecture reflects a deliberate governance philosophy derived from [10] refusal-threshold concept: the platform does not merely flag concerns but takes graduated, pre-specified governance actions that ensure no DACC escalation goes unaddressed within the reporting cycle in which it is detected. This eliminates the window of opportunity between detection and response that characterises human-directed audit quality monitoring, in which elevated DACC in one year may not trigger governance consequences until a subsequent periodic review. By that time, additional earnings manipulation may have occurred and been compounded within the financial statements.

3.2.3. The Assignment and Change Protocol

When the platform determines that an auditor change is warranted—either because DACC thresholds have been breached, because trajectory analysis indicates developing familiarity bias, or because a newly listed company requires initial auditor assignment—the AI Agent executes a structured five-step decision protocol that integrates DACC findings with the full range of evaluation-category data described in Section 3.1. Each step in this protocol is fully auditable. The AI Agent’s reasoning chain, including all input data, calculation parameters, benchmark comparisons and weighting decisions, is recorded in the platform’s permanent governance archive and made available to the CMA oversight committee for review before any recommendation is executed.

Step 1: Candidate Identification. The platform queries its continuously updated database of the sixteen CMA-approved external audit firms, applying initial filters based on sector expertise, geographic coverage and demonstrated technological capability. Firms that have previously been sanctioned by the CMA for audit-quality failures, or that are currently under regulatory review, are automatically excluded from the candidate pool regardless of their other performance characteristics. This initial filtering step ensures that the subsequent performance modelling is applied only to firms that meet the minimum governance standards required for appointment to a CMA-regulated engagement.

Step 2: Predictive Performance Modelling Including DACC History. For each candidate firm that passes the initial filter, the platform constructs a predictive performance model. The model draws on the full history of the firm’s DACC outcomes across all previous engagements within the CMA registry. This modelling is grounded in the empirical finding of [2] that non-Big Four firms adopting AI-powered drones achieve measurable improvements in decision-making quality and reductions in audit fees. The platform operationalises this finding by incorporating each candidate firm’s technology-adoption status as a predictive variable alongside its historical DACC profile. Firms that consistently deliver low DACC for their clients across multiple sectors and engagement types receive higher assignment scores, on the basis that their historical track record provides the most reliable predictor of future audit-quality outcomes. The predictive model explicitly weights DACC trajectory—the direction and rate of change in DACC over successive engagements—more heavily than absolute DACC levels. A firm showing consistent improvement in earnings-management detection is therefore a stronger candidate than one showing stable but mediocre DACC performance.

Step 3: Familiarity Bias and DACC Trajectory Analysis. Before finalising any assignment recommendation, the AI Agent conducts a prospective trajectory analysis to assess whether the candidate auditor’s recent engagements display patterns of earnings-management deterioration consistent with developing familiarity bias. Drawing on [59] finding that the relationship between tenure and independence is more nuanced than blanket rotation mandates assume, the platform examines, across all of the candidate firm’s engagements of at least three years’ tenure, the proportion in which DACC has increased monotonically across the three most recent reporting years. Where this proportion exceeds a pre-specified governance threshold—calibrated by the CMA oversight committee against the audit-quality literature and updated annually—the platform recommends an alternative assignment. The basis for this alternative assignment is that the candidate firm displays a pattern of tenure-correlated quality deterioration that historical absolute DACC figures alone do not fully capture. Engagements with fewer than three years of available DACC data are excluded from this trajectory calculation, since the directional pattern cannot be reliably observed. For the candidate firm itself, the trajectory analysis is applied only to its existing portfolio rather than to the prospective engagement, which has no historical DACC series.

Step 4: DACC-Driven Rotation Triggers. In addition to the prospective analysis described above, the platform monitors all active engagements continuously for three specific DACC-based triggers that automatically initiate rotation recommendation processes without waiting for a scheduled review cycle.

The first trigger is activated when DACC exceeds the 5 per cent threshold for two consecutive reporting years, indicating that an elevated earnings-management risk has persisted through more than one reporting cycle and has not been corrected by the incumbent auditor. The platform treats this pattern as evidence of either auditor complicity or insufficient professional scepticism.
The second trigger is activated when DACC volatility—the absolute year-on-year change in DACC—exceeds 3 per cent without an operational justification recorded in the company’s financial statement disclosures. This indicates that the magnitude of discretionary-accrual exercise is changing at a rate inconsistent with stable business operations and suggests active manipulation of reported financial performance.
The third trigger is activated when a newly listed company’s first-year DACC exceeds 5 per cent, indicating the possibility of pre-listing earnings management designed to present an artificially favourable financial position to CMA reviewers and prospective investors at the point of initial public offering. The existing literature identifies this form of manipulation as particularly consequential for market integrity in concentrated capital markets [8].

Step 5: Human-in-the-Loop Governance. Consistent with [10] non-delegable executive accountability framework and with this paper’s reconceptualisation of the triadic agentic system, the platform’s rotation and assignment recommendations are reviewed by a designated CMA oversight committee before execution. This human-in-the-loop governance layer is not a concession to institutional conservatism but a deliberate architectural feature that addresses the trust gap identified in this paper’s UTAUT analysis. Auditors and regulators cannot be expected to delegate legally consequential decisions to autonomous systems unless those systems are demonstrably transparent in their reasoning and subject to meaningful human oversight at the point of consequential action. The platform supports this oversight by making the AI Agent’s complete reasoning chain—including all DACC calculations, benchmark comparisons, trajectory analyses and candidate scoring results—fully available to the oversight committee in a structured, auditable format. This directly addresses the black-box concern that this paper identifies as the primary professional-scepticism barrier to agentic-AI adoption in the Saudi audit context. The platform does not ask regulators to trust an opaque algorithmic output but to evaluate a transparently reasoned recommendation that they can interrogate, challenge and override with documented justification.

3.2.4. Addressing Earnings Management Through DACC Monitoring

The integration of DACC monitoring into the auditor assignment and change protocol directly operationalises the earnings-management governance gap that this paper’s agency theory framework identifies as one of the most consequential and persistently unaddressed threats to financial reporting integrity in the Saudi market. As this paper documents, the risk of manager–auditor collaboration in earnings management is structurally amplified by prolonged audit tenure. When an external auditor is retained for an extended period, the resulting partiality may lead to a conflict of interest in which auditors become complicit in disguising specific income streams or profit levels rather than detecting and reporting them [2]. The conventional response to this risk—mandatory rotation—addresses the tenure dynamic but eliminates the commitment signal that [59] demonstrate can enhance independence among distressed firms. The DACC monitoring framework resolves this dilemma by making independence observable rather than assumed. Rather than prescribing the length of auditor–client relationships, the platform monitors the quality outcome of those relationships through continuous DACC surveillance. It intervenes only when the evidence indicates that the relationship is producing earnings management rather than preventing it.

By continuously monitoring DACC across all CMA-registered engagements, the platform achieves three governance outcomes that no existing framework within the Saudi regulatory environment currently delivers.

First, it detects potential collaboration between managers and auditors through the identification of abnormal accrual patterns that cannot be explained by legitimate business circumstances. These patterns are invisible to shareholders and regulators relying on financial statement analysis alone, but become visible when DACC is calculated, benchmarked and trended continuously rather than assessed retrospectively.

Second, it provides objective, empirically grounded evidence for auditor rotation decisions. This evidence replaces the subjective regulatory assessments and reputational considerations that have historically shaped rotation policy in the Kingdom with a transparent, quantitative standard that applies equally to all sixteen CMA-approved firms regardless of size or market position.

Third, it enables early intervention before earnings management escalates to the level of material fraud. The platform activates governance responses at the 5 per cent DACC threshold rather than waiting for restatements, regulatory investigations or investor complaints that arise only after material misstatement has already entered the public financial record and damaged market confidence.

3.2.5. The Competitive Equaliser Function

This paper proposes that agentic AI may function as a competitive equaliser within the Saudi audit market. It enables non-Big Four firms to close the quality gap with their larger counterparts not by replicating the resource base or brand recognition of Big Four firms, but by accessing the same independently verified, continuously updated and algorithmically generated evidence base that the platform provides equally to all CMA-approved firms. The DACC-enhanced platform operationalises this proposition through three specific mechanisms that restructure the competitive dynamics of the Saudi audit market in favour of demonstrated quality rather than inherited market position.

The first mechanism is the standardisation of earnings-management detection across all firms. Under the platform’s DACC framework, every CMA-approved firm—whether Big Four or non-Big Four—is evaluated against the same empirically derived thresholds, the same benchmarking methodology and the same trajectory analysis protocols. Subjective assessments that have historically favoured larger firms through reputational heuristics, client relationship networks and regulatory familiarity are replaced by a single, objective, continuously updated quality metric that no firm can manipulate through brand positioning or stakeholder management. A non-Big Four firm such as BDO Saudi Arabia or Dr Mohamed Al-Amri and Co. that consistently delivers DACC outcomes below the 2 per cent threshold for its client portfolio will receive higher assignment scores than a Big Four firm whose client portfolio shows persistently elevated DACC. This outcome holds regardless of the relative market positions or institutional prestige of the two firms.

The second mechanism is the systematic revelation of performance outliers across the market. By calculating and archiving DACC for every listed company in the CMA registry on a continuous basis, the platform generates a comprehensive, longitudinal quality league table for all sixteen approved firms that is not available through any existing public or regulatory disclosure mechanism. Non-Big Four firms that consistently outperform their larger counterparts on DACC metrics receive regulatory recognition through higher assignment priority scores and the visible endorsement implicit in platform-generated rotation recommendations. This quality-signalling function may, over time, reshape client perceptions of relative firm quality and create genuine competitive pressure on Big Four incumbents to maintain or improve their DACC performance. The second mechanism aligns directly with [56] empirical finding that non-Big Four firms adopting AI-powered drones achieve quality improvements that narrow the competitive gap with larger counterparts. The platform extends this finding from the adoption of drone technology specifically to the adoption of agentic-AI governance infrastructure more broadly.

The third mechanism is the establishment of transparent, objective and publicly defensible criteria for auditor rotation decisions. These criteria reduce the political and reputational risks that have historically made CMA enforcement action against Big Four firms difficult to sustain. Under the existing regulatory framework, recommending the rotation of a Big Four firm requires the CMA to make a qualitative judgement that may be challenged, litigated, or attributed to non-technical considerations. Under the DACC framework, the same recommendation is supported by an auditable analysis showing which thresholds were breached, over which periods, against which benchmarks and through what DACC trajectory. This evidentiary structure places the burden of justification on the rotated firm rather than on the regulator. As the broader audit quality literature confirms, DACC is a reliable predictor of audit quality regardless of firm size [8]. The platform’s operationalisation of DACC as the primary rotation trigger therefore provides the CMA with a governance instrument that is simultaneously more rigorous, more transparent and more defensible than any time-based rotation mandate that the existing regulatory framework could prescribe.

3.2.6. DACC Limitations and Composite Signalling

The DACC monitoring framework’s central role within Segment 2 should be understood with explicit acknowledgement of the accrual model’s documented limitations. The modified Jones model may misclassify legitimate business changes as discretionary earnings management. Such changes include legitimate revenue acceleration under construction contracts, write-downs reflecting genuine impairment, and accrual reversals tied to working-capital cycles. Empirical studies on emerging-market applications further document reduced reliability under high concentration, related-party transactions and rapid sector growth, all of which characterise the Saudi listed-company universe.

To address these limitations, the platform treats DACC as a primary signal within a composite signalling framework rather than as a standalone rotation trigger. The platform runs three accrual estimation models concurrently—the original Jones model, the modified Jones model and the performance-matched approach—and flags governance actions only where the three converge. Annual sector-specific recalibration further reduces misclassification specific to petrochemicals, cement, agriculture and construction. Most consequentially, every DACC-triggered recommendation is reviewed by the CMA oversight committee (Section 3.2.3, Step 5), with human-override authority preserved at all stages.

This positioning reframes DACC as a high-quality observable signal within a multi-input governance architecture rather than as the sole determinant of auditor assignment. The platform’s broader signal set includes the four audit-quality categories specified in Table 7—historical engagement quality, independence indicators, technological capability and asset verification accuracy—alongside DACC and market competitiveness. The eight rotation triggers across these categories function as mutually reinforcing rather than substitutive inputs. The composite signal structure is designed to absorb the noise documented in any single accrual model. The full Segment 2 workflow, integrating DACC monitoring with the broader six-category evaluation framework and the human-in-the-loop CMA oversight protocol, is presented in Figure 2.

3.3. The Big Data Link: Integration with National Data Infrastructure

3.3.1. Internal Data Sources

The platform’s verification accuracy depends on its integration with Saudi Arabia’s national big data ecosystem. Big data is defined through the Five Vs framework—volume, velocity, variety, veracity and value—which collectively characterise the scale, speed, diversity, reliability and analytical utility of contemporary information systems [62]. Passive access to this data is insufficient for governance purposes. The platform therefore implements Agentic Data Mining: the continuous, autonomous monitoring of data repositories that identifies anomalies without human instruction and acts on them without human scheduling. This transforms big data from a static archive into a live governance instrument. Ref. [48] demonstrates that database-native reasoning architectures can transform passive data repositories into active cognitive substrates for AI systems. This provides the technical foundation for the platform’s Agentic Data Mining operation across Saudi Arabia’s institutional data infrastructure.

Four internal data sources feed the platform directly, each contributing a distinct verification layer that the AI Agent cross-references autonomously.

ZATCA Authority. The ZATCA Authority holds the financial reports of all Saudi firms and serves as the primary regulatory baseline against which company asset declarations are verified. Cross-referencing what firms report to the CMA against what they declare to the tax authority closes one of the most consequential information-asymmetry channels available to managers seeking to maintain inconsistent financial narratives across regulatory bodies.

Bank mortgage data. Bank mortgage data provides the platform’s most granular asset-level intelligence. For every mortgaged asset in the Kingdom, banks hold valuations prepared independently by three professional real-estate agents, alongside photographs, precise geographic locations and current asset status. The coverage extends across buildings, storage facilities, industrial premises, land parcels and residential properties. This three-valuation standard provides the platform with a market-tested, professionally certified benchmark against which any firm’s self-reported asset values can be tested with precision.

Loan firm data. Loan firm data supplies income operations records, cash flow evidence and financial reports that the platform uses for operational performance verification and going-concern assessment. These records provide the principal with independently sourced evidence of the agent’s financial health that is not filtered through the firm’s own accounting judgements.

Firm-provided account data. Firm-provided account data constitutes the self-reported baseline that the platform uses as the starting point for discrepancy detection. The platform treats this data not as a trusted input but as a declaration to be tested against all other data sources simultaneously.

Table 9 summarises the internal data linkages with their verification applications and agency theory alignments.

3.3.2. External Data Sources

The platform’s most distinctive data-integration capability lies in its incorporation of unstructured external data, a category that the existing audit literature identifies as substantially under-exploited despite its significant informational content. Three external data streams feed the platform, each providing verification intelligence that internal data sources cannot replicate. This intelligence originates outside the firm’s control and cannot be manipulated through internal accounting decisions.

Social-media content. Social-media content constitutes the most novel external data source. Company advertisements and real-estate agent postings routinely include asset information, photographs and pricing that are published publicly to reach potential buyers or investors. This information is generated independently of any regulatory reporting obligation. It therefore reflects commercial market valuations rather than strategically managed financial statement figures.

Published reports. Published reports—including academic literature, industry analyses and media coverage—provide contextual benchmarking and reputational assessment data. The platform incorporates this data into its industry-sector DACC benchmarks and peer-group comparisons.

Real-estate agent social accounts. Real-estate agent social accounts provide the most granular external asset intelligence. Agents routinely post full asset specifications, including precise measurements, condition assessments and asking prices. These specifications can be cross-referenced directly against the bank mortgage valuations held in the internal data layer. Table 10 summarises the external data linkages with their verification applications and technical precedents.

3.3.3. The Social Media Verification Mechanism

The platform’s use of social media as an independent verification channel addresses a governance gap that no existing regulatory framework within the CMA currently exploits. When a company lists an asset—a commercial building, an industrial facility or a land parcel—real-estate agents and company representatives routinely publish detailed information about that asset on social-media platforms to maximise commercial reach. These postings typically include photographs, location data, structural specifications and asking prices that reflect genuine market valuations rather than the accounting valuations appearing in financial statements.

The AI Agent exploits this public information through a four-step autonomous process.

First, it crawls relevant social-media accounts and real-estate listing platforms, identifying postings related to assets declared in the financial statements of CMA-regulated companies.

Second, it extracts structured data—location, size, condition and price—from unstructured social-media content using natural language processing. This converts informal commercial listings into comparable data points.

Third, it compares the extracted information against the bank mortgage valuations and firm declarations held in the internal data layer. The platform calculates the percentage divergence between what the firm reports and what the market reflects.

Fourth, it flags material discrepancies for CMA investigation. The platform generates an automatic alert when the gap between formal and informal valuations exceeds the materiality threshold.

The agency theory significance of this mechanism is direct and consequential. A manager cannot simultaneously declare an asset at one value in a financial statement submitted to the CMA, maintain a different value in the bank mortgage documentation, and publish a third value through a real-estate agent’s social-media account without the platform detecting the inconsistency across all three channels. The information asymmetry that makes such multi-channel valuation management possible under passive regulatory oversight is eliminated by the platform’s continuous, cross-source reconciliation. This evidence integration is practically impossible to sustain through human-directed audit procedures operating at the speed and coverage that the Saudi listed-company universe requires.

3.3.4. Technical Implementation Precedent

The technical feasibility of real-time multi-source data integration at the scale required for the CMA platform is established by [34]. Their EUROCOMPLY framework demonstrates that agentic swarm systems can simultaneously verify compliance across multiple regulatory frameworks using a dual-mode retrieval architecture. EUROCOMPLY’s vector-based retrieval module extracts precise regulatory clause references from structured documentation. Its graph-based retrieval module navigates complex interdependencies between standards, regulations and compliance requirements, producing structured, traceable compliance reports with explicit regulatory references rather than opaque algorithmic outputs. This dual-mode architecture achieves an average performance score of 4.3 out of 5 across twenty telecommunications use cases. This evidence confirms that agentic swarm systems can integrate heterogeneous data sources, apply multiple analytical frameworks simultaneously and produce auditable outputs that regulators can inspect and verify.

The CMA platform extends this architecture from telecommunications regulatory compliance to financial asset verification by applying the same dual-mode retrieval logic to the integration of internal financial data and external market data. Where EUROCOMPLY retrieves regulatory clauses and navigates standards interdependencies, the CMA platform retrieves asset valuation data from bank mortgage records and navigates the interdependencies between formal financial declarations, tax authority records and informal social-media valuations. The governance architecture is directly analogous. Both systems deploy specialised agents assigned to distinct data-processing roles, coordinate their outputs through a central reasoning layer, and produce findings that are explicitly referenced to authoritative source data rather than generated through inaccessible algorithmic processes. Most consequentially, ref. [34] demonstrate that the opacity commonly attributed to agentic-AI systems is not an inherent technical limitation but a governance design choice. The CMA platform explicitly rejects this opacity by making every data source, calculation parameter and reconciliation decision available for CMA oversight committee inspection on demand. The full integration architecture of the platform’s internal and external data sources, organised around the Five Vs framework and aligned with the EUROCOMPLY dual-mode retrieval logic, is presented in Figure 3.

3.4. Governance Framework for the CMA Platform

3.4.1. The Triadic Agentic Framework Applied

The platform’s governance architecture instantiates the TAF proposed in this paper, distributing authority, responsibility and accountability across three distinct roles.

The principal—comprising the CMA as regulator and capital market participants as ultimate beneficiaries—holds final authority over all listing, enforcement and auditor rotation decisions. The principal receives verified data and platform recommendations but retains non-delegable responsibility for consequential regulatory action.

The Human Agent—external audit firms and listed company management—remains legally responsible for financial reporting quality and earnings management prevention. The Human Agent is subject to continuous platform oversight rather than periodic regulatory review.

The AI Agent—the CMA platform itself—operates as the autonomous third pillar. It independently collects physical evidence, reconciles multi-source data, calculates DACC, flags discrepancies and generates auditor assignment recommendations grounded in earnings-management risk.

This triadic structure does not transfer responsibility from humans to machines. It makes human responsibility more enforceable by ensuring that every consequential decision is informed by independently generated, algorithmically auditable evidence that the human decision-maker can inspect but cannot manipulate.

This triadic architecture is consistent with [63] sociotechnical framework for agentic performance management, which positions human–AI accountability as a coordinated rather than substitutive arrangement. Ref. [64] provide complementary international evidence on how internal-control systems are organised in the digital era, offering a comparative reference point for the CMA platform’s internal-control architecture across enterprise-level digital transformation contexts.

3.4.2. Leadership Accountability and Refusal Thresholds

The platform implements [10] leadership-centred governance framework through three DACC-specific refusal thresholds that automatically halt platform action when data-integrity conditions are not met.

First, if required financial data is missing or unverifiable, the platform cannot compute DACC reliably. It therefore escalates immediately to human review rather than proceeding on incomplete inputs.

Second, if DACC exceeds 8 per cent but no non-conflicted CMA-approved firm with adequate capacity is available as a replacement, the platform flags an emergency CMA intervention rather than recommending retention. This response recognises that maintaining a severely underperforming auditor is a worse governance outcome than acknowledging the structural constraint that limits rotation options.

Third, if DACC for a given auditor–client engagement has increased in each of the three most recent reporting years—a monotonic upward trajectory in earnings-management indicators across the engagement’s recent history—the platform automatically halts any retention recommendation and mandates rotation. This directional indicator is applied only to engagements with at least three years of available DACC data. For engagements with fewer than three years of data, including newly listed companies and short-tenure engagements, the platform relies on the absolute DACC thresholds specified in Table 8 and on the rotation triggers specified in Section 3.2.3, Step 4. In these cases, the platform treats the directional trajectory indicator as inapplicable rather than as a refusal condition.

These thresholds operationalise [10] concept of formally specified halt conditions, ensuring that the platform cannot generate recommendations that would perpetuate governance gaps it has itself identified. Ref. [65] specifies the privacy-engineering requirements that agentic AI must satisfy at runtime. These requirements are directly applicable to the platform’s data-protection architecture under the Saudi Personal Data Protection Law (PDPL) and ICO transparency principles, and complementary to the refusal-threshold architecture specified above.

3.4.3. Preventing Algorithmic Drift in DACC Estimation

DACC models are subject to drift over time as business conditions, accounting standards and industry characteristics evolve. The platform addresses this through periodic recalibration of its DACC estimation parameters against validated external benchmarks. When the AI Agent’s DACC estimates consistently deviate from prior-year industry averages or peer-group benchmarks without a corresponding change in observable business conditions, the ethical monitoring dashboard generates an automatic alert to CMA executives identifying the nature, magnitude and duration of the drift. This implements [10] algorithmic drift detection mechanism within the specific context of earnings-management surveillance. The mechanism ensures that the governance instrument does not itself become a source of systematic error that managers could exploit once they identify its miscalibration pattern.

3.4.4. Legal and Institutional Feasibility Within the Saudi Regulatory Environment

Three legal frameworks shape the platform’s operational feasibility within Saudi Arabia.

Data protection. The Saudi Personal Data Protection Law (PDPL), enacted under Royal Decree M/19 (2021) and enforced by SDAIA, establishes the data-handling baseline for any AI-enabled regulatory infrastructure operating across listed-company data. The platform’s data flows—drawing on ZATCA Authority records, bank mortgage data, social-media content and firm-provided accounts—must satisfy PDPL data-minimisation, purpose-limitation and lawful-processing requirements. The platform’s Agentic Data Mining architecture (Section 3.3) is therefore designed to operate through privacy-preserving query architectures and access-control protocols rather than bulk data extraction.

Airspace and drone operation. Drone operations across Saudi industrial geographies are regulated by the General Authority of Civil Aviation (GACA) under the Drone Regulation Framework. The platform’s drone deployments (Section 3.1) require GACA registration, operator licencing and integration with restricted-airspace protocols, particularly for offshore and petrochemical facilities. These regulatory conditions are documented constraints requiring formal CMA–GACA coordination before deployment.

Auditor assignment authority. Under the CMA’s existing regulatory perimeter, auditor assignment decisions for listed companies are subject to shareholder approval and SOCPA professional standards. The platform’s algorithmic recommendations therefore operate as advisory input to existing assignment processes rather than as a replacement for them. Formal SOCPA validation of DACC as admissible evidence (Section Trust Expectancy and DACC Transparency) and CMA rule-making to integrate platform recommendations into the existing assignment workflow are identified preconditions for full operational deployment.

Together, these three legal dimensions define the institutional pathway the platform must traverse before achieving full regulatory legitimacy. This pathway is addressed in the implementation discussion in Section 6.2.

3.5. Addressing the UTAUT Trust Gap

Trust Expectancy and DACC Transparency

The platform’s most significant adoption challenge is not technical complexity but professional willingness to delegate earnings-management detection to an autonomous system. This is precisely the Trust Expectancy barrier this paper identifies as the primary obstacle to agentic-AI adoption in the Saudi audit context. The platform addresses this barrier through four mechanisms that build trust specifically around DACC transparency.

Decision transparency is achieved through a complete, permanently archived audit trail of every DACC calculation, including source financial statements, Jones model parameters, industry benchmark values and materiality thresholds. This audit trail ensures that no calculation is a black box and that every figure the platform produces can be reconstructed and verified by human reviewers.

Technical credibility is established through the modified Jones model itself, which is the most widely cited earnings-management detection methodology in the accounting research literature and carries established legitimacy with both academic and practitioner audiences [8,60].

Human-in-the-loop governance ensures that CMA executives retain override authority over any DACC-driven rotation recommendation, with documented justification required for every override. This protocol preserves human accountability while making autonomous recommendations the default rather than the exception.

Regulatory recognition through SOCPA formal validation of DACC as admissible evidence for auditor performance assessment provides the institutional legitimacy that Saudi audit practitioners require before they will accept autonomous earnings-management detection as a basis for legally consequential governance decisions.

Ref. [66] further demonstrate that human–drone interaction characteristics influence the credibility of thermography-based inspection evidence. This finding supports the platform’s framing of drone-audit credibility as a function of both system performance and the institutional dialogue surrounding deployment. Ref. [67] extend this trust framing to the cultural dimension, demonstrating that trust in AI systems is shaped substantially by the ethical dialogue surrounding their deployment. This evidence supports the platform’s positioning of Trust Expectancy as a culturally and institutionally embedded construct rather than a purely technical one.

The four trust-building mechanisms operate at distinct measurement layers.

Decision transparency is measured through audit-trail completeness—specifically, the proportion of platform-generated recommendations for which the complete DACC calculation chain (including source statements, model parameters, benchmark values and threshold logic) is reproducible by an independent reviewer.

Technical credibility is measured through model accuracy on prior-year hold-out samples and through cross-validation against the three accrual estimation approaches described in Section 3.2.6.

Human-in-the-loop governance is measured through the proportion of platform recommendations confirmed, modified or overridden by the CMA oversight committee. This proportion provides a continuous calibration signal between algorithmic recommendation and institutional judgement.

Regulatory recognition is measured through SOCPA formal validation of DACC as admissible evidence for auditor performance assessment—a binary regulatory event rather than a continuous variable.

Together, these four measurement layers convert Trust Expectancy from an abstract adoption construct into an observable, monitored and managed governance condition.

4. Discussion: Practical Benefits of the CMA Agentic AI Platform

The evidence base for this study draws on two complementary streams. The first is the systematic bibliometric corpus of thirty-nine peer-reviewed studies retained after PRISMA 2020 screening and full-text eligibility review [25], described in Section 2.5. The second is a documentary set of four national regulatory frameworks and standards initiatives governing agentic-AI deployment in the Kingdom of Saudi Arabia [4], Singapore [22], the United States [23] and the United Kingdom [24]. Together, these two streams establish that the CMA Agentic Platform delivers four categories of practical benefit that no existing audit governance mechanism within the Saudi market currently achieves simultaneously. Each benefit aligns with the international regulatory frontier rather than departing from it.

4.1. Audit Efficiency at Scale

The first and most immediately quantifiable benefit is audit efficiency at scale. Ref. [13], through a design-science field test in two agricultural settings, document that drone-enabled inventory audit procedures reduce inventory count time from 681 h to 19 h. They also document a reduction in error rates from 0.15 per cent to 0.03 per cent relative to manual procedures. This evidence provides the foundational empirical grounding that drone-based asset verification can be operationalised within professional audit contexts. Ref. [19] validate IoT-integrated multi-drone trajectory coordination, achieving F-measure performance exceeding 88 per cent across emergency and complex industrial environments. This capability enables simultaneous drone deployment across geographically dispersed Saudi industrial assets without human scheduling intervention. Ref. [68] extend this evidence base by validating drone routing optimisation for stocktaking applications directly comparable to the asset-verification function performed by Segment 1 of the proposed platform. Ref. [39] further confirm, through a mixed-method study of AI-enabled drones in manufacturing process audits, that operational adoption is feasible when accompanied by appropriate enabling conditions. These efficiency gains are not additive within the platform architecture. They are multiplicative: Stage 1 financial reconciliation, Stage 2 autonomous drone deployment, Stage 3 multi-source data fusion and Stage 4 AI-powered verification operate as a continuous, integrated pipeline rather than as sequential human-directed procedures.

4.2. Closure of Information Asymmetry Channels

The second practical benefit is the closure of information asymmetry channels that the existing audit framework structurally cannot close. Ref. [21] confirm Faster R-CNN biological asset detection accuracy exceeding 97 per cent across UAV imagery of large plantations, eliminating sampling limitations that allow managers to present favourable asset subsets during scheduled inspections. Ref. [20] validate drone thermal-imaging measurement error within ±1.02 per cent and an R² of 0.97 in green-building energy audits. This capability provides continuous, non-relational monitoring that cannot be socially influenced by the long-standing auditor–client relationships that [2] identify as the primary familiarity bias mechanism in the Saudi market [37] and ref. [38] further document, through farmer and regulator perspectives, respectively, that the credibility of drone-derived audit evidence depends substantially on the trust architecture surrounding its deployment. This finding directly motivates the platform’s Trust Expectancy framing. The platform’s social-media cross-verification mechanism, comparing formal regulatory declarations against publicly posted real-estate valuations, closes a dual-channel manipulation pathway that no existing CMA instrument addresses. This integrated data-governance architecture aligns directly with the Saudi national regulatory environment, where SDAIA’s [4] establishes a mandatory governance baseline for AI deployment in the Kingdom, covering data governance, model accountability, transparency, human oversight and risk management as core pillars of any AI-enabled regulatory instrument [4].

4.3. Objective Earnings Management Detection

The third benefit is objective, transparent earnings management detection through continuous DACC monitoring. The modified Jones model [60], benchmarked against thresholds established in the foundational audit-quality literature [8,9], transforms auditor assignment from a subjective, reputationally weighted process into a transparent, data-driven governance mechanism. Ref. [59] confirm that this approach is more nuanced than blanket rotation mandates. It enables the platform to distinguish tenure conditions that enhance independence from those that compromise it—a distinction that time-based rotation cannot make. Ref. [33] provide independent corroboration of the broader risk-assessment framework approach to cognitive process automation in audit. Ref. [36] demonstrate that retrieval-augmented generation architectures can produce auditable reasoning chains across audit workflows. The platform’s treatment of the AI Agent as an auditable, non-human identity with traceable reasoning chains aligns with the three-pillar approach adopted by the [23] Centre for AI Standards and Innovation, whose AI Agent Standards Initiative is designed to ensure that next-generation autonomous agents can “function securely on behalf of [their] users” [23]. The reasoning-chain transparency that [23] identifies as foundational to agent accountability is precisely the architectural feature that the CMA platform operationalises through its permanently archived DACC calculation trail. Refs. [30,35] further establish the institutional conditions—Chief Data Officer governance and verifiable AI bills of materials, respectively—that underpin this accountability architecture. These conditions demonstrate that the auditable agentic system is a documented frontier of agentic-AI governance research rather than a speculative proposition. Ref. [69] provide independent empirical evidence that multi-agent AI systems can match professional analyst performance, supporting the platform’s positioning of agentic AI as a quality-comparable rather than quality-displacing audit instrument.

4.4. Independently Verified ESG Assurance

The fourth benefit is independently verified ESG assurance. Drone-based building thermal imaging and aerial energy auditing studies in the corpus [20,42,43,44,45] collectively establish the technical foundation for the objective, continuously updated physical evidence that [50] identify as the most significant unmet need in sustainable finance assurance. This need is directly applicable to Saudi Vision 2030 sustainability reporting. Ref. [46] provide independent evidence that UAV-based verification can be integrated with blockchain-anchored evidence records to produce immutable physical-audit trails, addressing the chain-of-custody concerns that ISA-admissibility frameworks raise. Operational viability across the Kingdom’s most challenging industrial geographies—desert heat, offshore humidity and dust—is identified in the manuscript’s methodology section as a context-specific calibration requirement rather than as a resolved technical condition. The supplier-accountability principle that underpins this physical-verification architecture is consistent with the conclusion of the UK [24], which finds that “organisations remain responsible for data protection compliance of the agentic AI they develop, deploy or integrate” [24]. This principle maps directly onto the CMA platform’s Trust Expectancy framing, in which the platform operator—not the audited entity, and not the human auditor—bears non-delegable responsibility for the integrity of the autonomous evidence the platform generates.

4.5. Alignment with International Regulatory Frontier

Finally, these four benefits address all three research questions this paper poses: closing the governance vacuum through Algorithmic Accountability architecture, resolving the trust gap through transparent DACC reasoning chains, and eliminating the ESG assurance deficit through autonomous physical verification. Together, they convert theoretical governance propositions into a practically attainable, empirically grounded regulatory instrument. This alignment is not coincidental. The platform’s four-stage decision architecture—risk assessment, human accountability checkpoints, technical controls throughout the agent lifecycle and end-user transparency—mirrors the four dimensions of the world’s first dedicated governance framework for agentic AI, launched at the World Economic Forum in January 2026 by Singapore’s Ministry of Digital Development and Information and the Infocomm Media Development Authority [22], which explicitly warns that increased agent autonomy creates “challenges for effective human accountability”. Ref. [70] further document the institutional conditions under which generative AI integrates into corporate decision processes, providing comparable case-study evidence for the institutional-readiness conditions the proposed platform requires for adoption. [10] leadership-centred accountability framework, ref. [34] demonstrated dual-mode retrieval architecture for zero-touch regulatory compliance, and [56] bibliometric finding that artificial intelligence is the highest-frequency cluster across Saudi financial-reporting research collectively confirm that the proposed CMA Agentic Platform represents not a departure from emerging international regulatory practice but a practical and Kingdom-specific application of it to the structural governance gaps of Saudi Arabia’s concentrated capital market.

5. Limitations and Future Research

5.1. Limitations and Mitigations

The study has four categories of limitations. Firstly, the methodological limitations relate to the bibliometric source coverage and the design science framing. The bibliometric analysis draws solely on Web of Science and does not incorporate Scopus, Google Scholar or other databases. This choice is justified by the database’s metadata consistency and reproducibility (Justifications for Using Web of Science), but it may exclude relevant studies indexed elsewhere. Most peer-reviewed empirical sources originate outside Saudi Arabia, limiting direct contextual transferability. The conceptual design approach [6] precludes empirical testing of governance outcomes, and the documentary evidence reflects technology and regulatory developments current to May 2026. Secondly, the DACC model carries documented limitations. The modified Jones model may misclassify legitimate business changes as discretionary accruals, particularly in environments characterised by concentrated ownership, related-party transactions and rapid sector growth. This limitation is addressed through the composite signalling framework specified in Section 3.2.6, which runs three accrual estimation models concurrently and flags governance actions only where the three converge. Thirdly, the platform faces a manipulation displacement risk. Firms anticipating DACC triggers may shift toward real earnings management—operational decisions that affect cash flows rather than accruals—which accrual-based detection cannot identify. This is a documented limitation of any accrual-based monitoring system and a priority area for future research. Finally, the platform carries technical and operational constraints. Cyber-attack vulnerabilities, the unsettled legal status of autonomous DACC under ISA and GAAS, training-data bias, weather constraints and the absence of pilot deployment all remain documented constraints. These constraints require further empirical and regulatory research before full operational legitimacy is achieved.

5.2. Future Research Directions

The platform’s design and acknowledged limitations generate five priority research directions, summarised in Table 11. Each direction targets a specific unresolved question that must be answered before the platform achieves full regulatory legitimacy within the Saudi audit ecosystem. Firstly, the legal liability question concerns the judicial validation of autonomously calculated DACC as admissible evidence for mandating auditor rotation. No existing Saudi regulatory instrument currently addresses this question, and resolving it requires coordinated standard-setting between the CMA and SOCPA. Secondly, the auditor-market effects question requires an empirical pre-post study to determine whether platform-based DACC monitoring genuinely reduces earnings management across the market, or simply displaces it toward real activities that accrual analysis cannot detect. This study would compare DACC trajectories before and after platform deployment across a representative sample of CMA-regulated engagements. Thirdly, the algorithmic bias question requires cross-sectional validation to establish whether DACC models systematically disadvantage particular sectors or ownership types. This validation must be completed before uniform threshold application across the full Saudi listed-company universe can be justified. Fourthly, the trust calibration question requires survey and experimental research with CMA officials and SOCPA members. The research aim is to determine whether practitioner-acceptable DACC thresholds align with academic benchmarks or require market-specific adjustment for the Saudi audit context. Ultimately, the DACC and drone verification correlation question requires a longitudinal study linking drone deployment to DACC trends. The study would test whether autonomous physical verification closes the gaps that managers previously exploited for accrual manipulation.

6. Conclusions

6.1. Theoretical Contributions

This study advances three theoretical contributions to the literature on audit governance, agentic AI, and capital-market regulation. Firstly, the study introduces the concept of Algorithmic Accountability as a distinct governance domain: the legal, professional and evidentiary standards that apply to autonomously generated audit evidence and to the chain of responsibility when AI agents act without continuous human oversight. Existing audit-governance frameworks address human auditor accountability (ISA 200; PCAOB AS 1015) and AI-tool accountability under data-protection regimes [24], but no current framework theorises the accountability of autonomous audit agents specifically. The Algorithmic Accountability concept fills this gap by extending the same professional scrutiny applied to financial records to the autonomous systems that examine them, and by specifying the regulatory chain through which that scrutiny operates. Secondly, the study develops the TAF as a governance architecture that distributes authority, responsibility and accountability across three roles: the Principal, the Human Agent and the AI Agent. This framework extends classical agency theory [71], which models two-party principal–agent relationships, to accommodate a third autonomous actor whose decisions are consequential, traceable and subject to its own accountability standards. The framework is not a metaphorical extension but an operational specification. The AI Agent’s role, authority limits, refusal thresholds and reasoning-chain transparency requirements are each defined as architectural features rather than aspirational principles. This makes the framework applicable to other regulatory domains beyond audit—including environmental compliance verification, healthcare quality assurance and supply-chain due diligence—in which autonomous monitoring is operationally feasible but governance frameworks remain undeveloped. Thirdly, the study reframes the UTAUT trust construct for the agentic-AI context by positioning Trust Expectancy—the willingness to delegate legally consequential decisions to autonomous systems—as the primary adoption barrier in professional contexts. Standard UTAUT models [72] treat trust as a moderator of behavioural intention. The present study repositions trust as a precondition for institutional adoption and identifies the architectural mechanisms—reasoning-chain transparency, refusal thresholds, human override authority and regulatory recognition—through which Trust Expectancy can be built rather than merely measured. This contribution is particularly relevant to research on AI adoption in mature professions where the consequences of error are legally and reputationally significant, including auditing, medicine, law and financial advisory services.

6.2. Practical Implications

The study generates direct implications for four constituencies in the Saudi capital-market ecosystem and, by extension, for regulators in comparable concentrated markets. Firstly, for the Capital Market Authority (CMA), the platform specifies an implementable governance architecture for autonomous asset verification and DACC-driven auditor rotation that can be piloted within the existing sixteen-firm regulatory perimeter without requiring new primary legislation. The DACC threshold ladder (Section 3.2.2, Table 8), the refusal-threshold architecture (Section 3.4.2), and the human-in-the-loop oversight protocol (Section 3.2.3, Step 5) are designed for direct adoption as CMA regulatory instruments. The platform’s social-media cross-verification mechanism (Section 3.3.3) addresses a dual-channel manipulation pathway that no current CMA instrument detects.

Secondly, for the Saudi Organisation for Chartered and Professional Accountants (SOCPA), the study identifies the standard-setting decisions required for autonomously generated audit evidence to achieve admissibility under Saudi auditing standards. Most consequentially, formal SOCPA validation of DACC as admissible evidence for auditor performance assessment is identified in Section Trust Expectancy and DACC Transparency as the institutional precondition for practitioner acceptance of autonomous earnings-management detection.

Thirdly, for Saudi audit firms—both Big Four and non-Big Four—the platform reframes competitive positioning. The previous positioning organised around firm size and reputation is replaced by one organised around continuously monitored, algorithmically verified audit-quality outcomes (Section 3.2.5). Non-Big Four firms that consistently deliver low DACC outcomes gain regulatory recognition through higher assignment scores. Big Four firms whose client portfolios show elevated DACC face the same scrutiny regardless of market position. The practical implication for firms is that investment in AI-integrated audit infrastructure becomes a quality-signalling mechanism rather than a discretionary efficiency choice.

Lastly, for international regulators in comparable markets—including the broader GCC, Singapore and emerging Asian financial centres—the platform offers a transferable architecture for integrating autonomous physical verification with quantitative earnings-management monitoring within concentrated capital-market environments. The four government regulatory frameworks anchoring the platform’s governance design [4,22,23,24] provide the international reference points against which Kingdom-specific implementation can be benchmarked, and against which other jurisdictions can position their own agentic-AI audit-governance instruments.

6.2.1. Implementation Risks

The platform’s pathway to operational deployment must address five categories of implementation risk identified across the documentary evidence base and the Saudi regulatory environment. Firstly, the technical integration risk concerns the platform’s connection to institutional data infrastructure. Connecting the platform to the data infrastructure of the Zakat, Tax and Customs Authority, the banking sector and the CMA’s own listed-company records requires cross-agency data-sharing memoranda, privacy-preserving query architectures and access-control protocols compatible with the Saudi Personal Data Protection Law (PDPL). Inadequate integration design at this stage could either compromise data security or fragment the verification evidence base in ways that undermine the platform’s analytical accuracy.

Secondly, the training and oversight capacity risk concerns the CMA oversight committee’s interpretive readiness. The committee responsible for reviewing the platform’s recommendations under the human-in-the-loop protocol (Section 3.2.3, Step 5) requires specialised training in algorithmic-recommendation interpretation, DACC trajectory analysis and override-justification documentation. Without sufficient committee capacity, the platform’s transparent reasoning chains risk being treated as opaque outputs rather than as the auditable evidence the architecture is designed to provide.

Thirdly, the regulatory coordination risk concerns the multi-authority rule-making required for operational deployment. The platform’s deployment requires coordinated rule-making across the CMA, SOCPA, the General Authority of Civil Aviation (GACA) and SDAIA. CMA rule-making must authorise platform recommendations as advisory input to existing assignment processes. SOCPA must formally validate DACC as admissible evidence for auditor performance assessment. GACA must register drone operations and integrate restricted-airspace protocols for offshore and petrochemical sites. SDAIA must confirm the platform’s compliance with the [4]. Coordinated rule-making across four authorities is the most significant non-technical risk to deployment.

Fourthly, cyber-security and operational risk concern the platform’s attack surfaces and adversarial-robustness requirements. The platform’s drone systems, cross-source data-fusion infrastructure and DACC computational pipelines each present cyber-attack surfaces requiring continuous security monitoring, incident-response protocols and adversarial-robustness testing of the underlying machine-learning models. These operational protocols must be specified in cooperation with SDAIA’s cyber-security guidance before pilot deployment.

Finally, the data-protection and PDPL compliance risk concerns the platform’s processing of personal and commercially sensitive data. Personal and commercially sensitive data flowing through the platform—including bank mortgage valuations, real-estate agent valuations, social-media content and firm-level financial records—must be processed under PDPL data-minimisation, purpose-limitation and lawful-processing principles. Privacy-impact assessments are required at each integration stage. Failure to satisfy PDPL conditions would invalidate the platform’s evidentiary outputs regardless of their technical accuracy.

6.2.2. Pathway to Pilot Deployment

The platform’s translation from conceptual specification to operational deployment is structured as a five-stage trajectory, with each stage producing a defined deliverable that conditions progression to the next.

Stage 1: CMA stakeholder consultation. The first stage convenes the CMA, SOCPA, GACA, SDAIA and a sample of approved external audit firms to review the platform’s architecture, refine the governance protocols and confirm the institutional preconditions for pilot deployment. The deliverable is a multi-agency memorandum of understanding establishing the regulatory authority and data-sharing terms required for Stage 2.

Stage 2: Limited pilot within selected audit firms. The platform is deployed in a controlled pilot covering two to three CMA-approved audit firms—ideally a mix of Big Four and non-Big Four firms—across a small selection of listed companies in different industrial sectors (for example, one petrochemical, one cement and one construction). DACC monitoring runs on archival financial data; drone-based asset verification is tested on consenting client sites; and platform recommendations are generated but not enforced. The deliverable is a pilot evaluation report assessing technical performance, recommendation accuracy and stakeholder acceptance.

Stage 3: Regulatory experimentation through CMA sandboxing. The CMA’s regulatory sandbox framework is extended to the platform, enabling controlled application of platform recommendations to a defined population of listed companies under modified regulatory authority. Sandboxed operation generates evidence on how platform recommendations interact with existing CMA decision processes, where human oversight is most consequential, and how firms and auditors respond to algorithmically generated governance signals. The deliverable is a sandbox evaluation report.

Stage 4: SOCPA standard-setting and CMA rule-making. SOCPA conducts a standard-setting consultation on the admissibility of autonomously generated DACC calculations as evidence for auditor performance assessment, drawing on Stage 3 sandbox findings. In parallel, the CMA conducts rule-making to integrate platform recommendations into the formal auditor-assignment process under the existing listed-company governance framework. The deliverable is a SOCPA standard and a CMA regulation authorising operational deployment.

Stage 5: Full operational deployment. The platform enters operational service across the full sixteen-firm regulatory perimeter and the complete listed-company universe. Continuous monitoring, periodic model recalibration, drift detection and ongoing CMA oversight committee review constitute the steady-state operational architecture.

6.3. Concluding Reflections

Through a conceptual design study supported by a PRISMA-guided bibliometric review and a documentary analysis of national agentic-AI regulatory frameworks, this research has translated the governance challenges of Saudi Arabia’s concentrated audit market into a specified regulatory architecture. The CMA Agentic AI Platform operationalises the TAF, distributing authority, accountability and refusal thresholds across the Principal, the Human Agent and the AI Agent. It supports Algorithmic Accountability for autonomously generated audit evidence within the CMA’s existing regulatory perimeter. Its continuous DACC monitoring framework complements time-based rotation with an objective, evidence-based mechanism that preserves auditor independence by calculating DACC continuously, benchmarking it against industry and peer norms, and applying graduated governance thresholds that engage directly with the theoretical tension between familiarity bias and predecessor-tenure commitment signalling. Agentic Drone Swarms are specified as a verification mechanism for ESG disclosures, supported by independently validated technical capabilities and aligned with the international regulatory frameworks anchoring the platform’s governance design (SDAIA, MDDI/IMDA, NIST, ICO). Together, these components convert what began as a set of governance challenges into a transparently reasoned, empirically grounded and internationally aligned proposal. The platform suggests that autonomous audit governance in concentrated capital markets may not merely be conceivable, but specifiable, defensible and ready for regulatory engagement.

Author Contributions

Conceptualisation, S.M.N.I., M.P. and A.H.J.A.; introduction, A.H.J.A.; literature review; A.H.J.A.; methodology, A.H.J.A.; results, A.H.J.A.; discussion, A.H.J.A.; formal analysis, A.H.J.A.; data curation, A.H.J.A.; conclusions, A.H.J.A.; writing—original draft preparation, A.H.J.A.; writing—review and editing, A.H.J.A. All authors have read and agreed to the published version of the manuscript.

Funding

The authors extend their appreciation to the Deanship of Scientific Research at Northern Border University, Arar, KSA, for funding this research work through the project number “NBUSAFIR-2026”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analysed in this study. Data sharing is not applicable.

Acknowledgments

During the preparation of this manuscript, the corresponding author (A.H.J.A.) used Grammarly Premium (web version, accessed June 2026) for editing and proofreading of the text, and FigureLabs (web version, accessed June 2026), an AI-assisted figure-generation tool, to support the visual rendering and layout of Figure 1, Figure 2 and Figure 3. The authors developed the conceptual content, system architecture and analytical structure of each figure. The authors have reviewed and edited all outputs and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

TAF	Triadic Agentic Framework
AAI	Agentic Artificial Intelligence
AI	Artificial Intelligence
ANDT	Anomaly Detection with Transformers
ANFIS	Adaptive Neuro-Fuzzy Inference System
AUC	Area Under Curve
BIM	Building Information Modelling
CASP	Critical Appraisal Skills Programme
CMA	Capital Market Authority
CNN	Convolutional Neural Network
DACC	Discretionary Accruals
ESG	Environmental, Social, and Governance
FC	Facilitating Conditions
ISA	International Standards on Auditing
LiDAR	Light Detection and Ranging
LLM	Large Language Model
AP	Average Precision
MRR	Mean Reciprocal Rank
NLP	Natural Language Processing

Appendix A

Figure A1. PRISMA 2020 flow diagram for the systematic literature search conducted on 29 May 2026 within the Web of Science Core Collection. The asterisk (*) within the search string denotes the Web of Science truncation wildcard, which matches all word-stems beginning with the indicated prefix (for example, audit* matches audit, audits, audited, auditing, auditor and auditors). The dashed-line frame indicates pathways from the standard PRISMA 2020 template that are not applicable to this review, because the search drew on a single database and did not include records identified from other sources such as citation searching, registers or grey literature. Source: Authors’ own work, adapted from the PRISMA 2020 template [25].

Figure A2. Keyword Co-occurrence Network. Source: Generated by the authors using VOSviewer software.

Appendix B

Table A1. Eligibility Exclusion Criteria.

N	Exclusion Criterion	Excluded
1	Telecommunications/wireless network applications (5G, 6G, multi-LLM networks) with no audit, accounting, financial reporting, or assurance anchor	3
2	Manufacturing process control or zero-defect production systems without audit/assurance application	1
3	Civil, structural, or seismic engineering applications without audit context	1
4	Medical or clinical coding (e.g., ICD-10), where “audit” refers to clinical-record review rather than financial or assurance audit	1
5	Marketing, branding, or consumer-behaviour applications with no audit, accounting, or assurance anchor	1
6	Scientific experimentation (e.g., physics, instrumentation) outside accounting, audit, or financial-reporting domains	1
7	Public-sector registration, property transfer, or asset-tokenisation systems without audit or financial-reporting context	1
8	Pure formal-logic, theoretical, or computer-science methodology papers without application to audit or accounting	1
9	Methodological datasets or benchmarking papers studying AI agents in general, with no audit, accounting, or assurance application	1
10	Information-retrieval or chatbot-performance studies (e.g., bibliographic accuracy tests) outside audit/accounting domains	1
	Total excluded at Stage 5	12

References

Al-Ajmi, J. Audit firm, corporate governance, and audit quality: Evidence from Bahrain. Adv. Account. 2009, 25, 64–74. [Google Scholar] [CrossRef]
Alhazmi, A.H.J.; Islam, S.; Prokofieva, M. The impact of changing external auditors, auditor tenure, and audit firm type on the quality of financial reports on the Saudi Stock Exchange. J. Risk Financ. Manag. 2024, 17, 407. [Google Scholar] [CrossRef]
Aljifri, K.; Moustafa, M. The impact of corporate governance mechanisms on the performance of UAE firms. J. Econ. Adm. Sci. 2007, 23, 71–93. [Google Scholar] [CrossRef]
Saudi Data and Artificial Intelligence Authority. AI Adoption Framework; SDAIA: Riyadh, Saudi Arabia, 2025. Available online: https://sdaia.gov.sa (accessed on 11 May 2026).
Rahman, F.R. Flying high: How drones are optimizing Aramco’s operations. Elements Magazine, 25 April 2024. Available online: https://www.aramco.com/en/news-media/elements-magazine/2024/flying-high-how-drones-are-optimizing-aramcos-operations (accessed on 25 April 2024).
Hevner, A.R.; March, S.T.; Park, J.; Ram, S. Design science in information systems research. MIS Q. 2004, 28, 75–105. [Google Scholar] [CrossRef]
Peffers, K.; Tuunanen, T.; Rothenberger, M.A.; Chatterjee, S. A design science research methodology for information systems research. J. Manag. Inf. Syst. 2007, 24, 45–77. [Google Scholar] [CrossRef]
Alzoubi, E.S.S. Audit quality and earnings management: Evidence from Jordan. J. Appl. Account. Res. 2016, 17, 170–189. [Google Scholar] [CrossRef]
Abbott, L.J.; Daugherty, B.; Parker, S.; Peters, G.F. Internal audit quality and financial reporting quality: The joint importance of independence and competence. J. Account. Res. 2016, 54, 3–40. [Google Scholar] [CrossRef]
Henderson, M.D. Agentic AI and the ethics of leadership maintenance: Rethinking responsibility in algorithmic organizations. Leadersh. Organ. Dev. J. 2026, 47, 294–308. [Google Scholar] [CrossRef]
Resende, M. AI agents and no-code tools in accounting: A case study. FinTech 2025, 4, 65. [Google Scholar] [CrossRef]
Van Aken, J.E. Management research as a design science: Articulating the research products of mode 2 knowledge production in management. Br. J. Manag. 2005, 16, 19–36. [Google Scholar] [CrossRef]
Christ, M.H.; Emett, S.A.; Summers, S.L.; Wood, D.A. Prepare for takeoff: Improving asset measurement and audit quality with drone-enabled inventory audit procedures. Rev. Account. Stud. 2021, 26, 1323–1343. [Google Scholar] [CrossRef]
Critical Appraisal Skills Programme. CASP Qualitative Studies Checklist; CASP UK: Oxford, UK, 2018; Available online: https://casp-uk.net/casp-tools-checklists/ (accessed on 11 May 2026).
Nooralishahi, P.; Ibarra-Castanedo, C.; Deane, S.; López, F.; Pilla, S.; Tison, F.; Maldague, X.P.V. Drone-based non-destructive inspection of industrial sites: A review and case studies. Drones 2021, 5, 106. [Google Scholar] [CrossRef]
Falorca, J.F.; Lanzinha, J.C.G. Facade inspections with drones—Theoretical analysis and exploratory tests. Int. J. Build. Pathol. Adapt. 2021, 39, 235–258. [Google Scholar] [CrossRef]
Seo, J.; Duque, L.; Wacker, J. Drone-Based Close-Range Sensing for Assessment of Timber Bridges; General Technical Report FPL-GTR-261; USDA Forest Service, Forest Products Laboratory: Madison, WI, USA, 2018.
Guerrero-Sevilla, D.; Rodríguez-Gómez, R.; Morcillo-Sanz, A.; Gonzalez-Aguilera, D. Optimising construction site auditing: A novel methodology integrating ground drones and building information modelling (BIM) analysis. Drones 2025, 9, 277. [Google Scholar] [CrossRef]
Al-Janabi, S.; Seyhood, N.G. Optimizing UAV performance with IoT and fuzzy linear fractional transportation models. Results Eng. 2024, 24, 103306. [Google Scholar] [CrossRef]
Khan, O.; Parvez, M.; Alansari, M.; Farid, M.; Devarajan, Y.; Thanappan, S. Application of artificial intelligence in green building concept for energy auditing using drone technology under different environmental conditions. Sci. Rep. 2023, 13, 8200. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Ghazali, K.H.; Han, F.; Mohamed, I.I. Automatic detection of oil palm tree from UAV images based on the deep learning method. Appl. Artif. Intell. 2021, 35, 13–24. [Google Scholar] [CrossRef]
Ministry of Digital Development and Information; Infocomm Media Development Authority. Model AI Governance Framework for Agentic AI; MDDI/IMDA: Singapore, 2026. Available online: https://www.mddi.gov.sg/newsroom/singapore-launches-new-model-ai-governance-framework-for-agentic-ai--/ (accessed on 11 May 2026).
National Institute of Standards and Technology. AI Agent Standards Initiative; CAISI, NIST: Gaithersburg, MD, USA, 2026. Available online: https://www.nist.gov/artificial-intelligence/ai-agent-standards-initiative (accessed on 11 May 2026).
Information Commissioner’s Office. ICO Tech Futures: Agentic AI; ICO: Wilmslow, UK, 2026; Available online: https://ico.org.uk/about-the-ico/research-reports-impact-and-evaluation/research-and-reports/technology-and-innovation/tech-horizons-and-ico-tech-futures/ico-tech-futures-agentic-ai/ (accessed on 11 May 2026).
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef] [PubMed]
Bartolacci, F.; Caputo, A.; Soverchia, M. Sustainability and financial performance of small and medium-sized enterprises: A bibliometric and systematic literature review. Bus. Strategy Environ. 2020, 29, 1297–1309. [Google Scholar] [CrossRef]
Mongeon, P.; Paul-Hus, A. The journal coverage of Web of Science and Scopus: A comparative analysis. Scientometrics 2016, 106, 213–228. [Google Scholar] [CrossRef]
Boyack, K.W.; Klavans, R.; Börner, K. Mapping the backbone of science. Scientometrics 2005, 64, 351–374. [Google Scholar] [CrossRef]
Alhazmi, A.H.J.; Islam, S.; Prokofieva, M. The impact of AI-integrated drone technology and big data on external auditing performance, sustainability, and financial reporting quality in an emerging market. Account. Audit. 2025, 1, 8. [Google Scholar] [CrossRef]
van Eck, N.J.; Waltman, L. Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 2010, 84, 523–538. [Google Scholar] [CrossRef] [PubMed]
Stratopoulos, T.C.; Wang, V.X. Artificial intelligence and accounting research: A framework and agenda. Int. J. Account. Inf. Syst. 2025, 56, 100760. [Google Scholar] [CrossRef]
Beulen, E.; Dans, M. Artificial intelligence governance mechanisms—The Chief Data Officer perspective with a focus on agentic AI governance. Information 2026, 17, 336. [Google Scholar] [CrossRef]
Cheong, A.; Kassar, M.; Li, H.X. A risk assessment framework for cognitive process automation in audit. J. Emerg. Technol. Account. 2026, 1–12. [Google Scholar] [CrossRef]
Ameur, M.; Brik, B.; Ksentini, A. EUROCOMPLY: Enabling zero-touch AI compliance auditing via LLM-based agentic AI. IEEE Commun. Mag. 2026, 1–7. [Google Scholar] [CrossRef]
Radanliev, P.; Santos, O.; Maple, C.; Atefi, K. Operationalising artificial intelligence bills of materials for verifiable AI provenance and lifecycle assurance. Front. Comput. Sci. 2026, 8, 1735919. [Google Scholar] [CrossRef]
Xiong, F.B.; Han, Q.H.; Zhang, C.N. Design AI agent for auditing: Applying large language models (LLMs) and retrieval-augmented generation (RAG) to audit workflows. J. Emerg. Technol. Account. 2026, 23, 189–198. [Google Scholar] [CrossRef]
Lucock, X.; Westbrooke, V. Trusting in the eye in the sky? Farmers’ and auditors’ perceptions of drone use in environmental auditing. Sustainability 2021, 13, 13208. [Google Scholar] [CrossRef]
Westbrooke, V.; Lucock, X.; Greenhalgh, I. Drone use in on-farm environmental compliance: An investigation of regulators’ perspectives. Sustainability 2023, 15, 2153. [Google Scholar] [CrossRef]
Shankar, A.; Behl, A.; Pereira, V.; Chavan, M.; Chirico, F. Exploring enablers and inhibitors of AI-enabled drones for manufacturing process audits: A mixed-method approach. Bus. Strategy Environ. 2024, 33, 3749–3768. [Google Scholar] [CrossRef]
Musa, A.M.H. Detecting the effect of artificial intelligence on internal audit performance: Empirical study in Saudi Arabia. Decis. Sci. Lett. 2024, 13, 967–976. [Google Scholar] [CrossRef]
Hou, Y.; Volk, R.; Chen, M.D.; Soibelman, L. Fusing tie points’ RGB and thermal information for mapping large areas based on aerial images: A study of fusion performance under different flight configurations and experimental conditions. Autom. Constr. 2021, 124, 103554. [Google Scholar] [CrossRef]
Hou, Y.; Chen, M.D.; Volk, R.; Soibelman, L. Investigation on performance of RGB point cloud and thermal information data fusion for 3D building thermal map modeling using aerial images under different experimental conditions. J. Build. Eng. 2022, 45, 103380. [Google Scholar] [CrossRef]
Bayomi, N.; Nagpal, S.; Rakha, T.; Fernandez, J.E. Building envelope modeling calibration using aerial thermography. Energy Build. 2021, 233, 110648. [Google Scholar] [CrossRef]
Daffara, C.; Muradore, R.; Piccinelli, N.; Gaburro, N.; de Rubeis, T.; Ambrosini, D. A cost-effective system for aerial 3D thermography of buildings. J. Imaging 2020, 6, 76. [Google Scholar] [CrossRef] [PubMed]
Mayer, Z.; Epperlein, A.; Vollmer, E.; Volk, R.; Schultmann, F. Investigating the quality of UAV-based images for the thermographic analysis of buildings. Remote Sens. 2023, 15, 301. [Google Scholar] [CrossRef]
Fernández-Caramés, T.M.; Blanco-Novoa, O.; Froiz-Míguez, I.; Fraga-Lamas, P. Towards an autonomous Industry 4.0 warehouse: A UAV and blockchain-based system for inventory and traceability applications in big data-driven supply chain management. Sensors 2019, 19, 2394. [Google Scholar] [CrossRef] [PubMed]
Appelbaum, D.; Nehmer, R.A. Using drones in internal and external audits: An exploratory framework. J. Emerg. Technol. Account. 2017, 14, 99–113. [Google Scholar] [CrossRef]
Ahmed, I. Database-native reasoning: Treating the database as the cognitive substrate for AI systems. IEEE Access 2026, 14, 54912–54921. [Google Scholar] [CrossRef]
Montero, A.; Rodríguez, S.; Sánchez, F.; Yébenes, A. Self-organization through a multi-agent system for orders distribution in large companies. ADCAIJ Adv. Distrib. Comput. Artif. Intell. J. 2018, 7, 65–71. [Google Scholar] [CrossRef]
Bawack, R.E.; Bawack, E.B.; Seny Kan, K.A. Artificial intelligence in sustainable finance and accounting: A bibliometric analysis and future research agenda. Inf. Syst. Front. 2026, 1–32. [Google Scholar] [CrossRef]
Leung, P.; Ilsever, J. Auditing and Assurance Services in Australia, 5th ed.; McGraw-Hill Education: Sydney, Australia, 2013. [Google Scholar]
Kim, S.; Downen, T.; Kang, H. Better sooner than later? Effects of adopting drone-enabled inventory observation on auditor liabilities. Manag. Audit. J. 2026, 41, 557–578. [Google Scholar] [CrossRef]
Kaur, R.; Kundu, T.; Sharma, B.; Park, K.M.; Pinsky, E. Operational resilience under carbon constraints: A socio-technical multi-agentic approach to global supply chains. Systems 2026, 14, 374. [Google Scholar] [CrossRef]
Alharbi, Y. Audit Data Analytics, the Transformation within the Audit Profession: Perspectives from the Kingdom of Saudi Arabia. Ph.D. Thesis, RMIT University, Melbourne, Australia, 2023. [Google Scholar]
Islam, M.A.; Somu, S.; Aldaihani, F.M.F. The rise of agentic AI: Synthesis of current knowledge and future research agenda. Glob. Bus. Organ. Excel. 2026, 45, 402–416. [Google Scholar] [CrossRef]
Alhazmi, A.H.J.; Islam, S.M.N.; Prokofieva, M. The impact of artificial intelligence adoption on the quality of financial reports on the Saudi Stock Exchange. Int. J. Financ. Stud. 2025, 13, 21. [Google Scholar] [CrossRef]
Muhamat, A.A.; Zulkifli, A.F.; Sulaiman, S.; Subramaniam, G.; Mohamad, S. Development of social cost and benefit analysis (SCBA) in the Maqasid Shariah framework: Narratives on the use of drones for takaful operators. J. Risk Financ. Manag. 2021, 14, 387. [Google Scholar] [CrossRef]
Qasim, A.; El Refae, G.A.; Eletter, S. A proposed model to integrate drone technology in accounting for long-term contracts: A cash flow management perspective. Int. Arab J. Inf. Technol. 2023, 20, 488–495. [Google Scholar] [CrossRef] [PubMed]
Burnett, B.M.; Martin, G.W.; Reppenhagen, D.A.; Tanyi, P. Incumbent auditor independence and predecessor auditor tenure. J. Account. Audit. Financ. 2026, 41, 287–316. [Google Scholar] [CrossRef]
Dechow, P.M.; Sloan, R.G.; Sweeney, A.P. Detecting earnings management. Account. Rev. 1995, 70, 193–225. [Google Scholar] [CrossRef]
Jones, J.J. Earnings management during import relief investigations. J. Account. Res. 1991, 29, 193–228. [Google Scholar] [CrossRef]
IBM. The Four V’s of Big Data. 2014. Available online: http://www.ibmbigdatahub.com/infographic/four-vs-big-data (accessed on 1 January 2014).
Basu, P. Governing agentic performance management: A sociotechnical framework for organisational effectiveness. Int. J. Organ. Anal. 2026, 1–18. [Google Scholar] [CrossRef]
Klius, Y.; Ivchenko, Y.; Izhboldina, A.; Ivchenko, Y. International approaches to organizing an internal control system at an enterprise in the digital era. Econ. Ann.-XXI 2020, 185, 145–158. [Google Scholar] [CrossRef]
Navaie, K. From rights to runtime: Privacy engineering for agentic AI. AI Mag. 2025, 46, e70036. [Google Scholar] [CrossRef]
Liu, P.K.; Tang, P.B.; Liu, J.P.; Hou, Y. Quantifying personality in human–drone interactions for building heat loss inspection with virtual reality training. Adv. Eng. Inform. 2026, 64, 104127. [Google Scholar] [CrossRef]
Lee, J.C.K.; Dede, C.; Wang, M.J.; Li, X.F. Building trust in AI through dialogues with Eastern ethics: Toward ethical partnerships in education. IEEE Trans. Learn. Technol. 2025, 18, 833–841. [Google Scholar] [CrossRef]
Vichitkunakorn, P.; Emde, S.; Masae, M.; Glock, C.H.; Grosse, E.H. Locating charging stations and routing drones for efficient automated stocktaking. Eur. J. Oper. Res. 2024, 316, 1129–1145. [Google Scholar] [CrossRef]
Francis, A.; Zhang, C.X. Can AI match professional analysts? Evidence from a multi-agent system. Financ. Res. Lett. 2026, 103, 110127. [Google Scholar] [CrossRef]
Biedova, O.; Junglas, I.; Villafranca, E.; Ives, B. Innovating with generative AI at CVPCorp. Commun. Assoc. Inf. Syst. 2025, 57, 445–457. [Google Scholar] [CrossRef]
Jensen, M.C.; Meckling, W.H. Theory of the firm: Managerial behavior, agency costs and ownership structure. J. Financ. Econ. 1976, 3, 305–360. [Google Scholar] [CrossRef]
Venkatesh, V.; Morris, M.G.; Davis, G.B.; Davis, F.D. User acceptance of information technology: Toward a unified view. MIS Q. 2003, 27, 425–478. [Google Scholar] [CrossRef]

Figure 1. Autonomous Asset Verification: +four-stage workflow of Segment 1 of the CMA Agentic Platform (Stages 1–4: financial data reconciliation, autonomous drone deployment, multi-source data fusion, and AI-powered verification). Arrows indicate the directional flow of data and triggering events; colour bands distinguish the four sequential stages. The Legend of Empirical Validations cites [18,19,20,21,36]. Source: Authors’ own work, prepared with the assistance of FigureLab (see Acknowledgments).

Figure 2. Agentic AI for Auditor Assignment or Change: the platform’s five-step decision protocol integrating the six evaluation categories specified in Table 7 with the four-band DACC threshold ladder specified in Table 8. Source: Authors’ own work, prepared with the assistance of FigureLab (see Acknowledgments).

Figure 3. Integration of Internal and External Data Sources in the Agentic AI Platform. The figure illustrates the platform’s multi-source data integration architecture, organised around the Five Vs framework (volume, velocity, variety, veracity, value) [62]. Blue arrows from the left panels show the four internal data sources (ZATCA Authority, Bank Mortgage Data, Loan Firm Data, and Firm Accounts) feeding into the central Agentic AI Platform, summarised in Table 9. Pink arrows from the right panels show the three external data sources (Social Media Content, Published Reports, and Real-Estate Agent Social Accounts), summarised in Table 10. The platform integrates these heterogeneous data streams through continuous autonomous data patrols (Agentic Data Mining), generating outputs that feed the Verified Asset Registry and Audit Dashboard. The annotation boxes around the central platform identify the governance functions delivered by the integration architecture: continuous autonomous data patrols, reduction of information asymmetry, mitigation of moral hazard and manipulation risks, and early anomaly detection. Source: Authors’ own work, prepared with the assistance of FigureLab (see Acknowledgments).

Table 1. CMA Agentic AI Platform—Core Segments.

Segment	Primary Function	Governance Mechanism
Segment 1: Asset Verification	Autonomous physical verification of corporate assets using drone technology, AI image recognition, and big data reconciliation	TAF with leadership-centred accountability [10]
Segment 2: Auditor Assignment and Change	Data-driven recommendation and execution of external auditor appointments and rotations based on objective performance metrics, asset-verification outcomes and discretionary-accruals monitoring	Algorithmic Accountability framework with refusal thresholds and ethical dashboards

Table 2. Empirical Validation of Segment 1 Capabilities.

Capability	Empirical Validation	Key Finding
Autonomous biological asset counting	[21]	97.79 per cent accuracy across a 22-hectare plantation; full audit completed in 1.5 h, compared with days of manual inspection
Multi-drone coordination without human scheduling	[49]	Organiser agent autonomously assigns tasks across heterogeneous vehicle types; dynamic environmental adaptation without human instruction
Automated construction site auditing	[18]	More than 99 per cent classification accuracy; detects structural translations, overhangs and voids without manual intervention
Thermal auditing under variable conditions	[20]	R² of 0.97; ±1.02 per cent measurement error; identifies invisible structural deterioration that managers may present as fully operational

Table 3. Audit Task 1—Asset Verification and Valuation Data.

Data Type	Purpose	Analytical Use Case	Agency Theory Alignment
Image data	Asset verification	Autonomous detection, counting and location mapping of physical assets via drone imagery cross-referenced against financial records; accuracy exceeding 97 per cent [21]	Reduces information asymmetry: prevents managers from overstating assets or concealing ghost inventory
Video data	Inventory count	Real-time monitoring of asset counts and movements provides visual audit documentation permanently archived in the platform record	Mitigates moral hazard: continuous visual presence discourages managers from manipulating physical stock levels between scheduled inspections
Geospatial data	Location tracking	GPS coordinates confirm that declared assets occupy the registered locations recorded in the CMA asset registry	Monitoring costs: lower the cost for the principal to verify the agent’s stewardship of remote assets across Saudi Arabia’s diverse industrial geography
Thermal data	Condition assessment	Heat signatures verify the operational status of machinery and infrastructure, detecting idle or deteriorating assets that managers present as fully functional.	Adverse selection ensures auditors identify impaired assets that managers conceal to avoid valuation adjustments under IFRS standards

Table 4. Audit Task 2—Compliance and ESG Integrity Data.

Data Type	Purpose	Analytical Use Case	Agency Theory Alignment	Key Reference
Environmental sensor data	Emission tracking	Monitor CO₂ and methane levels across industrial facilities to ensure compliance with Saudi environmental regulations and Vision 2030 sustainability mandates	Mitigates moral hazard: prevents managers from suppressing environmental costs to artificially inflate short-term profits	[50,53]
Geospatial data	Boundary compliance	High-precision GPS confirms that physical activities remain within authorised legal operating boundaries declared in CMA filings	Contractual alignment provides the principal with objective proof that the agent is operating within regulatory mandates	[54]
Image and video data	Safety compliance	Autonomous safety audits via drone footage verify adherence to occupational health and safety standards on active industrial and construction sites	Reduces information asymmetry: provides first-hand visual evidence that cannot be manipulated in a written management report	[55]
Thermal and infrared data	Safety and hazard inspection	Detects invisible hazards, including overheated machinery, heat leakages, and structural deterioration that jeopardise facility safety and operational integrity	Audit trail for accountability: creates a permanent record for leadership maintenance, ensuring executives are held responsible for asset upkeep under [10] framework	[10]

Table 5. Audit Task 3—Risk and Anomaly Detection Data.

Data Type	Purpose	Analytical Use Case	Agency Theory Alignment	Key Reference
Time-series imagery	Change detection	Identifies unauthorised alterations or missing assets by comparing sequential drone imagery across verification cycles against financial records	Fraud deterrence: reduces the information gap that managers exploit to misappropriate or strip company assets between scheduled audit visits	[2]
Video data	Real-time anomaly detection	Transformer-based AI models monitor live drone feeds across diverse industrial environments, identifying deviations from normal operational patterns without requiring predefined anomaly categories	Direct monitoring: the autonomous, unsupervised nature of transformer-based anomaly detection eliminates the human observation gap—the system acts as a permanently vigilant, non-human observer that cannot be socially influenced, fatigued, or manipulated by managerial relationships	[36]
Thermal sensors	Heat anomaly detection	Detects unexpected temperature changes indicating equipment malfunctions, safety risks, and deferred maintenance across industrial infrastructure	Verification of maintenance ensures the agent is not neglecting asset upkeep to inflate short-term cash flows by concealing deterioration from auditors	[5]
Machine learning	Predictive risk assessment	Historical drone data trains predictive models identifying facilities and asset classes with elevated irregularity probability, enabling proactive rather than reactive governance	Proactive governance: shifts auditing from reactive sampling to predictive continuous monitoring, limiting the window for managerial opportunism	[49,50]

Table 6. Audit Task 4—Construction Progress and Quality Control Data.

Data Type	Purpose	Analytical Use Case	Agency Theory Alignment
Image and video data	Progress monitoring	Continuous visual documentation of construction milestone completion, enabling independent verification of the percentage of completion for revenue recognition without requiring physical auditor site presence	Progress entrapment prevention: prevents contractors and managers from overstating construction progress to accelerate cash disbursements from project owners
Geospatial data	Boundary verification	High-precision GPS confirms that construction activity remains within the planned legal boundaries declared in the CMA project documentation	Compliance stewardship protects the principal from legal liabilities and regulatory fines caused by the agent’s negligent or opportunistic boundary encroachment
LiDAR and 3D mapping	Structural integrity assessment	Maps structural dimensions against engineering specifications to assess build quality and confirm that construction meets contractually agreed standards [18]	Quality verification ensures the agent is not cutting corners on material quality to save costs or protect performance bonuses at the expense of build integrity
Time-series data	Comparative quality analysis	Sequential drone imagery compared against original blueprints produces an objective, time-stamped, unalterable record of project advancement, ensuring reported completion rates reflect physical reality at each reporting date [58]	Continuous monitoring provides a transparent and unalterable construction history that eliminates the risk of retrospective data manipulation by management or contractors

Table 7. AI Agent Evaluation Categories for Auditor Assignment and Change.

Evaluation Category	Data Source	Governance Principle	DACC Integration
Audit quality metrics	Historical engagement data, restatement rates and regulatory findings	Performance Expectancy (UTAUT)	Includes historical DACC levels as a primary quality indicator across all previous engagements
Independence indicators	Tenure duration, fee ratios, non-audit service proportions	Familiarity bias detection [2]	Correlates DACC volatility with tenure length to detect whether long engagements produce measurable earnings management deterioration
Technological capability	AI, big data, and drone adoption status	Facilitating Conditions (UTAUT)	Assesses whether adoption of AI-powered audit tools is associated with lower DACC outcomes for the firm’s client portfolio
Asset verification accuracy	Reconciliation of platform findings versus auditor-reported assets	Algorithmic Accountability	Flags discrepancies between drone-verified asset values and auditor-accepted financial statement values that may indicate earnings management via asset overstatement
Discretionary accruals (DACC)	Historical financial statements; cross-firm DACC benchmarks; year-on-year DACC changes	Direct earnings management detection	Primary metric for auditor effectiveness: the platform treats DACC trajectory as the most reliable observable proxy for audit quality in the Saudi market context
Market competitiveness	Big Four versus non-Big Four quality gap	Competitive equaliser function	Compares DACC levels across firm types to determine whether non-Big Four adoption of platform capabilities narrows the quality differential

Table 8. DACC Thresholds and Governance Actions.

DACC Level (Absolute Value)	Interpretation	Platform Action
Less than 2 per cent of total assets	Normal earnings management risk—DACC within the range consistent with legitimate accounting discretion	Routine monitoring; no governance action initiated; engagement profile updated in real time
2 per cent to 5 per cent of total assets	Elevated risk—DACC above normal range but below the threshold associated with significant managerial opportunism	Flag for CMA review; platform requests auditor’s explanation of specific accrual items driving the elevation
5 per cent to 8 per cent of total assets	Significant risk—DACC at a level historically associated with material earnings management in the audit quality literature	Automatic recommendation for auditor rotation; CMA oversight committee notified; platform generates full reasoning chain for human review
Greater than 8 per cent of total assets	Severe risk—DACC at a level that the literature associates with active and material manipulation of reported financial performance	Immediate escalation to CMA enforcement; mandatory auditor change initiated; platform suspends routine monitoring and activates emergency governance protocol

Table 9. Internal Data Linkages.

Data Source	Key Information	Verification Application	Agency Theory Alignment
Zakah Authority	Financial reports of all Saudi firms	Cross-referencing asset declarations against tax records to identify inconsistent reporting across regulatory bodies	Reduces information asymmetry between CMA and regulated entities by eliminating the possibility of maintaining separate narratives for different regulatory audiences
Bank mortgage data	Asset valuations by three professional real-estate agents; photographs; locations; asset status covering buildings, storage, industries, lands, and apartments	Independent pricing verification; condition assessment; existence confirmation	Mitigates the moral hazard of managers overstating collateral values to support excessive borrowing or inflate reported asset bases
Loan firm data	Income operations; cash flow evidence; financial reports	Operational performance verification; going concern assessment	Reduces monitoring costs for the principal by providing independently sourced financial performance evidence not filtered through firm accounting discretion
Firm accounts	Self-reported financial data submitted directly to the platform	Baseline for discrepancy detection; early warning of manipulation when compared against all other data sources	Establishes contractual alignment between firm declarations and regulatory requirements, with every deviation automatically flagged

Table 10. External Data Linkages.

Data Source	Information Captured	Verification Application	Technical Precedent
Social media content	Company advertisements; real-estate agent postings with asset information, photographs, and pricing	Market-based asset valuation; detection of undisclosed asset transactions; identification of assets declared at values inconsistent with publicly posted market prices	Natural language processing for structured data extraction from unstructured text
Published reports	Academic literature; industry analyses; media coverage	Reputation assessment; industry benchmark construction; peer-group DACC calibration	Bibliometric analysis methodology [50]
Real-estate agent social accounts	Full asset listings with specifications, condition assessments, and asking prices	Cross-verification of bank mortgage valuations; detection of valuation discrepancies between formal and informal channels	Machine learning for price anomaly detection across listing databases

Table 11. Future Research Directions.

Research Direction	Core Questions	DACC Relevance
Legal liability	Under what conditions should CMA accept DACC calculations as legally sufficient for mandating auditor rotation?	DACC thresholds need judicial validation
Auditor-market effects	Does platform-based DACC monitoring reduce earnings management across the Saudi market, or does it simply shift manipulation to real activities?	An empirical pre-post study required
Algorithmic bias	Do DACC models systematically penalise certain sectors (e.g., high-growth vs stable industries) or ownership types (family vs institutional)?	Cross-sectional validation needed
Trust calibration	What DACC thresholds do CMA officials find acceptable for mandating rotation, and how do these compare to academic benchmarks?	Survey and experimental studies
DACC and drone verification correlation	Does autonomous asset verification reduce DACC by closing physical verification gaps that managers previously exploited?	Longitudinal study linking drone deployment to DACC trends

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alhazmi, A.H.J.; Islam, S.M.N.; Prokofieva, M. The CMA Agentic Platform: Autonomous Asset Verification and Algorithmic Auditor Governance. FinTech 2026, 5, 55. https://doi.org/10.3390/fintech5020055

AMA Style

Alhazmi AHJ, Islam SMN, Prokofieva M. The CMA Agentic Platform: Autonomous Asset Verification and Algorithmic Auditor Governance. FinTech. 2026; 5(2):55. https://doi.org/10.3390/fintech5020055

Chicago/Turabian Style

Alhazmi, Abdulkarim Hamdan J., Sardar M. N. Islam, and Maria Prokofieva. 2026. "The CMA Agentic Platform: Autonomous Asset Verification and Algorithmic Auditor Governance" FinTech 5, no. 2: 55. https://doi.org/10.3390/fintech5020055

APA Style

Alhazmi, A. H. J., Islam, S. M. N., & Prokofieva, M. (2026). The CMA Agentic Platform: Autonomous Asset Verification and Algorithmic Auditor Governance. FinTech, 5(2), 55. https://doi.org/10.3390/fintech5020055

Article Menu

The CMA Agentic Platform: Autonomous Asset Verification and Algorithmic Auditor Governance

Abstract

1. Introduction

1.1. Background

1.2. The Proposed Platform

1.3. Research Questions

2. Methods

2.1. Research Design: Conceptual Design Study

2.2. Data Collection Strategy

2.3. Documentary Content Analysis

2.3.1. Two-Stream Source Classification Framework

2.3.2. Document Selection Protocol

2.3.3. Documentary Evidence Base

2.3.4. Data Extraction and Analysis Protocol

2.4. Validity and Reliability

2.5. Web of Science Search and PRISMA Review

Justifications for Using Web of Science

2.6. Bibliometric Analysis

2.7. Keyword Occurrences

3. Results

3.1. Segment 1: Autonomous Asset Verification

3.1.1. Operational Architecture

3.1.2. Evidence Base

3.1.3. Addressing the Greenwashing Risk

3.1.4. Audit Task Mapping: From Theoretical Framework to Platform Architecture

Audit Task 1: Asset Verification and Valuation

Audit Task 2: Compliance and ESG Integrity

Audit Task 3: Risk and Anomaly Detection

Audit Task 4: Construction Progress and Quality Control

3.1.5. Integrated Architecture: From Task to Platform

3.2. Segment 2: Auditor Assignment and Change

3.2.1. From Mandatory Rotation to Agentic Governance with DACC Monitoring

3.2.2. The DACC Monitoring Framework

3.2.3. The Assignment and Change Protocol

3.2.4. Addressing Earnings Management Through DACC Monitoring

3.2.5. The Competitive Equaliser Function

3.2.6. DACC Limitations and Composite Signalling

3.3. The Big Data Link: Integration with National Data Infrastructure

3.3.1. Internal Data Sources

3.3.2. External Data Sources

3.3.3. The Social Media Verification Mechanism

3.3.4. Technical Implementation Precedent

3.4. Governance Framework for the CMA Platform

3.4.1. The Triadic Agentic Framework Applied

3.4.2. Leadership Accountability and Refusal Thresholds

3.4.3. Preventing Algorithmic Drift in DACC Estimation

3.4.4. Legal and Institutional Feasibility Within the Saudi Regulatory Environment

3.5. Addressing the UTAUT Trust Gap

Trust Expectancy and DACC Transparency

4. Discussion: Practical Benefits of the CMA Agentic AI Platform

4.1. Audit Efficiency at Scale

4.2. Closure of Information Asymmetry Channels

4.3. Objective Earnings Management Detection

4.4. Independently Verified ESG Assurance

4.5. Alignment with International Regulatory Frontier

5. Limitations and Future Research

5.1. Limitations and Mitigations

5.2. Future Research Directions

6. Conclusions

6.1. Theoretical Contributions

6.2. Practical Implications

6.2.1. Implementation Risks

6.2.2. Pathway to Pilot Deployment

6.3. Concluding Reflections

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information