1. Introduction
1.1. AI-Enhanced ESG Reporting: Market Context and Technological Innovation
The integration of artificial intelligence into environmental, social, and governance (ESG) disclosure practices represents a paradigmatic shift in corporate transparency and sustainability reporting [
1,
2,
3]. As global stakeholders increasingly demand comprehensive and accurate sustainability information, organizations face mounting pressure to deliver high-quality ESG disclosures that effectively communicate their environmental impact, social responsibility initiatives, and governance practices [
4,
5]. The convergence of AI technologies with ESG reporting emerges at a critical juncture where traditional manual reporting processes struggle to meet the complexity, frequency, and precision requirements of modern sustainability disclosure frameworks [
6,
7].
The global AI in ESG and sustainability market, valued at approximately USD 1.24 billion in 2024, is projected to reach USD 14.87 billion by 2034, reflecting a compound annual growth rate of 28.20% [
8]. This exponential growth signals the transformative potential of AI technologies in addressing long-standing challenges in sustainability reporting, including data aggregation complexity, materiality assessment accuracy, and stakeholder engagement transparency [
9]. The technological revolution in ESG reporting extends beyond mere automation, encompassing sophisticated applications such as natural language processing for disclosure statement optimization, machine learning algorithms for risk prediction, and advanced analytics for stakeholder sentiment analysis [
10,
11].
Contemporary ESG reporting faces fundamental challenges related to standardization inconsistencies, data quality concerns, and measurement methodologies that vary significantly across frameworks and jurisdictions [
12,
13,
14]. The absence of unified reporting standards and the subjective nature of ESG data interpretation have created an environment where disclosure quality varies substantially, potentially undermining investor confidence and stakeholder trust [
15]. Against this backdrop, AI technologies offer unprecedented opportunities to enhance reporting precision, reduce human error, and provide real-time insights that support strategic decision-making processes [
16].
For definitional clarity, this study focuses specifically on conventional artificial intelligence technologies, including machine learning algorithms, natural language processing systems, automated data collection and analysis tools, and predictive analytics platforms. While generative artificial intelligence (GenAI) technologies, particularly large language models, emerged prominently during the latter portion of the study period (2021–2024), the analysis concentrates on the implementation and effectiveness of traditional AI applications that were more widely adopted across Saudi-listed companies during this timeframe. This scope distinction is critical for accurate interpretation of findings and their applicability to different AI technology categories in ESG reporting contexts.
Recent scholarship has extensively documented the emerging applications of AI in ESG disclosure processes, revealing a rapidly evolving landscape of technological solutions designed to enhance reporting efficiency and accuracy [
17]. Seneca ESG’s implementation of large language models to create AI-powered ESG assistants demonstrates the practical application of natural language processing in identifying similarities across disclosure frameworks, thereby improving reporting consistency and reducing administrative burden [
12]. Similarly, advanced machine learning algorithms now facilitate automated data collection from diverse sources, including satellite imagery for environmental monitoring, social media sentiment analysis for social impact assessment, and corporate governance databases for risk evaluation [
18,
19,
20]. The integration of AI technologies in ESG reporting encompasses multiple dimensions of technological advancement, including predictive analytics for future risk assessment, pattern recognition for anomaly detection in sustainability data, and automated compliance monitoring systems [
21].
Research indicates that organizations utilizing AI-driven ESG reporting tools experience significant improvements in data accuracy, with some studies documenting accuracy improvements of up to 40% compared to traditional manual reporting methods [
22]. Furthermore, AI applications extend to sophisticated materiality assessment processes, where machine learning algorithms analyze stakeholder feedback, industry benchmarks, and regulatory requirements to identify and prioritize the most significant sustainability issues for specific organizations [
23]. The technological infrastructure supporting AI-enhanced ESG reporting continues to evolve, with developments in blockchain technology for data verification, Internet of Things (IoT) sensors for real-time environmental monitoring, and cloud computing platforms for scalable data processing [
24,
25,
26].
1.2. Literature Review and Research Gaps
The measurement of ESG disclosure quality remains a complex and contested area within sustainability research, with multiple methodological approaches competing for academic and practical acceptance [
27]. Contemporary research has established sophisticated frameworks for evaluating disclosure quality, incorporating dimensions such as completeness, accuracy, relevance, timeliness, and comparability across organizations and time periods [
28]. The Global Reporting Initiative (GRI) Standards, revised in 2021, emphasize materiality-focused reporting that prioritizes significant impacts on economy, environment, and people, representing a shift from stakeholder-centric to impact-centric disclosure approaches [
29,
30].
Empirical investigations into the relationship between ESG disclosure quality and financial performance have yielded mixed but increasingly positive results, with recent studies documenting significant correlations between high-quality sustainability reporting and improved financial outcomes [
15,
31,
32]. Research utilizing Bloomberg ESG ratings as quasi-natural experiments demonstrates that enhanced ESG disclosure quality correlates with improved market valuations, reduced cost of capital, and enhanced investor confidence [
33,
34]. The moderating role of institutional investors with ESG preferences further amplifies the positive relationship between disclosure quality and financial performance, suggesting that markets increasingly value transparency and accountability in sustainability reporting [
35,
36]. The financial implications of ESG disclosure extend beyond immediate market reactions to encompass long-term performance metrics, including operational efficiency improvements, risk mitigation benefits, and access to ESG-focused investment capital [
37].
Table 1 presents a comprehensive comparison of recent literature investigating AI applications in ESG reporting, highlighting methodological approaches, key findings, and research contributions across different geographical and sectoral contexts.
This comparative analysis, including the systematic identification of methodological limitations, reveals several critical patterns in the current literature that justify the present study’s methodological approach. First, the majority of empirical studies focus on developed markets, particularly the United States, United Kingdom, and China, with limited attention to emerging markets in the Middle East and North Africa region [
48]. Second, while technological capabilities and market potential receive substantial attention, there remains a paucity of rigorous empirical investigations into the effectiveness of AI-enhanced ESG reporting compared to traditional methods [
49]. Third, the relationship between AI adoption in ESG processes and subsequent financial performance lacks comprehensive investigation, particularly in the context of developing economies undergoing economic diversification [
50].
Despite the promising theoretical foundations and growing market interest in AI-enhanced ESG reporting, several critical research gaps persist that limit understanding of the practical effectiveness and financial implications of these technological innovations [
51]. The current literature predominantly consists of industry reports, case studies, and theoretical frameworks, with limited peer-reviewed empirical research examining the causal relationships between AI adoption in ESG processes and measurable improvements in disclosure quality or financial performance [
52]. The lack of uniform measures for assessing AI effectiveness in ESG contexts creates methodological challenges that impede comparative research across organizations, industries, and geographical regions [
53].
The overwhelming focus of existing research on developed markets, particularly the United States and the European Union, creates significant knowledge gaps regarding AI-enhanced ESG reporting in emerging economies. Current research lacks sophisticated methodological approaches for measuring the quality improvements attributable specifically to AI adoption in ESG reporting processes. Existing studies often conflate general technological adoption with AI-specific implementations, making it difficult to isolate the unique contributions of artificial intelligence technologies to disclosure quality enhancement. The absence of validated instruments for measuring AI adoption intensity in ESG contexts further compounds these methodological challenges [
54,
55]. The relationship between AI-enhanced ESG reporting and financial performance requires more nuanced investigation that considers mediating variables, temporal dynamics, and sector-specific factors.
1.3. Saudi Arabia as a Natural Experimental Setting
Saudi Arabia’s Vision 2030 represents one of the most ambitious national transformation programs globally, with sustainability and economic diversification serving as foundational pillars for the Kingdom’s future development trajectory. The integration of ESG principles into this national vision reflects a strategic commitment to transitioning from a hydrocarbon-dependent economy to a diversified, knowledge-based economy that prioritizes environmental stewardship, social progress, and governance excellence. The Saudi Green Initiative, launched in 2021 with targets to achieve net-zero emissions by 2060 and plant 10 billion trees, demonstrates the Kingdom’s commitment to environmental sustainability at an unprecedented scale [
56].
The establishment of the Saudi Data and Artificial Intelligence Authority (SDAIA) and the National Strategy for Data and AI (NSDAI) positions Saudi Arabia as a regional leader in AI adoption and implementation [
57]. The Kingdom’s USD 40 billion AI investment fund, led by the Public Investment Fund in partnership with international venture capital firms, signals substantial commitment to technological innovation that extends to sustainability and ESG applications [
58]. This convergence of sustainability priorities and AI investment creates unique opportunities for investigating the effectiveness of AI-enhanced ESG reporting in a rapidly transforming economy.
The Saudi Exchange (Tadawul) issued comprehensive ESG disclosure guidelines in 2021, aligning with UN Sustainable Stock Exchange model guidance and establishing voluntary reporting standards for listed companies [
59]. These guidelines emphasize materiality assessment, stakeholder engagement, and comprehensive disclosure across ESG dimensions, creating a structured framework for investigating AI applications in sustainability reporting. The guidelines’ voluntary nature during the initial implementation period provides a natural experimental setting for examining the factors that influence corporate adoption of enhanced disclosure practices. The Saudi Arabian Monetary Authority (SAMA) and Capital Market Authority (CMA) have implemented supporting regulatory frameworks that encourage ESG integration and transparency in financial markets [
60].
The urgency of understanding AI’s role in ESG transformation is amplified in markets experiencing concurrent regulatory evolution and economic diversification. Saudi Arabia’s trajectory illustrates this convergence: before 2021, sustainability disclosure among Saudi-listed companies was primarily voluntary and inconsistent, with environmental reporting concentrated in energy sectors and limited standardized social or governance metrics. Prior to the Saudi Exchange’s introduction of voluntary ESG disclosure guidelines in October 2021, sustainability reporting among Saudi-listed companies was largely fragmented, with fewer than 12% of non-financial companies publishing integrated sustainability reports, with fragmented environmental (31%) and social (18%) disclosure practices and significant variation in disclosure quality across sectors [
61]. The simultaneous introduction of structured ESG guidelines and substantial AI infrastructure investments creates unique conditions for examining whether technological capabilities can compress the typical maturation timeline for sustainability reporting practices observed in developed markets.
To contextualize the significance of this transformation,
Figure 1 presents comparative data illustrating the Saudi ESG landscape before and after the 2021 guidelines implementation. The baseline conditions reveal a fragmented sustainability reporting environment with only 12% of companies publishing integrated sustainability reports and minimal technological integration in ESG processes.
Figure 1a demonstrates that environmental reporting was concentrated among energy companies (31% adoption rate), while social disclosure remained limited across all sectors (18% adoption rate), and integrated sustainability reporting reached only 12% of listed companies.
Figure 1b shows that AI technology adoption in sustainability processes was virtually absent, with data collection automation implemented by fewer than 8% of companies, report automation by only 3%, and stakeholder analysis tools utilized by merely 5% of listed entities.
Figure 1c illustrates that the market capitalization distribution heavily favored companies with lower disclosure standards, with high-disclosure companies representing only 43% of total market value. The post-2024 landscape demonstrates dramatic transformation across all measured dimensions, with
Figure 1a showing integrated sustainability reporting reaching 67% adoption and environmental reporting expanding to 78% of companies.
Figure 1b reveals comprehensive AI integration in ESG processes exceeding 40% across most categories, with data collection automation reaching 52% adoption.
Figure 1d demonstrates the redistribution toward transparency, with high-disclosure companies commanding 67% of market capitalization.
This transformation trajectory illustrates why Saudi Arabia provides an ideal natural experimental setting for investigating AI-enhanced ESG reporting effectiveness. The simultaneity of regulatory structure introduction and technological capability expansion eliminates confounding factors present in markets where these developments occurred sequentially, enabling cleaner identification of causal relationships between AI adoption, disclosure quality improvements, and financial performance outcomes.
1.4. Research Aims and Objectives
This study aims to investigate how artificial intelligence adoption in ESG disclosure processes enhances organizational legitimacy and strengthens stakeholder relationships, thereby creating measurable financial value among Saudi-listed companies within the transformative context of Vision 2030.
Drawing from stakeholder theory and legitimacy theory, the research pursues four theoretically grounded objectives that address identified research gaps while advancing theoretical understanding of technology-mediated transparency effects. First, to examine whether AI-enhanced ESG disclosure quality reduces legitimacy gaps by improving transparency and accountability mechanisms, addressing legitimacy theory’s proposition that organizations seek social acceptance through enhanced disclosure practices. Second, to investigate how AI adoption in ESG reporting strengthens stakeholder relationships and reduces information asymmetries, extending stakeholder theory’s predictions about the value-creating potential of improved stakeholder communication. Third, to assess whether the financial performance benefits of AI-enhanced ESG disclosure operate primarily through stakeholder-mediated pathways rather than direct operational efficiencies, testing the theoretical mechanism through which transparency creates value in stakeholder capitalism frameworks. Fourth, to examine how institutional and industry contexts moderate the legitimacy and stakeholder relationship benefits of AI-enhanced ESG reporting, recognizing that theoretical predictions may vary across different stakeholder environments and regulatory contexts.
These theoretically informed objectives address the identified research gaps by providing the first comprehensive investigation of how technological innovation in sustainability reporting creates value through legitimacy enhancement and stakeholder relationship strengthening in an emerging market context. The research contributes to stakeholder theory by examining the mechanisms through which improved information quality affects stakeholder relationships and subsequent financial outcomes. Furthermore, the study extends legitimacy theory by investigating whether technological sophistication in disclosure processes serves as a legitimacy-building signal that reduces information asymmetries and enhances organizational credibility with key stakeholders, including investors, regulators, and civil society organizations.
1.5. Research Contributions
This study makes several distinct and significant contributions to the literature on corporate sustainability, technological innovation, and financial performance. First, it provides a crucial theoretical contribution by empirically testing and validating the mechanisms through which AI-enhanced transparency creates value. By demonstrating that approximately 73% of AI’s financial impact is mediated through improved ESG disclosure quality, the findings extend stakeholder and legitimacy theories, showing that the primary value-creation pathway is the reduction in information asymmetries and strengthening of stakeholder relationships, rather than direct operational efficiencies.
Second, the research offers a significant methodological contribution. Leveraging the unique context of Saudi Arabia’s Vision 2030—where regulatory and technological changes occurred simultaneously—mitigates confounding variables common in other settings. Furthermore, the use of a System GMM estimator, complemented by Panel VAR and a formal bounding analysis to test for unobserved macroeconomic confounding, provides a rigorous framework for addressing complex endogeneity concerns, setting a higher standard for causal inference in this domain.
Third, this paper makes a critical empirical contribution by providing the first comprehensive analysis of the AI–ESG-performance nexus in the rapidly emerging MENA region. The findings reveal that AI-enhanced ESG disclosure yields a financial performance premium that is 32% to 92% larger than that reported in prior studies on traditional ESG disclosures. This quantification of the incremental value of AI establishes a new, higher empirical benchmark and demonstrates the transformative potential of advanced technology in corporate reporting.
Finally, the study has direct practical and policy implications. It presents a clear business case for corporate managers and boards, justifying investment in AI technologies to improve stakeholder engagement and financial returns. For policymakers and regulators, particularly in emerging economies, the findings provide compelling evidence that national strategies promoting both technological adoption and sustainability disclosure can create a powerful synergistic effect, accelerating economic diversification and enhancing market transparency and valuation.
3. Results
3.1. Descriptive Statistics and Sample Characteristics
To establish the foundational understanding of the dataset and validate the representativeness of the sample, comprehensive descriptive analysis was conducted across all key variables. The analysis encompasses distributional characteristics, temporal trends, and cross-sectional variations that inform subsequent econometric modeling.
Table 2 presents the summary statistics for all primary variables used in the analysis, while
Figure 2 illustrates the temporal evolution of key metrics across the study period.
The descriptive statistics reveal substantial variation in both ESG disclosure quality and AI adoption intensity across the sample, indicating sufficient heterogeneity for meaningful econometric analysis. ESG disclosure quality scores range from 8.50 to 89.20 with a mean of 47.32, suggesting that while some companies have achieved high disclosure standards, significant room for improvement exists across the sample. AI adoption intensity demonstrates even greater variation, with scores ranging from zero (no AI implementation) to 95.60 (comprehensive AI integration), and a mean of 31.45, indicating that most companies remain in early stages of AI adoption for ESG processes.
Time-series analysis of mean values across the study period was conducted to examine the temporal evolution of key variables and identify potential trends that may influence the analysis.
Figure 2 displays the annual progression of ESG disclosure quality, AI adoption intensity, and average financial performance metrics from 2021 to 2024.
The temporal analysis reveals several important patterns. ESG disclosure quality exhibits a steady upward trend from a mean of 39.8 in 2021 to 54.1 in 2024, representing a 36% improvement over the study period. This progression aligns with the implementation timeline of Saudi Exchange ESG disclosure guidelines and suggests increasing corporate attention to sustainability reporting quality. AI adoption intensity demonstrates more dramatic growth, increasing from a mean of 18.2 in 2021 to 44.9 in 2024, indicating rapid technology diffusion across the sample companies.
Financial performance metrics show mixed temporal patterns. Average ROA remains relatively stable around 8–9% throughout the period, while ROE exhibits modest improvement from 12.1% in 2021 to 16.8% in 2024. Tobin’s Q demonstrates the most pronounced improvement, increasing from 1.42 in 2021 to 1.89 in 2024, suggesting that market valuations have responded positively to corporate sustainability and technology initiatives during the study period.
3.2. AI Adoption Measurement Validation
As detailed in the methodology, the AI adoption intensity instrument underwent a rigorous, multi-stage validation process to ensure it validly and reliably measures the intended construct. This section presents the results of these validation procedures for the AI adoption intensity instrument. The instrument’s construct validity was assessed using PCA, its internal consistency reliability using Cronbach’s alpha, and its convergent validity by correlating its scores with those derived from the separate content analysis.
The PCA, detailed in
Table 3, was conducted on the survey items to identify the underlying structure of AI adoption readiness. The analysis confirmed a clear, three-dimensional structure that was both statistically robust and theoretically sound.
The validation process followed a structured sequence to ensure the psychometric soundness of the AI adoption instrument. First, to establish construct validity, a PCA with Varimax rotation was performed. The analysis, detailed in
Table 3, confirmed a clear and theoretically grounded three-dimensional structure, with all eigenvalues exceeding 1.0 and a cumulative variance explanation of 70.9%. All individual factor loadings were above the 0.60 threshold, indicating that the items converge on their intended latent constructs.
Following the establishment of this factor structure, the internal consistency reliability of each construct was assessed. While the high and relatively clustered factor loadings suggest that Cronbach’s alpha is a reasonable estimator, to address the potential for minor heterogeneity in loadings, both Cronbach’s alpha and Composite Reliability (CR) were calculated. As noted in
Table 3, both sets of coefficients demonstrate excellent reliability for all three factors. Finally, to establish convergent validity, the survey-based instrument scores were correlated with independently derived scores from a systematic content analysis of corporate disclosures. The strong, statistically significant Pearson correlation shown in
Figure 3 (r = 0.78,
p < 0.001) confirms that both measurement approaches converge on the same underlying construct, providing robust support for the instrument’s validity.
To further assess the measurement approach, an analysis of the distribution of AI adoption scores across different sectors was conducted, and a test for convergent validity was performed.
Figure 3 visually summarizes this analysis. Panel (a) illustrates the significant variation in AI adoption intensity across major GICS sectors, with the Technology and Energy sectors showing the highest adoption rates. Panel (b) provides strong evidence of convergent validity by plotting the survey-based scores against the content analysis-derived scores. The resulting Pearson correlation coefficient of 0.78 (
p < 0.001) confirms a high degree of agreement between the two independent measurement methods, significantly strengthening confidence in the validity of the AI adoption instrument.
The results of the validation analysis, presented in
Table 3 and
Figure 3, confirm the psychometric soundness of the AI adoption instrument. The factor analysis reveals a clear and reliable three-factor structure, while the industry analysis in
Figure 3a shows significant sectoral variation that aligns with theoretical expectations of technology readiness and regulatory pressures. Crucially, the strong Pearson correlation of 0.78 (
p < 0.001) between the survey scores and the independently derived content analysis scores provides robust evidence of convergent validity. This high degree of agreement between two different measurement methods confirms that the instrument accurately captures the construct of AI adoption intensity, thereby validating its use for the primary empirical analysis of this study.
3.3. ESG Disclosure Quality Assessment
The comprehensive assessment of ESG disclosure quality across the sample reveals substantial heterogeneity in corporate sustainability reporting practices and provides insights into the factors driving disclosure improvements. Analysis was conducted across the three primary dimensions of the disclosure quality framework: GRI standards compliance, materiality assessment depth, and stakeholder engagement transparency.
Table 4 presents detailed statistics for each disclosure quality dimension, while
Figure 4a shows the distribution of overall disclosure quality scores, and
Figure 4b illustrates the relationship between disclosure quality dimensions.
The dimensional analysis reveals that GRI standards compliance achieves the highest mean score (48.72), followed closely by stakeholder engagement transparency (48.57), while materiality assessment depth demonstrates the lowest average performance (44.67). This pattern suggests that companies find compliance with established frameworks and stakeholder communication more manageable than developing sophisticated materiality assessment processes.
To examine the distribution characteristics of overall ESG disclosure quality and understand the relationships between different quality dimensions, distributional analysis and correlation assessment were conducted.
Figure 4a presents the frequency distribution of overall ESG disclosure quality scores across the sample, while
Figure 4b shows the correlation matrix and scatter plot relationships between the three primary disclosure quality dimensions.
The distributional analysis reveals a roughly normal distribution of ESG disclosure quality scores with slight positive skewness (skewness = 0.34), indicating that more companies achieve below-average than above-average disclosure quality. The distribution exhibits some evidence of bimodality, with peaks around scores of 35 and 65, suggesting potential clustering of companies into distinct disclosure quality categories.
The correlation analysis between disclosure quality dimensions reveals strong positive relationships, with the highest correlation between GRI compliance and materiality assessment (r = 0.73, p < 0.001), followed by GRI compliance and stakeholder engagement (r = 0.68, p < 0.001). The relationship between materiality assessment and stakeholder engagement demonstrates moderate correlation (r = 0.59, p < 0.001), indicating that while these dimensions are related, they capture distinct aspects of disclosure quality.
3.4. Main Regression Results: AI Adoption and ESG Disclosure Quality
The primary empirical analysis examines the causal relationship between AI adoption intensity and ESG disclosure quality using the System GMM estimator to address endogeneity concerns. The analysis progresses from baseline specifications to comprehensive models incorporating control variables, industry effects, and temporal dynamics.
Table 5 presents the main regression results across different model specifications, while
Figure 5a illustrates the predicted relationship between AI adoption and ESG quality, and
Figure 5b shows the marginal effects across different AI adoption levels.
The regression results provide strong evidence for a positive and statistically significant relationship between AI adoption intensity and ESG disclosure quality across all model specifications. The coefficient on AI adoption intensity ranges from 0.276 to 0.312, indicating that a one-unit increase in AI adoption intensity is associated with approximately a 0.28–0.31 unit increase in ESG disclosure quality score. This relationship remains robust to the inclusion of control variables, fixed effects, and alternative specifications.
The diagnostic tests confirm the validity of the System GMM estimation approach. The AR(2) test fails to reject the null hypothesis of no second-order serial correlation in all specifications (p-values ranging from 0.156 to 0.234), while the Hansen test of instrument validity shows no evidence of instrument invalidity (p-values from 0.356 to 0.423). The number of instruments remains below the rule-of-thumb maximum of 100 and well below the number of groups, ensuring estimation efficiency.
To visualize the relationship between AI adoption and ESG disclosure quality and examine potential non-linearities in the effect, predictive analysis and marginal effects estimation were conducted.
Figure 5a presents the predicted ESG disclosure quality scores across the range of AI adoption intensity values, while
Figure 5b shows the marginal effects of AI adoption at different levels of initial adoption.
The predictive analysis reveals a positive, approximately linear relationship between AI adoption intensity and ESG disclosure quality, with some evidence of diminishing marginal returns at very high adoption levels. The relationship is strongest for companies transitioning from low to moderate AI adoption levels (scores 20–60), with marginal effects ranging from 0.32 to 0.29. For companies with already high AI adoption (scores above 70), the marginal effects decline to approximately 0.18, suggesting that additional AI investments yield smaller incremental improvements in disclosure quality.
The confidence intervals remain relatively narrow throughout the adoption spectrum, indicating precise estimation of the relationship. However, the intervals widen slightly at the extremes of the distribution, reflecting the smaller number of observations with very low or very high AI adoption scores.
3.5. Financial Performance Implications
The analysis of financial performance implications examines whether AI-enhanced ESG disclosure quality translates into measurable business value through improved profitability and market valuation. The investigation employs multiple financial performance measures and addresses potential endogeneity through instrumental variable approaches.
Table 6 presents the regression results for financial performance outcomes, while
Figure 6a shows the relationship between ESG disclosure quality and financial performance, and
Figure 6b illustrates the industry variation in this relationship.
The financial performance analysis reveals statistically significant positive relationships between ESG disclosure quality and all three performance measures. The coefficients indicate that a one-unit increase in ESG disclosure quality is associated with a 0.089 percentage point increase in ROA, a 0.142 percentage point increase in ROE, and a 0.0067 unit increase in Tobin’s Q. These effects are economically meaningful, representing approximately 1.0%, 1.0%, and 0.4% improvements relative to sample means for ROA, ROE, and Tobin’s Q, respectively.
The first-stage F-statistics exceed the rule-of-thumb value of 10, indicating strong instruments and rejecting concerns about weak instrument bias. The Wu–Hausman tests reject the null hypothesis of exogeneity at the 5% level, confirming the necessity of instrumental variable estimation to address endogeneity concerns.
To examine the relationship between ESG disclosure quality and financial performance in greater detail and assess industry heterogeneity, sector-specific analysis and visualization of the quality-performance relationship were conducted.
Figure 6a presents scatter plots showing the relationship between ESG disclosure quality and each financial performance measure, while
Figure 6b illustrates how this relationship varies across major industry sectors.
The scatter plot analysis confirms positive relationships between ESG disclosure quality and financial performance measures, with correlation coefficients of 0.31 for ROA, 0.28 for ROE, and 0.35 for Tobin’s Q. The relationships demonstrate some non-linearity, with steeper slopes observed for companies with moderate disclosure quality scores (40–70) compared to those at the extremes.
The industry analysis reveals significant heterogeneity in the financial performance benefits of ESG disclosure quality. Energy and materials companies demonstrate the strongest relationships (coefficients of 0.142 for ROA and 0.187 for ROE), while consumer discretionary and industrials show more moderate effects (coefficients of 0.067 for ROA and 0.089 for ROE). This pattern aligns with stakeholder theory predictions regarding differential ESG materiality across sectors.
3.6. Mediation Analysis: AI Adoption Pathways
The mediation analysis investigates whether the relationship between AI adoption and financial performance operates through ESG disclosure quality improvements or through alternative pathways. This analysis addresses the mechanism question and quantifies the relative importance of different causal channels.
Table 7 presents the mediation analysis results using the causal mediation framework, while
Figure 7 illustrates the decomposition of total effects into direct and indirect components.
The mediation analysis reveals that ESG disclosure quality serves as the primary mechanism through which AI adoption affects financial performance. Approximately 73% of the total effect operates through the indirect pathway via ESG disclosure quality improvements, while only 27% represents direct effects of AI adoption on financial performance. This finding provides strong support for the theoretical framework suggesting that AI creates value primarily through enhanced transparency and stakeholder relationships rather than direct operational efficiency gains.
The Sobel test statistics confirm the statistical significance of the indirect effects across all performance measures (z-statistics > 4.0, p < 0.01). Bootstrap confidence intervals for the proportion mediated exclude zero and one, confirming partial mediation relationships where both direct and indirect pathways contribute to the total effect.
To visualize the mediation relationships and illustrate the relative magnitudes of direct and indirect effects, pathway analysis and effect decomposition were conducted.
Figure 7 presents a comprehensive diagram showing the mediation pathways, effect sizes, and confidence intervals for each causal relationship in the model.
The pathway diagram confirms the dominance of the indirect effect through ESG disclosure quality, with effect sizes of 0.092, 0.146, and 0.0073 for ROA, ROE, and Tobin’s Q, respectively. The direct effects remain positive but substantially smaller, indicating that while AI adoption may provide some operational benefits, the primary value creation mechanism operates through improved stakeholder relationships and reduced information asymmetries resulting from enhanced disclosure quality.
3.7. Panel Vector Autoregression Results
The Panel VAR analysis examines dynamic relationships and feedback effects between AI adoption, ESG disclosure quality, and financial performance over time. This analysis addresses questions about temporal ordering, adjustment speeds, and potential reverse causality.
Table 8 presents the Panel VAR coefficient estimates, while
Figure 8a shows impulse response functions and
Figure 8b displays variance decomposition results.
The Panel VAR results reveal significant dynamic relationships between all three variables. AI adoption demonstrates strong persistence (coefficient of 0.687 on first lag) and positive effects on ESG quality (0.124) and financial performance (0.089). ESG quality shows high persistence (0.734) and positive feedback to AI adoption (0.045) and financial performance (0.067). Financial performance exhibits moderate persistence (0.623) with positive effects on both AI adoption (0.234) and ESG quality (0.187).
Impulse response analysis and variance decomposition were conducted to examine the dynamic responses of variables to shocks and to assess the relative importance of each variable in explaining forecast error variance.
Figure 8a presents impulse response functions showing the response of each variable to one-standard-deviation shocks in other variables, while
Figure 8b shows the forecast error variance decomposition over a 10-period horizon.
The impulse response analysis reveals that positive shocks to AI adoption generate persistent positive responses in ESG disclosure quality, reaching maximum impact after 3–4 periods before gradually declining. Similarly, AI adoption shocks produce positive responses in financial performance, though with smaller magnitude and faster decay. ESG quality shocks generate positive responses in financial performance, with peak effects occurring after 2–3 periods.
The variance decomposition analysis indicates that AI adoption shocks explain approximately 35% of forecast error variance in ESG quality after 10 periods, while ESG quality shocks explain 18% of variance in financial performance. Own-variable shocks remain the dominant source of variance explanation, accounting for 65–75% of forecast error variance across variables, consistent with the persistent nature of these organizational characteristics.
3.8. Robustness and Sensitivity Analysis Results
To ensure the reliability and generalizability of the main findings, comprehensive robustness testing was conducted across multiple dimensions, including alternative specifications, sample variations, and measurement approaches.
Table 9 summarizes the robustness check results across different model specifications and sensitivity tests, while
Figure 9 illustrates the stability of key coefficients across various analytical approaches.
The robustness testing confirms the stability of main findings across alternative specifications and methodological approaches. The AI adoption effect on ESG disclosure quality remains positive and statistically significant in all alternative specifications, with coefficients ranging from 0.267 to 0.312. Similarly, the ESG disclosure quality effect on financial performance demonstrates consistency across specifications, with coefficients between 0.076 and 0.108.
Sample restriction analysis reveals that results are not driven by outliers, as excluding the top and bottom 5% of observations yields nearly identical coefficients. The relationship appears stronger among large firms, consistent with resource-based explanations for technology adoption and disclosure sophistication. Temporal sensitivity analysis indicates that effects have strengthened over time as AI technologies have matured and ESG reporting standards have evolved.
The comparison across estimation methods confirms the importance of addressing endogeneity concerns, as simpler methods (fixed effects, random effects, pooled OLS) yield systematically lower coefficient estimates. This pattern supports the use of System GMM and instrumental variable approaches in the main analysis.
To visualize the stability of key relationships across different analytical approaches and demonstrate the robustness of findings, coefficient stability analysis and sensitivity testing were conducted.
Figure 9 presents forest plots showing the range of coefficient estimates across different specifications, methodological approaches, and sample restrictions.
The forest plot analysis demonstrates remarkable consistency in the AI adoption effect on ESG disclosure quality, with all coefficient estimates falling within a narrow range (0.25–0.32) and confidence intervals overlapping substantially. The ESG quality effect on financial performance shows slightly greater variation but maintains statistical significance across all specifications.
The sensitivity analysis confirms that the main findings are not artifacts of specific methodological choices or sample characteristics but represent robust empirical relationships that persist across alternative analytical approaches. This robustness provides confidence in the practical implications and policy relevance of the research findings.
3.9. Validation Against Benchmark Studies
To contextualize the findings within the broader literature and assess the external validity of results, comparative analysis was conducted against benchmark studies examining ESG disclosure quality and financial performance relationships in emerging markets.
Table 10 presents comparisons with relevant benchmark studies, while the analysis addresses both consistency with prior findings and novel contributions of the current research.
The benchmark comparison reveals that the effect sizes identified in this study are substantially larger than those reported in previous research examining ESG disclosure-performance relationships. The AI-enhanced ESG disclosure quality demonstrates 32–81% larger effects on ROA compared to traditional ESG measures used in prior studies. Similarly, the Tobin’s Q effects are 62–92% larger than benchmark studies.
These larger effect sizes provide empirical support for the value creation potential of AI-enhanced ESG disclosure processes. The comparison suggests that technological innovation in sustainability reporting generates incremental benefits beyond traditional disclosure approaches, consistent with the theoretical framework emphasizing the role of information quality and stakeholder engagement in value creation.
The validation analysis confirms that the methodology and findings contribute novel insights to the ESG-performance literature while remaining consistent with established theoretical relationships. The larger effect sizes reflect both the methodological rigor of addressing endogeneity concerns and the substantive innovation of examining AI-enhanced disclosure processes in an emerging market context undergoing rapid economic transformation.
3.10. Validation Against Unobserved Confounding
A central challenge in this analysis is the potential for omitted variable bias, particularly given that the 2021–2024 study period coincided with significant global macroeconomic shocks, including the aftermath of the COVID-19 pandemic and subsequent inflationary pressures. These events could plausibly influence both a firm’s propensity to invest in advanced technologies like AI and its financial performance, creating a confounding effect.
To formally address this concern and assess the robustness of the core findings to the influence of unobservables, a bounding analysis is conducted based on the methodology proposed by Oster (2019) [
83], which is a variant of the approach used in Dantas et al. (2023) [
84]. This test evaluates how strong the correlation between unobserved factors and the treatment variable (ESG disclosure quality) would need to be to render the estimated treatment effect statistically insignificant.
The procedure is operationalized by first estimating the main financial performance model from
Table 6 (the baseline model) and then re-estimating it after augmenting it with a set of powerful, time-varying macroeconomic controls. Consistent with recent literature examining this period [
85], the analysis includes the Global Economic Policy Uncertainty (EPU) Index to capture macro-level volatility and the log of the U.S. Federal Reserve’s total assets as a proxy for global liquidity conditions. Following the guidance in Oster (2019) [
83] for short panels, year fixed effects are removed in favor of these explicit time-varying controls to enable proper identification.
By comparing the coefficient of interest (β) and the model’s R-squared from the baseline model (, ) with those from the model including controls (β, R2), the test estimates the value of the coefficient (β*) that would be obtained if unobservables were controlled for. This provides a quantitative assessment of whether the results are likely driven by the variables of interest or by unobserved confounding factors.
Table 11 presents the results of this bounding analysis for the effect of ESG disclosure quality on the primary financial performance metric, ROA.
Figure 10 provides a visual representation of the bounding analysis detailed in
Section 3.10. The plot illustrates the coefficient on ESG Disclosure Quality from the baseline model (β = 0.094), the model with macroeconomic controls (β = 0.081), and the estimated lower bound for the coefficient (β* = 0.065). The results show that while including powerful macroeconomic controls modestly attenuates the coefficient, the estimated lower bound remains positive, economically meaningful, and statistically significant. This analysis significantly strengthens confidence that the documented findings are not an artifact of omitted macroeconomic variables from the turbulent 2021–2024 period.
4. Discussion
The findings provide compelling evidence suggesting artificial intelligence’s substantial capacity for enhancing financial performance and ESG disclosure practices within Saudi-listed entities. The robust positive relationship between AI adoption intensity and ESG disclosure quality (β = 0.289, p < 0.001) indicates that technological innovation substantially improves transparency mechanisms beyond conventional reporting approaches. This finding is particularly significant given the 146.7% growth in AI adoption observed during the study period, suggesting that early adopters are realizing substantial competitive advantages in stakeholder communication and legitimacy building.
The financial performance implications reveal economically meaningful returns to AI-enhanced ESG reporting, with effect sizes substantially exceeding those documented in prior literature. While previous studies reported ROA improvements of 0.067 for traditional ESG ratings in Chinese markets [
79], and 0.052 effects across Gulf countries [
80], this study’s coefficient of 0.094 represents a 40–81% premium attributable to AI enhancement. Similarly, the Tobin’s Q effects (0.0073) substantially exceed global findings (0.0045) [
86] and MENA results (0.0038) [
87]. These differentials indicate that AI technologies may create incremental value through superior data processing capabilities, real-time stakeholder sentiment analysis, and automated compliance monitoring that traditional manual processes cannot replicate.
The mediation analysis revealing 73% indirect effects through ESG quality improvements validates stakeholder theory predictions while highlighting disclosure quality as the primary value creation mechanism rather than direct operational efficiencies. This finding requires careful interpretation of the underlying mechanisms through which AI-enhanced ESG disclosure generates financial returns.
The dominance of the indirect pathway operates through several interconnected mechanisms grounded in information economics and stakeholder theory. First, AI-enhanced ESG disclosure reduces information asymmetries between management and stakeholders by providing more accurate, timely, and comprehensive sustainability information. The superior data processing capabilities of AI systems enable real-time monitoring of ESG metrics, automated compliance tracking, and sophisticated materiality assessments that would be prohibitively expensive through manual processes. This enhanced information quality directly addresses investor concerns about greenwashing and ESG performance verification, thereby reducing the risk premium demanded by capital providers.
Second, the transparency mechanism operates through improved stakeholder trust and legitimacy. AI-powered disclosure systems demonstrate organizational commitment to transparency through technological sophistication, signaling credible dedication to ESG principles beyond mere compliance. The algorithmic consistency and reduced human bias in AI-generated reports enhance perceived reliability, while automated stakeholder sentiment analysis enables more responsive engagement strategies. These factors collectively strengthen stakeholder relationships, leading to improved access to capital, enhanced customer loyalty, and reduced regulatory scrutiny.
The smaller direct effect (27%) likely captures operational benefits such as process automation and efficiency gains from AI implementation. However, these direct benefits are constrained by the specific application domain—ESG reporting systems primarily focus on information processing and communication rather than core operational activities. Unlike AI applications in manufacturing or supply chain management, ESG-focused AI systems generate value predominantly through their information and communication functions rather than direct cost reduction or productivity enhancement.
This interpretation aligns with legitimacy theory’s emphasis on organizational actions that enhance social acceptance and stakeholder approval. The empirical dominance of the transparency pathway suggests that in the ESG domain, signaling and communication effects substantially outweigh pure operational efficiencies, consistent with the stakeholder-oriented nature of sustainability initiatives where value creation depends critically on external perceptions and relationships.
Crucially, the core findings are validated against the potential influence of unobserved macroeconomic confounders prevalent during the 2021–2024 study period. The formal bounding analysis [
83], detailed in
Section 3.10, explicitly tests the relationship’s robustness to such factors. The results (
Table 11 and
Figure 10) demonstrate that even after controlling for global economic policy uncertainty and monetary policy shifts, the identified lower bound of the effect of ESG quality on ROA remains positive and economically meaningful (β* = 0.065). This provides strong quantitative evidence that the core relationships identified in this study are not artifacts of the unique macroeconomic environment, lending significant credence to the interpretation of a causal link between AI-enhanced transparency and firm value.
The industry heterogeneity observed, with energy and materials companies demonstrating stronger relationships than consumer sectors, aligns with materiality-based explanations where ESG disclosure carries greater stakeholder salience in environmentally intensive industries. This pattern corroborates findings in emerging markets [
54], though the magnitude of effects suggests that AI amplifies these sector-specific benefits. The temporal dynamics revealed through Panel VAR analysis indicate bidirectional relationships with significant feedback loops, contrasting with predominantly unidirectional assumptions in earlier cross-sectional studies.
The scope limitation to conventional AI technologies represents both a methodological strength and a boundary condition for interpreting results. By focusing on traditional machine learning and automated analytics approaches that were predominantly implemented during the study period, the findings provide robust evidence for established AI applications in ESG reporting. However, the rapid emergence and adoption of generative AI technologies, particularly large language models, since late 2022 suggests that future research should investigate whether the positive relationships documented in this study extend to or are amplified by GenAI implementations in sustainability reporting processes.
Several limitations warrant acknowledgment. The sample restriction to non-financial Saudi companies limits generalizability across institutional contexts and regulatory frameworks. The voluntary nature of ESG disclosure guidelines during the study period may introduce selection bias, as companies choosing enhanced disclosure practices likely possess unobserved characteristics favoring both AI adoption and superior performance. The AI adoption measurement, while validated through multiple approaches, relies partially on self-reported survey data subject to social desirability bias. Additionally, the relatively short observation period (2021–2024) may not capture long-term sustainability of identified relationships or potential diminishing returns as AI technologies mature and become commoditized across industries.
Future research should extend geographical scope to examine cross-country variations in AI–ESG relationships, particularly comparing mandatory versus voluntary disclosure regimes. Longitudinal investigations spanning longer time horizons could illuminate sustainability of competitive advantages and identify optimal AI investment strategies. Micro-level studies examining specific AI applications (natural language processing, machine learning algorithms, automated data collection) would provide granular insights for managerial decision-making. Finally, stakeholder-specific analyses could investigate differential responses from investors, regulators, and civil society organizations to AI-enhanced disclosure practices, informing targeted communication strategies and resource allocation decisions for maximizing the identified performance benefits.
5. Conclusions
This empirical investigation offers a comprehensive analysis of the impact of artificial intelligence on ESG disclosure quality and financial performance, addressing a notable gap in the literature concerning emerging economies undergoing rapid, state-led transformation. By leveraging the natural experimental setting of Saudi Arabia’s Vision 2030, this study moves beyond the correlational findings of prior research to rigorously examine the causal mechanisms through which technological innovation in sustainability reporting creates measurable business value. The findings offer a clear and compelling narrative: AI is not merely an incremental tool but a transformative enabler of corporate transparency that generates significant financial returns.
The core discovery is that AI adoption robustly enhances the quality of ESG disclosure. This improvement is not trivial; it represents a fundamental shift in a firm’s ability to communicate its sustainability efforts with greater accuracy, timeliness, and depth. This AI-enhanced transparency, in turn, translates directly into superior financial performance, yielding economically meaningful improvements in profitability and market valuation that substantially exceed the benchmarks reported in studies of traditional ESG disclosure. Our findings thus present a clear business case for the strategic allocation of capital toward AI technologies in sustainability functions, reframing such expenditures as value-creating investments rather than compliance-driven costs.
Crucially, this study illuminates the primary pathway through which this value is created. The mediation analysis reveals that the financial benefits of AI adoption are overwhelmingly channeled through the enhancement of ESG disclosure quality. Approximately 73% of AI’s total effect on financial performance is indirect, operating through improved stakeholder relationships and reduced information asymmetries. The smaller direct effect underscores a critical insight: in the ESG domain, the value of AI lies less in direct operational efficiencies and more in its power to build organizational legitimacy and trust. This validates predictions from both stakeholder and legitimacy theories, providing strong empirical evidence that in the modern economy, transparency is a tangible asset.
Furthermore, the analysis reveals important dynamics that carry strategic implications. The accelerating rate of AI adoption (a 146.7% increase during the study period) signals a closing window of competitive advantage for early adopters. The findings also confirm that these benefits are not uniform across industries; sectors such as energy and materials, where ESG concerns are most material, derive the greatest financial rewards from AI-enhanced transparency. This aligns with materiality-based explanations and suggests that the return on investment in AI for ESG is highest where stakeholder scrutiny is most intense.
In synthesizing these discoveries, this research offers a generalized insight of significant importance for policymakers and corporate leaders, particularly in other emerging economies. The synergistic success observed in Saudi Arabia demonstrates that national strategies promoting parallel advancements in technological infrastructure and sustainability reporting standards can create a virtuous cycle, accelerating economic diversification and enhancing market integrity. For corporations, the message is unequivocal: integrating advanced technology into the core of sustainability strategy is no longer a peripheral activity but a central driver of long-term shareholder value. While this study’s scope was limited to conventional AI and a specific national context, it lays a robust foundation, validated against potential unobserved confounding variables, for future inquiries into the role of generative AI and the cross-country replicability of these powerful findings.