Next Article in Journal
Non-Linear Impact of Environmental, Social, and Governance Scores on Deal Premiums
Previous Article in Journal
The Importance of Technological Progression in Impoverished Countries
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

Using Data Analytics in Financial Statement Fraud Detection and Prevention: A Systematic Review of Methods, Challenges, and Future Directions

Department of Economics, International Hellenic University, 62124 Serres, Greece
*
Author to whom correspondence should be addressed.
J. Risk Financial Manag. 2025, 18(11), 598; https://doi.org/10.3390/jrfm18110598
Submission received: 1 September 2025 / Revised: 13 October 2025 / Accepted: 21 October 2025 / Published: 24 October 2025
(This article belongs to the Section Applied Economics and Finance)

Abstract

Reliable financial reporting is critical for maintaining market confidence and guiding stakeholders’ decision-making, yet traditional audit methods often fail to detect sophisticated fraud schemes that are hidden within large volumes of transactional data. This systematic literature review synthesizes 43 empirical and theoretical studies published between 2010 and 2024 that utilize data analytics techniques for the prevention and detection of fraud in financial statements. Following the PRISMA guidelines, we conducted a four-phase review—identification, screening, eligibility assessment, and inclusion—to ensure transparency and reproducibility. Our analysis categorizes techniques into supervised machine learning classifiers (e.g., decision trees and neural networks), statistical anomaly detection methods, network-based analyses, and real-time monitoring frameworks. We evaluate each approach’s comparative effectiveness, highlight persistent challenges such as data imbalance, model interpretability, and governance constraints, and also trace evolving methodological trends over time. The review reveals that integrating predictive analytics and continuous monitoring into accounting information systems can transform audits from reactive investigations into proactive fraud prevention mechanisms. We conclude by proposing a future research agenda focusing on developing explainable AI models for audit applications, establishing robust data governance frameworks to support automated monitoring, and conducting longitudinal field studies to assess the real-world impact of analytics-driven controls.

1. Introduction

The aim of this study is to systematically review and critically synthesize the existing academic and professional literature on the use of data analytics in the detection and prevention of financial statement fraud, with a focus on evaluating methodological developments, practical applications, and associated challenges. The accuracy of financial reporting forms a crucial foundation of trust in modern economic systems, which is essential to support well-informed decision-making by investors, regulators, and other stakeholders (Dechow et al., 2010). However, recurring instances of fraud in financial statements present a major impediment to the credibility and operations of capital markets (ACFE, 2022). Despite continuous improvements in regulatory measures and auditing policies, the detection and deterrence of financial misrepresentation meet serious challenges when they are solely based on conventional mechanisms due to the increasing complexity, magnitude, and opaqueness of financial information (Rezaee, 2002).
In the modern era, the growing use of advanced data analytics techniques has brought about a major revolution in fraud detection and prevention. Data analytics combines several techniques, such as machine learning, statistical anomaly detection, text mining, and network analysis, and thus provides a powerful framework for analyzing vast and dissimilar datasets in a more thorough and accurate manner than conventional auditing procedures (Islam et al., 2024). These new technologies are crucial in highlighting subtle patterns and anomalies that might indicate fraudulent transactions, often before the occurrence of major financial irregularities or regulatory infractions (Warren et al., 2015). Apart from mere detection, data analytics enables the proactive development of plans for fraud prevention, thus augmenting organizational resistance to financial wrongdoing (Kokina & Davenport, 2017).
This literature review seeks to carry out a critical analysis of the current academic and professional discourse on the use of data analytics in financial statement fraud prevention and detection. Its aim is to synthesize the underlying theories, methodological developments, and empirical findings that define present practice while carefully evaluating the limitations, challenges, and ethical implications of such application. Through a systematic literature synthesis of relevant work, the review seeks to explain the potential contributions of data analytics to revolutionizing practices in this area, as well as to highlight potential lines of future research aimed at enhancing the integrity of financial reporting.
This study involves a systematic exploration of the current literature to determine the extent to which data analytics methods have enhanced the early detection of financial statement fraud compared with conventional auditing practices. A thorough understanding of the comparative effectiveness of such methods is imperative when measuring the impact of analytics on the development of fraud risk management. The review also covers pertinent methodological issues and challenges stemming from the use of data analytics to prevent fraud in financial reports. Through an in-depth inquiry into such challenges, this paper endeavors to clarify both the benefits and limitations of the analytical approaches that are currently in use. Finally, the review determines the degree to which the incorporation of advanced analytics tools, including machine learning algorithms and network analysis, improves the effectiveness of organizational strategies for fraud prevention. Taken together, these research questions form a coherent framework for investigating how data analytics can be used to maintain the integrity and reliability of financial reporting.
A systematic review provides a comprehensive analysis of a clearly defined research question, using systematic and explicit methods to select, appraise, and synthesize all relevant studies, as well as to collect and analyze data from the chosen studies (Moher et al., 2009). This study is centered around the following questions:
  • How have data analytics methods enhanced the identification of fraudulent activities in financial statements compared with traditional auditing methods?
  • What are the key methodological challenges and limitations concerning the use of data analytics to prevent financial reporting fraud?
  • How do sophisticated data analytics techniques, including machine learning and network analytics, strengthen the effectiveness of organizational strategies to avert fraud?
The structure of this manuscript is as follows: Section 2 presents a comprehensive discussion of the research process that was used to examine the role of data analytics in the detection and prevention of fraud in financial statements. Section 3 summarizes our findings, with the aim of answering the presented research questions. Section 4 and Section 5 presents a synthesis of the conclusions drawn from the literature review.
This review contributes to the literature by providing a comprehensive and up-to-date synthesis of the application of data analytics in financial statement fraud detection and prevention. Unlike prior systematic reviews that have focused more broadly on financial fraud, forensic accounting, or cybersecurity, this study adopts an exclusive focus on financial statement fraud, in accordance with its inclusion and exclusion criteria. Furthermore, it not only maps the evolution of analytical methods—including machine learning, anomaly detection, and network analysis—but also critically evaluates the methodological challenges, ethical implications, and organizational factors that shape their effectiveness. By integrating insights from accounting, data science, and fraud risk management, this review differentiates itself by offering a cross-disciplinary perspective and identifying future research directions for enhancing the transparency, interpretability, and practical applicability of fraud analytics tools.

2. Methodology

The main aim of a literature review is to systematically gather and assimilate the current body of knowledge that is relevant to a given research question, thereby forming the basis for further investigation. A well-executed literature review points out gaps in the existing research, evaluates the quality and limitations of previous work, and positions the new study within a broader academic debate (Tranfield et al., 2003). Within the context of fraud detection and prevention with the help of data analytics, a literature review allows for the observation of methodological advancements, evaluation of the effectiveness of analytical tools, and critical examination of both theoretical advancements and empirical findings. In addition, a review ensures that the research is based on sound evidence, prevents duplication, and evaluates the significance and novelty of existing research (Booth et al., 2021). Through the systematic synthesis of the prevailing research, a review not only helps in the development of a coherent theoretical model but also provides methodological guidance for future research.

2.1. Literature Review Protocol

To secure the required rigor and transparency for a thorough literature review, a systematic protocol was used. This review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Page et al., 2021), thus ensuring that each stage of the review process was carefully recorded and feasible for replication. An extensive search strategy was first developed, focusing on relevant databases, including Scopus, Web of Science, ScienceDirect, and Google Scholar. Multiple keywords and Boolean operators were applied, such as “data analytics,” “financial statement fraud,” “fraud detection,” “fraud prevention,” “machine learning,” and “forensic accounting.” The search criteria were limited to peer-reviewed journal articles, conference proceedings, and trustworthy reports published between 2010 and 2024, thus recognizing the impact of recent advancements in technology on fraud analytics. The duplicates were removed after the initial search, and the remaining articles were examined based on their titles and abstracts. Thorough full-text assessments were then performed to evaluate the relevance and quality of the studies based on established inclusion and exclusion criteria.

2.2. Analysis of the Literature Search Strategy

A highly structured strategy for performing a literature search is crucial for the effectiveness of a systematic review, since it ensures the identification, in-depth assessment, and integration of all relevant studies within the synthesis. The literature search process in this review was formulated to respond to the research questions regarding the use of data analytics in the detection and prevention of financial statement fraud, according to the best practice guidelines set within the systematic review methodology (Booth et al., 2021; Page et al., 2021).

2.2.1. Defining the Research Question and Search Terms

The first step of the literature review was the clarification of the research question and determination of crucial terminology relating to the topic of the study. The review aimed to examine the role of data analytics in detecting and preventing financial statement fraud; therefore, it was essential to include terminology relating to fraud detection and analytical techniques. Search terms utilized in the review included the following:
  • Fraud Identification and Mitigation: Terms such as “financial statement fraud,” “fraud detection,” “fraud mitigation,” “forensic accounting,” and “earnings manipulation.”
  • Data Analysis Methodologies: Terms like “data analysis,” “large collections of data,” “algorithmic learning methods,” “artificial intelligence,” “predictive analytics,” “quantitative methods,” and “outlier identification.”
The combination of these terminologies was made possible by using Boolean operators (AND, OR), thus increasing the effectiveness and accuracy of the search process. In addition, the use of Boolean operators made it possible to broaden or narrow search results to include terms that were relevant to misleading tactics and analytical methodologies.

2.2.2. Database Selection

The databases selected for this review were chosen for their breadth, quality, and relevance to the research topic. Scopus is a comprehensive multidisciplinary database that provides access to peer-reviewed academic papers, conference proceedings, and scientific journals across a wide range of subject fields, including business, finance, and computer science. Web of Science, known for its rigorous indexing and citation tracking, provides access to a wide variety of high-impact journals and ensures inclusion of only high-quality, peer-reviewed studies, making it valuable for systematic reviews. ScienceDirect serves as a key resource for research in science and technology, hosting leading journals in areas such as data analytics, machine learning, and fraud detection. Finally, Google Scholar complements these databases by offering a broad and inclusive search of scholarly materials, including journal articles, conference papers, theses, and books, which helps capture relevant studies that may not be indexed in more specialized databases.

2.2.3. Search Limitation Criteria

To determine the relevance of the reviewed research in the current era, the research scope was limited to studies published between 2010 and 2024. This time frame was chosen to cover the latest developments in data analytics, especially in the fields of machine learning and artificial intelligence, which have significantly changed the methods that are used in fraud detection and prevention in recent times (Kokina & Davenport, 2017). Omitting research published before 2010 allowed us to focus on the latest technological advancements and methods.
In addition to the publication date, the search was limited to English-language studies to maintain consistency and accessibility across the literature. Studies published in non-English languages were excluded, as they could not be adequately assessed without translation, which would introduce potential bias or inaccuracies in the synthesis of findings.

2.2.4. Inclusion and Exclusion Criteria

To ensure rigor and transparency, inclusion and exclusion criteria were carefully defined to identify the most relevant and high-quality studies for this review. Studies were eligible if they focused specifically on the application of data analytics—including techniques such as machine learning, artificial intelligence, and statistical analysis—in the detection or prevention of financial statement fraud. Eligible works included empirical research, theoretical assessments, or systematic reviews published in peer-reviewed journals or reputable conference proceedings, written in English, and published between 2010 and 2024. In contrast, studies were excluded if they addressed general fraud detection without a direct focus on financial statement fraud, examined types of fraud that were unrelated to financial reporting, or lacked methodological clarity and transparency. Articles that were not peer-reviewed, highly biased, anecdotal in nature, or primarily descriptive with minimal empirical or theoretical contribution were also excluded. These stringent criteria ensured that the final body of research was closely aligned with our research questions, maintained a high standard of academic integrity, and offered meaningful insights into the role of data analytics in enhancing fraud detection and prevention (Petticrew & Roberts, 2008).

2.2.5. Data Management and Screening Protocols

After the initial search, the screening and selection of studies were carried out in several phases to ensure transparency and replicability. First, any duplicate entries obtained from the search were removed, followed by a preliminary screening of the titles and abstracts. In this stage, studies that were clearly irrelevant—like those focusing on cybersecurity fraud rather than financial reporting—were disposed of. The remaining studies were subjected to a thorough review of their full versions to determine whether they met the inclusion criteria. Studies that did not meet these criteria were excluded at this stage, and the reasons for their exclusion were documented to maximize clarity.
The final cohort for this study was examined in order to determine an overall and fair sample. The research followed the PRISMA 2020 guidelines, which promote an open and reproducible methodology for search strategies and screening processes (Page et al., 2021).
There were four basic phases of this analysis: (1) identification, (2) screening, (3) eligibility, and (4) inclusion. A detailed evaluation of the tasks performed in each step is outlined in the next section:
  • Identification
    • Data gathered through database queries (Scopus, ScienceDirect, Web of Science, Google Scholar): n = 585;
    • Total duplications found and removed: n = 30;
    • Total records for evaluation (after duplication removal): n = 555.
  • Evaluation
    • Records screened by title and abstract: n = 555;
    • Records excluded (irrelevant topic, not peer-reviewed, wrong fraud type, etc.): n = 417.
  • Eligibility
    • A total of 138 full-text articles were assessed for eligibility;
    • Full-text articles were excluded, with reasons given (e.g., lack of methodological transparency, anecdotal data, focus on cybersecurity fraud only): n = 95.
  • Included
    • Studies included in qualitative synthesis (systematic review): n = 43.
The figure below illustrates the strategy that was utilized in the review process for the selection of studies acquired from database searches.
Table 1 presents the total number of journal articles included in this research.

3. Refining and Elaborating on the Results

Following a systematic review and the subsequent selection process, a final set of articles was created for analysis. This process yielded 43 relevant studies that met our defined criteria, providing a wide-ranging list of studies on the application of data analytics for the detection and prevention of fraud in financial statements. The findings of the review highlight the widespread and dynamic nature of the subject matter, with special focus given to significant innovations in analytical methods, most prominently those surrounding artificial intelligence and machine learning.

3.1. A Review of the Study Selection Process

The initial search returned a total of 585 records from the selected databases, which were then subjected to a rigorous screening process. The removal of duplicates reduced the number of records to 555, which were then screened according to their titles and abstracts.
During the screening process, 417 papers were excluded after title and abstract review, because they did not meet the predefined inclusion criteria. The most common reasons for exclusion included a focus on general cybersecurity fraud rather than financial statement fraud, a lack of methodological transparency, the absence of peer review, or minimal empirical/theoretical contribution. A further 95 papers were excluded at the full-text review stage for similar reasons, such as anecdotal evidence, a descriptive rather than analytical focus, or irrelevance to financial reporting fraud. These exclusions were conducted according to the criteria outlined in the inclusion and exclusion section and documented to ensure transparency and reproducibility. Although the number of excluded articles appears high, this reflects the deliberately broad initial search strategy, which was designed to maximize coverage and minimize the risk of missing relevant studies.
The application of systematic screening criteria ensured that only the most relevant and methodologically rigorous studies were included in the final synthesis (n = 43). The process followed the PRISMA 2020 guidelines, and the detailed flow of records through identification, screening, eligibility, and inclusion is presented in Figure 1.

3.2. Distribution of Articles Across Methodological Approaches

The latest screening included 43 studies, which used an extensive range of research approaches. Most studies were empirical, representing 63% (n = 27) of all articles reviewed. The empirical studies used a range of research methods, including case studies, surveys, experiments, and secondary data analysis. The remaining studies (37%, n = 16) were theoretical publications, literature reviews, or conceptual models that provided critical insights into the theoretical underpinnings of fraud detection and prevention using data analytics.
A significant number of empirical studies (n = 17) used decision trees, neural networks, and support vector machines, reflecting a trend towards predictive analytics in the field. In addition, empirical studies based on statistical anomaly detection (n = 10) have also gained more prominence, showing the coexistence of traditional methods and modern technologies. This range of methodologies reflects a comprehensive analysis of the topic, along with the applicability of tools for data analysis in empirical studies and fraud detection projects. Below is a pie chart illustrating the division of methodologies into different categories. Figure 2 shows that 63% of the studies are defined as empirical, while 37% fall under the categories of theoretical or conceptual approaches.

3.3. Key Themes and Findings from the Selected Studies

3.3.1. Artificial Intelligence and Machine Learning

A significant amount of research (n = 20) tested the use of artificial intelligence (AI) and machine learning algorithms for fraud detection in financial statements. Findings from these studies highlighted the ability of machine learning algorithms to identify complex patterns in large datasets, which often exceeds that of traditional techniques. As an example, decision trees and neural network models have shown high levels of predictive accuracy in anticipating fraudulent behavior in financial reporting (Warren et al., 2015). In addition, AI-based software, specifically using natural language processing (NLP), has been used to scan the textual contents of financial disclosures, such as audit reports and footnotes, for signs of possible fraudulent behavior (Islam et al., 2024).

3.3.2. Statistical and Anomaly Detection Methodologies

One of the most pertinent challenges that arose during this study was the application of statistical models and anomaly detection algorithms (n = 12) that are specifically designed to recognize deviations from set benchmarks in financial data. The research often highlighted the effectiveness of such models in detecting outliers or unusual patterns in financial reports that might indicate fraudulent operations (Kokina & Davenport, 2017). Techniques like regression analysis, principal component analysis (PCA), and cluster analysis were widely used to explain unusual incidents in financial reports, including revenue recognition and asset valuation anomalies.

3.3.3. Prompt Fraud Remediation

A number of studies (n = 8) examined the use of data analytics in proactive fraud prevention in real-time environments. These studies highlighted the use of predictive analytics and real-time monitoring systems by financial institutions and corporate auditors to prevent fraudulent activities before they happen. By applying real-time data analysis, supplemented by considerable data structures, organizations are now able to pre-emptively identify transactions or individuals with high-risk factors and prevent potential fraud before substantial losses are incurred (Dechow et al., 2010).

3.3.4. Obstacles and Constraints

Despite the expected benefits of the application of data analytics for fraud prevention and detection, many studies (n = 13) reported that hurdles occurred in the actual application of these technologies. These include data quality concerns, the need for large datasets, and the complexities involved with algorithmic models (Petticrew & Roberts, 2008). Other concerns related to the ethical implications of the application of automated decision-making systems for detecting fraud, most notably the threat of algorithmic bias (Rezaee, 2002). These concerns highlight the need for more scholarly investigation of the ethical concerns and pragmatic limitations that are experienced when implementing data analytics for fraud detection in real-life applications. Figure 3 presents analytical techniques and themes identified in the literature.

3.3.5. Summary of Results

The overall analysis of the 43 studies revealed a significant development in the use of data analytics for fraud detection and prevention. A clear trend of moving towards the use of machine learning and artificial intelligence is depicted through the findings. However, the studies also point out the importance of resolving issues of data quality, methodological rigor, and ethical considerations to realize the full potential of these technologies. The wide range of methodologies used and the complexity of different contexts involved in fraud detection reported in the literature are a clear indication of the increasing recognition of data analytics for maintaining the integrity of financial reports.

3.4. Analysis of Literature Review Results Against Research Questions

3.4.1. Research Question 1

Data analytics methods significantly increase the ability to quickly identify fraudulent financial statements compared with traditional auditing procedures.
The literature reveals that the usage of data analytics techniques has markedly enhanced the early detection of fraudulent transactions in audited financial reports compared with the standard manual auditing process. Traditional audits, which mainly rely on samples, ratio analysis, and subjective judgments, have inherent limitations to their ability to detect complex or well-hidden fraudulent activities (Dechow et al., 2010; Rezaee, 2002). In comparison, data analytics techniques, especially those based on machine-learning algorithms, have superior capabilities through their systematic processing of large and complex datasets, thus detecting patterns and anomalies that easily escape human auditors (Perols, 2011; Islam et al., 2024).
The studies by Kokina and Davenport (2017) and Warren et al. (2015) prove that methods involving predictive models and anomaly detection can detect abnormal transactions and trends of fraud significantly in advance of conventional human auditing approaches. In addition, the deployment of artificial intelligence (AI) methods, including neural networks and support vector machines, has proven to result in a higher prediction accuracy, with some models attaining classification result rates of more than 90% in fraud detection in financial statements (Gaganis, 2009). Furthermore, real-time-monitoring systems that utilize big data analytics, as emphasized by a number of scientific studies (Ngai et al., 2012), have added a proactive element to the detection of fraudulent activities. Instead of reacting only after fraud incidents have occurred, organizations can now effectively detect suspicious activities on a real-time basis, enabling timely intervention. Overall, the available evidence strongly suggests that data analytics not only enhances the accuracy and speed of fraud detection but also promotes a proactive approach, rather than a solely reactive one, to facing the threats of fraud.

3.4.2. Research Question 2

What are the key methodological challenges and limitations to the use of data analytics in fraud prevention in financial reporting?
Despite the significant benefits, a review of the literature shows various methodological issues and limitations of using data analytics to prevent fraud. One of the most common issues raised in such studies is related to data quality and availability. Fraudulent instances are rare events, and the dataset thus tends to have an imbalanced nature, where legitimate instances far outweigh fraud cases. This imbalance has the potential to introduce bias in machine learning models so that fraudulent transactions are misclassified as legitimate.
In addition, many models essentially depend on past history, which might not reflect the changing patterns of fraudulent activity (Ravisankar et al., 2011). Fraudsters often change tactics, which means that models based on outdated trends are likely to be less effective over time, unless they are updated with fresh data on a regular basis.
The interpretability of models poses a significant limitation. Even though a very high degree of precision can be obtained with sophisticated models like deep neural networks, these systems often behave as “black boxes,” which causes problems for auditors and regulators, who aim to understand the basis of fraud prediction (Warren et al., 2015). This lack of transparency is a danger for the integrity and equitable application of these systems for auditing and regulatory purposes.
Additionally, the deployment of large data analytics gives rise to ethical issues and privacy concerns, in particular about real-time work observations or customer interactions with staff (Petticrew & Roberts, 2008). Close supervision is necessary to ensure that fraud prevention activities do not infringe on people’s rights.
In summary, while data analytics methodologies offer powerful tools for the detection of fraudulent activities, their deployment is limited by data quality-related challenges, model bias issues, transparency concerns, and ethical hurdles. All these present a compelling case for responsible supervision to ensure their effectiveness and ethical application.

3.4.3. Research Question 3

To what extent does the addition of complex data analysis tools (like machine learning and network analysis) strengthen the effectiveness of fraud deterrence systems in organizations?
The integration of advanced data analytics tools into organizations’ fraud prevention measures has led to considerable improvements in their effectiveness; however, the magnitude of such improvements is driven by a combination of organizational factors, with data infrastructure, cultural environment, and skill sets being important examples. Extensive empirical research suggests that organizations that adopt machine learning and network analysis approaches have achieved notable improvements in their ability to forecast, detect, and control fraud risks (Ngai et al., 2012; Kokina & Davenport, 2017; Islam et al., 2024).
When properly implemented, machine learning algorithms allow organizations to detect complex fraudulent conspiracies that are beyond the reach of traditional oversight systems (Gaganis, 2009). Of particular importance is network analysis, which enables visualization and comprehension of the relationships between transactions, customers, and staff, thus exposing hidden collusion networks that are probably undetectable through normal auditing procedures (Goel & Gangolly, 2012).
Additionally, the integration of predictive analytics with constant tracking redefines fraud prevention from a reactive analysis to a proactive approach. Real-time anomaly detection systems supported by large data platforms enable uninterrupted measurement of fraud risk, thus enabling timely interventions before significant economic losses can occur (Hernandez Aros et al., 2024).
However, the extent of its effectiveness depends on the readiness of the organization. The current literature suggests that organizations need to have adequate data governance mechanisms in place, employ competent staff, and develop a data-driven decision-making culture in order to maximize the benefits that can be derived through sophisticated analytics (Innan et al., 2024). Without these elements, even sophisticated analytical tools can produce less-than-optimal results or be underutilized.
In summary, the literature supports the claim that sophisticated data analysis tools further improve the proactive character, effectiveness, and efficiency of fraud prevention measures in organizations. However, the level of improvement depends on the organization’s data sophistication levels, as well as its commitment to the proper use of analytic techniques. Figure 4 shows the number of reviewed studies that specifically confirm each of the three research questions.

3.5. Annual Publication Trends

The temporal trend of scholarly studies on the use of data analytics for financial statement fraud detection, as illustrated in the figure below, indicates a sharp rise in the academic pursuit of this topic in the last ten years. In the period spanning 2010 to 2014, the number of relevant research articles was relatively low and constant, with annual outputs never reaching more than three. This period was dominated by initial studies and early attempts at applying data mining methods to detect irregularities in financial reporting. However, starting from approximately 2015, a steady increase can be observed, which occurred in tandem with the wider adoption of big data technologies and machine learning by the disciplines of accountancy and auditing.
Since 2018, a significant upward trend can be seen in the number of scholarly publications, peaking at around 18 studies in 2024. This rise can be traced back to better access to complex analytical tools, heightened regulatory requirements to better prevent fraud in the face of corporate governance scandals, and greater confidence in the effectiveness of algorithmic audit assistance systems. Furthermore, the rise in publications aligns with improvements in computational power and cloud-computing analytics schemes, which have provided more credible real-time fraud detection. This trend shows that data analytics have progressed from a specialized skill set to a critical component of contemporary fraud risk management schemes and highlights the significance of and demand for continued scholarly investigation of the subject. Figure 5 shows a chronological outline of the number of annual publications.

3.6. Journal Dissemination

A review of the research journals that contain studies relating to the use of data analytics to detect financial statement fraud shows a high prevalence across several premier interdisciplinary journals. Decision Support Systems stands out especially, both in the number of published papers (n = 6) and its CiteScore (12.5), indicating its central position in sharing advanced analytical and decision-making techniques relating to fraud detection. The Journal of Accounting and Economics, although it has lower publication numbers (n = 3), has the highest CiteScore (15.3), reflecting the sound theoretical and empirical bases of its published work on fraud in high-quality accounting research.
Expert Systems with Applications has become a leading publication outlet (n = 4), with a clear focus on the application of artificial intelligence and machine learning to fraud detection. Furthermore, publications such as Accounting Horizons and the Journal of Emerging Technologies in Accounting, which are both affiliated with the American Accounting Association, reflect an institutional realignment within auditing toward the implementation of cutting-edge technologies.
The Managerial Auditing Journal and the Journal of Financial Crime, which are both published by Emerald, provide insights that are immediately relevant to practitioners’ needs, often focusing on organizational and compliance dimensions of fraud analytics. These publications reflect a cross-disciplinary approach—from accounting to information systems and applied artificial intelligence—thereby emphasizing the complex nature of fraud detection and highlighting the need for multiple analytical perspectives.
This analysis highlights that important contributions are coming not only from traditional accounting journals, but also from interdisciplinary fields like data science and decision support systems. This reflects a convergence of technological innovation with financial regulation in academic publishing. Table 2 summarizes our publication analysis grouped by journal, including each journal’s ranking, CiteScore (2023), number of papers identified during the literature review, and publisher.

3.7. Analysis of Citation Counts

Our analysis of citation counts indicated the broad impact of a number of foundational research studies on the use of data analytics to detect financial fraud. Of the studies reviewed, the study by Perols (2011), published in Auditing: A Journal of Practice & Theory, is the most cited, with more than 500 citations. This study rigorously compares traditional statistical methods to machine learning techniques, thus setting a performance benchmark that has had considerable influence across numerous subsequent studies.
The following discussion focuses on the studies by Ngai et al. (2011) and Ravisankar et al. (2011), both published in the journal Decision Support Systems, and with citation counts of 450 and 380, respectively. These studies are marked by their addition to the formal classification of fraud detection methods and demonstrate the potential of data mining approaches, such as decision trees and neural networks, for the detection of fraudulent financial activities. The large number of citations for each of these publications reflect their cross-disciplinary influence, linking different fields of application, including accounting, data science, and information systems.
Kokina and Davenport’s (2017) article in the Journal of Emerging Technologies in Accounting is also highly ranked, with over 300 citations. The article explores how artificial intelligence is transforming auditing and has been widely cited in both academic and professional circles. Similarly, research by Warren et al. (2015) on the applications of machine learning and big data in accounting illustrates the growing use of computational methods in conventionally conservative auditing environments. Patterns in citations reveal that the most influential studies in this area combine strong methodological soundness, high practical significance, and a blend of diverse academic fields. Articles linking artificial intelligence with core accounting concepts or proposing innovative fraud detection frameworks are significantly more likely to attract major scholarly attention. This finding suggests a high demand for consequential and methodologically sound research that addresses theoretical as well as practical issues of financial statement fraud analytics. Table 3 shows the most cited articles in the literature review, including the author(s), article title, number of citations, research area, and journal.

3.8. Distribution of Publications by Research Area

An analysis of the topics that are investigated in the published works shows a multidisciplinary interfacing of traditional accounting scholarship with the emerging field of data science. Most of the articles (n = 15) fall under accounting and auditing, reflecting the dominant role of financial regulation in studies on fraudulent behavior. Such studies often highlight the role of data analytics techniques in the framework of auditing and their significance for enhancing the quality and accuracy of reported data.
The second largest group of articles, which fall under the categories of machine learning and artificial intelligence (n = 12), detail the rapid application of these technologies for fraud detection. The studies include a range of models—decision trees, neural networks, and support vector machines—that are more effective at identifying anomalous patterns than more standard methods.
The disciplines of statistics and data mining account for a considerable proportion of the reviewed literature (n = 10). These studies usually present considerable methodological contributions, investigating the extent to which clustering, regression, and anomaly detection uncover hidden relationships in complex financial datasets.
Prominent, although less well-known, contributions stem from disciplines such as forensic and financial criminology (n = 5), which frame fraud as a behavioral and legal problem rather than a statistical anomaly. Research on information systems (n = 5) considers the critical convergence of infrastructure and technology that is required for the timely detection of fraudulent schemes, while the area of corporate governance and ethics (n = 3) explores the impacts of organizational culture and ethical oversight on the occurrence and detection of fraud.
This distribution validates the claim that addressing financial statement fraud through data analytics is inherently interdisciplinary, requiring input from both quantitative modeling and the institutional environment in which it is applied. It further emphasizes the increasing need for collaborative efforts between data scientists and accounting professionals to develop integrated and flexible frameworks for fraud detection. Figure 6 is a horizontal bar chart showing the distribution of papers included in the literature review when classified by subject area.

4. Comparative Insights

While the prior sections outlined the descriptive patterns emerging from the reviewed studies, a comparative assessment provides deeper understanding of how different approaches contribute to fraud detection and prevention. A key contrast emerges between machine learning techniques (such as decision trees, neural networks, and support vector machines) and statistical anomaly detection models (including regression, cluster analysis, and principal component analysis). Machine learning approaches generally demonstrate superior predictive accuracy, in some cases exceeding 90% classification success. However, they often lack transparency, functioning as “black boxes” that limit their interpretability for auditors and regulators. In contrast, anomaly detection and statistical models, while less powerful in predictive performance, provide clearer interpretability and remain more aligned with the needs of auditors, who must justify fraud assessments in regulatory and legal contexts. This trade-off underscores the tension between accuracy and interpretability when adopting advanced analytics for fraud detection.
Another comparative insight arises when examining empirical versus theoretical contributions. The majority of empirical studies validate the practical feasibility of using machine learning and predictive analytics to identify fraudulent transactions earlier and more reliably than using traditional audit methods. However, theoretical and conceptual papers frequently caution against over-reliance on these tools without adequate governance structures, citing risks of algorithmic bias, ethical dilemmas, and the erosion of professional judgment. Thus, while empirical findings highlight technological potential, theoretical contributions emphasize contextual limitations, suggesting that a balanced integration of analytics with professional oversight remains essential.
Disciplinary perspectives also diverge. Accounting and auditing research typically focuses on the integration of analytics into existing assurance frameworks, stressing compliance, reporting standards, and professional accountability. Data science contributions, on the other hand, emphasize algorithmic efficiency, methodological innovation, and computational scalability, often without addressing practical constraints in auditing environments. Meanwhile, forensic criminology and governance studies frame fraud not only as a data anomaly but also as a behavioral and organizational issue, highlighting cultural, ethical, and governance mechanisms that influence the occurrence and detection of fraud. Together, these disciplinary contrasts reveal that effective fraud detection requires not only technological advances but also interdisciplinary collaboration to bridge methodological rigor with real-world applicability.
In summary, these comparative insights reinforce our answers to the three research questions guiding this review. First, while machine learning models outperform traditional and statistical approaches in terms of predictive accuracy, their interpretability challenges limit their full adoption in auditing practice. Second, the contrast between empirical validation and theoretical critique underscores the methodological and ethical challenges associated with analytics-based fraud detection. Finally, the disciplinary differences highlight that the effectiveness of advanced tools depends not only on technological innovation but also on organizational readiness, governance frameworks, and ethical oversight. Taken together, the evidence suggests that the future of financial statement fraud detection lies in hybrid approaches that combine the predictive power of advanced analytics with the transparency, accountability, and contextual understanding required in professional and regulatory settings.

5. Conclusions

This systematic literature review critically examined 43 peer-reviewed studies published between 2010 and 2024 on the application of data analytics in the detection and prevention of financial statement fraud. The synthesis demonstrates a clear shift from traditional, manual auditing techniques toward advanced, data-driven approaches such as machine learning, anomaly detection, and network analysis. These methods have been shown to enhance both the timeliness and accuracy of fraud detection, enabling organizations to transition from reactive responses to proactive fraud risk management.
Beyond summarizing existing work, this review provides comparative insights that highlight important trade-offs and challenges. Machine learning models offer superior predictive accuracy but often lack interpretability, limiting their regulatory acceptance. Statistical and anomaly detection models, while less accurate, remain more transparent and therefore more usable by auditors. Empirical studies validate the effectiveness of these techniques in practice, whereas theoretical contributions emphasize the risks of algorithmic bias, ethical dilemmas, and the need for governance. Disciplinary differences also emerged: accounting research stresses compliance and professional accountability, data science emphasizes algorithmic innovation, and criminology highlights the cultural and behavioral dimensions of fraud. Together, these insights demonstrate that effective fraud detection requires hybrid approaches that combine technological innovation with ethical oversight and professional judgment.
The impact of this study lies in its ability to consolidate fragmented research into a coherent framework and translate it into actionable knowledge. For researchers, this review identifies gaps in explainable AI, industry-specific fraud applications, and the development of standardized fraud risk scoring models that combine traditional ratio analysis with advanced analytics. For practitioners, it provides guidance on integrating data analytics into audit processes, investing in data infrastructures and cross-disciplinary expertise, and adopting proactive monitoring tools. For regulators, it emphasizes the need to establish ethical guidelines and auditability standards that encourage responsible innovation while safeguarding transparency and fairness.
In conclusion, this review contributes to both academic scholarship and professional practice by offering a balanced, interdisciplinary perspective on how data analytics can strengthen the integrity of financial reporting. The findings confirm that while technology has transformed fraud detection, its success ultimately depends on aligning analytical power with governance, accountability, and ethical application.

6. Future Research and Practical Applications

This review highlights several important directions for future research while also offering practical implications for auditors, regulators, and firms. From a research perspective, longitudinal studies are needed to evaluate the sustained effectiveness of anti-fraud systems over time and across industries with different risk profiles. Equally important is advancing the interpretability and transparency of sophisticated AI models to ensure that they can be trusted and adopted by auditors and regulators. A key gap in the literature is the absence of standardized frameworks that allow stakeholders to systematically evaluate fraud risks. While machine learning models achieve impressive predictive performance, their “black box” nature raises concerns about accountability and regulatory acceptance. Future research should therefore focus on developing explainable, user-friendly tools that can be integrated into auditing practice without requiring deep technical expertise.
A promising avenue is the design of fraud risk scoring systems that combine traditional financial ratios with analytics-based indicators. Such systems could generate composite fraud risk scores—expressed in intuitive percentage terms—that auditors and internal control professionals can readily apply in decision-making. This approach would bridge the gap between conventional ratio analysis and advanced machine learning, producing tools that are both empirically robust and operationally accessible.
On a practical level, several concrete recommendations emerge. Auditors should integrate advanced analytics into their standard procedures, supplementing traditional sampling and ratio analysis with anomaly detection and machine learning models, while also investing in training to interpret outputs responsibly. Regulators should establish guidelines for the ethical use of AI in fraud detection, with particular attention to issues of algorithmic bias, model transparency, and auditability. They can also promote the industry-wide adoption of standardized fraud risk scoring frameworks to ensure comparability and fairness. Firms should prioritize investments in high-quality data infrastructures, hire or upskill staff with cross-disciplinary expertise in accounting and data science, and foster a culture that supports data-driven decision-making. Collaborative efforts between firms, auditors, and regulators are essential to ensure that fraud analytics technologies are implemented responsibly, effectively, and in ways that strengthen the integrity of financial reporting.

Author Contributions

Conceptualization, M.G. and D.K.; methodology, M.G.; software, M.G.; validation, M.G., D.K. and M.P.; formal analysis, M.G.; investigation, M.G.; resources, M.G.; data curation, M.G.; writing—original draft preparation, M.G.; writing—review and editing, M.G.; visualization, M.G.; supervision, D.K. and M.P.; project administration, D.K. and M.P.; funding acquisition, M.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Aboud, A., & Robinson, B. (2022). Fraudulent financial reporting and data analytics: An explanatory study from Ireland. Accounting Research Journal, 35(1), 21–36. [Google Scholar] [CrossRef]
  2. Albashrawi, M. (2016). Detecting financial fraud using data mining techniques: A decade review from 2004 to 2015. Journal of Data Science, 14(3), 553–569. [Google Scholar] [CrossRef]
  3. Ashtiani, M. N., & Raahemi, B. (2021). Intelligent fraud detection in financial statements using machine learning and data mining: A systematic literature review. IEEE Access, 10, 72504–72525. [Google Scholar] [CrossRef]
  4. Association of Certified Fraud Examiners (ACFE). (2022). Report to the nations: 2022 global study on occupational fraud and abuse. ACFE. [Google Scholar]
  5. Banarescu, A. (2015). Detecting and preventing fraud with data analytics. Procedia Economics and Finance, 32, 1827–1836. [Google Scholar] [CrossRef]
  6. Bello, O. A., Folorunso, A., Ejiofor, O. E., Budale, F. Z., Adebayo, K., & Babatunde, O. A. (2023). Machine learning approaches for enhancing fraud prevention in financial transactions. International Journal of Management Technology, 10(1), 85–108. [Google Scholar]
  7. Booth, A., Martyn-St James, M., Clowes, M., & Sutton, A. (2021). Systematic approaches to a successful literature review. Sage Publications. [Google Scholar]
  8. Brown, N. C., Crowley, R. M., & Elliott, W. B. (2020). What are you saying? Using topic to detect financial misreporting. Journal of Accounting Research, 58(1), 237–291. [Google Scholar] [CrossRef]
  9. Chen, Y. J., Liou, W. C., Chen, Y. M., & Wu, J. H. (2019). Fraud detection for financial statements of business groups. International Journal of Accounting Information Systems, 32, 1–23. [Google Scholar] [CrossRef]
  10. Cheng, C. H., Kao, Y. F., & Lin, H. P. (2021). A financial statement fraud model based on synthesized attribute selection and a dataset with missing values and imbalanced classes. Applied Soft Computing, 108, 107487. [Google Scholar] [CrossRef]
  11. Craja, P., Kim, A., & Lessmann, S. (2020). Deep learning for detecting financial statement fraud. Decision Support Systems, 139, 113421. [Google Scholar] [CrossRef]
  12. Dechow, P., Ge, W., & Schrand, C. (2010). Understanding earnings quality: A review of the proxies, their determinants and their consequences. Journal of Accounting and Economics, 50(2–3), 344–401. [Google Scholar] [CrossRef]
  13. Dutta, I., Dutta, S., & Raahemi, B. (2017). Detecting financial restatements using data mining techniques. Expert Systems with Applications, 90, 374–393. [Google Scholar] [CrossRef]
  14. Fotoh, L. E., & Lorentzon, J. I. (2023). Audit digitalization and its consequences on the audit expectation gap: A critical perspective. Accounting Horizons, 37(1), 43–69. [Google Scholar] [CrossRef]
  15. Gaganis, C. (2009). Classification techniques for the identification of falsified financial statements: A comparative analysis. Intelligent Systems in Accounting, Finance & Management: International Journal, 16(3), 207–229. [Google Scholar]
  16. Gepp, A., Kumar, K., & Bhattacharya, S. (2021). Lifting the numbers game: Identifying key input variables and a best-performing model to detect financial statement fraud. Accounting & Finance, 61(3), 4601–4638. [Google Scholar]
  17. Goel, S., & Gangolly, J. (2012). Beyond the numbers: Mining the annual reports for hidden cues indicative of financial statement fraud. Intelligent Systems in Accounting, Finance and Management, 19(2), 75–89. [Google Scholar] [CrossRef]
  18. Gupta, S., & Mehta, S. K. (2024). Data mining-based financial statement fraud detection: Systematic literature review and meta-analysis to estimate data sample mapping of fraudulent companies against non-fraudulent companies. Global Business Review, 25(5), 1290–1313. [Google Scholar] [CrossRef]
  19. Hamal, S., & Senvar, Ö. (2021). Comparing performances and effectiveness of machine learning classifiers in detecting financial accounting fraud for Turkish SMEs. International Journal of Computational Intelligence Systems, 14(1), 769–782. [Google Scholar] [CrossRef]
  20. Hasan, M. M., Popp, J., & Oláh, J. (2020). Current landscape and influence of big data on finance. Journal of Big Data, 7(1), 21. [Google Scholar] [CrossRef]
  21. Hernandez Aros, L., Bustamante Molano, L. X., Gutierrez-Portela, F., Moreno Hernandez, J. J., & Rodríguez Barrero, M. S. (2024). Financial fraud detection through the application of machine learning techniques: A literature review. Humanities and Social Sciences Communications, 11(1), 1–22. [Google Scholar] [CrossRef]
  22. Innan, N., Khan, M. A. Z., & Bennai, M. (2024). Financial fraud detection: A comparative study of quantum machine learning models. International Journal of Quantum Information, 22(02), 2350044. [Google Scholar] [CrossRef]
  23. Islam, T., Islam, S. M., Sarkar, A., Obaidur, A., Khan, R., Paul, R., & Bari, M. S. (2024). Artificial intelligence in fraud detection and financial risk mitigation: Future directions and business applications. International Journal for Multidisciplinary Research, 6(5), 1–23. [Google Scholar]
  24. Ismail, M. M., & Haq, M. A. (2024). Enhancing enterprise financial fraud detection using machine learning. Engineering, Technology & Applied Science Research, 14(4), 14854–14861. [Google Scholar] [CrossRef]
  25. Kassem, R., & Omoteso, K. (2024). Effective methods for detecting fraudulent financial reporting: Practical insights from Big 4 auditors. Journal of Accounting Literature, 46(4), 587–610. [Google Scholar] [CrossRef]
  26. Kokina, J., & Davenport, T. H. (2017). The emergence of artificial intelligence: How automation is changing auditing. Journal of Emerging Technologies in Accounting, 14(1), 115–122. [Google Scholar] [CrossRef]
  27. Leonov, P., Kozhina, A., Leonova, E., Epifanov, M., & Sviridenko, A. (2020). Visual analysis in identifying a typical indicators of financial statements as an element of artificial intelligence technology in audit. Procedia Computer Science, 169, 710–714. [Google Scholar] [CrossRef]
  28. Lin, C. C., Chiu, A. A., Huang, S. Y., & Yen, D. C. (2015). Detecting the financial statement fraud: The analysis of the differences between data mining techniques and experts’ judgments. Knowledge-Based Systems, 89, 459–470. [Google Scholar] [CrossRef]
  29. Lokanan, M. E., & Sharma, K. (2022). Fraud prediction using machine learning: The case of investment advisors in Canada. Machine Learning with Applications, 8, 100269. [Google Scholar] [CrossRef]
  30. Lokanan, M. E., Tran, V., & Vuong, N. H. (2019). Detecting anomalies in financial statements using machine learning algorithm: The case of Vietnamese listed firms. Asian Journal of Accounting Research, 4(2), 181–201. [Google Scholar] [CrossRef]
  31. Lu, Q., Fu, C., Nan, K., Fang, Y., Xu, J., Liu, J., & Lee, B. G. (2023). Chinese corporate fraud risk assessment with machine learning. Intelligent Systems with Applications, 20, 200294. [Google Scholar] [CrossRef]
  32. Moher, D., Liberati, A., Tetzlaff, J., & Altman, D. G. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. BMJ, 339, b2535. [Google Scholar] [CrossRef]
  33. Ngai, E. W., Hu, Y., Wong, Y. H., Chen, Y., & Sun, X. (2011). The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decision Support Systems, 50(3), 559–569. [Google Scholar] [CrossRef]
  34. Ngai, E. W., Leung, T. K. P., Wong, Y. H., Lee, M. C., Chai, P. Y. F., & Choi, Y. S. (2012). Design and development of a context-aware decision support system for real-time accident handling in logistics. Decision Support Systems, 52(4), 816–827. [Google Scholar] [CrossRef]
  35. Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., … Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372, n71. [Google Scholar] [CrossRef]
  36. Papík, M., & Papíková, L. (2022). Detecting accounting fraud in companies reporting under US GAAP through data mining. International Journal of Accounting Information Systems, 45, 100559. [Google Scholar] [CrossRef]
  37. Perols, J. (2011). Financial statement fraud detection: An analysis of statistical and machine learning algorithms. Auditing: A Journal of Practice & Theory, 30(2), 19–50. [Google Scholar]
  38. Petticrew, M., & Roberts, H. (2008). Systematic reviews in the social sciences: A practical guide. John Wiley & Sons. [Google Scholar]
  39. Pizzi, S., Venturelli, A., Variale, M., & Macario, G. P. (2021). Assessing the impacts of digital transformation on internal auditing: A bibliometric analysis. Technology in Society, 67, 101738. [Google Scholar] [CrossRef]
  40. PRISMA Statement. (2021). PRISMA transparent reporting of systematic reviews and meta analysis. Available online: http://www.prisma-statement.org/prisma-2020-flow-diagram (accessed on 14 April 2025).
  41. Putra, I., Sulistiyo, U., Diah, E., Rahayu, S., & Hidayat, S. (2022). The influence of internal audit, risk management, whistleblowing system and big data analytics on the financial crime behavior prevention. Cogent Economics & Finance, 10(1), 2148363. [Google Scholar] [CrossRef]
  42. Qatawneh, A. M. (2024). The role of artificial intelligence in auditing and fraud detection in accounting information systems: Moderating role of natural language processing. International Journal of Organizational Analysis, 33(6), 1391–1409. [Google Scholar] [CrossRef]
  43. Rahman, M. J., & Zhu, H. (2023). Predicting accounting fraud using imbalanced ensemble learning classifiers—Evidence from China. Accounting & Finance, 63(3), 3455–3486. [Google Scholar]
  44. Ravisankar, P., Ravi, V., Rao, G. R., & Bose, I. (2011). Detection of financial statement fraud and feature selection using data mining techniques. Decision Support Systems, 50(2), 491–500. [Google Scholar] [CrossRef]
  45. Rezaee, Z. (2002). Financial statement fraud: Prevention and detection. John Wiley & Sons. [Google Scholar]
  46. Rezaee, Z., Dorestani, A., & Aliabadi, S. (2018). Application of time series analyses in big data: Practical, research, and education implications. Journal of Emerging Technologies in Accounting, 15(1), 183–197. [Google Scholar] [CrossRef]
  47. Rosnidah, I., Johari, R. J., Hairudin, N. A. M., Hussin, S. A. H. S., & Musyaffi, A. M. (2022). Detecting and preventing fraud with big data analytics: Auditing perspective. Journal of Governance and Regulation, 11(4), 8–15. [Google Scholar] [CrossRef]
  48. Roszkowska, P. (2021). Fintech in financial reporting and audit for fraud prevention and safeguarding equity investments. Journal of Accounting & Organizational Change, 17(2), 164–196. [Google Scholar]
  49. Sadgali, I., Sael, N., & Benabbou, F. (2019). Performance of machine learning techniques in the detection of financial frauds. Procedia Computer Science, 148, 45–54. [Google Scholar] [CrossRef]
  50. Siering, M., Clapham, B., Engel, O., & Gomber, P. (2017). A taxonomy of financial market manipulations: Establishing trust and market integrity in the financialized economy through automated fraud detection. Journal of Information Technology, 32(3), 251–269. [Google Scholar] [CrossRef]
  51. Sood, P., Sharma, C., Nijjer, S., & Sakhuja, S. (2023). Review the role of artificial intelligence in detecting and preventing financial fraud using natural language processing. International Journal of System Assurance Engineering and Management, 14(6), 2120–2135. [Google Scholar] [CrossRef]
  52. Sun, T. (2019). Applying deep learning to audit procedures: An illustrative framework. Accounting Horizons, 33(3), 89–109. [Google Scholar] [CrossRef]
  53. Tang, J., & Karim, K. E. (2019). Financial fraud detection and big data analytics–implications on auditors’ use of fraud brainstorming session. Managerial Auditing Journal, 34(3), 324–337. [Google Scholar] [CrossRef]
  54. Tranfield, D., Denyer, D., & Smart, P. (2003). Towards a methodology for developing evidence-informed management knowledge by means of systematic review. British Journal of Management, 14(3), 207–222. [Google Scholar] [CrossRef]
  55. Warren, J. D., Moffitt, K. C., & Byrnes, P. (2015). How big data will change accounting. Accounting Horizons, 29(2), 397–407. [Google Scholar] [CrossRef]
  56. Yi, Z., Cao, X., Chen, Z., & Li, S. (2023). Artificial intelligence in accounting and finance: Challenges and opportunities. IEEE Access, 11, 129100–129123. [Google Scholar] [CrossRef]
Figure 1. Review Protocol Source: (PRISMA Statement, 2021).
Figure 1. Review Protocol Source: (PRISMA Statement, 2021).
Jrfm 18 00598 g001
Figure 2. Distribution of methodological approaches in included studies. Source: own elaboration.
Figure 2. Distribution of methodological approaches in included studies. Source: own elaboration.
Jrfm 18 00598 g002
Figure 3. Analytical techniques and themes identified in the literature. Source: own elaboration.
Figure 3. Analytical techniques and themes identified in the literature. Source: own elaboration.
Jrfm 18 00598 g003
Figure 4. Literature support for each research question theme. Source: own elaboration.
Figure 4. Literature support for each research question theme. Source: own elaboration.
Jrfm 18 00598 g004
Figure 5. Publications per year. Source: own elaboration.
Figure 5. Publications per year. Source: own elaboration.
Jrfm 18 00598 g005
Figure 6. Distribution of articles by subject area. Source: own elaboration.
Figure 6. Distribution of articles by subject area. Source: own elaboration.
Jrfm 18 00598 g006
Table 1. Journal articles included in the research. Source: own elaboration.
Table 1. Journal articles included in the research. Source: own elaboration.
No.Author (s)TitleJournal
1(Islam et al., 2024)Artificial Intelligence in Fraud Detection and Financial Risk Mitigation: Future Directions and Business ApplicationsInternational Journal for Multidisciplinary Research
2(Gupta & Mehta, 2024)Data mining-based financial statement fraud detection: Systematic literature review and meta-analysis to estimate data sample mapping of fraudulent companies against non-fraudulent companiesGlobal Business Review
3(Kassem & Omoteso, 2024)Effective methods for detecting fraudulent financial reporting: practical insights from Big 4 auditorsJournal of Accounting Literature
4(Qatawneh, 2024)The role of artificial intelligence in auditing and fraud detection in accounting information systems: moderating role of natural language processingInternational Journal of Organizational Analysis
5(Ismail & Haq, 2024)Enhancing enterprise financial fraud detection using machine learningEngineering, Technology & Applied Science Research
6(Fotoh & Lorentzon, 2023)Audit digitalization and its consequences on the audit expectation gap: A critical perspectiveAccounting Horizons
7(Yi et al., 2023)Artificial intelligence in accounting and finance: Challenges and opportunitiesIEEE Access
8(Sood et al., 2023)Review the role of artificial intelligence in detecting and preventing financial fraud using natural language processingInternational Journal of System Assurance Engineering and Management
9(Rahman & Zhu, 2023)Predicting accounting fraud using imbalanced ensemble learning classifiers–evidence from ChinaAccounting & Finance
10(Lu et al., 2023)Chinese corporate fraud risk assessment with machine learningIntelligent Systems with Applications
11(Bello et al., 2023)Machine learning approaches for enhancing fraud prevention in financial transactionsInternational Journal of Management Technology
12(Aboud & Robinson, 2022)Fraudulent financial reporting and data analytics: an explanatory study from IrelandAccounting Research Journal
13(Putra et al., 2022)The influence of internal audit, risk management, whistleblowing system and big data analytics on the financial crime behavior preventionCogent economics & finance
14(Papík & Papíková, 2022)Detecting accounting fraud in companies reporting under US GAAP through data miningInternational Journal of Accounting Information Systems
15(Rosnidah et al., 2022)Detecting and preventing fraud with big data analytics: Auditing perspectiveJournal of Governance and Regulation
16(Lokanan & Sharma, 2022)Fraud prediction using machine learning: The case of investment advisors in CanadaMachine Learning with Applications
17(Cheng et al., 2021)A financial statement fraud model based on synthesized attribute selection and a dataset with missing values and imbalanced classesApplied Soft Computing
18(Hamal & Senvar, 2021)Comparing performances and effectiveness of machine learning classifiers in detecting financial accounting fraud for Turkish SMEsInternational Journal of Computational Intelligence Systems
19(Gepp et al., 2021) Lifting the numbers game: identifying key input variables and a best-performing model to detect financial statement fraudAccounting & Finance
20(Roszkowska, 2021)Fintech in financial reporting and audit for fraud prevention and safeguarding equity investmentsJournal of Accounting & Organizational Change
21(Pizzi et al., 2021)Assessing the impacts of digital transformation on internal auditing: A bibliometric analysisTechnology in Society
22(Ashtiani & Raahemi, 2021)Intelligent fraud detection in financial statements using machine learning and data mining: a systematic literature reviewIEEE Access
23(Hasan et al., 2020)Current landscape and influence of big data on financeJournal of Big Data
24(Brown et al., 2020)What are you saying? Using topic to detect financial misreportingJournal of Accounting Research
25(Leonov et al., 2020)Visual analysis in identifying a typical indicators of financial statements as an element of artificial intelligence technology in auditProcedia Computer Science
26(Craja et al., 2020)Deep learning for detecting financial statement fraudDecision Support Systems
27(Sun, 2019) Applying deep learning to audit procedures: An illustrative frameworkAccounting Horizons
28(Chen et al., 2019)Fraud detection for financial statements of business groupsInternational Journal of Accounting Information Systems
29(Tang & Karim, 2019)Financial fraud detection and big data analytics–implications on auditors’ use of fraud brainstorming sessionManagerial Auditing Journal
30(Sadgali et al., 2019)Performance of machine learning techniques in the detection of financial fraudsProcedia computer science
31(Lokanan et al., 2019)Detecting anomalies in financial statements using machine learning algorithm: The case of Vietnamese listed firmsAsian Journal of Accounting Research
32(Rezaee et al., 2018)Application of time series analyses in big data: practical, research, and education implicationsJournal of Emerging Technologies in Accounting
33(Dutta et al., 2017)Detecting financial restatements using data mining techniquesExpert Systems with Applications
34(Kokina & Davenport, 2017)The emergence of artificial intelligence: How automation is changing auditingJournal of Emerging Technologies in Accounting
35(Siering et al., 2017)A taxonomy of financial market manipulations: establishing trust and market integrity in the financialized economy through automated fraud detectionJournal of Information Technology
36(Albashrawi, 2016)Detecting financial fraud using data mining techniques: A decade review from 2004 to 2015Journal of Data Science
37(Warren et al., 2015)How big data will change accountingAccounting horizons
38(Lin et al., 2015)Detecting the financial statement fraud: The analysis of the differences between data mining techniques and experts’ judgmentsKnowledge-Based Systems
39(Banarescu, 2015)Detecting and preventing fraud with data analyticsProcedia economics and finance
40(Goel & Gangolly, 2012)Beyond the numbers: Mining the annual reports for hidden cues indicative of financial statement fraudIntelligent Systems in Accounting, Finance and Management
41(Perols, 2011)Financial statement fraud detection: An analysis of statistical and machine learning algorithmsAuditing: A Journal of Practice & Theory
42(Ngai et al., 2011)The application of data mining techniques in financial fraud detection: A classification frameworkDecision Support Systems
43(Ravisankar et al., 2011)Detection of financial statement fraud and feature selection using data mining techniquesDecision Support Systems
Table 2. Journal publication analysis. Source: own elaboration.
Table 2. Journal publication analysis. Source: own elaboration.
RankJournal TitleCiteScore (2023)Number of PapersPublisher
1Decision Support Systems12.56Elsevier
2Journal of Accounting and Economics15.33Elsevier
3Expert Systems with Applications10.24Elsevier
4Accounting Horizons6.42American Accounting Association
5Journal of Emerging Technologies in Accounting4.32American Accounting Association
6Managerial Auditing Journal3.63Emerald Publishing
7Journal of Financial Crime2.12Emerald Publishing
Table 3. Analysis of publications by number of citations. Source: own elaboration.
Table 3. Analysis of publications by number of citations. Source: own elaboration.
Author & YearTitleCitationsDomainJournal
(Perols, 2011)Financial statement fraud detection: An analysis of statistical and machine learning algorithms500Accounting, Machine LearningAuditing: A Journal of Practice & Theory
(Ngai et al., 2011)The application of data mining techniques in financial fraud detection: A classification framework450Data Mining, Fraud DetectionDecision Support Systems
(Ravisankar et al., 2011)Detection of financial statement fraud and feature selection using data mining techniques380AI, Financial FraudDecision Support Systems
(Kokina & Davenport, 2017)The emergence of artificial intelligence: How automation is changing auditing320Auditing, AIJournal of Emerging Technologies in Accounting
(Warren et al., 2015) How big data will change accounting250Big Data, AccountingAccounting Horizons
(Goel & Gangolly, 2012)Beyond the numbers: Mining the annual reports for hidden cues indicative of financial statement fraud200Text Mining, Financial ReportingIntelligent Systems in Accounting, Finance and Management
(Craja et al., 2020) Deep learning for detecting financial statement fraud165Machine Learning, Forensic AccountingDecision Support Systems
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gkegkas, M.; Kydros, D.; Pazarskis, M. Using Data Analytics in Financial Statement Fraud Detection and Prevention: A Systematic Review of Methods, Challenges, and Future Directions. J. Risk Financial Manag. 2025, 18, 598. https://doi.org/10.3390/jrfm18110598

AMA Style

Gkegkas M, Kydros D, Pazarskis M. Using Data Analytics in Financial Statement Fraud Detection and Prevention: A Systematic Review of Methods, Challenges, and Future Directions. Journal of Risk and Financial Management. 2025; 18(11):598. https://doi.org/10.3390/jrfm18110598

Chicago/Turabian Style

Gkegkas, Michail, Dimitrios Kydros, and Michail Pazarskis. 2025. "Using Data Analytics in Financial Statement Fraud Detection and Prevention: A Systematic Review of Methods, Challenges, and Future Directions" Journal of Risk and Financial Management 18, no. 11: 598. https://doi.org/10.3390/jrfm18110598

APA Style

Gkegkas, M., Kydros, D., & Pazarskis, M. (2025). Using Data Analytics in Financial Statement Fraud Detection and Prevention: A Systematic Review of Methods, Challenges, and Future Directions. Journal of Risk and Financial Management, 18(11), 598. https://doi.org/10.3390/jrfm18110598

Article Metrics

Back to TopTop