Artificial Intelligence in Literature Review Synthesis: A Step-by-Step Methodological Approach for Researchers and Academics

Mtotywa, Matolwandile M.; Mowers, Jeri-Lee J.; Ndou, Wavhudi; Moleko, Thabang V. Q.; Ledwaba, Matsobane J.

doi:10.3390/informatics13030043

Open AccessArticle

Artificial Intelligence in Literature Review Synthesis: A Step-by-Step Methodological Approach for Researchers and Academics

by

Matolwandile M. Mtotywa

^*

,

Jeri-Lee J. Mowers

,

Wavhudi Ndou

,

Thabang V. Q. Moleko

and

Matsobane J. Ledwaba

Faculty of Commerce, Rhodes Business School, Rhodes University, Makhanda 6139, South Africa

^*

Author to whom correspondence should be addressed.

Informatics 2026, 13(3), 43; https://doi.org/10.3390/informatics13030043

Submission received: 20 December 2025 / Revised: 10 March 2026 / Accepted: 10 March 2026 / Published: 13 March 2026

Download

Browse Figures

Versions Notes

Abstract

The integration of artificial intelligence (AI) in literature reviews aims to transform research by potentially automating processes, enhancing rigour, and improving quality. The study proposes a structured step-by-step approach to integrate AI tools into the literature review synthesis process. The developed methodological approach has five steps. The first step, planning and readiness, involves scoping, understanding practices, and defining boundaries of AI use. Next is selecting AI tools and aligning their capabilities with the literature needs through a matrix. The third step focuses on using AI to conduct the review, followed by validation and cross-referencing of AI-generated results. The final step is disclosing AI use in line with ethical and reporting standards. The approach is demonstrated through five scenarios: emerging or fragmented literature, large or saturated fields, interdisciplinary domains, methodologically diverse studies, and under-researched topics. This approach is designed to enhance transparency, potentially reduce bias, and support reproducibility by aligning AI functions with research goals. It also addresses ethical considerations and promotes human–AI collaboration. For researchers and academics, it aims to provide a practical roadmap for the responsible adoption of AI in literature reviews, supporting efficiency, ethical tool use, transparency, and the balance between machine assistance and academic judgment.

Keywords:

artificial intelligence; review of the research literature; structured step-by-step approach; human–machine collaboration; ethical use; use scenarios

1. Introduction

The application of Artificial Intelligence (AI) in literature review improves efficiency and reduces errors associated with manual reviews, marking a significant advance in research execution [1,2,3]. As such, AI in research processes has the potential to improve decision-making, operational efficiency, and the overall quality of research output [4,5]. The use of AI has been strengthened by natural language processing (NLP), such as large language models (LLMs) [1,6] and machine learning [7], which can automate stages of the literature review process [5]. With the growing use of AI for research, there is also an increasing risk of misuse and unethical practices that threaten academic integrity and the foundational principles of knowledge creation. This is because of the AI’s ability to mimic human cognitive functions, which makes it difficult to distinguish between genuine creativity and AI-generated content [8]. AI tools can generate texts that academics or researchers can submit as their own work, blurring the lines between assistance and academic dishonesty [9]. This challenge is exacerbated by the questionable detection capabilities of tools that purport to detect AI-generated text [10,11,12]. This has resulted in an ongoing debate [13], with some institutions deciding to switch off AI detection tools due to their unreliability [14,15]. The unreliability of AI-generated text detection tools is a multifaceted issue influenced by several key factors. These factors include the inherent biases and limitations of detection algorithms, the sophistication of AI-generated text, and the impact of content obfuscation techniques [12]. As AI models become more advanced, the challenge of distinguishing between human-authored and AI-generated text becomes increasingly complex, leading to significant implications for academic and other professional settings [16].

Worsening this situation is the growing trend toward overreliance on AI in academic research and writing, which presents significant challenges to educational integrity and intellectual development. The existing literature converges around two interrelated issues. First, several scholars argue that excessive dependence on AI tools can disrupt the formation of robust knowledge systems and weaken researchers’ independent learning processes [17]. Rather than actively constructing understanding, researchers may outsource key cognitive tasks to AI systems, thereby limiting opportunities for meaningful knowledge consolidation. Second, multiple studies emphasise the cognitive consequences of such dependence, particularly the erosion of higher-order thinking skills. Research by Chavez et al. [18] and Zhai et al. [19] suggests that habitual reliance on AI-generated content may reduce engagement in analytical reasoning and critical evaluation. When researchers assume AI outputs are inherently accurate, they may develop confirmation biases and diminish their scrutiny of sources, ultimately impairing critical thinking development. These highlight broader: (1) knowledge development concerns, and (2) cognitive/critical thinking impacts. Simply put, while AI offers efficiency and accessibility, uncritical or excessive use may compromise core academic competencies. This underscores the urgent need for research that moves beyond documenting risks to identifying frameworks for the effective, pedagogically sound integration of AI in academic research and writing.

Recent scholarship examining the integration of artificial intelligence (AI) into the literature review process can be grouped into several interrelated thematic strands. First, multiple studies focus on the operational integration of AI within discrete stages of the systematic review workflow. For example, Bolaños et al. [1] and Wagner et al. [3] analyse AI-assisted tools for screening, study selection, and data extraction, concluding that AI is particularly effective in accelerating labour-intensive phases while still requiring human oversight to ensure methodological rigour. Second, Khalifa and Albadawy [20] offer a broader functional perspective, identifying six domains in which AI enhances academic functions, including literature review and synthesis. These findings position AI not merely as a task-specific tool but as a cross-cutting augmentation mechanism that supports knowledge discovery and analytical productivity. Third, applied and tool-focused contributions, such as Molopa [21], catalogue existing AI platforms and describe their potential uses in literature reviews. While valuable for practical orientation, this strand of work remains largely descriptive and does not provide structured procedural guidance.

Finally, more recent research has shifted attention toward effectiveness evaluation and ethical governance. Mogoale et al. [22], through an analysis of IEEE and MDPI journal publications from 2020 to 2024, highlight concerns regarding bias, transparency, reproducibility, and responsible disclosure. Although these studies acknowledge ethical considerations, few provide comprehensive, step-by-step frameworks that embed validation, quality assurance, and clearly defined boundaries for AI use throughout the review lifecycle. The literature demonstrates growing recognition of AI’s potential to enhance efficiency and synthesis in academic reviews; however, structured methodological guidance that integrates ethical safeguards and validation mechanisms remains underdeveloped.

The study aimed to develop a structured step-by-step approach to integrate AI tools into the literature review synthesis process. The first objective is to design the structure approach, while the second objective is to present the use-case application of multiple scenarios. The rest of the paper starts with theoretical positioning and the lens of research (theoretical context of the study). It is followed by the design of the methodological approach, which integrates AI tools into the literature review synthesis process, focusing on a step-by-step approach. Flowing from this are the use-cases and applications. The paper concludes by providing the limitations and future research direction, followed by the conclusion.

2. Theoretical Context of the Study

2.1. Growing Challenges of Effective Literature Review with Traditional Processes

Reviews of the traditional literature may at times lack the systematic approach necessary to ensure a comprehensive and unbiased synthesis of research findings. This is mainly attributed to the sheer volume of available literature, inherent biases by the researcher, and time constraints [5,23,24]. The exponential increase in scholarly publications makes it difficult for researchers to keep up with the latest findings, leading to potential gaps in knowledge integration. Additionally, the multidisciplinary nature of fields such as business and management further complicates the literature review process, as researchers must navigate diverse theories and complex subject areas [24]. The more rigorous methodologies, such as systematic or systematised literature review, which are designed to address these challenges by maximising transparency, objectivity, and repeatability, have their own challenges. They are not always feasible due to resource constraints, prompting the need for improved traditional review processes that incorporate systematic elements. There are emerging solutions and methodologies that can improve its effectiveness. Integration of systematic review principles, technological tools, and comprehensive training can help mitigate issues related to volume, bias, and time constraints. However, it is important to recognise that these solutions may not be universally applicable, and researchers must adapt their approaches based on the specific context and available resources. The solutions can involve a generative pre-training transformer (GPT), which is a deep learning model constructed on a transformer architecture [25]. The model uses NLP tasks, trained on vast quantities of text data designed to produce human-like text and responses [26]. LLMs are NLP models built on a transformer architecture and can handle large datasets.

2.2. Benefits of Integrating AI in Literature Review

Integration of AI provides multiple benefits, such as speed and scale of screening, where it can rapidly process and screen large volumes of information far beyond human capability. It also provides cognitive enhancement (not replacement) by enabling users to focus on higher-level reasoning by automating routine or repetitive tasks. AI thus becomes a tool for enhancing, not substituting, human intellectual capacity [27]. AI also enhances discovery by uncovering latent patterns and relationships between various documents, facilitating novel insights and connections that might elude human attention. This ability to surface ‘hidden’ links is invaluable for exploratory research and discovery [28]. It can also provide visual mapping and synthesis by clustering information by topic, highlighting knowledge gaps, and helping users grasp complex content at a glance. Although research specifically on co-pilots and GPTs is nascent, the principle aligns with broader AI visualisation capabilities observed in cognitive screening systems [26,28]. GPTs use LLMs’ rapid processing speeds to streamline the literature review process, although challenges such as hallucinations (false or fabricated outputs) in AI outputs must be addressed [29].

2.3. AI Co-Piloting-Hallucination Paradox

The “AI co-piloting-hallucination paradox” can be theoretically understood through the lens of Sociotechnical Systems Theory. This theory posits that any production system, including knowledge production in academic research, comprises both a social subsystem (the researcher’s critical judgment, ethical considerations, and interpretive skills) and a technical subsystem (the AI tool’s algorithms, data, and functionalities) [30]. According to this perspective, optimal outcomes are not achieved by maximising technology alone, but by jointly optimising the interaction between the human and the machine [31]. The paradox, therefore, describes the fundamental tension within this sociotechnical system: the value AI brings as a supportive partner (a co-pilot for thinking, research and decision-making) versus the risks posed by its technical limitations (hallucinations: when AI produces false, misleading or fabricated outputs) and the potential for misuse in the social system. The paradox is further compounded by the most limiting factor being the human element, with all the biases that can be transferred in the process, such as decision-making, cultural diversity, algorithmic, data representation, and historical and societal perspectives [32,33,34].

Hallucinations are produced by producing plausible-sounding but incorrect or fabricated outputs. Hallucinations can include fictitious citations, invented facts, or misleading information that undermines credibility and reliability. It can also create a problem when overreliance on AI summaries, without verification, can propagate misinformation. Users may blindly trust AI-generated content, particularly when summaries appear authoritative, even when they are not [19]. Furthermore, AI is constrained by its training data [35]. Biases or gaps, especially when models lack access to paywalled or proprietary sources, have the potential to skew outputs or inadvertently omit critical information, limiting accuracy and completeness [28]. There are also reproducibility and transparency challenges, where AI systems often lack transparency in their decision pathways. This undermines reproducibility and accountability, especially in high-stakes settings such as healthcare or legal research. There are also ethical concerns regarding plagiarism. AI-generated content may inadvertently plagiarise or produce a new synthetic text that raises authorship and attribution issues. There is also concern that over-automation could erode human skills, diminishing critical thinking and creative analysis [19,36].

The theoretical analysis highlighted the fundamental tension in contemporary literature review practice. Traditional approaches are increasingly untenable amid the exponential expansion of scholarly outputs and escalating demands for transparency and methodological rigour. While AI offers substantial gains in efficiency, screening precision, and synthesis capacity, its integration generates a parallel epistemic risk captured in the AI co-piloting–hallucination paradox, where augmented cognition coexists with the possibility of fabrication, bias, and uncritical reliance. Although prior scholarship recognises these opportunities and vulnerabilities, the area remains underdeveloped with a dearth of studies that articulate a procedurally grounded framework that embeds validation, accountability, and ethical boundaries across the review lifecycle. Addressing this gap requires a systematic, stage-aligned methodological approach capable of operationalising AI augmentation without compromising scholarly integrity. The following section advances such a framework.

3. Design of the Methodological Approach

The research offers a structured, step-by-step approach to integrate and use AI tools into the research literature review synthesis process (Figure 1). It is designed to enhance methodological rigour, foster ethical transparency, and ensure that AI complements, not replaces, human scholarly judgment.

3.1. Scoping, Understanding Intertwined Practices and Confirming Boundaries of AI Use (Planning and Readiness)—Step 1

This initial step is where the foundational architecture of the sociotechnical system is designed. Before any tool is chosen or a single query is run, the researcher, acting as the core of the social subsystem, must define the rules of engagement with the future technical subsystem (the AI tool). Articulating the scope, establishing ethical boundaries, and understanding the intertwined practices is not merely a planning exercise; it is the conscious act of defining how human judgment will guide, constrain, and collaborate with machine assistance [37]. This includes articulating clear research questions and selecting an appropriate methodological framework (e.g., systematic, scoping, narrative) where applicable. This step ensures the system is optimised from the outset, not just for technical efficiency, but for scholarly rigour and ethical compliance, preventing a scenario where the technical subsystem’s logic overrides the research goals. The researcher must also identify, understand, anticipate and resolve any biases that may arise in the sociotechnical system. The five proposed practices are as follows:

Practice 1: Use AI tools as collaborators, not as authorities. Using AI tools as collaborators rather than unquestioned authorities ensures that technology supports human goals while avoiding overreliance [38]. This perspective frames AI as an aid to decision-making rather than the final arbitrator. This foundational mindset directly informs the actions taken in later steps. In Step 2 (Tool Selection), this means choosing a tool that fits the collaborative role required, like picking a meticulous research assistant (e.g., ASReview for screening) versus a creative brainstorming partner (e.g., ChatGPT -5.4 for ideation). This principle is most critical in Step 4 (Validation), where the researcher’s role is to treat the AI’s output as a draft to be audited, not a final verdict to be accepted.

Practice 2: Combine AI with human critical judgment. Combining AI with human critical judgment safeguards against blind acceptance of machine output. Human expertise provides the interpretive lens that contextualises AI-generated insights. This practice is the engine of the entire role, representing the researcher’s essential and non-delegable role. It is applied during Step 3 (Use of AI) when crafting prompts and interpreting the initial outputs generated by the AI tool. It becomes the entire focus of Step 4 (Validation). The iterative loop between using the AI and validating its output is the structured application of human critical judgment, ensuring that technology enhances, rather than replaces, scholarly intellect.

Practice 3: Maintain documentation for transparency. Maintaining thorough documentation promotes transparency and accountability in planning processes. Clear records enable teams to track how AI contributions were evaluated and integrated. This practice creates a direct and practical bridge to the final step of the model. The act of maintaining thorough documentation throughout Steps 2, 3, and 4, i.e., logging which tools were used, the prompts that were engineered, and the results of validation checks, provides the essential raw material for Step 5 (Disclosure). Without this continuous documentation, creating a credible and replicable transparency statement as required by journals and ethical bodies like the Committee on Publication Ethics (COPE) would be nearly impossible.

Practice 4: Cross-validate AI outputs using verified sources. Cross-validating AI outputs against verified sources reduces the risks of misinformation or bias. This practice strengthens readiness by providing technological assistance in established evidence [39]. This principle is so fundamental that it is given its own dedicated step in the model. Step 4 (Validation) is the direct and systematic operationalisation of this practice. The approaches outlined, such as parallel manual screening and iterative error review, are the specific methods by which researchers can fulfil the mandate to “cross-validate AI outputs using verified sources”. This is intended to help ensure the final results are robust and trustworthy.

Practice 5: Respect copyright and use open-access filters. Respecting copyright while prioritising open-access materials balances ethical responsibility with equitable knowledge use. Such practices ensure compliance while broadening the reliability of shared information [40]. This ethical practice is implemented during the most active phases of the literature review. It guides Step 2 (Tool Selection), prompting the researcher to consider whether a potential tool can filter for open-access materials or if its training data respects paywalled and copyrighted sources. It is then put into action in Step 3 (Use of AI), where the researcher actively applies these filters during the search and retrieval process, ensuring the review is built upon an ethically sourced foundation of literature.

These practices are not isolated, but deeply intertwined, treating AI as a collaborator, exercising human judgment, documenting processes, validating outputs, and respecting ethical use together create a robust approach to planning and readiness. Each practice reinforces the others, ensuring that technological innovation is grounded in human responsibility, transparency, and integrity. This is followed by confirming the limits of AI use. Figure 2 illustrates the continuum of ethical to non-ethical use of AI across different steps of academic research support. At the lowest end of the spectrum (green zone), the foundational support is where grammar and translation are coded in green, reflecting high acceptability and minimal ethical concerns. This aligns with the general consensus that linguistic correction and accessibility support are ethically permissible uses of AI [38,41]. Brainstorming and clarification follow, which is also largely acceptable [41,42].

With increased use of AI for the other elements of research, the academic may migrate to the yellow and amber zones. In the yellow zone, the use of AI is primarily for structural assistance, which involves development and structured concept extraction, introducing moderate ethical tension. In this step, reliance on AI begins to intersect with intellectual originality, necessitating clear boundaries and documentation of contributions. Data analysis of unstructured themes and analytics falls further into the yellow-orange zone, indicating questions on potential ethical issues.

This step and the previous step require awareness of the role as a collaborator rather than as a substitute for human reasoning. This reflects challenges in transparency, accuracy, and the interpretive role of human researchers in contextualising machine-driven insights [39]. The yellow and amber zones (outline development and structured concepts/themes extraction and data analysis—unstructured themes and analytics) are in the contextual zone of uncertainty and contradictions. This is because different institutions, researchers and academics can perceive this grey area differently. Some agree on the use of AI tools in this zone, while others question the use or deem it unethical. As such, institutional policy should guide whether this type of use is acceptable. Finally, the authorship of content, which entails drafting and writing research, falls within the red zone, indicating unethical practice. Delegating core scholarly writing to AI compromises academic integrity and raises concerns about authorship, originality, and plagiarism [40]. The gradient progression emphasises that while AI can be a valuable aid, ethical responsibility increases proportionally with its role in shaping intellectual content. Figure 2 serves as a reminder that human critical judgment, proper attribution, and cross-validation with credible sources are essential safeguards.

While these practices are introduced during the planning and readiness phase (Step 1), they are not static; they are systematically intertwined with the subsequent operational steps of the model. To visualise these relationships, Table 1 provides a mapping of how each sociotechnical practice is enacted across the methodological workflow, ensuring that human responsibility and integrity remain embedded throughout the review lifecycle. Returning to Figure 1, the five steps are not merely sequential; they are embedded within a continuous framework of sociotechnical practices. While the iterative loop between Step 3 (Active Use) and Step 4 (Validation) remains the engine of the review, the overarching practices—such as documentation (P3) and human judgment (P2)—provide the necessary governance to ensure the system’s integrity from scoping to final disclosure. Similarly, the ‘Grey Zone’ identified in Figure 2 represents the area of highest conceptual tension. It is precisely in this zone where Practice 2 (Human Critical Judgment) and Practice 4 (Cross-validation) must be most rigorously applied to prevent the technical subsystem from overriding scholarly originality.

3.2. Select AI Tool(s) and Link AI Capability with Literature Requirements (Matrix)—Step 2

This step represents the deliberate coupling of the social and technical subsystems. The selection of an AI tool is not a search for the “best” technology in isolation, but for the most compatible partner for the researcher and their specific task. The capability-requirement matrix is the key instrument for achieving this joint optimisation at the selection phase. By mapping research needs (the goals of the social subsystem) to the specific functionalities of a tool (the capabilities of the technical subsystem), the researcher ensures the chosen technology is fit-for-purpose. This prevents sociotechnical friction, such as using a highly creative generative AI for a task that requires factual precision and citation tracking, which would place an excessive validation burden on the researcher.

AI tools have shown varying levels of effectiveness in automating the screening process for systematic reviews, with some achieving significant time savings and workload reductions [43,44,45]. Researchers should select AI tools that align with the objectives of the literature review. A capability-requirement matrix should be developed to map the tasks involved in the literature review against the specific functionalities of selected AI tools (see illustrative example in Table 2). This ensures a strategic alignment between the demands of research and the technical capacity of AI systems.

Table 3 presents 20 AI tools that can be used for a review of the literature. The research acknowledges that though this list is comprehensive, it is not fully exhaustive and will need to be adapted periodically due to the increase in the speed of development of AI tools. The AI tools included in this matrix were purposively selected to represent the main categories currently relevant to literature review workflows. Selection was based on four considerations: (1) relevance to one or more stages of evidence synthesis, such as search, screening, citation mapping, summarisation, or synthesis support; (2) visibility and adoption within academic or review-orientated practice; (3) diversity of underlying approaches, including machine learning, large language models, semantic search and citation-network analysis; and (4) availability of sufficient public information to enable functional comparison. The tools were not intended to constitute an exhaustive inventory, but rather a representative cross-section of currently influential and accessible AI-supported review technologies. The tools were descriptively evaluated using a comparative matrix that considered tool type, primary review use-case, main strength, and key limitation. Where relevant, the matrix also considered whether the tool primarily supported discovery, screening, summarisation, citation analysis, or workflow management. This approach was intended to highlight functional fit rather than to generate a formal performance ranking, as performance may vary substantially by discipline, dataset, review question, and reviewer expertise.

Screening and review workflow tools. Tools in this group include Abstrackr, ASReview, EPPI-Reviewer, and Rayyan. These tools support title or abstract selection, prioritisation, tagging, and review workflow management. Abstrackr and ASReview are machine learning-based systems designed for systematic reviews, with strengths in semi-automated screening and active learning, although both require careful alignment and training of the data sets [44,45,46]. Systematic review platforms, including Rayyan, and EPPI-Reviewer, focus on workflow management, screening, and meta-analysis, but require user training and offer limited advanced AI features [44].

Case excerpt for this group: In the screening stage, tools such as Abstrackr, ASReview, Rayyan, and EPPI-Reviewer were used to prioritise or organise records for reviewer assessment. However, records highlighted as relevant by these tools were not automatically accepted. Each suggested study was manually screened against predefined inclusion and exclusion criteria. For example, a paper ranked highly by ASReview or tagged as potentially relevant in Rayyan should still be reviewed by reviewers for study design, population, intervention, and outcome fit. Where discrepancies arose between tool recommendations and reviewer judgment, the final decision must be based on the review protocol and manual assessment rather than AI-assisted ranking.

Generative AI assistants for reasoning and synthesis. Tools in this group include ChatGPT, Claude, and Gemini. These AI tools mainly generate summaries, suggest themes, explain concepts, and support exploratory synthesis. Generative LLMs such as ChatGPT [47], Claude and Gemini support narrative, scoping, and exploratory reviews by providing text generation, contextual reasoning, and multimodal synthesis, though they face challenges such as hallucination, lack of citation awareness, and limited academic integration.

Case excerpt for this group: ChatGPT, Claude, and Gemini can be used as idea generation and summarisation aids. Their outputs were treated as provisional rather than authoritative. For example, when ChatGPT or Claude generated a summary of a paper, reviewers compared that summary with the article’s abstract, results, and conclusion sections of the article to confirm that the reported claims accurately reflected the original source. Similarly, when Gemini suggested thematic groupings or interpretive links between articles, those suggestions were manually checked against the included studies before being incorporated into the manuscript. Any unsupported, overgeneralised, or inaccurate statements should be revised or discarded.

Semantic search and evidence discovery tools. Tools in this group include Consensus, Elicit, Semantic Scholar, and Perplexity. These tools assist in retrieving the literature semantically, identify relevant studies, surface answers from papers, and provide high-level evidence overviews.

Case excerpt for this group: For evidence discovery, Consensus, Elicit, Semantic Scholar, and Perplexity were used to identify potentially relevant studies and surface conceptually related literature. These outputs were used only as a starting point for the selection of the review. For example, papers recommended by Elicit or Semantic Scholar were manually checked against the predefined eligibility criteria, and records surfaced by Perplexity or Consensus were cross-referenced with source metadata and abstracts to confirm that they were peer reviewed, topically relevant, and methodologically appropriate. This validation step helped prevent reliance on highly ranked but out-of-scope articles, editorials, or nonempirical sources.

Citation mapping and literature network tools. Tools in this group include Connected Papers, Litmaps, Research Rabbit, and Scite. These tools mainly map citation relationships, identify foundational and derivative studies, and explore the citation context. Citation-mapping and data visualisation tools like Connected Papers, Litmaps, and Research Rabbit [48,49] visualise scholarly networks to support scoping and meta-reviews, but are limited by static data and a lack of deep analytical features.

Case excerpt for this group: To expand the literature set and identify influential studies, Connected Papers, Litmaps, Research Rabbit, and Scite were used for citation-based exploration. The outputs of these tools were validated through manual inspection rather than accepted at face value. For example, articles appearing in a Connected Papers or Research Rabbit network were individually checked by reviewing titles, abstracts, publication year, and relevance to the review question. The Citation relationships highlighted by Litmaps were similarly verified to ensure that they reflected the topic’s relevance rather than a loose conceptual association. Where Scite was used to inspect the citation context, reviewers examined whether the claims cited were being supported, contrasted, or only mentioned superficially before using that information in interpretation.

Summarisation and document-level reading tools. Tools in this group include NotebookLM, Scholarcy, SciSpace, SciSummary, and Paper Digest. Summarise uploaded papers or PDFs, extract key points, simplify technical content, and support close reading.

Case excerpt for this group: For document-level analysis, NotebookLM, Scholarcy, SciSpace, SciSummary, and Paper Digest were used to generate summaries or highlight key concepts from individual papers. These summaries were checked directly against the full text before use. For example, if Scholarcy or SciSpace identified the main findings of a study, reviewers verified those findings against the article’s methods, results, and conclusion sections. When NotebookLM or SciSummary proposed interpretations or condensed technical arguments, these outputs were cross-referenced with the original article to ensure that no important qualifiers, limitations, or methodological details were lost. Paper Digest outputs were also treated as preliminary summaries and were not used without manual confirmation.

Broad review support and mixed-function tools also occur because some tools overlap across functions.

Case excerpt for this cross-cutting group: Some tools, including EPPI-Reviewer, Elicit, Perplexity, NotebookLM, and Scite, supported multiple stages of the review process. In these cases, validation involved checking each output according to its function. The results of the search were checked against eligibility criteria, the summaries were compared with the full texts, and the citation insights were cross-referenced with the metadata of the article and source context. This function-specific verification ensured that AI support remained bounded by reviewer oversight at each stage of the workflow.

Despite their utility, AI tools used in literature reviews introduce important methodological risks. Large-language models may generate inaccurate or fabricated statements, omit nuance, or present unsupported claims convincingly. Semantic search and recommendation systems can amplify existing publication biases, language biases, or citation inequalities embedded in their training data or indexed corpora. Many tools also rely on incomplete or non-transparent databases, which may result in missing studies, uneven disciplinary coverage, or unclear ranking logic. Reproducibility is another concern, particularly for generative systems whose outputs may vary over time, prompt wording, software versions, or model updates. For these reasons, AI-assisted output should be interpreted cautiously and used to augment, rather than replace, reviewer judgment, protocol-based screening, and direct verification against original sources. These tools have the potential to streamline various review processes, but each is best suited to particular review types and must be applied with awareness of its limitations.

Table 3. AI tools with their suggested use on literature review type, their main function, and their limitations.

AI Tool	Application Identification	Official Website	Tool Type	Best for Review Type	Main Function/Strength	Limitations
Abstrackr [45]	Abstrackr (Brown University, Providence, RI, USA)	https://abstrackr.com/ [accessed on 13 February 2026]	Machine Learning	Systematic Reviews	Semi-automated screening	Accuracy may vary by dataset
ASReview [46,47,50]	ASReview (Utrecht University, Utrecht, The Netherlands)	https://asreview.nl/ [accessed on 13 February 2026]	Machine Learning for Screening	Systematic Reviews	Active learning for abstract screening	Requires setup and training data
ChatGPT [44,51,52]	ChatGPT (OpenAI, San Francisco, CA, USA)	https://chat.openai.com/ [accessed: 25 February 2026]	Generative AI (LLM)	Narrative, Scoping	Text generation, summarisation	May hallucinate facts, not citation-aware
Claude [53]	Claude (Anthropic PBC, San Francisco, CA, USA)	https://claude.ai/ [accessed on 25 February 2026]	Generative AI (LLM)	Narrative, Scoping	Contextual reasoning, summarisation	Limited access to academic databases
Connected Papers [54,55]	Connected Papers (Connected Papers Ltd., Tel Aviv, Israel)	https://www.connectedpapers.com/ [accessed on 13 February 2026]	Citation Mapping	Scoping Reviews	Generates a graph of related papers important for prior works and derivative works	Not updated in real-time, static data
Consensus [56]	Consensus (Consensus AI, Inc., Boston, MA, USA)	https://consensus.app/ [accessed on 13 February 2026]	AI Semantic Search	Evidence-based Reviews	Summarises consensus from literature	Limited database, surface-level responses
Elicit [57]	Elicit (Ought, Inc., San Francisco, CA, USA)	https://elicit.com/ [accessed on 13 February 2026]	AI Semantic Search	Systematic, Rapid Reviews	Find and extract answers from papers	Limited database, still in beta
EPPI-Reviewer [45,58]	EPPI-Reviewer (EPPI-Centre, University College London, London, UK)	https://eppi.ioe.ac.uk/cms/Default.aspx?tabid=2914 [accessed on 25 February 2026]	Systematic Review Platform	Systematic Reviews	Advanced tagging and meta-analysis	Complex UI requires training
Gemini [59,60]	Gemini (Google LLC, Mountain View, CA, USA)	https://gemini.google.com/ [accessed on 25 February 2026]	Generative AI (LLM)	Narrative, Exploratory	Multimodal reasoning, summarisation	Limited PDF handling, less tuned to academic use
Litmaps [61,62]	Litmaps (Litmaps Ltd., Wellington, New Zealand)	https://www.litmaps.com/ [accessed on 12 February 2026]	Citation Tracking Tool	Exploratory, Narrative, Meta Review (Review of reviews)	Track citation networks over time, aggregate citation networks, and track meta-level insights	Limited analysis depth
NotebookLM [63,64]	NotebookLM (Google LLC, Mountain View, CA, USA)	https://notebooklm.google.com/ [accessed on 12 February 2026]	AI-Powered Document Assistant	Narrative, Scoping, Critical Review	Supports deep reading and synthesis from user-uploaded content	Limited integration with live databases
Paper Digest [65]	Paper Digest (Paper Digest LLC, Brookline, MA, USA)	https://www.paperdigest.org/ [accessed on 12 February 2026]	AI Summary Tool	Rapid Reviews	Summarises abstracts and conclusions	Surface-level summary
Perplexity [66]	Perplexity (Perplexity AI, Inc., San Francisco, CA, USA)	https://www.perplexity.ai/ [accessed on 12 February 2026]	AI Summary and insight extraction Tool	Scoping and Rapid Reviews	Aggregates multiple sources into structured summaries, identifies themes, trends, and contrasting viewpoints, literature mapping	Incomplete Academic Coverage with aggregates of mixed sources (peer-reviewed papers, blogs, news
Rayyan [58]	Rayyan (Rayyan Systems Inc., Cambridge, MA, USA)	https://www.rayyan.ai/ [accessed on 12 February 2026]	Systematic Review Platform	Systematic Reviews	Collaborative screening and tagging	No built-in AI summarisation
Research Rabbit [48,49]	Research Rabbit (Research Rabbit Inc., Brooklyn, NY, USA)	https://www.researchrabbit.ai/ [accessed on 12 February 2026]	Citation Discovery	Scoping, Narrative	Visual citation network, paper discovery	No full-text analysis
Scholarcy [67]	Scholarcy (Scholarcy Ltd., London, UK)	https://www.scholarcy.com/ [accessed on 13 February 2026]	AI Summarisation	Narrative, Scoping	Summarises PDFs, highlights key points	No database search, summary not always nuanced
Scispace [57,66]	SciSpace (Typeset Technologies Pvt. Ltd., Bangalore, India)	https://typeset.io/ [accessed on 13 February 2026]	AI Summarisation	Narrative, Rapid	Summarises PDFs, explains key concepts	Accuracy varies, limited reasoning
Scite [68]	Scite (Scite Inc., Brooklyn, NY, USA)	https://scite.ai/ [accessed on 25 February 2026]	Citation AI	Systematic, Citation Mapping, Integrative Review	Smart citation context and classification support combining diverse literature	No full-text access, mainly metadata
SciSummary—AI [69]	SciSummary (SciSummary, San Francisco, CA, USA)	https://scisummary.com/ [accessed on 25 February 2026]	Search & discovery, summariser for scientific articles	Exploratory, Narrative	Automated summarisation of scientific and academic papers. Simplification of Technical Content	Context compression, limited critical appraisal
Semantic Scholar [70,71]	Semantic Scholar (Allen Institute for AI, Seattle, WA, USA)	https://www.semanticscholar.org/ [accessed on 25 February 2026]	AI-powered Semantic Search Engine	All types (starting point)	Search and filtering with AI	Missing some publisher content

3.3. Use AI in Literature Review—Step 3

In this step, the sociotechnical system becomes active and dynamic. This step is characterised by the direct, moment-to-moment interplay between the researcher (social) and the AI (technical). The researcher provides prompts, adjusts parameters, and guides the search, while the AI processes data, identifies patterns, and generates outputs. This is the “co-piloting” phase in its most literal sense, where the efficiency of the technical subsystem (e.g., screening thousands of articles) is leveraged under the constant cognitive oversight of the social subsystem. The success of this step depends entirely on the quality of the collaboration designed in Steps 1 and 2.

AI tools can be employed at various steps of the literature review process to enhance efficiency, accuracy, and comprehensiveness. One of the primary applications is in automated or semi-automated literature searches, where AI-driven databases and NLP algorithms can retrieve relevant sources more effectively than traditional keyword searches [72]. For example, AI-driven semantic search tools can identify conceptually related studies that do not share identical keywords, automatically screen large volumes of abstracts for relevance, and recommend influential or thematically connected articles through citation network analysis. These capabilities are particularly valuable in emerging or fragmented literature as well as interdisciplinary research contexts, where relevant literature is often dispersed across multiple domains (see also, use-cases and applications for details).

These tools not only save time but also reduce the risk of overlooking important publications. For instance, platforms such as Abstrackr support semi-automated abstract screening by learning from initial inclusion and exclusion decisions made by reviewers. As screening progresses, the system prioritises records predicted to be relevant, thereby helping researchers identify potentially eligible studies earlier in the process and reducing the likelihood that important publications remain unscreened in large datasets. Similarly, EPPI-Reviewer incorporates machine-assisted coding and relevance ranking features that assist with document classification and thematic tagging, supporting consistency across reviewers and improving transparency in decision trails. Other generative AI–enhanced environments, such as NotebookLM and SciSpace, can assist during the synthesis phase by summarising full-text articles, extracting key methodological features, and enabling semantic search across uploaded documents. For example, a reviewer examining intervention effectiveness may query a corpus for specific outcome measures or population characteristics and retrieve contextually relevant passages even when terminology varies. Such functionality supports more comprehensive evidence mapping and reduces the risk that relevant findings are overlooked due to keyword limitations.

This means that AI can also be used to support relevance screening and document classification, allowing researchers to filter large datasets of articles based on predetermined inclusion and exclusion criteria. This approach minimises manual screening fatigue while improving consistency in decision-making [1]. Furthermore, topic modelling and clustering techniques can be employed to group articles into thematic categories, enabling researchers to identify emerging trends, conceptual frameworks, and research gaps within the literature. Another valuable function is metadata extraction, where AI systems extract bibliographic details, methodological information, and key variables from articles in a structured format, supporting systematic synthesis and bibliometric analysis. In addition, citation mapping and network analysis can reveal intellectual linkages between studies, highlight seminal works, and track the evolution of specific research domains [73]. Despite these advantages, maintaining academic rigour remains crucial. To ensure reliability, human-in-the-loop validation checkpoints must be incorporated throughout the process. This means that while AI tools can suggest patterns, classifications, or relevancy scores, researchers must critically review and confirm these outputs to avoid algorithmic bias or misinterpretation. Ultimately, AI may be most effectively viewed as a complementary assistant in the literature review, enhancing human judgment rather than replacing it.

3.4. Validate and Cross-Reference AI Outcomes—Step 4

This step is the critical regulatory loop where joint optimisation is actively managed and enforced. It represents the most important function of the social subsystem: auditing, correcting, and refining the outputs of the technical subsystem to safeguard the integrity of the entire process. When a researcher manually corroborates a sample of AI-screened articles or fact verification of an AI-generated summary against source texts, they are performing a vital act of sociotechnical governance. This iterative cycle acknowledges the inherent fallibility of the technical subsystem (e.g., its potential for bias or hallucination) and institutionalises the researcher’s critical expertise as the ultimate arbiter of validity, ensuring the final output is a product of robust human–AI collaboration, not uncritical automation.

All AI-generated output should be critically evaluated before integration into the literature review process. The core of the methodological approach is centred on a tight, iterative cycle between the AI application (Step 3) and researcher-led validation (Step 4). These steps should not be viewed as linear but as a continuous loop. In Step 3, AI tools are initially deployed for tasks such as relevance screening or thematic clustering. During the cross-checking, there are conceptually several possible outcomes, which are true positives, false positives, false negatives, and true negatives (Table 4).

True positives refer to cases where both the AI tool and the human reviewer identify an article as relevant (decision: accept), while false positives occur when the AI includes an article that is later excluded by the human reviewer (decision: review criteria). False negatives arise when the AI excludes an article that the human reviewer considers relevant (decision: adjust or retrain the AI filtering parameters), and true negatives describe instances where both the AI and the human reviewer exclude an article (decision: accept).

For example, a researcher may use ASReview to screen 1000 abstracts, generating a preliminary list of 100 potentially relevant articles. During the subsequent validation step, the researcher manually reviews a subset (sample) of both included and excluded records. An article on AI in the workplace that is selected by both ASReview and the human reviewer would be classified as a true positive, whereas an article included by ASReview but found to be unrelated to the research question upon manual inspection would represent a false positive. Conversely, a theoretically relevant study excluded by ASReview but identified during manual screening would constitute a false negative, signalling the need to refine the AI model or screening criteria. Articles excluded by both approaches would be treated as true negatives. This validation process, summarised in Table 4, can be conducted through parallel manual screening, dual search strategies, redundancy checks, or iterative error review to improve screening accuracy.

Following the principle of iterative error review, the researcher would manually screen a random sample (e.g., 10%) of both the included and excluded articles to identify false positives and false negatives. Any discrepancies are logged and used to refine the AI’s parameters or the search criteria itself. For instance, if the validation reveals the AI is consistently excluding articles that use a specific synonym, the search query is updated. The process then cycles back to Step 3, where the refined query or model is re-run on the existing selected sample to ensure refined queries are responding appropriately. At this step, the sample may also be expanded, especially if revisions to the queries were major. This iterative loop continues until the AI’s output reaches a pre-defined level of reliability (e.g., >95% agreement with the manual check). This embedded validation ensures that the final AI-generated dataset is rigorously vetted and that the researcher maintains control over the analytical process, aligning with the principles of human-in-the-loop validation throughout the review lifecycle.

Comparing outputs from multiple AI tools or platforms can further enhance reliability, as different algorithms may highlight distinct but complementary aspects of the literature [47]. Researchers should also remain attentive to inconsistencies, duplications, or hallucinated data, which may arise from limitations in training datasets or algorithmic bias. Incorporating theoretical and conceptual frameworks as benchmarks ensures that the extracted themes or classifications are not only technically accurate but also academically relevant. To strengthen methodological credibility, strategies such as peer debriefing, triangulation of data sources, and inter-rater reliability assessments should be integrated throughout the review. For instance, two or more reviewers can independently evaluate AI-screened results and reconcile discrepancies through discussion and documentation, thereby increasing consensus and reducing subjectivity. Regular audits of AI-generated outputs are crucial, including checks for accuracy in metadata extraction, verification of citation relationships, and alignment with the scope of the research question [74]. Documentation of these validation processes enhances transparency and reproducibility, allowing other scholars to follow and critique the approach. Ultimately, AI validation is not a one-off task, but a continuous monitoring process embedded throughout the review lifecycle. By combining AI efficiency with human scholarly judgement, researchers can balance innovation with rigour, ensuring that findings are both robust and trustworthy.

3.5. Disclose AI Use as per Relevant Requirements During Reporting—Step 5

The final step is an act of making the entire sociotechnical system transparent and accountable. By documenting which tools were used, for what purpose, and how their outputs were validated, the researcher (social subsystem) takes full responsibility for the process and its outcomes. This disclosure is not just an ethical formality; it is a methodological commitment to reproducibility. It allows peers to scrutinise the interaction between the human and the machine, understand the potential influence of the technical subsystem on the findings, and ultimately trust the scholarly work produced by the system. This final act reaffirms the principle that accountability always resides within the social subsystem.

Researchers should clearly document the tools used, their specific roles, and any known limitations; outline the extent of human oversight and quality control; and provide a replicability appendix detailing tool configurations, prompts, and workflows to enable reproducibility and accountability. Compliance with relevant ethical guidelines related to AI (e.g., COPE, APA, ICMJE) should be strictly observed (Table 5). An AI transparency statement should also be provided (a sample is included in Appendix A). This is to ensure disclosure and to reaffirm that human authorship and intellectual contributions remain central to the review process.

4. Use-Cases and Applications

This article offers a roadmap for early-career and seasoned scholars seeking to harness AI’s potential while preserving critical academic rigour. A use-case application is presented, which is based on five scenarios of the emerging or fragmented literature, the large-scale or highly saturated literature, the interdisciplinary or cross-domain literature, under-researched or neglected topics, and the methodologically diverse literature.

4.1. Scenario 1: Emerging or Fragmented Literature

In domains where the body of literature is emerging or fragmented, integrating AI tools into the review process can help consolidate dispersed insights. Researchers in such fields often face challenges due to inconsistent terminology, limited studies, or unevenly distributed publications. The research “picture” is characterised by disintegrated or emerging themes within the research landscape, resulting in a large quantity of literature to search through or a set of dispersed literature is found. The researcher’s initial high-level analysis of the discovered citations and abstracts often appears contradictory and confusing. The intention is to literally draw a picture of fragmented literature, review and prune to make a clear, systematic sense of it in a clear, connected image. Clustering and semantic analysis can uncover hidden connections, map conceptual links, and identify recurring themes in otherwise disjointed work. This structured synthesis enables a more coherent understanding of the trajectory of the field. Using AI, researchers can reduce bias, ensure a broader coverage of sources, and generate new conceptual frameworks to guide future research directions. Several AI tools are used to resolve the challenges presented, with a workflow as follows.

Strategic Approach: For this scenario, the primary goal would be discovery and consolidation. A researcher should employ a workflow that moves from broad, visual exploration of fragmented and dispersed sources to deep synthesis of a small corpus of visually mapped clusters, networks and relationships.

Step 1: Planning and Readiness. The researcher defines the scope by identifying the “disintegrated” nature of the research landscape as a primary boundary. The objective is to prune a large quantity of confusing or contradictory citations into a clear, connected image.

Step 2: Tool Selection and Matrix Mapping. Following the capability-requirement matrix, the researcher selects citation mapping tools like Research Rabbit or Litmaps for initial discovery. These tools are chosen for their ability to uncover hidden citation networks and “neighbourhoods” of research not linked by common keywords.

Step 3: Active Use in Review. The researcher uses the selected tools to algorithmically downscale complex relationships into simplified dimensions, facilitating readability at high retrieval speeds [48,49,78]. Once small clusters of relevant papers are identified, a document assistant like NotebookLM is employed to identify common theoretical threads from the disparate full-text articles.

Step 4: Validation and Cross-Referencing. The social subsystem (the researcher) validates the technical results to detect potential biases or hallucinations. This involves checking the AI-generated visual clusters against known foundational “seed papers” to ensure the network analysis has not omitted critical but obscure studies.

Step 5: Reporting and Disclosure. The final report provides a replicability appendix detailing the “seed” articles used to generate the citation networks and the specific parameters of the visual mapping. Disclosure includes an AI transparency statement confirming that human judgment was used to interpret the mapped relationships.

4.2. Scenario 2: Large-Scale or Highly Saturated Literature

Complex issues are difficult for the non-specialist reader to understand because experts’ statements are obscure to those who are uninformed about the issue. Established domains with vast amounts of academic output—such as cancer research, renewable energy, or machine learning—present unique scale challenges. Researchers must contend with thousands of publications, risking information overload and superficial syntheses. AI applications, particularly those that take advantage of NLP and topic modelling, enable efficient filtering, thematic clustering, and summarisation of large datasets. This structured reduction in complexity allows reviewers to focus on the most influential trends, critical debates, and research gaps. Thus, AI integration supports more rigorous, transparent, and scalable literature reviews in saturated fields, improving both efficiency and analytical depth.

Strategic Approach: The strategy for saturated fields is characterised as triage and conquer. The goal is to systematically reduce a large, unmanageable dataset into thematic clusters that can be synthesised effectively.

Step 1: Planning and Readiness. The researcher identifies that the sheer volume of literature risks information overload and superficial synthesis. The boundaries of AI use are defined to leverage NLP and machine learning specifically for efficient filtering and thematic clustering.

Step 2: Tool Selection and Matrix Mapping Using the capability-requirement matrix, the researcher selects machine learning-based screening tools like ASReview or collaborative platforms like Rayyan for high-throughput abstract screening. Semantic Scholar is identified for its ability to perform semantic clustering, while Elicit is chosen for targeted data extraction within identified clusters.

Step 3: Active Use in Review. The technical subsystem is activated to perform high-volume screening of thousands of titles and abstracts based on active learning principles. Once the dataset is reduced, the researcher employs semantic clustering to group the remaining articles into distinct, manageable thematic categories.

Step 4: Validation and Cross-Referencing. To safeguard against algorithmic bias or misinterpretation, the researcher performs a “human-in-the-loop” validation check. This involves parallel manual screening of a random sample (e.g., 10%) of both included and excluded articles to identify and correct false negatives or positives.

Step 5: Reporting and Disclosure. The final reporting includes a detailed account of the tool configurations used (e.g., version numbers and screening parameters) and the results of the validation audit. An AI transparency statement confirms that while AI facilitated the triage, final interpretations were verified by human reviewers.

4.3. Scenario 3: Interdisciplinary or Cross-Domain Literature

Interdisciplinary research often faces the challenge of integrating diverse terminology, frameworks, and methodologies. For example, topics such as climate change and artificial intelligence in education span multiple domains, requiring synthesis between different knowledge traditions. AI tools offer a systematic way of bridging conceptual gaps by recognising semantic similarities and clustering information across fields. This capacity reduces the cognitive burden on reviewers and helps avoid disciplinary silos. By supporting integration across domains, AI fosters the development of holistic conceptual models and opens pathways for novel insights that might otherwise remain obscured.

Strategic Approach: The key proposed strategy can involve translation and conceptual bridging, designed to overcome the “Tower of Babel” problem, where different academic disciplines use varied terminology for similar concepts. A multi-phase AI-powered workflow is highly effective.

Step 1: Planning and Readiness. The researcher acknowledges the challenge of integrating diverse terminology and frameworks. The plan focuses on recognising semantic similarities to reduce cognitive burden and avoid disciplinary silos.

Step 2: Tool Selection and Matrix Mapping. The researcher selects semantic search tools like Elicit for initial conceptual discovery. Citation analysis tools such as Scite are chosen to analyse engagement across disciplines, while NotebookLM is selected for the final integrated synthesis.

Step 3: Active Use in Review. The technical subsystem is used to ask high-level, cross-domain questions to identify core concepts and their variants across fields. The researcher then employs Litmaps or Connected Papers to visually map citation networks and identify influential “boundary spanner” articles cited across multiple domains.

Step 4: Validation and Cross-Referencing. Using NotebookLM, the researcher performs a robust synthesis by asking complex questions across the curated library (e.g., comparing management vs. sociology approaches). This step ensures that the final narrative respects the nuances of each discipline while building a holistic understanding.

Step 5: Reporting and Disclosure. The disclosure details how different disciplinary terminologies were reconciled during the search process. A reproducibility appendix provides the list of “boundary spanner” articles and the cross-domain queries used in the AI tools.

4.4. Scenario 4: Methodologically Diverse Literature

Some research domains contain studies employing highly diverse methods, ranging from qualitative ethnographies to quantitative experiments and big data analyses. Traditional synthesis approaches often struggle to reconcile these methodological differences into a coherent narrative. AI tools can support structured coding of methodological features, detect commonalities between epistemological divides, and facilitate the integration of mixed methods. By providing automated classification and comparison, AI helps researchers synthesise heterogeneous designs more systematically. This not only reduces subjectivity but also strengthens the methodological rigour of literature reviews, particularly in complex social sciences and applied fields.

Strategic Approach: A potentially effective strategy might be to structure and then synthesise a workflow designed to respect the epistemological diversity of the source material before attempting to integrate findings. The primary challenge is not just the volume of studies but the heterogeneity of their designs, ranging from quantitative surveys to qualitative case studies.

Step 1: Planning and Readiness. The researcher establishes a plan for methodological triage, focusing on the heterogeneity of designs. The goal is to facilitate the integration of mixed methods while reducing subjectivity.

Step 2: Tool Selection and Matrix Mapping. Tools with strong, structured data extraction capabilities, such as Elicit, are prioritised for pulling details like sample size and analytical techniques. EPPI-Reviewer is selected for its advanced tagging and meta-analysis features designed for mixed-methods reviews.

Step 3: Active Use in Review. The technical subsystem maps the methodological landscape by pulling and tabulating key details from each paper. This structured data informs the creation of a detailed coding framework within the review platform.

Step 4: Validation and Cross-Referencing. The researcher performs a nuanced synthesis by first grouping findings by methodology. Human judgment is then applied to the higher-order task of integrating these separate syntheses into a coherent, overarching narrative.

Step 5: Reporting and Disclosure. The final reporting includes the specific coding framework and the logic used for the methodological triage. Authors disclose how AI-assisted extraction supported the human-led synthesis of heterogeneous designs.

4.5. Scenario 5: Under-Researched or Neglected Topics

In areas where scholarship is scarce, such as indigenous knowledge systems or the adoption of low-resource technology, literature reviews face the opposite challenge of data sparsity. Conventional review methods may neglect relevant but obscure studies, particularly in less visible or non-indexed sources. AI can expand retrieval by scanning multiple repositories, preprint servers, and grey literature, uncovering overlooked materials. Semantic similarity detection allows AI to identify potentially relevant work even when terminology differs. This enhances inclusivity and ensures that underrepresented perspectives are systematically captured, offering a stronger knowledge base for fields in need of academic attention.

Strategic Approach: A potential strategy here is casting a wide net followed by deep analysis. Search tools that index a broad range of sources are critical.

Step 1: Planning and Readiness. The researcher defines the boundaries of AI use to prioritise the expansion of retrieval across multiple repositories, including preprint servers and grey literature. The plan emphasises the use of semantic similarity detection to identify relevant work even when terminology is inconsistent.

Step 2: Tool Selection and Matrix Mapping. Semantic Scholar may serve as a useful starting point, as it indexes preprints and conference proceedings. Consensus is selected for its ability to find expert consensus in reports and white papers. Additionally, Research Rabbit, Connected Papers, or Litmaps are chosen to build visual citation trees from a “seed paper” to identify related work that may not be directly cited.

Step 3: Active Use in Review. The researcher builds a visual citation network to reveal sparse regions and underexplored themes. This technical process allows for the identification of patterns through the inspection of clusters, highlighting areas in need of academic attention.

Step 4: Validation and Cross-Referencing. Using NotebookLM, the researcher performs a deep reading of the gathered (likely small) collection. This tool is intended to support the identification of what is missing in terms of theories, contexts, or methods, ensuring the review is grounded exclusively in the specific evidence found in these hard-to-reach areas.

Step 5: Reporting and Disclosure. The reporting phase documents the search strategies used across broad repositories and the specific “seed” papers used for citation mapping. An AI transparency statement confirms that human domain expertise was used to interpret the “white spaces” identified by the AI tools.

5. Limitations and Future Research Direction

5.1. Limitations

The developed structured step-by-step approach is not without limitations. First, there is variability in AI tools. Differences in AI platform algorithms, training data, and functionality may affect the consistency and generalisability of the results. Second, academics and researchers may create an over-reliance on AI outputs. As such, scholars may risk undervaluing critical human judgment if AI-generated syntheses are accepted uncritically. Furthermore, ethical and disclosure challenges. Guidelines for the use of AI in research are still evolving, which can create uncertainty about disclosure standards, bias mitigation, and academic integrity. Finally, a noteworthy limitation is that the use-cases and applications presented in Section 4 are primarily conceptual and illustrative. While these workflows are grounded in sociotechnical theory and documented tool capabilities, they have not yet been subjected to primary empirical testing or controlled comparative evaluations against traditional manual review processes.

5.2. Directions for Future Research

The systematic reviews reveal a growing body of knowledge on AI applications, indicating a trend toward interdisciplinary research and the need for further exploration of under-researched areas [79,80]. There are future developments and the usefulness of the approach, which can be expanded to the related scope for future research.

First, the integration of AI in bibliographic analysis. Future research should explore how artificial intelligence can improve bibliographic methodologies. Machine learning algorithms can automate citation parsing, detect emerging research fronts, and improve author disambiguation. Such integration would enable scholars to track scientific influence more precisely and provide richer insights into knowledge diffusion across disciplines.

Second, AI-Assisted Theory-Building, another promising direction, involves the use of AI to support the development of theories. NLP and generative models can analyse large bodies of literature, identify conceptual gaps, and propose novel hypotheses. Research should examine how AI-driven pattern recognition can complement human intuition and facilitate the creation of new theoretical frameworks.

Third, personalised AI review agents, which investigate AI-based ‘review agents’ tailored to individual scholars, have the potential to significantly alter academic workflows. These agents can filter the literature according to the interests of the researcher, summarise the findings, and suggest relevant methodologies. Future work should evaluate the effectiveness of such personalised systems in improving research productivity and reducing information overload.

Further, interoperability with academic databases: it remains crucial to ensure that AI tools integrate seamlessly with major scholarly databases. Research should focus on standardising metadata formats, developing interoperable APIs, and addressing data privacy. Such interoperability would enhance discoverability, reproducibility, and cross-platform collaboration in scholarly communication [81]. In the past year, Scopus and Web of Science have introduced an AI research assistant, and this can be expanded to other sites.

Finally, there is a clear need for controlled studies that compare the efficiency and accuracy of the AI-integrated 5-step approach against traditional manual methods. Future research should measure specific outcomes, such as screening precision (false positives/negatives) and the depth of thematic synthesis, to provide empirical grounding for these digital workflows.

6. Conclusions

While existing frameworks provide procedural guidance for using AI in literature reviews (e.g., Bolaños et al. [1], Khalifa and Albadawy [20], Wagner et al. [3], Molopa [21], and Mogoale et al. [22]), our proposed model makes a unique contribution by being explicitly grounded in a sociotechnical systems lens. This approach moves beyond a simple workflow to foreground the critical, ongoing tension between AI’s technical capabilities and its inherent limitations, such as hallucination. By embedding an iterative validation loop (Steps 3 and 4) as the core engine of the review process, our model proposes a practical methodology for researchers to actively manage this ‘co-piloting-hallucination paradox,’ ensuring that human critical judgment remains central to knowledge production.

The integration of AI tools into systematic review processes represents a noteworthy development in literature screening methodologies, offering potential for improved efficiency and accuracy. However, careful consideration of the performance, user experience and specific needs of each review is essential to maximise their benefits while minimising the risks associated with their use. The robustness of the proposed framework is derived from the systematic integration of the five sociotechnical practices with the five procedural steps. Rather than treating ethical boundaries or human judgment as isolated preparatory tasks, this model ensures they permeate every phase of the review—from the strategic mapping in Step 2 to the rigorous auditing in Step 4. As illustrated by the mapping matrix, the social subsystem (the researcher’s judgment and ethics) acts as the continuous governance layer that directs the technical subsystem (AI functionalities). This alignment ensures that as the role of AI in shaping intellectual content increases, human responsibility and critical oversight increase proportionally, safeguarding the originality and integrity of the scholarly output. A fine balance will have to be struck between the need for data privacy, creativity and originality and the potential of LLM and NLP in a fully accessible dataset of research literature. A model with open data access and closed data analysis may be viable.

Author Contributions

Conceptualisation, M.M.M.; methodology, M.M.M.; validation, M.M.M. and T.V.Q.M.; investigation, M.M.M.; data curation, J.-L.J.M., W.N., T.V.Q.M. and M.J.L.; writing—original draft, M.M.M., J.-L.J.M., W.N., T.V.Q.M. and M.J.L.; writing—review and editing, M.M.M., J.-L.J.M., W.N., T.V.Q.M. and M.J.L.; visualization, M.M.M.; supervision, M.M.M.; project administration, J.-L.J.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ICMJE	International Committee of Medical Journal Editors
AI	Artificial intelligence
APA	American Psychological Association
COPE	Committee on Publication Ethics
GPT	Generative pre-training transformer
LLMs	Large language models
NLP	Natural language processing

Appendix A

Sample of AI Transparency Statement

In conducting this systematic review, the research team used artificial intelligence (AI) tools to support specific phases of the process. In line with the Committee on Publication Ethics (COPE) guidelines on transparency and integrity, the International Committee of Medical Journal Editors (ICMJE) recommendations on authorship and disclosure, and the American Psychological Association (APA, 7th ed.) standards for responsible reporting, we provide the following details:

Tools Used

Tools X, Y, Z: Assisted in prioritising articles during the title/abstract screening stage.

Tools P, Q: Supported the brainstorming and refinement of objectives and suggested ways to improve clarity.

Scope of AI Involvement

AI tools were used to enhance researcher-led processes, not to replace human judgment. All inclusion/exclusion decisions, data extractions, thematic syntheses, and final interpretations were performed and verified by human reviewers.

Quality Control and Validation

AI outputs were systematically cross-checked against manual screening results and validated using peer debriefing, inter-rater reliability checks, and triangulation with multiple data sources. Any AI-generated text was critically reviewed, edited, and fact-checked by the research team before inclusion.

Limitations of AI Tools

AI tools may produce incomplete, inaccurate, or fabricated outputs (“hallucinations”). They do not replace domain expertise and cannot assume responsibility for the content; accountability remains with the authors.

Reproducibility and Transparency

Full details of the AI tools used, including software versions, prompts, parameter settings, and decision rules, are provided in Replicability Materials. Additionally, all data sets and search strategies are documented to allow replication by other researchers.

Ethical Compliance

This study follows the COPE guidelines on transparency in publication, the ICMJE recommendations on authorship and disclosure of AI use, and the APA 7th Edition guidelines on responsible use of emerging technologies in research and writing.

References

Bolaños, F.; Salatino, A.; Osborne, F.; Motta, E. Artificial Intelligence for Literature Reviews: Opportunities and Challenges. Artif. Intell. Rev. 2024, 57, 259. [Google Scholar] [CrossRef]
Ilegbusi, P.H. The Integration of Artificial Intelligence (AI) in Literature Review and Its Potentials to Revolutionize Scientific Knowledge Acquisition. AfricArXiv 2024. [Google Scholar] [CrossRef]
Wagner, G.; Lukyanenko, R.; Paré, G. Artificial Intelligence and the Conduct of Literature Reviews. J. Inf. Technol. 2022, 37, 209–226. [Google Scholar] [CrossRef]
Mwogosi, A.; Mambile, C. AI Integration in EHR Systems in Developing Countries: A Systematic Literature Review Using the TCCM Framework. Inf. Discov. Deliv. 2025, IDD-07-2024-0097. [Google Scholar] [CrossRef]
Zala, K.; Acharya, B.; Mashru, M.; Palaniappan, D.; Gerogiannis, V.C.; Kanavos, A.; Karamitsos, I. Transformative Automation: AI in Scientific Literature Reviews. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 1246–1255. [Google Scholar] [CrossRef]
Robinson, A.; Thorne, W.; Wu, B.P.; Pandor, A.; Essat, M.; Stevenson, M.; Song, X. Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation. arXiv 2023, arXiv:2308.06610. [Google Scholar] [CrossRef]
Guler, N.; Kirshner, S.N.; Vidgen, R. A Literature Review of Artificial Intelligence Research in Business and Management Using Machine Learning and ChatGPT. Data Inf. Manag. 2024, 8, 100076. [Google Scholar] [CrossRef]
Kotsis, K.T. Artificial Intelligence Creates Plagiarism or Academic Research? Eur. J. Arts Humanit. Soc. Sci. 2024, 1, 169–179. [Google Scholar] [CrossRef]
Chen, Z.; Chen, C.; Yang, G.; He, X.; Chi, X.; Zeng, Z.; Chen, X. Research Integrity in the Era of Artificial Intelligence: Challenges and Responses. Medicine 2024, 103, e38811. [Google Scholar] [CrossRef]
Ayub, T.; Ahmad Malla, R.; Khan, M.Y.; Ganaie, S.A. The Art of Deception: Humanizing AI to Outsmart Detection. Glob. Knowl. Mem. Commun. 2024, ahead of print. [Google Scholar] [CrossRef]
Crockett, R.; Howe, R. The Inherent Uncertainties of AI-Text Detection and the Implications for Education Institutions: An Overview. In Advances in Educational Marketing, Administration, and Leadership; Mahmud, S., Ed.; IGI Global: Hershey, PA, USA, 2024; pp. 175–198. ISBN 979-8-3693-0240-8. [Google Scholar]
Weber-Wulff, D.; Anohina-Naumeca, A.; Bjelobaba, S.; Foltýnek, T.; Guerrero-Dib, J.; Popoola, O.; Šigut, P.; Waddington, L. Testing of Detection Tools for AI-Generated Text. Int. J. Educ. Integr. 2023, 19, 26. [Google Scholar] [CrossRef]
Bowen, J.A.; Watson, C.E. Teaching with AI; Johns Hopkins University Press: Baltimore, MD, USA, 2024; ISBN 978-1-4214-4923-4. [Google Scholar]
Blommerde, T.; Bright, W.; Musgrave, E.; Mitchell, R.; Heselton, R. AI Detectors in Universities: Time to Turn Them off and Embrace AI for Enhanced Learning. Educ. Dev. 2024, 25, 8–11. [Google Scholar]
McKenna, S.; Kramm, N. Turning off AI Detection Software Is the Right Call for SA Universities. Daily Maverick, 25 July 2025. [Google Scholar]
Eslit, E. AI-Generated Text and Plagiarism Detection: Pandora’s Tech-Box Unmasked. Soc. Sci. 2025, preprint. [Google Scholar] [CrossRef]
Cui, P.; Alias, B.S. Opportunities and Challenges in Higher Education Arising from AI: A Systematic Literature Review (2020–2024). J. Infrastruct. Policy Dev. 2024, 8, 8390. [Google Scholar] [CrossRef]
Chavez, J.V.; Cuilan, J.T.; Mannan, S.S.; Ibrahim, N.U.; Carolino, A.A.; Radjuni, A.; Albani, S.E.; Garil, B.A. Discourse Analysis on the Ethical Dilemmas on the Use of AI in Academic Settings from ICT, Science, and Language Instructors. Forum Linguist. Stud. 2024, 6, 349–363. [Google Scholar] [CrossRef]
Zhai, C.; Wibowo, S.; Li, L.D. The Effects of Over-Reliance on AI Dialogue Systems on Students’ Cognitive Abilities: A Systematic Review. Smart Learn. Environ. 2024, 11, 28. [Google Scholar] [CrossRef]
Khalifa, M.; Albadawy, M. Using Artificial Intelligence in Academic Writing and Research: An Essential Productivity Tool. Comput. Methods Programs Biomed. Update 2024, 5, 100145. [Google Scholar] [CrossRef]
Molopa, S.T. Artificial Intelligence-Based Literature Review Adaptation. S. Afr. J. Libr. Inf. Sci. 2024, 90, 1–18. [Google Scholar] [CrossRef]
Mogoale, P.D.; Pretorius, A.B.; Mogase, R.C.; Segooa, M.A. Evaluating the Efficacy of AI Tools in Systematic Literature Reviews: A Comprehensive Analysis. Journalisi 2025, 7, 870–888. [Google Scholar] [CrossRef]
Haddaway, N.R.; Bethel, A.; Dicks, L.V.; Koricheva, J.; Macura, B.; Petrokofsky, G.; Pullin, A.S.; Savilaakso, S.; Stewart, G.B. Eight Problems with Literature Reviews and How to Fix Them. Nat. Ecol. Evol. 2020, 4, 1582–1589. [Google Scholar] [CrossRef]
Mitchell, A.; Rich, M. Challenges of Writing an Effective Literature Review for Students and New Researchers of Business. Eur. Conf. Res. Methodol. Bus. Manag. Stud. 2022, 21, 141–148. [Google Scholar] [CrossRef]
Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving Language Understanding by Generative Pre-Training; OpenAI: San Francisco, CA, USA, 2018. [Google Scholar]
Yenduri, G.; Ramalingam, M.; Selvi, G.C.; Supriya, Y.; Srivastava, G.; Maddikunta, P.K.R.; Raj, G.D.; Jhaveri, R.H.; Prabadevi, B.; Wang, W.; et al. GPT (Generative Pre-Trained Transformer)—A Comprehensive Review on Enabling Technologies, Potential Applications, Emerging Challenges, and Future Directions. IEEE Access 2024, 12, 54608–54649. [Google Scholar] [CrossRef]
Gunning, D.; Stefik, M.; Choi, J.; Miller, T.; Stumpf, S.; Yang, G.-Z. XAI—Explainable Artificial Intelligence. Sci. Robot. 2019, 4, eaay7120. [Google Scholar] [CrossRef] [PubMed]
Sirilertmekasakul, C.; Rattanawong, W.; Gongvatana, A.; Srikiatkhachorn, A. The Current State of Artificial Intelligence-Augmented Digitized Neurocognitive Screening Test. Front. Hum. Neurosci. 2023, 17, 1133632. [Google Scholar] [CrossRef]
Saied, M.; Mokhtar, N.; Badr, A.; Adel, M.; Boles, P.; Khoriba, G. AI in Literature Reviews: A Survey of Current and Emerging Methods. In Proceedings of the 2024 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), Cairo, Egypt, 13–14 November 2024; pp. 61–65. [Google Scholar]
Bawack, R.; Bawack, R. “Hey Librarian, What Can AI and Analytics Do for You”: A Systematic Literature Review and Sociotechnical Perspective. Aslib J. Inf. Manag. 2025, 77, 124–145. [Google Scholar] [CrossRef]
Taxén, L. Reviving the Individual in Socio-Technical Systems Thinking. J. Complex Syst. Inform. Model. Q. 2020, 22, 39–48. [Google Scholar] [CrossRef]
Ferrara, E. Fairness and Bias in Artificial Intelligence: A Brief Survey of Sources, Impacts, and Mitigation Strategies. Sci 2023, 6, 3. [Google Scholar] [CrossRef]
Schwartz, R.; Vassilev, A.; Greene, K.; Perine, L.; Burt, A.; Hall, P. Towards a Standard for Identifying and Managing Bias in Artificial Intelligence; National Institute of Standards and Technology (U.S.): Gaithersburg, MD, USA, 2022; p. NIST SP 1270. [Google Scholar]
Shahbazi, N.; Lin, Y.; Asudeh, A.; Jagadish, H.V. Representation Bias in Data: A Survey on Identification and Resolution Techniques. ACM Comput. Surv. 2023, 55, 1–39. [Google Scholar] [CrossRef]
Naudé, W. Artificial Intelligence vs COVID-19: Limitations, Constraints and Pitfalls. AI Soc. 2020, 35, 761–765. [Google Scholar] [CrossRef]
Gerlich, M. AI Tools in Society: Impacts on Cognitive Offloading and the Future of Critical Thinking. Societies 2025, 15, 6. [Google Scholar] [CrossRef]
Schemmer, M.; Hemmer, P.; Kühl, N.; Benz, C.; Satzger, G. Should I Follow AI-Based Advice? Measuring Appropriate Reliance in Human-AI Decision-Making. arXiv 2022, arXiv:2204.06916. [Google Scholar]
Johnson, J. The AI Commander Problem: Ethical, Political, and Psychological Dilemmas of Human-Machine Interactions in AI-Enabled Warfare. J. Mil. Ethics 2022, 21, 246–271. [Google Scholar] [CrossRef]
Mueller, S.T.; Hoffman, R.R.; Clancey, W.; Emrey, A.; Klein, G. Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI. arXiv 2022, arXiv:2204.06916. [Google Scholar]
Patel, P. AI Voice Enters the Copyright Regime: Proposal of a Three-Part Framework. Fordham Intellect. Prop. Media Entertain. Law J. 2024, 34, 451–513. [Google Scholar]
Foltynek, T.; Bjelobaba, S.; Glendinning, I.; Khan, Z.R.; Santos, R.; Pavletic, P.; Kravjar, J. ENAI Recommendations on the Ethical Use of Artificial Intelligence in Education. Int. J. Educ. Integr. 2023, 19, 12. [Google Scholar] [CrossRef]
Lin, Z. Beyond Principlism: Practical Strategies for Ethical AI Use in Research Practices. AI Ethics 2025, 5, 2719–2731. [Google Scholar] [CrossRef]
Dos Reis, A.H.S.; De Oliveira, A.L.M.; Fritsch, C.; Zouch, J.; Ferreira, P.; Polese, J.C. Usefulness of Machine Learning Softwares to Screen Titles of Systematic Reviews: A Methodological Study. Syst. Rev. 2023, 12, 68. [Google Scholar] [CrossRef]
Schmidt, L.; Cree, I.; Campbell, F. Digital Tools to Support the Systematic Review Process: An Introduction. Eval. Clin. Pract. 2025, 31, e70100. [Google Scholar] [CrossRef]
Pijls, B.G. Machine Learning Assisted Systematic Reviewing in Orthopaedics. J. Orthop. 2024, 48, 103–106. [Google Scholar] [CrossRef]
Van De Schoot, R.; De Bruin, J.; Schram, R.; Zahedi, P.; De Boer, J.; Weijdema, F.; Kramer, B.; Huijts, M.; Hoogerwerf, M.; Ferdinands, G.; et al. An Open Source Machine Learning Framework for Efficient and Transparent Systematic Reviews. Nat. Mach. Intell. 2021, 3, 125–133. [Google Scholar] [CrossRef]
Nguyen-Trung, K.; Saeri, A.K.; Kaufman, S. Applying ChatGPT and AI-Powered Tools to Accelerate Evidence Reviews. Hum. Behav. Emerg. Technol. 2024, 2024, 8815424. [Google Scholar] [CrossRef]
Fallico, N. Exploring Research Rabbit: Your New Favourite Reference Manager. Med. Writ. 2025, 34, 67–69. [Google Scholar] [CrossRef]
Sharma, R.; Gulati, S.; Kaur, A.; Sinhababu, A.; Chakravarty, R. Research Discovery and Visualization Using ResearchRabbit: A Use Case of AI in Libraries. COLLNET J. Scientometr. Inf. Manag. 2022, 16, 215–237. [Google Scholar] [CrossRef]
Van Dijk, S.H.B.; Brusse-Keizer, M.G.J.; Bucsán, C.C.; Van Der Palen, J.; Doggen, C.J.M.; Lenferink, A. Artificial Intelligence in Systematic Reviews: Promising When Appropriately Used. BMJ Open 2023, 13, e072254. [Google Scholar] [CrossRef] [PubMed]
Nepal, T.K. Exploring the Applications and Challenges of ChatGPT in Research and Academia: A Comprehensive Review. West Sci. Interdiscip. Stud. 2024, 2, 1043–1050. [Google Scholar] [CrossRef]
Prajith, J.; Dhirajlal, V.B. Exploring the Role of ChatGPT in Assisting Research Work and Writing Research Papers: A Study on ChatGPT AI Integration in Academic Writing. J. Inform. Educ. Res. 2025, 5, 3682–3691. [Google Scholar] [CrossRef]
Maghsoudlou, P.; Fennely-Barnwell, J.; Abraham, A.; Kapacee, Z.; Allen, C.; Totenhofer, A.; Mamtora, S.; Keane, P.A.; Denniston, A.; Liu, X.; et al. Evaluating Large Language Models for Quality Control in Research Ethics Review. Res. Sq. 2025, rs.3.rs-5142171.v1. [Google Scholar] [CrossRef]
Behera, P.K.; Jain, S.J.; Kumar, A. Visual Exploration of Literature Using Connected Papers: A Practical Approach. Issues Sci. Technol. Librariansh. 2023, 104. [Google Scholar] [CrossRef]
Tay, A. 3 New Tools to Try for Literature Mapping—Connected Papers, Inciteful and Litmaps. Available online: https://aarontay.medium.com/3-new-tools-to-try-for-literature-mapping-connected-papers-inciteful-and-litmaps-a399f27622a (accessed on 11 November 2025).
Faix, A. Consensus: Using AI to Analyze Scientific Literature. Libr. Trends 2025, 73, 344–354. [Google Scholar] [CrossRef]
Tomczyk, P.; Brüggemann, P.; Mergner, N.; Petrescu, M. Are AI Tools Better than Traditional Tools in Literature Searching? Evidence from E-Commerce Research. J. Librariansh. Inf. Sci. 2024, 58, 135–145. [Google Scholar] [CrossRef]
Waffenschmidt, S.; Sieben, W.; Jakubeit, T.; Knelangen, M.; Overesch, I.; Bühn, S.; Pieper, D.; Skoetz, N.; Hausner, E. Increasing the Efficiency of Study Selection for Systematic Reviews Using Prioritization Tools and a Single-Screening Approach. Syst. Rev. 2023, 12, 161. [Google Scholar] [CrossRef] [PubMed]
Alsajri, A.; Salman, H.A.; Steiti, A. Generative Models in Natural Language Processing: A Comparative Study of ChatGPT and Gemini. Babylon. J. Artif. Intell. 2024, 2024, 134–145. [Google Scholar] [CrossRef]
Supriyadi, E. Exploring Google Bard’s (Gemini) Role in Enhancing Research Articles in Computational Thinking and Mathematics Education. Papanda J. Math. Sci. Res. 2024, 3, 28–37. [Google Scholar] [CrossRef]
Foley, K.; McLean, C.; De Zylva, R.; Asa, G.; Maio, J.; Batchelor, S.; Dzando, G.; Dimassi, A. Developing a Critical Imagination for How Researchers Can Use Artificially Intelligent Tools Reflexively and Responsibly During Qualitative Literature Reviews. Int. J. Qual. Methods 2025, 24, 16094069251316249. [Google Scholar] [CrossRef]
Michalak, R.; Ellixson, D. AI-Driven Discovery: How Litmaps Shapes Research and Teaching & Learning. Ser. Libr. 2024, 85, 117–129. [Google Scholar] [CrossRef]
Reyna, J. The Potential of Google NotebookLM for Teaching and Learning. In Proceedings of the E-Learn 2025 World Conference on E-Learning; Association for the Advancement of Computing in Education (AACE): Bangkok, Thailand, 2025; pp. 88–95. [Google Scholar]
Shor, R.; Greene, E.A.; Sumberg, L.; Weingrad, A.B. AI Tools in Academia: Evaluating NotebookLM as a Tool for Conducting Literature Reviews. Psychiatry 2025, online ahead of print. [Google Scholar] [CrossRef]
Tay, A. Paper Digest, Elicit and Auto-Generation of Literature Review. Front Matter. 2022. Available online: https://aarontay.medium.com/paper-digest-elicit-and-auto-generation-of-literature-review-f14d9b4b4b4b (accessed on 9 March 2026).
Khattak, A.N.; Bhatti, G.A.; Patrick, N. The Role of Artificial Intelligence (AI) in Literature Reviews in Business and Social Sciences Research Studies in Pakistan: Qualitative Exploratory Expert Interviews. Glob. Manag. Sci. Rev. 2025, 10, 22–29. [Google Scholar] [CrossRef]
Bui, T.X.H.; Bui, V.H. Decoding Scholarcy Website: A Study on Its Research Summarization Efficiency. Proc. AsiaCALL Int. Conf. 2024, 6, 71–80. [Google Scholar] [CrossRef]
Basumatary, B.; Basumatary, N.; Vivekavardhan, J.; Verma, M.K. Tracing the Footprints of Scholarly Influence in Academia: A Contextual Smart Citation Analysis of Highly Cited Articles Using Scite. Glob. Knowl. Mem. Commun. 2024, 73, 542–560. [Google Scholar] [CrossRef]
Fajaria, N.H. Optimizing sci summary usage for summarizing research article. In Useful AI Tools for English Teachers, 1st ed.; Alamsyah, A., Faozan, A., Hasbi, M., Suryaningsih, N.L.S., Ning, P., Astawa, S.P., Fauzi, A.R., Utomo, H.Y., Devi, A.P., Nor, H., Eds.; Rizquna: Banyumas, Indonesia, 2024; Chapter 21; pp. 255–263. [Google Scholar]
Hingnekar, V. An Empirical Evaluation of a Multi-Agent Framework for Retrieval-Augmented Academic Research. TechRxiv 2025. [Google Scholar] [CrossRef]
Williamson, J.M.; Fernandez, P. “Through the Looking Glass: Envisioning New Library Technologies” Academic Search Using Artificial Intelligence Tools. Libr. Hi Tech News 2025, 42, 1–5. [Google Scholar] [CrossRef]
Zhang, Y.; Long, J.; Zhao, W. The Curvilinear Relationships Between Relational Embeddedness and Dynamic Capabilities: The Mediating Effect of Ambidextrous Learning. Front. Psychol. 2022, 13, 830377. [Google Scholar] [CrossRef]
Mohd, N.I.; Ariffin, H.L.T.; Kamaruddin, T.; Shukery, N.M.; Ismail, F.; Omar, S.R.; Kamarden, H.; Abd Rahman, N.H.; Mustaffa, N.E. Mapping Research Trends in Artificial Intelligence in Higher Education: A Bibliometric Analysis. Int. J. Mod. Educ. 2025, 7, 190–203. [Google Scholar] [CrossRef]
Delgado-Chaves, F.M.; Jennings, M.J.; Atalaia, A.; Wolff, J.; Horvath, R.; Mamdouh, Z.M.; Baumbach, J.; Baumbach, L. Transforming Literature Screening: The Emerging Role of Large Language Models in Systematic Reviews. Proc. Natl. Acad. Sci. USA 2025, 122, e2411962122. [Google Scholar] [CrossRef]
Committee on Publication Ethics. Artifical Intelligence (AI) in Decision Making; Committee on Publication Ethics: Eastleigh, UK, 2021. [Google Scholar]
American Psychological Association. APA Journals’ Policy on Generative AI: Additional Guidance (Updated August 2025). Available online: https://www.apa.org/pubs/journals/resources/publishing-tips/policy-generative-ai#disclosure-guidance (accessed on 10 November 2025).
International Committee of Medical Journal Editors. Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals (Updated April 2025); International Committee of Medical Journal Editors: Vancouver, BC, Canada, 2025. [Google Scholar]
Lowe, J.; Matthee, M. Requirements of Data Visualisation Tools to Analyse Big Data: A Structured Literature Review. In Responsible Design, Implementation and Use of Information and Communication Technology; Hattingh, M., Matthee, M., Smuts, H., Pappas, I., Dwivedi, Y.K., Mäntymäki, M., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2020; Volume 12066, pp. 469–480. ISBN 978-3-030-44998-8. [Google Scholar]
Kumar, D.; Ratten, V. Artificial Intelligence and Family Businesses: A Systematic Literature Review. J. Fam. Bus. Manag. 2025, 15, 373–392. [Google Scholar] [CrossRef]
Mariani, M.M.; Perez-Vega, R.; Wirtz, J. AI in Marketing, Consumer Research and Psychology: A Systematic Literature Review and Research Agenda. Psychol. Mark. 2022, 39, 755–776. [Google Scholar] [CrossRef]
Huang, Y.-F.; Chen, P.-H. Fake News Detection Using an Ensemble Learning Model Based on Self-Adaptive Harmony Search Algorithms. Expert Syst. Appl. 2020, 159, 113584. [Google Scholar] [CrossRef]

Figure 1. Structured step-by-step methodological approach for integrating AI tools into the literature review synthesis process.

Figure 2. Continuum of function/support of AI use in the literature review.

Table 1. Mapping of Sociotechnical Practices across Methodological Steps.

Practice (From Step 1)	Step 2: Tool Selection	Step 3: Active Use	Step 4: Validation	Step 5: Disclosure
P1: AI as Collaborator	Selecting tools based on their specific collaborative role (e.g., ASReview for screening).	Maintaining “co-piloting” oversight where the AI supports human-led goals.	Auditing AI outputs as initial drafts to be vetted rather than final verdicts.	Confirming that human authorship and responsibility remain central.
P2: Human Judgment	Evaluating tool “fit-for-purpose” to ensure alignment with research objectives.	Exercising judgment during prompt engineering and the interpretation of initial outputs.	Serving as the non-delegable arbiter of validity and analytic rigour.	Contextualising machine-driven insights within the broader theoretical landscape.
P3: Documentation	Logging specific tool configurations and selection justifications.	Maintaining thorough records of specific prompts and search queries used.	Logging validation results, error reviews, and iterative refinements.	Providing the verified raw materials required for formal transparency statements.
P4: Cross-Validation	Identifying tools with built-in features for cross-referencing or citation mapping.	Monitoring for patterns, hidden links, and consistency during active review.	Operationalising parallel manual screening and dual-search redundancy checks.	Detailing specific validation methods and findings to support reproducibility.
P5: Ethical/Copyright	Prioritising tools that offer open-access filters and respect proprietary data.	Actively applying ethical filters during the literature search and retrieval phases.	Monitoring outputs for plagiarism risks or the unauthorised generation of synthetic text.	Adhering to established ethical reporting guidelines such as COPE, APA, and ICMJE.

Table 2. Illustrative Capability-Requirement Matrix.

Review Task	Required AI Capability	Potential Tools	Sociotechnical Consideration
Review Task	Required AI Capability	Potential Tools	(Justification)
1. Intellectual Discovery	Network analysis; visual mapping; algorithmic dimension reduction.	Research Rabbit, Litmaps.	The technical system reveals “neighbourhoods” of research, while human cross-validation (P4) ensures seminal work is not missed.
(Scenario 1:
Fragmented
Literature)
2. High-Volume Screening	Active learning, high throughput, relevance ranking.	ASReview,	Efficiency is maximised, but the researcher (P1) must govern the dataset reduction to prevent algorithmic bias.
(Scenario 2:		Rayyan.
Saturated Fields)
3. Conceptual Bridging	Semantic similarity detection, cross-domain Q&A, citation context analysis.	Elicit, Scite.	Human judgment (P2) is required to “translate” and interpret terminological variants across different knowledge traditions.
(Scenario 3:
Interdisciplinary Domains)
4. Methodological Triage	Structured data extraction; advanced tagging; epistemological clustering.	Elicit,	Robust documentation (P3) is essential to track how diverse methodological contributions were systematically evaluated.
(Scenario 4:		EPPI-Reviewer.
Diverse Designs)
5. Deep Reading & Synthesis	Multi-repository indexing; source-grounded summarisation; gap identification.	Consensus, NotebookLM.	Ethical use (P5) and human judgment guide the synthesis of sparse data to ensure inclusive and accurate representation.
(Scenario 5:
Neglected Topics)

Table 4. Approaches to cross-checking AI outputs.

Approach	Action	Enhancement Strategy
Parallel manual screening	Select a sample subset of articles that the AI marked as relevant (and irrelevant).	Have human reviewer(s) manually apply the inclusion/exclusion criteria to the same subset *.
Dual search or redundancy check	Run the same query in two or more AI-powered platforms.	Compare lists of retrieved articles and flagged themes. Disagreements should be manually examined by a reviewer to see which tool aligns more closely with the inclusion criteria.
Iterative error review	Maintain a validation log of where the AI made mistakes, such as misclassifying a qualitative study as quantitative.	Feed this back into the process (adjust prompts, retrain filters, refine criteria).

* True positives → AI & human both include (decision: accept), False positives → AI includes but human excludes (decision: review criteria), False negatives → AI excludes but human includes (decision: adjust AI filter), True negatives → AI & human both exclude (decision: accept).

Table 5. Expectation of disclosure on the use of AI in the literature. ** Ethical standards mandate AI policies, authorship roles, and publication procedures for recommended conduct to ensure transparency and best practice in research.

Ethical Guidelines **	Expectations	Reference
Committee on Publication Ethics (COPE)	Authors should disclose AI tool usage in the materials and methods (or similar section) of the paper, how the AI tool was used, and which tool was used. Authors are fully responsible for the content of their manuscript, even those parts produced by an AI tool, and are thus liable for any breach of publication ethics.	[75]
American Psychological Association (APA)	APA advises transparent disclosure of AI use, proper citation of AI-generated content, and ensuring AI outputs are validated and aligned with ethical research practice. Authors need to be transparent, maintain ethical approaches, take responsibility and provide attribution.	[76]
International Committee of Medical Journal Editors (ICMJE)	At submission, the journal should require authors to disclose whether they used AI-assisted technologies (such as LLMs, chatbots, or image creators) in the production of submitted work. Authors who use such technology should describe, in both the cover letter and the submitted work in the appropriate section, if applicable, how they used it. Researchers must explicitly confirm that humans—not AI—meet authorship criteria, and disclose potential biases introduced by AI.	[77]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mtotywa, M.M.; Mowers, J.-L.J.; Ndou, W.; Moleko, T.V.Q.; Ledwaba, M.J. Artificial Intelligence in Literature Review Synthesis: A Step-by-Step Methodological Approach for Researchers and Academics. Informatics 2026, 13, 43. https://doi.org/10.3390/informatics13030043

AMA Style

Mtotywa MM, Mowers J-LJ, Ndou W, Moleko TVQ, Ledwaba MJ. Artificial Intelligence in Literature Review Synthesis: A Step-by-Step Methodological Approach for Researchers and Academics. Informatics. 2026; 13(3):43. https://doi.org/10.3390/informatics13030043

Chicago/Turabian Style

Mtotywa, Matolwandile M., Jeri-Lee J. Mowers, Wavhudi Ndou, Thabang V. Q. Moleko, and Matsobane J. Ledwaba. 2026. "Artificial Intelligence in Literature Review Synthesis: A Step-by-Step Methodological Approach for Researchers and Academics" Informatics 13, no. 3: 43. https://doi.org/10.3390/informatics13030043

APA Style

Mtotywa, M. M., Mowers, J.-L. J., Ndou, W., Moleko, T. V. Q., & Ledwaba, M. J. (2026). Artificial Intelligence in Literature Review Synthesis: A Step-by-Step Methodological Approach for Researchers and Academics. Informatics, 13(3), 43. https://doi.org/10.3390/informatics13030043

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Artificial Intelligence in Literature Review Synthesis: A Step-by-Step Methodological Approach for Researchers and Academics

Abstract

1. Introduction

2. Theoretical Context of the Study

2.1. Growing Challenges of Effective Literature Review with Traditional Processes

2.2. Benefits of Integrating AI in Literature Review

2.3. AI Co-Piloting-Hallucination Paradox

3. Design of the Methodological Approach

3.1. Scoping, Understanding Intertwined Practices and Confirming Boundaries of AI Use (Planning and Readiness)—Step 1

3.2. Select AI Tool(s) and Link AI Capability with Literature Requirements (Matrix)—Step 2

3.3. Use AI in Literature Review—Step 3

3.4. Validate and Cross-Reference AI Outcomes—Step 4

3.5. Disclose AI Use as per Relevant Requirements During Reporting—Step 5

4. Use-Cases and Applications

4.1. Scenario 1: Emerging or Fragmented Literature

4.2. Scenario 2: Large-Scale or Highly Saturated Literature

4.3. Scenario 3: Interdisciplinary or Cross-Domain Literature

4.4. Scenario 4: Methodologically Diverse Literature

4.5. Scenario 5: Under-Researched or Neglected Topics

5. Limitations and Future Research Direction

5.1. Limitations

5.2. Directions for Future Research

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Sample of AI Transparency Statement

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI