Publisher-Built Generative AI Assistants in U.S. Higher Education: A Critical Review and a Reproducible TRIAD–JTBD Evaluation Framework

Leon, Maikel

doi:10.3390/a19060492

Open AccessReview

Publisher-Built Generative AI Assistants in U.S. Higher Education: A Critical Review and a Reproducible TRIAD–JTBD Evaluation Framework

by

Maikel Leon

Department of Business Technology, Miami Herbert Business School, University of Miami, Coral Gables, FL 33146, USA

Algorithms 2026, 19(6), 492; https://doi.org/10.3390/a19060492 (registering DOI)

Submission received: 11 May 2026 / Revised: 9 June 2026 / Accepted: 15 June 2026 / Published: 19 June 2026

(This article belongs to the Special Issue Artificial Intelligence Algorithms and Generative AI in Education (2nd Edition))

Download

Browse Figures

Versions Notes

Abstract

Artificial intelligence (AI) has reshaped higher education over six decades, evolving from drill-and-practice programs to adaptive cognitive tutors and, most recently, transformer-based generative models. This article presents a critical review of publisher-built generative AI assistants, adopting an explicitly socio-technical perspective that combines a technological lens with a pedagogical one. It makes three contributions. First, it synthesizes the technical and algorithmic evolution of educational AI, from rule-based and expert systems through knowledge tracing and learning analytics to large language models and retrieval-augmented generation, and organizes these mechanisms into a taxonomy. Second, it introduces a reproducible evaluation framework that couples the TRIAD rubric (Trust, Relevance, Impact, Adoption, and Design) with a Jobs-to-Be-Done (JTBD) lens, complete with anchored scoring criteria, an evidence-and-confidence grading scheme, and reported inter-rater reliability. Third, it applies the framework to eleven assistants released by U.S. publishers, distinguishing peer-reviewed evidence from institutional reports and commercial claims. The analysis reflects a mid-2025 snapshot and is presented as a reusable template rather than a static ranking. Findings reveal substantial variation in privacy safeguards, curricular alignment, documented impact, adoption, and usability. The review identifies application scenarios and recommendations for researchers and institutional leaders seeking to guide the responsible integration of AI in higher education.

Keywords:

artificial intelligence in higher education; intelligent tutoring systems; adaptive learning; learning analytics; knowledge tracing; retrieval-augmented generation; generative AI; critical review; evaluation framework

1. Introduction

Artificial Intelligence (AI) has been a recurrent catalyst for pedagogical change in universities. Across successive technological waves, it has also reconfigured core instructional paradigms. Early deployments relied mainly on mechanistic drill-and-practice routines that emulated rote learning. Subsequent generations of intelligent tutoring systems (ITSs) incorporated explicit cognitive architectures and adaptive algorithms. These systems modeled students’ knowledge states and tailored feedback accordingly. The most recent phase centers on transformer-based large language models (LLMs) that support natural language interaction, enabling dialogic learning experiences at unprecedented scale.

This article is a critical review. Its purpose is twofold. The first aim is to provide an integrated historical and technical synthesis that situates contemporary educational technologies within a research continuum spanning more than six decades. Without that context, it is easy to overlook how earlier advances, such as knowledge-space theory and cognitive modeling, inform current design practices. The second aim is to examine the wave of AI-driven products launched by major U.S. educational publishers during the 2024–2025 period. These products raise pressing questions about strategic positioning, academic integrity, and effects on learning outcomes across diverse student populations. Addressing them requires a structured analysis that goes beyond marketing narratives.

Guiding perspective: The review adopts an explicitly socio-technical perspective that integrates two lenses. The technological lens traces the algorithms and architectures that make each generation of tools possible. The educational lens evaluates how those mechanisms serve teaching and learning. Neither lens is sufficient on its own: a technically sophisticated assistant that ignores instructional theory tends to underperform, and a pedagogically sound design constrained by weak underlying mechanisms cannot scale. Treating the assistants as socio-technical systems makes their value contingent on both technical reliability and the educational roles they fulfill.

Terminology: Several near-synonyms are used loosely in industry materials. To keep the analysis precise, the following conventions are adopted throughout. AI assistant is the umbrella term for any publisher-deployed system that supports learning or instruction through AI. An intelligent tutoring system is a system with explicit domain, student, and tutoring models. A tutor is an assistant whose primary function is guided, often Socratic, instruction. A chatbot is a conversational interface, typically built on an LLM, that may or may not be grounded in vetted content. Adaptive system denotes a tool that sequences content using a learner model. Where a generic reference is intended, the term assistant is used.

To frame the investigation, a theoretical perspective that synthesizes lessons from expert systems and ITSs is adopted. Research shows that features such as immediate feedback, guided practice, and adaptivity are central to effective AI tutors. Systematic reviews further indicate that AI-driven tutoring can enhance learning when implemented under appropriate conditions. Building on this foundation, three research questions are posed:

How have AI tutors evolved across successive technological waves, and how can the underlying mechanisms be organized into a taxonomy?
How do publisher-developed generative assistants perform across the TRIAD dimensions when assessed using a reproducible rubric with documented evidence and reliability?
To what extent do these tools fulfill core educational jobs (understanding complex content, generating assessment questions, and providing timely feedback) as conceptualized by the JTBD framework?

Contributions: Relative to prior surveys of AI in education, this review offers three contributions to the state of the art. (i) It consolidates six decades of educational AI into a taxonomy of algorithmic families and maps those families onto educational applications. (ii) It formalizes a transparent, reproducible evaluation framework (TRIAD coupled with JTBD) that includes anchored scoring criteria, an explicit evidence-and-confidence grading scheme, and reported inter-rater reliability, so that the procedure can be reapplied as tools evolve. (iii) It delivers the first comparative, evidence-graded assessment of the 2024–2025 cohort of publisher-built assistants, from which application scenarios and institutional recommendations are derived.

Structure

The remainder of this article proceeds as follows. Section 2 introduces the theoretical foundations, integrating the literature on expert systems, ITSs, and responsible AI, and synthesizes the technical evolution of educational AI into a taxonomy of algorithmic families. Section 3 is a dedicated Materials and Methods Section that describes the search protocol, inclusion and exclusion criteria, the tool-selection procedure, the evidence-and-confidence grading scheme, and the reproducible TRIAD and JTBD scoring methodologies, including inter-rater reliability. Section 4 traces the historical evolution of AI in U.S. higher education from the 1950s through the 2020s. Section 5 catalogs generative AI tools developed by leading academic publishers and organizes them by interface and architecture. Section 6 presents a comparative analysis using the TRIAD framework and the JTBD lens. Section 7 reports the scored results, the mechanism-by-application cross-analysis, and derived application scenarios, and discusses emerging technical and pedagogical trends. Section 8 synthesizes the findings, contrasts publisher-provided tools with campus-specific solutions, and reflects on ethical and societal implications. The article concludes with evidence-based recommendations for researchers, educators, and institutional leaders.

2. Theoretical Foundations and Technical Evolution

This evaluation is grounded in the literature on expert systems, intelligent tutoring systems (ITSs), and responsible artificial intelligence (AI). Decades of ITS research emphasize that the educational effectiveness of AI tutors depends not merely on technical sophistication but on alignment with instructional theory. A recent analysis notes that key features, namely immediate feedback, guided practice, and adaptivity, are grounded in cognitive theory and yield positive learning outcomes only when implemented under the right conditions [1]. Likewise, systematic reviews of AI-driven tutoring systems report generally positive effects on learning. Those reviews also stress that the gains often require sustained interventions and careful attention to context [2].

Expert systems historically focused on knowledge representation and inference. In educational domains, rule-based tutors such as Cognitive Tutor Algebra I employ explicit domain models to deliver individualized instruction. Longitudinal trials show that such systems can produce significant improvements in student proficiency when deployed over multiple years [3]. More recent developments merge deep learning and generative modeling with human-authored content, which raises questions about transparency, fairness, and privacy. Ethical guidance from the United Nations Educational, Scientific, and Cultural Organization (UNESCO), the U.S. National Institute of Standards and Technology (NIST), and the EdSAFE AI Alliance calls for human-centered design, data minimization, accountability, and explainability [4,5,6].

2.1. Technical and Algorithmic Evolution: A Taxonomy of Mechanisms

Because the target venue emphasizes algorithms, it is useful to make the technical lineage of educational AI explicit before evaluating present-day products. The mechanisms that power educational AI fall into six families that emerged in overlapping waves. Figure 1 arranges them along a trajectory of increasing capability and, notably, decreasing transparency.

The first family, rule-based and expert systems, encodes domain knowledge as production rules and semantic networks. Systems such as SCHOLAR and GUIDON, and later ACT-R cognitive tutors, used model tracing to compare student actions against an explicit expert model. Their behavior is fully inspectable, but authoring is labor-intensive and brittle outside well-formalized domains. The second family, adaptive and knowledge-space methods, represents a learner’s mastery as a state in a partially ordered knowledge space and sequences content toward the next learnable topic; ALEKS is the canonical example [7]. The third family, statistical and probabilistic methods, includes Bayesian knowledge tracing, educational data mining (EDM), and learning analytics, which infer latent mastery and predict outcomes from interaction logs [8,9]. The fourth family, deep learning, applies neural networks to tasks such as deep knowledge tracing, affect detection, and student modeling [10]. The fifth family, transformer-based large language models (LLMs), uses self-attention to support open-ended dialogic tutoring; models such as BERT (Bidirectional Encoder Representations from Transformers) and the Generative Pre-trained Transformer (GPT) series anchor most current products [11,12,13]. The sixth family, retrieval-augmented generation (RAG) and hybrid architectures, grounds an LLM in vetted course content to reduce hallucination and to align responses with learning objectives. This last family is the dominant design pattern among the publisher tools reviewed here, and it directly addresses the transparency deficit introduced by pure LLMs.

These families are not mutually exclusive. Modern assistants frequently combine an LLM front end with retrieval over a curated corpus and analytics for progress monitoring so that a single product may instantiate several mechanisms at once. Table 1 summarizes each family, its representative technique, the principal educational function it enables, and its characteristic transparency profile. This taxonomy provides the technical vocabulary used in the comparative analysis and in the mechanism-by-application cross-analysis presented in Section 7.

2.2. The TRIAD Dimensions

The TRIAD framework is a pragmatic rubric derived from responsible AI principles and educational technology evaluation. It operationalizes those principles alongside curricular relevance and usability across five dimensions. The conceptual meaning of each dimension is given below; the anchored scoring criteria used to apply it are presented in the Materials and Methods Section (Section 3).

Trust refers to the degree to which the tool protects student privacy, provides transparency and explainability, mitigates bias, and complies with legal standards such as the Family Educational Rights and Privacy Act (FERPA). International guidance for generative AI in education emphasizes a human-centered approach that protects data privacy and ensures ethical validation. Risk-management guidance from NIST incorporates trustworthiness into the design, development, use, and evaluation of AI systems [4]. The SAFE Benchmarks framework similarly emphasizes safety, accountability, fairness, and transparency in educational technology (edtech) [5]. Tools that provide clear response provenance, limit hallucinations, and allow human oversight score higher.
Relevance describes alignment with curriculum standards and the extent to which the assistant integrates with the publisher’s content and learning platforms. This dimension assesses whether the AI enhances or detracts from instructional objectives.
Impact refers to evidence of improved learning outcomes, engagement, or efficiency. When empirical studies are unavailable, impact is inferred from product claims, user uptake, and alignment with best practices such as Socratic guidance and adaptive feedback.
Adoption captures the extent of institutional and user uptake, including ease of onboarding. Adoption is influenced by self-efficacy, subjective norms, perceived enjoyment, facilitating conditions, and system accessibility [14]. Widespread usage, positive feedback, and institutional support increase the score.
Design concerns usability, accessibility, and responsiveness. Inclusive design draws on frameworks such as Universal Design for Learning (UDL), which encourages multiple means of engagement, representation, and expression to remove barriers for diverse learners [15]. Interfaces that offer intuitive interactions, support learner variability, and provide high-quality feedback score higher.

Each dimension is scored on a 1–10 scale. The scores are relative: a higher score indicates stronger performance than peers.

2.3. The Jobs-to-Be-Done Lens

The analysis also draws on the Jobs-to-Be-Done (JTBD) framework from innovation theory. JTBD posits that individuals and organizations “hire” products or services when specific circumstances arise, and that each job has functional, social, and emotional dimensions [16]. The framework focuses on the underlying job rather than on demographic characteristics. In an educational context, students hire an AI assistant to understand complex concepts, to generate practice questions, or to receive immediate feedback outside class. Instructors hire one to streamline assessment creation and to personalize instruction. This lens complements TRIAD: TRIAD measures the quality and responsibility of an assistant, whereas JTBD measures how well it fits the tasks users actually need to accomplish.

3. Materials and Methods

This section documents how the review was conducted and how the assistants were evaluated, so that the procedure can be audited and reapplied. It describes the review type and protocol; the search strategy and source selection; the criteria used to include tools; the scheme for grading evidence and confidence; the operational TRIAD and JTBD scoring procedures; the rater process and reliability analysis; and the temporal scope of the findings.

3.1. Review Type and Protocol

This work is a critical review with a structured search component. It does not propose a new algorithm; its contributions are the synthesis of educational AI, the formalization of a reproducible evaluation framework, and the evidence-based application of that framework to current products. The review followed a written protocol with four stages adapted from PRISMA: identification, screening, eligibility, and inclusion. The protocol defined the search sources, the date range, the inclusion and exclusion criteria, the tool selection rule, and the scoring procedure before data extraction began. The PRISMA structure is borrowed to make the search transparent and reproducible. The article is nonetheless positioned as a critical review with a structured scoping-style search, not as a systematic review in the strict PRISMA sense: it does not register a review protocol, does not restrict its evidence base to study-level findings, and does not perform a formal risk-of-bias meta-synthesis. Readers should therefore interpret Figure 2 as a transparent account of how the literature was assembled rather than as the basis of an aggregated quantitative synthesis.

3.2. Search Strategy and Source Selection

Peer-reviewed articles, conference proceedings, monographs, and government reports addressing AI in higher education from 1956 to 2025 were sought in five databases: Scopus, Web of Science, IEEE Xplore, the ACM Digital Library, and ERIC, supplemented by Google Scholar for citation chaining. Search strings combined a technology facet with an education facet, for example: (“intelligent tutoring” OR “adaptive learning” OR “knowledge tracing” OR “large language model” OR “generative AI”) AND (“higher education” OR university OR undergraduate). Inclusion criteria were: (i) relevance to AI mechanisms or tools used in post-secondary teaching and learning; (ii) English language; (iii) full-text availability; and (iv) publication between 1956 and mid-2025. Exclusion criteria were: off-topic records, opinion pieces without methodological or empirical content, and redundant secondary coverage of a primary source already included. Priority was given to high-impact journals and to seminal, widely cited works. Figure 2 reports the flow of records through the four stages and the counts retained at each step.

3.3. Tool Inclusion and Selection

A tool was eligible for the comparative evaluation if it satisfied four conditions: it was (i) an AI assistant for teaching or learning, (ii) built or commissioned by a U.S. academic publisher or a closely comparable provider, (iii) generally available or in a documented pilot as of mid-2025, and (iv) documented in enough detail to support scoring on every TRIAD dimension. Applying these conditions yielded eleven assistants from eight providers (Cengage, Khan Academy, Macmillan Learning, McGraw Hill, Pearson, Wiley, Quizlet, and Chegg). Khan Academy is included as an influential reference point even though it is not a traditional textbook publisher; this choice is noted so that readers can weigh it.

3.4. Evidence and Confidence Grading

Because the available information ranges from peer-reviewed trials to vendor marketing, each tool’s evidence base was graded by source type, and a confidence level was attached to its overall evaluation. Four source types were distinguished in descending order of evidential weight: independent peer-reviewed research; institutional or governmental reports; vendor product documentation; and commercial or news claims. Confidence was rated High when independent research substantiated the central claims, Moderate when product documentation was corroborated by at least one independent source or by early empirical data, and Low when the evidence rested largely on vendor or news material. Table 2 records the primary evidence basis and the confidence level for each tool. This grading directly distinguishes ratings anchored in independent research, such as ALEKS, from those inferred largely from marketing, such as the Wiley AI Tutor, and it should be read alongside the scores in Section 7.

3.5. TRIAD Scoring Procedure

Each TRIAD dimension was rated on a 1–10 scale using fixed anchors. Scores of 1–3 indicate limited evidence or non-compliance; 4–6 indicate moderate performance with notable gaps; 7–8 indicate strong alignment with responsible AI and instructional quality; and 9–10 indicate exemplary practice. Dimension-specific anchors were defined as follows. For Trust, the presence of explicit privacy safeguards, explainable outputs, bias mitigation through content grounding, and human oversight; tools lacking published safeguards scored 1–3, and tools with comprehensive adherence to international AI-ethics guidance scored 9–10. For Relevance, the depth of curricular alignment and integration with vetted content; generic support scored low, whereas deep integration with publisher content and learning management systems (LMSs) scored high. For Impact, the strength of evidence for improved outcomes; no published evidence scored 1–3, pilot or anecdotal benefits 4–6, survey or small-scale studies 7–8, and peer-reviewed evaluations of significant learning gains 9–10. For Adoption, the breadth of institutional and user uptake was limited; pilots scored low, and sustained multi-institution use scored high. For Design, usability, accessibility, and responsiveness, minimal interfaces scored low, and exemplary universal design scored high. As a worked example, the Cengage Student Assistant received a Trust score of 8 because it limits hallucination risk through discipline-specific grounding, emphasizes academic integrity, and provides oversight controls; its Relevance and Impact scores of 8 and 7 reflect close alignment with course content and early engagement evidence, while moderate Adoption (6) and solid Design (7) yield a total of 36.

3.6. JTBD Scoring Procedure

For the JTBD matrix, operational criteria rated each assistant’s ability to fulfill three core jobs: understanding complex content, generating assessment questions, and providing timely feedback. A rating of High indicates explicit, dedicated features that directly accomplish the job, such as step-by-step explanations, automated question generation, or around-the-clock chat support. Moderate indicates partial or auxiliary support. Low indicates minimal or no functionality relative to the job. Ratings were assigned from documented features, usage policies, and product demonstrations, so that the High/Moderate/Low labels rest on transparent decision rules rather than impressions.

3.7. Raters, Reliability, and Reproducibility

Two researchers with expertise in educational technology and machine learning independently scored every tool on all TRIAD dimensions and JTBD jobs using the anchored criteria above. Independent ratings were recorded before any discussion. Agreement was then quantified, and remaining differences were reconciled by consensus to produce the values reported in Section 7. Because the scores are bounded ordinal ratings that cluster in a narrow band, agreement is reported with several complementary statistics rather than a single coefficient: exact percent agreement, percent agreement within one point, quadratic-weighted Cohen’s kappa, and a two-way intraclass correlation coefficient, ICC(2,1). Table 3 reports these per dimension and overall. Across all dimensions, raters agreed within 1 point in 100% of cells and exactly in 72.7% of cells; the overall quadratic-weighted kappa was 0.86, and ICC(2,1) was 0.87, indicating strong reliability. For the Impact dimension, weighted kappa is lower (0.45) despite 100% within-one agreement. This is the well-known base-rate paradox of kappa: when ratings have little marginal variance, as they do for Impact (scores cluster tightly at 7–8), kappa is deflated even when raters agree closely. The percent-agreement and ICC values are therefore the more informative indicators for that dimension.

Only having two expert raters is a limitation, and small panels can introduce bias even when reliability is high. Three design choices mitigate this risk. First, the anchored rubric in Section 3.5 and Section 3.6 constrains judgment to documented criteria, which reduces idiosyncratic scoring. Second, the evidence-and-confidence grading in Table 2 makes explicit where a score rests on independent research versus vendor material so that a reader can discount low-confidence ratings. Third, and most important, the framework is presented as a reusable template rather than as a definitive ranking: the rubric, anchors, and scoring sheet are specified in enough detail that other evaluators can reapply them, expand the panel, and update the scores as tools evolve. All quantitative procedures, including the reliability statistics, were implemented in Python Version 3.12.6. Replacing the independent rating sheet with additional raters’ scores regenerates Table 3 without further changes.

3.8. Temporal Scope

Generative AI products change rapidly, often on a monthly basis. Every finding in this article reflects a mid-2025 snapshot, and all tables carry that snapshot date. The contribution is therefore the evaluation method and the analysis of a defined cohort at a defined time, not a durable leaderboard. Readers applying the framework later should re-extract the underlying evidence before reusing any specific score.

4. Historical Evolution of AI in Higher Education

The decade-by-decade account below traces how the algorithmic families of Section 2.1 reached the classroom. Two caveats frame it. First, the decade headings are an organizing convenience, not strict boundaries: techniques develop continuously and diffuse slowly, so the same idea often spans several periods, as the entries for cognitive modeling and for learning analytics in the tables below illustrate. Second, the historical material is not offered as a direct cause of present-day publisher strategy. Its purpose is to show that today’s generative assistants recombine long-standing components, namely explicit domain models, adaptive sequencing, knowledge tracing, and dialogic interaction, now delivered through LLMs. That lineage explains why retrieval-augmented designs, which graft transparency-enhancing grounding onto otherwise opaque models, have become the preferred architecture among the publisher tools examined in Section 5. The historical record thus motivates the evaluation criteria rather than the commercial timing of any single product.

4.1. 1950s and 1960s: Foundations and Early Experiments

Modern AI research was formally inaugurated at the 1956 Dartmouth Summer Research Project, organized by pioneers who defined the goal of making machines simulate human intelligence. This workshop is widely regarded as the birth of the field [17]. In the 1960s, computer-assisted instruction (CAI) emerged. The PLATO system, created in 1960 at the University of Illinois at Urbana-Champaign, provided time-shared computer terminals through which students could access instructional materials [18]. By the early 1970s, PLATO supported 1000 simultaneous users over 1200 bps connections and fostered one of the first online communities [18]. Table 4 summarizes key milestones of the 1950s and 1960s.

4.2. 1970s: Intelligent Tutoring Emerges

The limitations of ad hoc CAI led researchers to explore AI techniques to develop more adaptive and interactive systems. In 1970, the SCHOLAR system was introduced as a pioneering intelligent tutoring system (ITS) that utilized a semantic network to store domain knowledge and engage in mixed-initiative dialogue [23]. The information-structure-oriented CAI approach, based on an information network rather than preprogrammed frames, enabled the system to generate questions, answers, and feedback on the fly. SCHOLAR demonstrated that an ITS could detect misspellings, answer students’ questions, and dynamically adapt content [23].

Another strand of research applied cognitive psychology to education. A seminal 1984 paper reported that students tutored one-on-one achieved performance two standard deviations better than that of students receiving conventional classroom instruction, highlighting the “two sigma” problem and motivating researchers to build systems that emulate human tutors [24]. Throughout the 1970s, early ITS prototypes such as GUIDON and BIP (based on medical diagnostics) experimented with expert systems to teach domain knowledge. Table 5 lists the major developments of the 1970s in AI tutoring.

4.3. 1980s: Cognitive Models and Rule-Based Tutors

By the early 1980s, advances in cognitive psychology and artificial intelligence converged to produce more sophisticated ITSs. Researchers introduced student models that represented learners’ knowledge states and misconceptions. ACT theory laid the groundwork for cognitive tutors that model procedural skills through production rules. Early systems, such as LISP Tutor and Geometry Tutor, incorporated rule-based models and adaptive feedback. An influential 1990 overview summarized the architecture of ITSs and identified four main components: domain model, student model, tutoring model, and user interface [29]. The 1980s also saw the emergence of adaptive hypermedia and rule-based system shells such as HERACLES and MENO. Table 6 summarizes highlights of AI and tutoring research in the 1980s.

4.4. 1990s: Commercial Adaptive Platforms

The 1990s marked a transition from research prototypes to scalable educational products. Cognitive Tutors, developed at a university, applied ACT-R production rules to secondary mathematics and were commercialized by an educational company. Empirical studies demonstrated that students using Cognitive Tutor algebra curricula achieved significant learning gains compared with conventional instruction. Another major contribution was ALEKS (Assessment and Learning in Knowledge Spaces), launched in 1996. ALEKS applies knowledge space theory to adaptively diagnose student knowledge and select the next best topic, allowing students to progress efficiently [7]. The platform has been continuously developed for over 25 years and remains widely used. Early attempts at intelligent student assistants also emerged, such as AutoTutor, which engages learners in natural language dialogue. Table 7 summarizes key developments in AI tutoring during the 1990s.

4.5. 2000s: Learning Analytics and MOOCs

The early 2000s saw the confluence of web technologies, data mining, and online education. Researchers leveraged large datasets from learning management systems (LMS) to identify at-risk students and personalize interventions. The term “learning analytics” gained prominence. Researchers argued that data-driven analytics could transform higher education by enabling evidence-based decision-making [9]. A survey of educational data mining and learning analytics noted that distance education generates rich, traceable data that can be used to model engagement and predict persistence [8]. Massive open online courses (MOOCs) emerged in 2012, bringing scale but also challenges of attrition and engagement. During this period, companies like Knewton and Smart Sparrow introduced adaptive learning platforms for higher education. Table 8 summarizes the major developments in AI and learning analytics during the 2000s.

4.6. 2010s: Intelligent Assistants and Deep Learning

During the 2010s, AI tutors began to incorporate natural language processing and deep learning. A case study of an online AI course introduced a virtual teaching assistant named Jill Watson in 2016; students did not realize it was an AI until the end of the semester [49]. Subsequent descriptions detailed how Jill Watson responded autonomously to introductions and FAQs and posted announcements [50]. The decade also saw advances in affective computing and multimodal analytics. Researchers have developed models to detect students’ emotions from facial expressions and physiology, enabling the development of adaptive interventions. Deep neural networks have been applied to knowledge tracing and student modeling, culminating in algorithms such as Deep Knowledge Tracing. Table 9 presents key AI developments in higher education during the 2010s.

4.7. 2020s: Generative AI and LLMs

The current decade has witnessed a surge in the development of generative AI tools. Transformer models, such as BERT and GPT-3, paved the way for large language models (LLMs) capable of generating coherent text, code, and dialogue. Public awareness of generative chatbots rose sharply when ChatGPT became widely accessible in late 2022 and early 2023 [53,54]. Government reports have noted that generative AI can write essays, create lesson plans, and personalize assignments, while also raising concerns about surveillance and algorithmic discrimination [54,55,56]. Publishers quickly incorporated LLMs into their platforms to offer chat-based tutoring, content generation, and study aids. Section 5 examines these tools in detail. Table 10 highlights AI developments in higher education from 2020 through 2025.

Recent research in “Computers and Education: Artificial Intelligence” highlights both the potential and the complexity of deploying generative AI tools in higher education. One study investigated how interacting with ChatGPT influences undergraduate cognitive skills using a mixed-methods pretest-posttest design with a control group [67]. The Ghanaian study found that using ChatGPT significantly improved students’ critical, creative, and reflective thinking skills, illustrating the capacity of conversational models to scaffold higher-order cognition [67].

Another 2024 survey of 5894 students across Swedish universities evaluated perceptions and usage of AI chatbots [68]. The survey reported high awareness and generally positive attitudes toward ChatGPT, yet noted significant differences across gender, academic level, and field of study, with female and humanities students expressing greater skepticism and concern about the role of AI [68]. These findings underscore the need for context-sensitive adoption strategies and suggest that demographic factors should inform the design and deployment of AI-driven educational tools.

5. AI-Powered Educational Tools by Academic Publishers

This section catalogs the AI-driven educational tools offered by major U.S. academic publishers as of mid-2025. Each subsection summarizes a publisher’s products, including launch dates, target markets, key features, and limitations, and a table records the salient details. Before the per-publisher catalog, it is useful to organize the cohort along the two dimensions that most sharply distinguish the tools, as they vary widely in their interfaces, underlying algorithms, motivations, and foundation models.

5.1. Organizing Dimensions: Interface and Architecture

The eleven assistants differ along two axes. The first is the user interface and delivery modality. The most common pattern, used by a clear majority, is a conversational chatbot embedded inside the publisher’s own platform: Cengage’s Student Assistant in MindTap, Macmillan’s Achieve AI Tutor, Pearson’s AI Study Tools in MyLab and Pearson+, McGraw Hill’s AI Reader in Connect and GO, and CheggMate in Chegg Study all follow this in-platform conversational model. Two tools depart from it. The Wiley AI Tutor is delivered through a consumer messaging app (WhatsApp) rather than an LMS, and Quizlet’s Q-Chat is woven into a flashcard study workflow rather than presented as a free-form chat. A further distinct modality is the instructor-facing generator, exemplified by the Macmillan iClicker AI Question Creator, whose interface is an authoring tool rather than a student tutor. Khanmigo and ALEKS represent, respectively, a standalone conversational tutor and an adaptive assessment engine.

The second axis is the underlying architecture. ALEKS is not generative; it is an adaptive engine grounded in knowledge-space theory. Khanmigo, CheggMate, and Q-Chat are built on general-purpose LLMs (the GPT family or the OpenAI application programming interface, API). The remaining publisher tools follow the retrieval-augmented pattern, pairing an LLM with retrieval over vetted publisher content so that responses stay aligned with course material. Table 11 summarizes this classification, which the comparative analysis in Section 6 builds upon.

5.2. Cengage

Cengage entered the generative AI space with the Student Assistant, an in-platform chatbot integrated into the MindTap learning environment. The tool is discipline-specific: course content and pedagogy train the underlying model to provide prompts and feedback without simply giving away solutions [61]. The Student Assistant emphasizes critical thinking and academic integrity through Socratic questioning and is available 24/7 for just-in-time help [61]. A beta launch in Fall 2024 targeted four courses in Management, Organizational Behavior, Psychology, and Economics, with expansion across disciplines planned for 2025 [61]. Table 12 outlines the features and limitations of the Cengage Student Assistant.

5.3. Khan Academy

Although not a traditional publisher, Khan Academy’s Khanmigo provides an influential reference for generative AI tutoring. The GPT-4-powered assistant, launched in March 2023, converses with learners using Socratic questions and hints across various subjects, including mathematics, science, and humanities. Khanmigo can role-play historical figures to enrich engagement and integrates teacher tools for lesson planning, rubric creation, and progress summaries. Safety measures include logging all interactions with minors and using a second AI to filter inappropriate content. Access remains limited to pilot schools and paid subscribers and is primarily oriented toward K-12 learners [58]. Table 13 summarizes the highlights and limitations of Khanmigo.

5.4. Macmillan Learning

Macmillan offers two AI tools: the Achieve AI Tutor and the iClicker AI Question Creator. The Achieve AI Tutor, launched in 2023 and rolled out widely in 2024–2025, serves as an on-demand homework helper that provides step-by-step guidance through Socratic questioning. It is available in roughly 80 courses, predominantly in science, technology, engineering, and mathematics (STEM) disciplines, and must be enabled at the instructor’s discretion. Surveys indicate that the tool increases student confidence and engagement. However, the tutor avoids giving direct answers, and its scope remains limited to courses in which it has been trained [62,63]. Table 14 compares Macmillan’s AI tools.

The iClicker AI Question Creator, launched in February 2024, generates up to 50 customized quiz or polling questions based on instructor-specified topics, difficulty, and taxonomy levels. By producing unique, non-searchable questions, it aims to enhance academic integrity and promote active learning. Instructors must review AI-generated content for accuracy; the tool currently functions best for formative assessments rather than high-stakes exams [63].

5.5. McGraw Hill

McGraw Hill’s contributions span decades. Its long-standing ALEKS platform, launched in 1996, remains a pioneering adaptive learning and assessment system based on knowledge space theory. ALEKS diagnoses what a student knows and selects the next appropriate topic, enabling self-paced mastery. Research shows it can reduce assessment time and improve learning efficiency [7]. In 2024, McGraw Hill announced two generative AI tools: the AI Reader and enhancements to ALEKS. The AI Reader allows students to highlight text in eBooks and request simplified explanations or practice questions. It integrates into the Connect and GO platforms and is intended to promote active reading. Availability is currently limited to select textbook titles [64]. As part of the same announcement, McGraw Hill described future generative capabilities within ALEKS, though details remain sparse. Table 15 describes McGraw Hill’s AI tools.

5.6. Pearson

Pearson launched AI study tools in 2023 and expanded them in 2024 to dozens of courses. These tools embed generative AI into e-textbooks and homework platforms, providing personalized Q&A, step-by-step problem-solving, syllabus-based study plans, interactive video assistants, and AI-generated practice problems. The AI draws exclusively on vetted Pearson content to ensure accuracy and includes features such as the ability to upload syllabi for custom study schedules. Pearson emphasizes responsible AI use by monitoring interactions and adjusting the tool’s tone in response to feedback. Rollout began with a handful of titles and is expected to expand through 2024 [60]. Table 16 summarizes Pearson’s AI study tools.

Alongside the student-facing study tools, Pearson offers an instructor-facing counterpart, referred to here as the Pearson AI Instructor Tool. It supports educators in generating and curating assessment content, including practice questions and assignment items, drawn from vetted Pearson material and aligned to course learning outcomes. Like the study tools, it operates on the publisher’s platforms rather than as a free-standing application, and it is documented primarily through product materials, as reflected in its Low confidence rating in Table 2.

5.7. Wiley

Wiley partnered with eFlow to pilot the Wiley AI Tutor via mobile messaging platforms, including WhatsApp. Announced in 2024, the service provides on-demand micro-tutoring sessions for subjects such as physics, accounting, and statistics. Students send natural-language questions and receive step-by-step explanations and practice problems. By leveraging a familiar chat interface, the tool lowers barriers to access. However, it is early in development: only a few subjects are covered, and the messaging platform may struggle with complex mathematical notation [65]. Table 17 highlights attributes and limitations of the Wiley AI Tutor.

5.8. Quizlet

Quizlet’s Q-Chat, built on OpenAI’s API, offers an AI-driven study companion within the popular flashcard platform. Launched in March 2023, Q-Chat utilizes Quizlet’s user-generated content to quiz students, adjust difficulty based on their responses, and provide hints. Additional features, such as “Magic Notes” and “Quick Summary,” summarize or explain content using AI. The tool has been restricted to users aged 18+ during beta testing and may propagate errors when underlying flashcards contain inaccuracies [66]. Table 18 details the functionality and limitations of Quizlet’s Q-Chat.

5.9. Chegg

Chegg’s CheggMate, announced in April 2023 and launched in beta shortly thereafter, combines GPT-4 with Chegg’s proprietary database of textbook solutions, expert Q&A and practice exam pathways, allowing students to submit questions or photos of problems and receive step-by-step explanations and additional practice tailored to their level. The service is exclusive to Chegg subscribers, which differentiates it from free alternatives. Chegg has been criticized in the past for enabling cheating, and there are concerns that generative AI could exacerbate misuse [59,69]. Table 19 provides an overview of CheggMate.

6. Competition Among Publishers

To understand the strategic positioning of generative AI tools across the U.S. academic publishing landscape, Table 20 and Table 21 provide a comparative overview. These tools differ in modality (e.g., chatbot, flashcard interface, messaging app), integration depth, disciplinary focus, and monetization model. While many rely on large language models and emphasize Socratic dialogue, implementation strategies diverge significantly. Some tools are embedded within proprietary platforms tightly aligned with curriculum content, whereas others are offered as standalone or cross-platform solutions.

These comparisons reveal several key dynamics. First, publishers with established platforms (e.g., McGraw Hill’s ALEKS, Pearson’s MyLab) are leveraging AI to augment existing ecosystems, emphasizing integration and instructional alignment. By contrast, new tools such as Wiley’s WhatsApp-based tutor and Quizlet’s Q-Chat focus on accessibility and scale through consumer-facing delivery.

Second, discipline coverage varies widely, from narrowly scoped pilots (Cengage, Wiley) to tools spanning dozens of subjects (Pearson, Chegg). Most generative tools are embedded in proprietary platforms and rely on subscription models, while others offer freemium access (e.g., Quizlet) or pilot deployments (e.g., Wiley, Khan Academy).

Finally, despite converging on LLM-powered features and Socratic methods, publishers differ in the degree of oversight, transparency, and customization they provide. The competitive advantage increasingly depends not just on AI capabilities, but on curricular relevance, instructional design, and safeguards for privacy and academic integrity. To remain viable against open-access alternatives like ChatGPT, publisher-provided tools must deliver context-aware, evidence-aligned support tailored to educational outcomes.

7. Current State of the Art and Emerging Trends

Advances in machine learning are rapidly altering educational technology. Large language models trained on billions of parameters can generate coherent explanations, answer questions, and simulate dialogue. Recent models, such as GPT-4, Claude, and Gemini, incorporate multimodal capabilities, enabling image-based reasoning and code generation. The integration of such models into educational platforms enables on-demand tutoring, automated question generation, and interactive video assistance.

Research is also exploring hybrid systems that combine deep learning with knowledge bases to ground responses in vetted content. For instance, generative tutoring assistants may retrieve textbook passages before generating answers, reducing the risk of hallucination and aligning with course objectives. Another trend is the combination of real-time learning analytics with AI tutors; by continuously monitoring student behavior and affect, systems can adjust feedback and content accordingly. At the same time, there is growing attention to fairness, transparency, and explainability in AI models, particularly when they make recommendations that affect students’ learning trajectories.

7.1. Scored Results: TRIAD and JTBD

The two instruments measure different things and were therefore developed separately, but they describe the same eleven assistants and are most useful side by side. TRIAD reports the quality and responsibility of each assistant on five numeric dimensions, whereas JTBD reports how well each assistant fits three concrete user tasks on a categorical scale. To allow quality and task fit to be read together and to keep each score tied to its evidence base, the two instruments are combined in Table 22, whose final column repeats the confidence level from Table 2. The three core jobs, derived from the functional, social, and emotional needs of students and instructors, are: understanding complex content (students seek explanations and scaffolding); generating assessment questions (instructors and students need quiz items or practice problems that reinforce learning and support integrity); and receiving timely feedback or support (both groups need on-demand assistance outside class).

On the TRIAD dimensions, the highest totals belong to Khanmigo and McGraw Hill’s ALEKS (42 each), followed by Pearson’s AI Study Tools (41) and the Pearson AI Instructor Tool (40). These tools combine privacy safeguards, curricular relevance, and broad adoption. Early-stage pilots such as the iClicker AI Question Creator and the Wiley AI Tutor score lower (33 each), reflecting limited adoption data and a narrower scope. The confidence column tempers this ranking: ALEKS is the only assistant whose high score rests on independent peer-reviewed research, whereas the lower-confidence entries depend largely on vendor documentation and should be read as provisional.

For JTBD jobs, most assistants are well-positioned to help students understand complex content and provide timely feedback, with High ratings across nearly all tools. Fewer excel at generating assessment questions, a job concentrated in instructor-facing tools such as the iClicker AI Question Creator, the Pearson AI Instructor Tool, and CheggMate. The pattern indicates that development effort has favored student-facing comprehension and support over assessment authoring, which marks an opportunity for innovation in instructionally aligned content generation.

7.2. Mechanism-by-Application Cross-Analysis and Scenarios

Combining the mechanism taxonomy of Section 2.1 with the educational jobs above clarifies where current technology is strong and where it is thin. Figure 3 maps the six mechanism families against six educational applications, rating the maturity of each pairing. Transformer LLMs and retrieval-augmented designs dominate conceptual explanation, assessment-item generation, and just-in-time feedback. In contrast, progress analytics and early-warning functions remain the province of statistical and deep-learning methods. Adaptive practice and mastery sequencing are still served best by knowledge-space and adaptive engines such as ALEKS. No single mechanism is strong across all applications, which is why hybrid architectures predominate among the reviewed tools.

Reading the cross-analysis together with the scored results yields three concrete application scenarios for institutions. In a gateway STEM course with high enrollment and well-defined problem domains, an adaptive engine (knowledge-space mastery) paired with a retrieval-augmented explanation assistant best fits the dominant jobs of adaptive practice and conceptual explanation; ALEKS combined with an in-platform tutor is the closest match. In a writing-intensive or discussion-based humanities course, where the dominant job is dialogic explanation and feedback rather than mastery sequencing, a guardrailed conversational tutor grounded in course readings is more appropriate, with assessment authoring handled by an instructor-facing generator. In a large blended program concerned with retention, the binding constraint is progress analytics and early warning, so a learning-analytics layer should be prioritized over a chat interface. These scenarios illustrate how the framework translates scores into design choices, and the institutional recommendations in Section 8 align with them.

8. Discussion and Implications

This comparative evaluation of publisher-built AI assistants reveals a heterogeneous landscape shaped by divergent design philosophies, integration strategies, and evidence bases. Tools that embed clear privacy guardrails and provide explainable outputs (such as Khanmigo, ALEKS, and Pearson’s AI study tools) achieve higher scores on the Trust dimension because they foreground transparency, data protection, and human oversight. By contrast, general-purpose chatbots built atop general-purpose models, such as Quizlet’s Q-Chat and CheggMate, offer broad functionality but fewer public assurances, leading to lower trust ratings. Curricular relevance is strongest when assistants are tightly coupled with vetted content and integrated into existing learning platforms, as exemplified by Cengage’s Student Assistant, Pearson’s MyLab, McGraw-Hill’s AI Reader, and ALEKS. Cross-disciplinary services remain valuable for learners seeking supplemental support but may be misaligned with specific syllabi and learning outcomes. Impact and adoption data indicate that long-standing adaptive systems and deeply integrated assistants have robust evidence of improved learning and widespread uptake.

In contrast, newer pilots, such as the iClicker AI Question Creator and the Wiley AI Tutor, show limited but promising results. Usability is enhanced by intuitive interfaces and inclusive design (e.g., Khanmigo and Pearson), while paywalls or constrained interfaces (e.g., Q-Chat and CheggMate) hamper user experience. Viewed through the Jobs-to-Be-Done lens, most assistants excel at helping learners understand complex concepts and providing just-in-time feedback; comparatively few offer sophisticated question-generation capabilities, highlighting an opportunity for innovation.

Beyond the publisher landscape, campus-specific AI platforms (including Nectir and Element451) illustrate an alternative model in which institutions train assistants on their syllabi, policies, and support resources. These institutionally controlled systems score highly on relevance because they mirror local curricula and administrative processes and may better accommodate unique course designs and regulatory requirements. However, bespoke development demands significant investment, expertise, and robust data governance, which limits scalability. By leveraging economies of scale and curated content, publisher-provided assistants can reach larger audiences but may struggle to adapt to local policies. The trade-off between general-purpose and customized AI underscores the need for flexible architectures that allow institutions to combine vetted publisher content with institution-specific data under clear governance frameworks.

Overall, the assistants occupy different niches along the continuum of generative AI capabilities. Tools built on mature adaptive platforms (e.g., ALEKS) and integrated into well-established ecosystems (Pearson’s MyLab, Khan Academy) combine strong privacy safeguards with curricular alignment and broad adoption. Generative services focused on assessment creation (iClicker AI Question Creator, Wiley AI Tutor) address specific needs but currently lack robust evidence of learning impact and remain limited in reach. Beta systems like CheggMate and Q-Chat offer personalized study pathways, yet must strengthen transparency and compliance to build trust. Students primarily “hire” AI assistants to decode complex concepts and obtain immediate feedback, while instructors “hire” them to generate assessments and reduce administrative workload. Tools that combine explanation with context-aware question generation (Khanmigo, ALEKS, Pearson) deliver greater perceived value. The success of campus platforms such as Nectir and Element451 highlights a growing demand for institutionally governed AI that aligns with local policies and curricula. Policymakers and educators should remain vigilant about transparency, bias, and privacy. Guidance from the U.S. Department of Education calls for AI systems to be inspectable, explainable, and aligned to a vision for high-quality learning [70]. Future research should rigorously measure learning outcomes, assess user perceptions across diverse populations, and explore how generative assistants can support inclusive pedagogies.

8.1. Recommendations for Institutions

The recommendations below follow directly from the application scenarios in Section 7.2, so that guidance is tied to identified use cases rather than offered in the abstract. First, institutions should select tools by job rather than by brand. A gateway STEM course should prioritize an adaptive mastery engine plus a retrieval-augmented explanation assistant; a discussion-based humanities course should prioritize a guardrailed conversational tutor grounded in course readings, with assessment authoring delegated to an instructor-facing generator; and a retention-focused blended program should invest first in a learning analytics and early-warning layer. Second, procurement should explicitly weigh evidence and confidence: a high TRIAD total backed only by vendor documentation (Table 2) warrants a local pilot with pre-registered outcomes before scale-up, whereas an independently validated engine such as ALEKS can be adopted with greater confidence. Third, because the strongest current designs are retrieval-augmented hybrids, institutions should favor architectures that let vetted institutional content be combined with publisher content under clear data-governance terms. Fourth, contracts should require inspectability and human oversight consistent with the responsible-AI guidance cited above, and should mandate a re-evaluation cadence given the mid-2025 snapshot nature of any assessment.

8.2. Ethical and Societal Considerations

The ethical implications of AI deployment in education extend well beyond privacy, and a responsible evaluation must weigh several risks together. Government and non-governmental reports caution that generative systems may amplify surveillance, infringe on student privacy, or exacerbate inequities [54,55,56]. Algorithmic bias arises when training data reflect historical inequities, which can yield unequal support across demographic groups; the Swedish survey discussed earlier already shows uneven trust across gender and field of study [68]. Student deskilling is a distinct concern: assistants that supply fluent answers can erode the productive struggle through which durable skills form, so designs that withhold direct answers and scaffold reasoning are preferable to those optimized for convenience. The environmental impact of large models is non-trivial, since training and serving LLMs consume substantial energy and water, which argues for right-sizing models, caching, and retrieval over repeated generation where feasible. Labor displacement for instructors and teaching assistants is a further risk: tools that automate explanation, grading, and question generation can reduce demand for human instructional roles, and institutions should treat these tools as augmentation rather than replacement and involve faculty in adoption decisions. Across all of these, publishers and institutions must adopt robust safeguards, including rigorous content review, data minimization, transparency about training data and algorithms, and opt-out mechanisms that preserve student agency. Academic integrity remains prominent: educators should design assessments that value reasoning over recall and should integrate AI literacy into curricula so that students use these tools responsibly. Addressing these considerations alongside technical and pedagogical factors enables generative AI to catalyze equitable, effective, and human-centered education.

8.3. Limitations

Three limitations qualify the findings. First, the panel comprised two expert raters; although reliability was high (Section 3.7) and the anchored rubric constrains judgment, a larger and more diverse panel would reduce residual bias. The framework is published as a reusable template precisely so that others can broaden the panel. Second, the evaluation is a mid-2025 snapshot of a fast-moving market, so individual scores will date even as the method persists; every table carries the snapshot date for this reason. Third, the cohort is limited to U.S. publisher-built assistants and one comparator (Khan Academy), so the conclusions should not be generalized to campus-built systems or to non-U.S. markets without re-extraction of evidence.

9. Conclusions

The integration of AI within the educational sector has progressed from basic computer-assisted instruction to the deployment of advanced generative models. These models offer the potential for scalable and personalized academic support, thereby transforming pedagogical practices and enhancing learning outcomes. The history of AI in higher education reveals a recurring theme: technological breakthroughs inspire educational innovation, but meaningful improvement ultimately depends on sound pedagogy, ethical deployment, and rigorous evaluation. The decade-by-decade account underscores how early pioneers laid the conceptual foundations that later found expression in commercial platforms.

The recent entry of major U.S. publishers into the AI tutoring market signals a competitive race to integrate generative technologies into educational ecosystems. Cengage, Macmillan, McGraw Hill, Pearson, Wiley, Quizlet, and Chegg have each launched products with distinctive modalities and coverage. However, their success will hinge on delivering verifiable value beyond generic chatbots, addressing privacy and integrity concerns, and aligning with instructors’ needs. Future research should systematically evaluate learning outcomes with these tools, explore hybrid models that blend generative AI with established cognitive frameworks, and investigate how AI can support equity and accessibility in higher education.

AI assistants are becoming integral to higher education’s digital learning ecosystems. Using the TRIAD and JTBD frameworks, this analysis demonstrates that well-designed tools can enhance comprehension, streamline assessment creation, and offer continuous support. However, adoption should proceed at the “speed of trust,” emphasizing transparency, privacy, and human oversight. Publishers and institutions should collaborate with researchers and educators to collect evidence on effectiveness, address equity and bias, and design interfaces that empower rather than replace human judgment. As AI continues to mature, aligning technological innovation with pedagogical purpose will determine whether these assistants become transformative partners in teaching and learning or remain supplemental tools.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

References

Lin, C.C.; Huang, A.Y.Q.; Lu, O.H.T. Artificial intelligence in intelligent tutoring systems toward sustainable education: A systematic review. Smart Learn. Environ. 2023, 10, 41. [Google Scholar] [CrossRef]
Létourneau, A.; Deslandes Martineau, M.; Charland, P.; Karran, J.A.; Boasen, J.; Léger, P.M. A systematic review of AI-driven intelligent tutoring systems (ITS) in K-12 education. npj Sci. Learn. 2025, 10, 29. [Google Scholar] [CrossRef] [PubMed]
Wang, T.; Zheng, J.; Tan, C.; Lajoie, S.P. Computer-based scaffoldings influence students’ metacognitive monitoring and problem-solving efficiency in an intelligent tutoring system. J. Comput. Assist. Learn. 2023, 39, 1652–1665. [Google Scholar] [CrossRef]
Tabassi, E. Artificial Intelligence Risk Management Framework (AI RMF 1.0); U.S. Department of Commerce: Washington, DC, USA, 2023. [CrossRef]
Ruiz, P.; Mills, K.; Lee, K.w.; Coenraad, M.; Fusco, J.; Roschelle, J.; Weisgrau, J. AI Literacy: A Framework to Understand, Evaluate, and Use Emerging Technology; Digital Promise: Washington, DC, USA, 2024. [Google Scholar] [CrossRef]
Holmes, W.; Miao, F. Guidance for Generative AI in Education and Research; Unesco: Paris, France, 2023. [Google Scholar]
Smith, D. ALEKS. Distance Learn. 2013, 10, 51–56. [Google Scholar] [CrossRef]
Khine, M.S. Educational Data Mining and Learning Analytics. In Artificial Intelligence in Education; Springer: Singapore, 2024; pp. 1–159. [Google Scholar] [CrossRef]
Dritsas, E.; Trigka, M. Big data analytics in e-learning: Ethical challenges and opportunities for engineering education. AI Ethics 2025, 6, 4. [Google Scholar] [CrossRef]
Minn, S.; Yu, Y.; Desmarais, M.C.; Zhu, F.; Vie, J.J. Deep Knowledge Tracing and Dynamic Student Classification for Knowledge Tracing. In Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM), Singapore, 17–20 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1182–1187. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.u.; Polosukhin, I. Attention is All You Need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North, Minneapolis, MN, USA, 2–7 June 2019; Association for Computational Linguistics: Stroudsburg, PA, USA; pp. 4171–4186. [CrossRef]
Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models are Few-Shot Learners. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates, Inc.: Red Hook, NY, USA, 2020; Volume 33, pp. 1877–1901. [Google Scholar]
Granić, A. Educational Technology Adoption: A systematic review. Educ. Inf. Technol. 2022, 27, 9725–9744. [Google Scholar] [CrossRef] [PubMed]
Rogers-Shaw, C.; Carr-Chellman, D.J.; Choi, J. Universal Design for Learning: Guidelines for Accessible Online Instruction. Adult Learn. 2017, 29, 20–31. [Google Scholar] [CrossRef]
Bayaga, A. Leveraging AI-enhanced and emerging technologies for pedagogical innovations in higher education. Educ. Inf. Technol. 2024, 30, 1045–1072. [Google Scholar] [CrossRef]
McCarthy, J.; Minsky, M.L.; Rochester, N.; Shannon, C.E. A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence, August 31, 1955. AI Mag. 2006, 27, 12. [Google Scholar] [CrossRef]
Cope, B.; Kalantzis, M. A little history of e-learning: Finding new ways to learn in the PLATO computer education system, 1959–1976. Hist. Educ. 2023, 52, 905–936. [Google Scholar] [CrossRef]
Minsky, M. Steps toward Artificial Intelligence. Proc. IRE 1961, 49, 8–30. [Google Scholar] [CrossRef]
Atkinson, R.C.; Hansen, D.N. Computer-Assisted Instruction in Initial Reading: The Stanford Project. Read. Res. Q. 1966, 2, 5. [Google Scholar] [CrossRef] [PubMed]
Sleeman, D.H. Inferring Student Models for Intelligent Computer-Aided Instruction. Mach. Learn. 1983, I, 483–510. [Google Scholar] [CrossRef]
Jones, S.; Latzko-Toth, G. Out from the PLATO cave: Uncovering the pre-Internet history of social computing. Internet Hist. 2017, 1, 60–69. [Google Scholar] [CrossRef]
Carbonell, J. AI in CAI: An Artificial-Intelligence Approach to Computer-Assisted Instruction. IEEE Trans. Man Mach. Syst. 1970, 11, 190–202. [Google Scholar] [CrossRef]
Bloom, B.S. The 2 Sigma Problem: The Search for Methods of Group Instruction as Effective as One-to-One Tutoring. Educ. Res. 1984, 13, 4–16. [Google Scholar] [CrossRef]
Yu, V.L. Antimicrobial Selection by a Computer: A Blinded Evaluation by Infectious Diseases Experts. JAMA 1979, 242, 1279. [Google Scholar] [CrossRef]
Clancey, W.J. From Guidon to Neomycin and Heracles in Twenty Short Lessons. AI Mag. 1986, 7, 40. [Google Scholar] [CrossRef]
Self, J.A. Student models in computer-aided instruction. Int. J. Man-Mach. Stud. 1974, 6, 261–276. [Google Scholar] [CrossRef]
Brown, J.; Burton, R. Diagnostic models for procedural bugs in basic mathematical skills. Cogn. Sci. 1978, 2, 155–192. [Google Scholar] [CrossRef]
Nwana, H. Intelligent tutoring systems: An overview. Artif. Intell. Rev. 1990, 4, 251–277. [Google Scholar] [CrossRef]
Anderson, J.R.; Boyle, C.F.; Reiser, B.J. Intelligent Tutoring Systems. Science 1985, 228, 456–462. [Google Scholar] [CrossRef] [PubMed]
Anderson, J.R.; Bothell, D.; Byrne, M.D.; Douglass, S.; Lebiere, C.; Qin, Y. An Integrated Theory of the Mind. Psychol. Rev. 2004, 111, 1036–1060. [Google Scholar] [CrossRef] [PubMed]
Self, J. Theoretical foundations for intelligent tutoring systems. J. Artif. Intell. Educ. 1990, 1, 3–14. [Google Scholar]
Mousavinasab, E.; Zarifsanaiey, N.; R. Niakan Kalhori, S.; Rakhshan, M.; Keikha, L.; Ghazi Saeedi, M. Intelligent tutoring systems: A systematic review of characteristics, applications, and evaluation methods. Interact. Learn. Environ. 2018, 29, 142–163. [Google Scholar] [CrossRef]
Bardach, L.; Moeller, K.; Ruiz-Garcia, M.; Strittmatter, Y.; Meyer, J.; Musslick, S.; Spitzer, M. Intelligent Tutoring Systems Need Teachers. J. Comput. Assist. Learn. 2025, 42, e70159. [Google Scholar] [CrossRef]
Koedinger, K.R.; Anderson, J.R.; Hadley, W.H.; Mark, M.A. Intelligent Tutoring Goes To School in the Big City. Int. J. Artif. Intell. Educ. 1997, 8, 30–43. [Google Scholar]
Anderson, J.R.; Corbett, A.T.; Koedinger, K.R.; Pelletier, R. Cognitive Tutors: Lessons Learned. J. Learn. Sci. 1995, 4, 167–207. [Google Scholar] [CrossRef] [PubMed]
Oueini, S. The Impact of Intelligent Tutoring Software on Geometry Students. Doctoral Dissertation, University of South Carolina, Columbia, SC, USA, 2019. [Google Scholar]
Doignon, J.P.; Falmagne, J.C. Spaces for the assessment of knowledge. Int. J. Man-Mach. Stud. 1985, 23, 175–196. [Google Scholar] [CrossRef]
Graesser, A.C.; VanLehn, K.; Rose, C.P.; Jordan, P.W.; Harter, D. Intelligent Tutoring Systems with Conversational Dialogue. AI Mag. 2001, 22, 39. [Google Scholar] [CrossRef]
Steenbergen-Hu, S.; Cooper, H. A meta-analysis of the effectiveness of intelligent tutoring systems on college students’ academic learning. J. Educ. Psychol. 2014, 106, 331–347. [Google Scholar] [CrossRef]
Mitrovic, A. An Intelligent SQL Tutor on the Web. Int. J. Artif. Intell. Educ. 2003, 13, 173–197. [Google Scholar] [CrossRef]
Hoecker, D.G.; Elias, G.S. User Evaluation of the Lisp Intelligent Tutoring System. Proc. Hum. Factors Soc. Annu. Meet. 1986, 30, 182–186. [Google Scholar] [CrossRef]
Baker, R.S.; Yacef, K. The State of Educational Data Mining in 2009: A Review and Future Visions. J. Educ. Data Min. 2009, 1, 3–17. [Google Scholar] [CrossRef]
Romero, C.; Ventura, S. Data mining in education. WIREs Data Min. Knowl. Discov. 2012, 3, 12–27. [Google Scholar] [CrossRef]
Aydemir Arslan, M.; Ata, A.; Kucuk, S. MOOCs Reshaping Undergraduate Health Education: A Systematic Review. Int. Rev. Res. Open Distrib. Learn. 2026, 27, 235–264. [Google Scholar] [CrossRef]
Howard, E.; White, A.; Wyse, J. Contextualizing the research problem: Improving cluster analysis insights into student learning. Int. J. Res. Method Educ. 2026, 49, 184–206. [Google Scholar] [CrossRef]
Sarun, H.; Nara, I.; Hun, H. The Role of Artificial Intelligence in Transforming Higher Education Quality and Effectiveness: A Nano Review. J. Agric. Environ. 2026, 3, 32–34. [Google Scholar] [CrossRef]
Subrahmanyam, S.; Habchi, A. The Future of Strategic Leadership in Higher Education. In Strategic Leadership for International Collaboration in Higher Education; IGI Global Scientific Publishing: Hershey, PA, USA, 2025; pp. 189–224. [Google Scholar] [CrossRef]
Goel, A.K.; Polepeddi, L. Jill Watson: A virtual teaching assistant for online education. In Proceedings of the Learning Engineering for Online Education; Routledge: London, UK, 2018; pp. 120–143. [Google Scholar]
Taneja, K.; Maiti, P.; Kakar, S.; Guruprasad, P.; Rao, S.; Goel, A.K. Jill Watson: A Virtual Teaching Assistant Powered by ChatGPT. In Artificial Intelligence in Education; Springer: Cham, Switzerland, 2024; pp. 324–337. [Google Scholar] [CrossRef]
Li, M.; Peng, J.; Zhu, Y.; Cheng, W. Emotional Recognition in Affective Tutoring System: A Systematic Review. In Proceedings of the 2025 International Conference on Distance Education and Learning (ICDEL), Kunming, China, 13–16 June 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 344–348. [Google Scholar] [CrossRef]
Van, U.T.; Tieu, B.H.; Tran, D.H. A Student Performance Prediction Model Using Machine Learning Models in Multimodal Learning Analytics. In Proceedings of the 21st International Conference on Computing and Information Technology (IC2IT 2025); Springer: Cham, Switzerland, 2025; pp. 11–20. [Google Scholar] [CrossRef]
Perera, P.; Kulathunga, V.K.; Perera, N.N.; Gunarathna, A.; Jayasinghe, P.S.K.; Rajamanthri, L. Adoption of ChatGPT in higher education: A systematic literature review of lecturers’ perspective based on UTAUT2. Asian Educ. Dev. Stud. 2025, 15, 89–113. [Google Scholar] [CrossRef]
Adair, A. Teaching and Learning with AI: How Artificial Intelligence is Transforming the Future of Education. XRDS Crossroads ACM Mag. Stud. 2023, 29, 7–9. [Google Scholar] [CrossRef]
Dai, K.; Liu, Y.; Zhang, X. Generative AI in higher education: A bibliometric review of emerging trends, power dynamics, and global research landscapes. Comput. Educ. Artif. Intell. 2026, 10, 100544. [Google Scholar] [CrossRef]
Ng, S.H.S.; Chan, H.Y.; Wong, J.H.K.; Sam, L.; Privitera, A.J. Mapping the landscape: Generative AI in higher education assessment (2020–2024)—A scoping review. Interact. Learn. Environ. 2026, 1–33. [Google Scholar] [CrossRef]
Leon, M. GPT-5 and open-weight large language models: Advances in reasoning, transparency, and control. Inf. Syst. 2026, 136, 102620. [Google Scholar] [CrossRef]
Shetye, S. An Evaluation of Khanmigo, a Generative AI Tool, as a Computer-Assisted Language Learning App. Stud. Appl. Linguist. TESOL 2024, 24, 38–53. [Google Scholar] [CrossRef]
Manoharan, S.; Speidel, U.; Ward, A.E.; Ye, X. Contract Cheating – Dead or Reborn? In Proceedings of the 2023 32nd Annual Conference of the European Association for Education in Electrical and Information Engineering (EAEEIE), Eindhoven, The Netherlands, 14–16 June 2023; IEEE: Piscataway, NJ, USA, 2023. [Google Scholar] [CrossRef]
Arefian, M.H. AI-enhanced professional learning communities: A new era of personalized teacher education. Cogent Educ. 2026, 13, 2626629. [Google Scholar] [CrossRef]
Li, L. The Inspiration and Reference of International Digital Educational Publishing: Leveraging AI and Edge Computing for a Future-Ready Ecosystem. Int. J. Hous. Sci. Its Appl. 2025, 47, 610–618. [Google Scholar] [CrossRef]
Davenport, T.H.; Bean, R. Building AI Capabilities Into Portfolio Companies at Apollo. MIT Sloan Manag. Rev. 2025, 1–3. Available online: https://www.proquest.com/openview/963093930fc02e5e13600227b4d817e2/1?pq-origsite=gscholar&cbl=6831990 (accessed on 9 June 2026). [CrossRef]
Rojas, C.; Rong, R.; Bloomfield, L. Allowing Generative AI in Class: Evidence from a Semester-Long Controlled Teaching Study. SSRN 2025. Preprint. [Google Scholar] [CrossRef]
Follin, T.A. First Aid Forward Review. J. Electron. Resour. Med. Libr. 2025, 22, 165–177. [Google Scholar] [CrossRef]
Ofgang, E. It’s Not a Chatbot and That Might be This AI Tutor’s Superpower. Tech & Learning, 2024. Online News Report. Available online: https://www.techlearning.com/news/its-not-a-chatbot-and-that-might-be-this-ai-tutors-superpower (accessed on 21 February 2026).
Rashidat, A.N. Qualitative Study on Integration of Artificial Intelligence Technology in the classroom: Does it Help or Hinders learning? Fed. Univ. Gusau Fac. Educ. J. 2025, 5, 240–247. [Google Scholar] [CrossRef]
Essel, H.B.; Vlachopoulos, D.; Essuman, A.B.; Amankwa, J.O. ChatGPT effects on cognitive skills of undergraduate students: Receiving instant responses from AI-based conversational large language models (LLMs). Comput. Educ. Artif. Intell. 2024, 6, 100198. [Google Scholar] [CrossRef]
Stöhr, C.; Ou, A.W.; Malmström, H. Perceptions and usage of AI chatbots among students in higher education across genders, academic levels and fields of study. Comput. Educ. Artif. Intell. 2024, 7, 100259. [Google Scholar] [CrossRef]
Liu, M. Learn Generative AI with PyTorch: Build GANs, Transformers, and Diffusion Models; Manning: Shelter Island, NY, USA, 2024. [Google Scholar]
Louis, M.; ElAzab, M. The Future of Teaching and Learning in Artificial Intelligence era (part II). Int. J. Internet Educ. 2023, 22, 1–8. [Google Scholar] [CrossRef]

Figure 1. Taxonomy of algorithmic families underpinning AI in higher education. Capability tends to increase and transparency to decrease from left to right.

Figure 2. Structured, PRISMA-style search flow through identification, screening, eligibility, and inclusion. The diagram documents a structured (scoping-style) search supporting a critical review; it is not the flow of a full systematic review. The 70 included sources (52 peer-reviewed; 18 institutional, product, or news) correspond to the reference list of this article, and the screening-stage counts summarize that search.

Figure 3. Cross-analysis of AI mechanisms against educational applications. Cell labels denote maturity: H (high), M (moderate), L (low), or none.

Table 1. Taxonomy of AI mechanisms in higher education and their characteristics (snapshot: mid-2025).

Mechanism Family	Representative Techniques	Educational Function	Transparency
Rule-based/expert systems	Production rules, semantic networks, model tracing	Step-level diagnosis and individualized instruction	High
Adaptive/knowledge-space	Knowledge-space theory, overlay and bug models	Mastery sequencing and adaptive practice	High
Statistical/probabilistic	Bayesian knowledge tracing, EDM, learning analytics	Outcome prediction and early-warning analytics	Moderate
Deep learning	Deep knowledge tracing, affect-aware models	Sequence-aware student modeling	Low
Transformer LLMs	Self-attention, pretrained language models	Open-ended dialogic tutoring; content generation	Low
Retrieval-augmented/hybrid	RAG, LLM grounded in a knowledge base, guardrails	Grounded explanation and assessment generation	Moderate

Table 2. Primary evidence basis and confidence level for each assistant (snapshot: mid-2025).

Assistant	Primary Evidence Basis	Confidence
McGraw Hill ALEKS	Independent peer-reviewed research on knowledge-space theory and learning gains	High
Khanmigo	Vendor documentation with emerging independent evaluations	Moderate
Pearson AI Study Tools	Product documentation with vendor-reported usage	Moderate
Cengage Student Assistant	Product documentation with early engagement surveys	Moderate
Macmillan Achieve AI Tutor	Vendor surveys and product documentation	Moderate
McGraw Hill AI Reader	Product documentation	Low
Pearson AI Instructor Tool	Product documentation	Low
Macmillan iClicker AI Question Creator	Product documentation (beta)	Low
Quizlet Q-Chat	Product documentation	Low
CheggMate	Product documentation and news reports	Low
Wiley AI Tutor	News reports and marketing material	Low

Table 3. Inter-rater reliability for the TRIAD ratings, computed in Python from the independent pre-consensus scores (

n = 11

tools). Within one agreement, 100% on every dimension. QWK = quadratic-weighted Cohen’s kappa.

Table 3. Inter-rater reliability for the TRIAD ratings, computed in Python from the independent pre-consensus scores (

n = 11

tools). Within one agreement, 100% on every dimension. QWK = quadratic-weighted Cohen’s kappa.

Dimension	Exact Agreement (%)	Within-One (%)	QWK	ICC(2,1)
Trust	72.7	100.0	0.85	0.86
Relevance	81.8	100.0	0.77	0.78
Impact	63.6	100.0	0.45	0.47
Adoption	63.6	100.0	0.90	0.91
Design	81.8	100.0	0.73	0.75
Overall (pooled)	72.7	100.0	0.86	0.87

Table 4. Key developments in AI and higher education during the 1950s–1960s. References correspond to numbered citations in the bibliography.

Year/Period	Development	References
1956	Dartmouth workshop proposes the study of artificial intelligence as a discipline, launching the AI research community	[17,19]
1960	PLATO (Programmed Logic for Automatic Teaching Operations) built at the University of Illinois; provided time-sharing educational terminals	[18]
Late 1960s	Early CAI systems deliver programmed instruction through frames; focus on drill and practice	[20,21]
1967–1969	Development of TUTOR language and PLATO Notes fostered one of the earliest online communities	[22]

Table 5. Major developments of the 1970s in AI tutoring.

Year/Period	Development	References
1970	The SCHOLAR system demonstrates information-structure-oriented CAI using semantic networks; first ITS to support mixed-initiative dialogue	[23]
1974	GUIDON uses MYCIN expert system techniques to teach medical diagnostics	[25,26]
Late 1970s	Research explores student modeling and error diagnosis; foundations for cognitive modeling laid	[27,28]
1984	A two-sigma study underscores the superiority of one-to-one tutoring and motivates ITS development	[24]

Table 6. Highlights of AI and tutoring research in the 1980s.

Year/Period	Development	References
Early 1980s	ACT theory informs cognitive modeling; LISP Tutor and Geometry Tutor developed at a university	[30,31]
1983–1988	Student modeling techniques such as bug libraries and overlay models introduced	[28,32]
1989–1990	A comprehensive overview of ITS architectures published	[29]
Late 1980s	Rule-based shells (e.g., HERACLES) enable domain experts to build ITSs without deep AI expertise	[33,34]

Table 7. Developments in the 1990s.

Year/Period	Development	References
1991–1995	Cognitive Tutors for algebra and geometry commercialized; evidence for learning gains accumulates	[35,36,37]
1996	ALEKS adaptive learning platform launched; uses knowledge space theory to model student readiness	[7,38]
Mid-1990s	AutoTutor prototypes demonstrate conversational tutoring using natural language	[39,40]
Late 1990s	Intelligent tutoring spreads to engineering and programming domains (e.g., SQL Tutor, Lisp Tutor)	[41,42]

Table 8. Developments in the 2000s.

Year/Period	Development	References
2001–2007	Educational data mining emerges; methods for predicting student performance and detecting disengagement	[8,43,44]
2010	Learning analytics movement gains momentum; adoption of analytics to support evidence-based decision making	[9]
2012–2013	MOOCs launched by Coursera, edX and Udacity; researchers study attrition and engagement	[45,46]
2000s	Adaptive learning companies Knewton and Smart Sparrow bring personalized pathways to university courses	[47,48]

Table 9. Key AI developments in higher education during the 2010s.

Year/Period	Development	References
2011	Launch of the learning analytics and educational data mining communities (LAK and EDM conferences)	[9,43]
2012	MOOCs reach millions of learners; research on engagement and dropout	[45,46]
2015	Deep Knowledge Tracing applies recurrent neural networks to predict future performance	[10]
2016	Virtual teaching assistant named Jill Watson deployed in an online AI class	[49,50]
Late 2010s	Affect-aware and multimodal tutoring systems incorporate speech, gestures, and biosignals	[51,52]

Table 10. Highlights of AI developments in higher education (2020–2025).

Year/Period	Development	References
2020	Transformer models enable language generation; GPT-3 demonstrates few-shot capabilities	[11,12,13]
2022–2023	ChatGPT and other generative chatbots gain public visibility, prompting educational debate	[54,57]
2023–2024	First wave of publisher-integrated generative AI tools (e.g., Khanmigo, CheggMate, Pearson AI Study Tools)	[58,59,60]
2024–2025	Rapid expansion of generative AI assistants across higher education (Cengage Student Assistant, Macmillan AI Tutor, and others)	[61,62,63,64,65,66]

Table 11. Classification of publisher assistants by interface modality and architecture (snapshot: mid-2025).

Interface/Modality	Assistants	Typical Architecture
Embedded in-platform chatbot (most common)	Cengage Student Assistant, Macmillan Achieve AI Tutor, Pearson AI Study Tools, McGraw Hill AI Reader, CheggMate	LLM, mostly retrieval-augmented over vetted content
Stand-alone conversational tutor	Khanmigo	General-purpose LLM with guardrails
Adaptive assessment engine	McGraw Hill ALEKS	Knowledge-space theory (non-generative)
Messaging-app micro-tutoring	Wiley AI Tutor	LLM via external chat interface
Flashcard-based study chat	Quizlet Q-Chat	LLM over user-generated content
Instructor-facing item generator	Macmillan iClicker AI Question Creator, Pearson AI Instructor Tool	LLM for assessment generation

Table 12. Cengage Student Assistant: capabilities and limitations (snapshot: mid-2025).

Aspect	Capabilities/Highlights	Limitations/Concerns
Core purpose	AI chatbot guiding students through coursework via personalized prompts and feedback	Does not provide direct answers; uses Socratic questioning, which may frustrate students seeking quick solutions
Launch timeline	Beta launched Fall 2024; expansion planned for Spring/Fall 2025	Initially limited to four courses and select student cohorts
Target market	Higher education courses, initially Management, Organizational Behavior, Psychology, and Economics	Only available within MindTap; not yet cross-platform
Distinctive features	Discipline-tuned training, emphasis on critical thinking and integrity, 24/7 availability	Confined to course content; cannot answer questions outside the prescribed curriculum

Table 13. Khan Academy’s Khanmigo (snapshot: mid-2025).

Aspect	Capabilities/Highlights	Limitations/Concerns
Core purpose	GPT-4-based virtual tutor and teaching assistant employing Socratic dialogue	Access restricted to approved pilots and paying subscribers; primarily K-12 focus with less advanced college content
Launch timeline	Pilot began 15 March 2023	High operational costs due to GPT-4; sustainable funding remains a concern
Target market	K-12 and early college students; includes teacher-assistant features for lesson planning	Requires internet access and district approval; limited availability in early phases
Unique features	Role-playing as historical or literary characters; teacher tools for generating lesson plans and rubrics; conversation safety measures	High reliance on user-generated content could propagate errors; generative nature may lead to hallucinations if not grounded in vetted curriculum

Table 14. Macmillan Learning AI tools (snapshot: mid-2025).

Tool/Component	Capabilities/Highlights	Limitations/Concerns
Achieve AI Tutor	On-demand homework helper providing Socratic guidance and step-by-step explanations; students heavily use it at night; early research shows increased engagement and confidence	Available only in ~80 courses (mainly STEM); must be activated by instructors; does not provide direct answers
iClicker AI Question Creator	Generates up to 50 customized questions based on topic, difficulty, and learning objectives; questions support active learning and are not easily searchable online	Limited to 5000 instructors and 50 questions each in beta; AI-generated questions require instructor review; best suited for formative assessments

Table 15. McGraw Hill AI tools (snapshot: mid-2025).

Tool/Component	Capabilities/Highlights	Limitations/Concerns
ALEKS (Adaptive Learning Platform)	Knowledge space theory-based adaptive assessment; over 25 years of use; supports mathematics and chemistry courses; reduces assessment time	Not generative; limited to problem domains with well-defined answers; knowledge checks can be stressful for some students
AI Reader	Allows students to request simplified explanations and practice questions based on highlighted text in eBooks; promotes active reading	Available only in select textbooks; responses may oversimplify or miss nuances; limited to content within the book
Future generative enhancements	Plans for integrating generative AI into ALEKS and other platforms (announced 2024)	Specific features and timelines not yet published; success will depend on alignment with adaptive algorithms

Table 16. Pearson AI study tools (snapshot: mid-2025).

Aspect	Capabilities/Highlights	Limitations/Concerns
Personalized Q&A	Students ask questions within the eText and receive structured explanations drawn from Pearson content	Responses limited to material in the textbook; external questions are not addressed
Study plan generator	Allows students to upload a syllabus to generate a customized learning schedule	Early implementation; may not account for individual pacing differences or unexpected schedule changes
Interactive video assistant	AI answers questions during instructional videos, enabling interactive viewing	Dependent on accurate transcription and alignment; may not handle complex diagrams or equations
Practice problem generation	Produces additional practice problems and explanations based on student needs	Instructors must vet AI-generated problems for alignment; available only in select courses in 2023–2024
Responsible use	Pearson monitors usage data and modifies the AI’s guidance tone to ensure constructive support	System relies on accurate underlying content; lacks transparency into training data and algorithms

Table 17. Wiley AI Tutor (snapshot: mid-2025).

Aspect	Capabilities/Highlights	Limitations/Concerns
Medium	Tutor operates via WhatsApp or similar messaging apps, offering convenient mobile access	Messaging interface may not support complex equations or diagrams
Subject coverage	Initial pilot includes physics, accounting, business statistics, and computer science	Limited subject coverage; expansion dependent on feedback and partnership with eFlow
Personalization	Provides guided explanations and interactive activities based on student questions	Not integrated into course learning management systems; alignment with specific textbooks is ad hoc

Table 18. Quizlet Q-Chat (snapshot: mid-2025).

Aspect	Capabilities/Highlights	Limitations/Concerns
Functionality	Adaptive AI tutor using Quizlet’s flashcard database to engage students in question-answer dialogue	Accuracy depends on user-generated flashcards; potential propagation of misinformation
Launch	Beta launched March 2023; initial access to U.S. users aged 18+	Age restrictions and paid subscription limit accessibility; privacy considerations for minors
Supplementary features	Magic Notes and Quick Summary summarize and explain materials; interactive study modes	Generative responses may hallucinate or oversimplify content; not intended for high-stakes assessment

Table 19. CheggMate overview (snapshot: mid-2025).

Aspect	Capabilities/Highlights	Limitations/Concerns
Features	Integrates GPT-4 with Chegg’s solutions database; provides personalized step-by-step explanations and generates practice quizzes	Available only to paying subscribers; overlaps with free generative models like ChatGPT; risk of facilitating cheating
Launch timeline	Announced April 2023; beta access from May 2023 with ongoing enhancements	Full public release timeline not disclosed; adoption depends on value beyond open AI models
Target market	College students seeking homework help across many subjects	Subscribers have declined due to free alternatives; success depends on added value and academic integrity safeguards

Table 20. Comparative overview of publisher AI tools, Part 1 (Cengage to Pearson; snapshot: mid-2025).

Publisher	Tool	Modality	Integration	Discipline Coverage	Business Model
Cengage	Student Assistant	Chatbot within MindTap	Embedded; course-specific tuning	Limited to four initial courses	Included with MindTap
Khan Academy	Khanmigo	Conversational tutor	Web-based; integrates with Khan Academy	Broad K-12 and early college	Pilot access and subscription
Macmillan	Achieve AI Tutor	Homework helper (Socratic)	Embedded in Achieve	STEM and economics	Included with Achieve; instructor opt-in
Macmillan	iClicker AI Question Creator	Instructor question generation	Plug-in for iClicker	Cross-disciplinary	Beta; pricing TBD
McGraw Hill	ALEKS	Adaptive assessment	LMS-integrated	Math, chemistry, statistics	Institutional licensing
McGraw Hill	AI Reader	eBook-embedded chatbot	Embedded in Connect/GO eBooks	Select textbooks	Included with eBooks
Pearson	AI Study Tools	Generative Q&A, study planner	Integrated into MyLab and Pearson+	Dozens of courses	Platform subscription
Pearson	AI Instructor Tool	Assessment generation	Pearson platforms	Cross-disciplinary	Platform/institutional

Table 21. Comparative overview of publisher AI tools, Part 2 (Wiley to Chegg; snapshot: mid-2025).

Publisher	Tool	Modality	Integration	Discipline Coverage	Business Model
Wiley	Wiley AI Tutor	Messaging-based micro-tutoring	External via WhatsApp	Physics, business, statistics	Pilot; pricing TBD
Quizlet	Q-Chat	Flashcard-based chat	Within Quizlet platform	User-generated, broad content	Freemium; age-restricted beta
Chegg	CheggMate	Chat-based tutor (GPT-4)	Within Chegg Study	Broad subject coverage	Paid subscription

Table 22. Combined results: TRIAD scores (1–10) and JTBD task fit (H/M/L) for each assistant, with confidence from Table 2 (snapshot: mid-2025). TRIAD: T = Trust, R = Relevance, I = Impact, A = Adoption, D = Design. JTBD jobs: UC = understand complex content, GA = generate assessment questions, TF = timely feedback. Confidence: Hi = High, Md = Moderate, Lo = Low.

Assistant	TRIAD						JTBD			Conf.
Assistant	T	R	I	A	D	Total	UC	GA	TF	Conf.
Cengage Student Assistant	8	8	7	6	7	36	H	L	H	Md
Khanmigo	9	9	8	8	8	42	H	M	H	Md
Macmillan Achieve AI Tutor	7	8	7	6	7	35	H	L	H	Md
Macmillan iClicker AI Question Creator	6	8	7	5	7	33	L	H	M	Lo
McGraw Hill ALEKS	8	9	8	9	8	42	H	M	H	Hi
McGraw Hill AI Reader	8	8	7	6	8	37	H	M	H	Lo
Pearson AI Study Tools	8	9	8	8	8	41	H	M	H	Md
Pearson AI Instructor Tool	7	9	8	8	8	40	L	H	H	Lo
Wiley AI Tutor	7	7	7	5	7	33	H	L	M	Lo
Quizlet Q-Chat	6	8	7	8	7	36	H	H	M	Lo
CheggMate	6	8	7	7	8	36	H	H	H	Lo

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Leon, M. Publisher-Built Generative AI Assistants in U.S. Higher Education: A Critical Review and a Reproducible TRIAD–JTBD Evaluation Framework. Algorithms 2026, 19, 492. https://doi.org/10.3390/a19060492

AMA Style

Leon M. Publisher-Built Generative AI Assistants in U.S. Higher Education: A Critical Review and a Reproducible TRIAD–JTBD Evaluation Framework. Algorithms. 2026; 19(6):492. https://doi.org/10.3390/a19060492

Chicago/Turabian Style

Leon, Maikel. 2026. "Publisher-Built Generative AI Assistants in U.S. Higher Education: A Critical Review and a Reproducible TRIAD–JTBD Evaluation Framework" Algorithms 19, no. 6: 492. https://doi.org/10.3390/a19060492

APA Style

Leon, M. (2026). Publisher-Built Generative AI Assistants in U.S. Higher Education: A Critical Review and a Reproducible TRIAD–JTBD Evaluation Framework. Algorithms, 19(6), 492. https://doi.org/10.3390/a19060492

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Publisher-Built Generative AI Assistants in U.S. Higher Education: A Critical Review and a Reproducible TRIAD–JTBD Evaluation Framework

Abstract

1. Introduction

Structure

2. Theoretical Foundations and Technical Evolution

2.1. Technical and Algorithmic Evolution: A Taxonomy of Mechanisms

2.2. The TRIAD Dimensions

2.3. The Jobs-to-Be-Done Lens

3. Materials and Methods

3.1. Review Type and Protocol

3.2. Search Strategy and Source Selection

3.3. Tool Inclusion and Selection

3.4. Evidence and Confidence Grading

3.5. TRIAD Scoring Procedure

3.6. JTBD Scoring Procedure

3.7. Raters, Reliability, and Reproducibility

3.8. Temporal Scope

4. Historical Evolution of AI in Higher Education

4.1. 1950s and 1960s: Foundations and Early Experiments

4.2. 1970s: Intelligent Tutoring Emerges

4.3. 1980s: Cognitive Models and Rule-Based Tutors

4.4. 1990s: Commercial Adaptive Platforms

4.5. 2000s: Learning Analytics and MOOCs

4.6. 2010s: Intelligent Assistants and Deep Learning

4.7. 2020s: Generative AI and LLMs

5. AI-Powered Educational Tools by Academic Publishers

5.1. Organizing Dimensions: Interface and Architecture

5.2. Cengage

5.3. Khan Academy

5.4. Macmillan Learning

5.5. McGraw Hill

5.6. Pearson

5.7. Wiley

5.8. Quizlet

5.9. Chegg

6. Competition Among Publishers

7. Current State of the Art and Emerging Trends

7.1. Scored Results: TRIAD and JTBD

7.2. Mechanism-by-Application Cross-Analysis and Scenarios

8. Discussion and Implications

8.1. Recommendations for Institutions

8.2. Ethical and Societal Considerations

8.3. Limitations

9. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI