1. Introduction
Massive Open Online Courses (MOOCs) have expanded access to education worldwide, but they continue to face persistent challenges with learner engagement and success. One contributing factor is the lack of individualized feedback to support self-regulated learning (SRL), the iterative process of planning, monitoring, and evaluating one’s learning, which is strongly associated with improved achievement [
1,
2,
3]. Although SRL is widely recognized as beneficial, many MOOC learners struggle to engage in it effectively [
4,
5].
Learning Analytics Dashboards (LADs) have been proposed as tools to foster SRL at scale by visualizing indicators of performance, progress, and engagement [
6,
7]. When designed well, dashboards can help learners reflect on their behavior and adapt strategies [
8,
9]. Yet their effectiveness remains contested. Reviews have shown that many LADs lack theoretical grounding or contextual alignment [
7,
10], and others caution that poorly designed dashboards may increase cognitive load, encourage unhelpful social comparison, or even discourage learners [
11,
12]. The evidence to date has been mixed, with many studies relying on small-scale pilots or lab settings that limit generalizability [
13,
14]. This has led to ongoing debate about whether LADs have truly lived up to their promise [
11].
A recurring critique of LAD design is the absence of strong theoretical foundations. Matcha et al. [
7] found that most dashboards lacked explicit connections to SRL theory, while more recent reviews suggest incremental progress but continued inconsistency. Paulsen et al. [
15] argue that the field is moving “from analytics to learning” but note that theoretical integration is often partial. Similarly, Masiello [
16] highlights that dashboards remain largely descriptive, translating data into visualizations but rarely into actionable, pedagogically meaningful guidance.
Another concern is the use of peer-referenced indicators. While showing learners how they compare to their peers can provide useful benchmarks, it can also trigger counterproductive forms of social comparison. Classic theory suggests that upward comparison often undermines motivation [
17,
18], and empirical studies confirm this risk in online learning settings [
12,
19]. To mitigate such effects, researchers have proposed using prior cohorts rather than real-time peers as reference points [
20] or offering multiple benchmarks (e.g., passing, certificate-ready, mastery) aligned with diverse learner goals [
21,
22].
A third critique relates to inference cost, the difficulty of interpreting and acting on dashboard information. Complex or abstract visualizations have been shown to impose high cognitive demands [
23,
24], with their benefits often skewed toward more educated learners [
20]. Recent studies emphasize that explanatory, goal-oriented designs reduce extraneous cognitive load [
25], while poor designs can undermine motivation by eroding learners’ sense of competence [
26]. Reducing inference costs requires not only careful visualization design but also the inclusion of actionable feedback. The ARCS model [
27], which structures feedback to capture Attention, highlight Relevance, build Confidence, and foster Satisfaction, has proven effective in online learning environments [
28,
29].
Finally, LAD evaluation practices have often been limited. Many studies rely on small pilots, lab settings, or usability testing, restricting generalizability [
13,
14]. Field studies in authentic learning contexts are rarer but crucial, as they reveal confounding factors and provide stronger evidence for both researchers and practitioners [
12,
30,
31,
32]. Reviews consistently call for more controlled evaluations of dashboards in real courses with diverse learners [
11].
This study addresses these gaps by designing a theory- and context-grounded LAD for a credit-bearing MOOC in supply chain management and evaluating it through a randomized controlled trial (RCT) with 8745 learners. Guided by the COPES model of SRL [
33,
34] and informed by historical course data, the dashboard incorporated pacing and progress indicators, with or without actionable ARCS-framed feedback. Our findings show that dashboards without feedback offered no measurable benefits, while dashboards with feedback significantly increased learners’ verification rates (a marker of commitment) but had mixed effects on engagement and no effect on final performance. These results suggest that dashboards are not inherently beneficial; their impact depends on specific design choices. By combining design principles with experimental evidence, this work contributes to ongoing debates about the value of LADs and offers practical guidance for building dashboards that support self-regulated learning at scale.
2. Materials and Methods
This section describes the context, design, and evaluation of the study. We begin with an overview of the MOOC that served as the research setting, then detail the development of the Learning Analytics Dashboard (LAD), and finally outline the experimental design used to assess its impact.
2.1. Course Context
The intervention was implemented in a 14-week MOOC in supply chain analytics, part of a credential-bearing online program in supply chain management hosted on the edX platform (edx/2U, Arlington, VA, USA). The course ran from April to August 2023 and enrolled 8745 learners, the majority of whom were working professionals. All enrolled learners were included in the sampling frame at the start of the course; no exclusions were applied beyond standard platform registration criteria, ensuring that the randomized groups represented the full course population. Learners could audit the course for free or enroll as verified learners by paying a fee. Only verified learners obtained access to graded assignments and the final exam and were eligible for a certificate if they achieved a grade of 60% or higher.
The course consisted of five instructional modules followed by a final exam. Each module included lecture videos, practice problems, and a graded assignment. Practice problems permitted three attempts with immediate automated feedback. After either answering correctly or exhausting all attempts, learners could access detailed explanations. These problems were designed to scaffold concepts that would be assessed in subsequent graded assignments. Graded assignments contributed 10% of the final grade, while the exam contributed 90%. Survey data from previous course runs (2021–2022), implemented using Qualtrics XM (Qualtrics LLC, Provo, UT, USA), indicated that verified learners typically pursued two main goals: earning a certificate for professional purposes or achieving a high grade as a pathway toward graduate credit. A recurring theme in these surveys was anxiety about pacing and uncertainty about progress, which motivated the development of a learner-facing dashboard to scaffold self-regulated learning.
2.2. Dashboard Design
The Learning Analytics Dashboard (LAD) was designed using the COPES model of self-regulated learning [
33,
34], which conceptualizes regulation as cycles of conditions, operations, products, evaluations, and standards. From this perspective, dashboards act as external feedback systems that can supplement or correct learners’ often biased self-assessments.
To contextualize the design, we conducted exploratory analyses on clickstream data from earlier runs of the course (2021–2022). These analyses were performed using Python 3.10 in Google Colab (Google LLC, Mountain View, CA, USA), relying on the pandas, numpy, and statsmodels libraries.Multiple linear regression was used to predict final grades from behavioral traces (see
Table 1). Indicators were selected if they showed statistical significance in the linear model (
). This process yielded three indicators: (1) the number of unique lecture videos completed; (2) the number of unique practice problems submitted; and (3) the number of practice problem solutions viewed.
Consultations with the teaching staff underscored concerns about learners’ pacing behaviors and highlighted the critical role of structured study planning in online learning. These insights informed the inclusion of time-related features in the dashboard to scaffold effective pacing strategies. The decision was further motivated by evidence on the spacing effect, which demonstrates that learning is more effective when study sessions are distributed over time rather than massed together [
35,
36].
Two dashboard variants were developed to examine the role of feedback in shaping learners’ interpretation and use of these indicators. Both dashboards presented the indicators with basic visualizations and a space to display messages. The dashboard shown to Group A contained generic, static messages. The dashboard shown to Group B contained personalized and actionablefeedback messages. These personalized feedback messages were drafted using Keller’s ARCS model [
27]: Attention (capture learner interest), Relevance (connect learning to learner goals and needs), Confidence (build belief in ability to succeed), and Satisfaction (reinforce accomplishment). Each message was linked to the learner’s current progress and automatically selected from a message bank.
2.3. Experimental Design
We conducted a three-arm randomized controlled trial (RCT) to evaluate the LAD’s impact on learner outcomes. Upon enrollment, learners were randomly assigned into one of three groups: (1) Group C (Control): no dashboard; (2) Group A (LAD without feedback): dashboard with indicators and generic messages; (3) Group B (LAD with feedback): dashboard with indicators plus ARCS-framed actionable feedback. Randomization occurred automatically at the time of enrollment, without a separate consent procedure, as learners participated under the platform’s standard terms of use.
Learners in Groups A and B accessed their dashboards through an “Engagement Dashboard” button on the course landing page. To ensure consistency in the interface, control learners saw a similar button leading to a survey. Engagement with the dashboard was voluntary, with no incentives to click. As learners were randomly assigned to conditions at enrollment, analyses followed an intent-to-treat design; dashboard usage frequency was not recorded, as the focus was on the impact of dashboard availability rather than self-selected engagement.
Three outcome variables were analyzed: verification status, engagement, and performance. Verification status was a binary variable indicating whether a learner upgraded to the verified track (i.e., paid for access to graded assignments, the exam, and the option to earn a certificate). Engagement was operationalized as the total time spent in the course. Engagement time was computed from clickstream logs as the sum of session durations, where each session comprised consecutive events separated by less than 10 min of inactivity. Total time was calculated as the cumulative duration of all sessions for each learner. Performance was measured as the final course grade, expressed on a 0–1 scale, and available only for verified learners. Verification status outcomes were analyzed using logistic regression, with Group C (no dashboard) as the reference category. Engagement and performance were analyzed using one-way ANOVAs, followed by Tukey’s HSD tests for post hoc comparisons. ANOVA was chosen to maintain comparability with prior learning analytics research and to address our primary goal of testing the mean group differences.
Because the engagement data were positively skewed, analyses were conducted on both raw and log-transformed values. Assumption checks indicated that log transformation improved distributional properties: Levene’s test confirmed homogeneity of variances across groups for the log-transformed data (, ) but not for the raw data (, ). The Shapiro–Wilk tests remained significant () due to the large sample size (), though normality improved substantially (mean W increased from 0.43 to 0.77). Given these improvements and to assess robustness, analyses were conducted on both the raw and log-transformed engagement values.
Assumption checks for performance data indicated deviations from normality (Shapiro–Wilk ), but homogeneity of variances was satisfied (Levene’s ). Given the bounded nature of grade data and the large sample size, the ANOVA was considered robust to these deviations, and analyses were conducted on raw values.
Effect sizes were reported as odds ratios (ORs) for logistic regression and eta squared () for ANOVAs. Statistical significance was set at . All analyses were performed in Python 3.12 in Google Colab (Google LLC, Mountain View, CA, USA), using the pandas, numpy, statsmodels, and scipy libraries.
4. Discussion
This study set out to design and evaluate a Learning Analytics Dashboard (LAD) grounded in theory and contextualized in a MOOC, with the goal of enhancing self-regulated learning (SRL) and improving learner outcomes. By combining insights from SRL theory, data-driven indicator selection, and prior critiques of LAD design, we proposed a set of design principles and tested their impact in an online course field experiment. The results highlight both the promise and the pitfalls of LADs: dashboards with actionable, ARCS-framed feedback increased learners’ likelihood of verification (paying for the option to get a certificate) and showed tentative signs of boosting engagement, while dashboards without feedback provided no measurable benefit and may have imposed additional cognitive costs. Across all conditions, no significant effects were observed on final performance.
These findings extend the ongoing debate captured by Kaliisa et al. [
11], who noted that the evidence for positive effects of LADs remains mixed. Our study underscores that dashboards are not uniformly beneficial: simply visualizing learner data, without clear interpretive support, can increase inference costs and fail to motivate action. In contrast, dashboards that integrate low-inference visualizations with actionable feedback can foster motivation and persistence, leading to higher levels of commitment. In this sense, the question is not whether LADs “work” but under what design conditions they provide value.
The results also align with research on cognitive load and motivation. Explanatory, goal-oriented designs have been shown to reduce extraneous cognitive load [
25], while poorly structured feedback can undermine learners’ sense of competence [
26]. Our findings fit this dual mechanism: learners who received indicators without guidance saw little benefit, whereas those who received ARCS-framed feedback showed stronger commitment to the course. Feedback thus appears essential for transforming dashboard data into actionable strategies while also sustaining motivation.
A further contribution of this study concerns pacing. Building on evidence that distributed engagement predicts certification more strongly than total time-on-task [
38], we incorporated features such as weekly streaks and time-on-task indicators. While these elements alone were insufficient to produce significant effects, their integration with actionable feedback may explain why the feedback condition produced more favorable outcomes. Consistent with this, Group B exhibited higher verification odds than that for the control and a modest raw engagement advantage over Group A that attenuated under log transformation (see
Table 5,
Table 7 and
Table 8); however, no grade differences were observed (
Table 10). This pattern suggests that pacing supports are most effective when combined with guidance that helps learners interpret and act on them.
Taken together, our findings emphasize that the impact of LADs depends not on their mere presence but on their theoretical grounding, contextualization, and feedback design. Effective dashboards need to reduce inference costs, support motivational needs, and provide meaningful pacing scaffolds. Future research should continue to explore which combinations of features influence different learner outcomes and how learner characteristics (e.g., prior achievement, goals, self-efficacy) moderate dashboard effectiveness.
Limitations and Future Work
This studyhas three main limitations. First, it was conducted in a single MOOC on supply chain management, primarily targeting working professionals, which constrains the generalizability to other subjects, learner populations, or formats such as instructor-paced or non-credit-bearing courses. Second, the dashboard was evaluated as a bundled intervention, making it impossible to isolate the specific contributions of individual components, for example, whether the increase in verification rates was driven by ARCS-framed messages, pacing indicators, or their interaction. Third, this study focused on short-term outcomes within a single course; we did not examine whether exposure to dashboards fostered lasting SRL practices or longer-term academic gains.
Future work should therefore test LADs across more diverse settings, employ experimental designs that isolate the contribution of individual features, and track learners longitudinally to capture sustained impacts. In parallel, our ongoing research agenda aims to refine the dashboard through component-level A/B tests and explore the integration of agentic AI systems capable of delivering adaptive, context-aware feedback in real time. These developments, together with multi-course replications, will help clarify how actionable, personalized feedback can strengthen self-regulated learning at scale.
5. Conclusions
This study designed and evaluated a Learning Analytics Dashboard (LAD) grounded in SRL theory and contextualized in a MOOC, testing its impact in an online course field experiment. The results show that dashboards are not inherently beneficial: a LAD without actionable feedback offered no measurable advantages and in some cases was associated with negative outcomes, while a LAD with ARCS-framed feedback increased learners’ verification rates and showed tentative benefits for engagement. No differences were found in final course performance across groups.
These findings suggest that LADs are most effective when they combine low-inference visualizations with actionable, motivationally framed feedback and when they make pacing strategies explicit. For practitioners, these results caution against deploying dashboards that simply display learner data while highlighting the value of embedding interpretive support that helps learners translate indicators into concrete actions. For researchers, this study underscores the need for designs that isolate the contribution of dashboard features and for evaluations across diverse contexts to determine when and for whom LADs are effective.
Ultimately, the question is not whether LADs have lived up to the hype but under what conditions they can deliver on their promise. By grounding dashboards in theory, contextualizing them with course-specific data, and embedding feedback that supports both cognition and motivation, future work can move toward LADs that reliably foster self-regulated learning and learner success at scale.