1. Introduction
Nowadays, various devices such as personal computers, smartphones, tablets, smart watches, and smart eyeglasses are used with different software platforms and operating systems including
Windows,
Linux,
Android, and
iOS [
1,
2]. As a result, the strong demand for cross-platform developments has been growing in IT industries around the world [
3]. Traditionally, building applications for different platforms required separate implementations using native programming languages, since each platform has its own structure and features [
2]. For example,
Android development typically uses
Java or
Kotlin,
iOS uses
Swift or
Objective-C, and web applications are developed using
HTML and
JavaScript.
Under such situations, a service vendor needs to develop and maintain separate codebases using different programming languages for each target platform [
4]. This typically results in higher development costs, broader team skill requirements, and increased maintenance overhead [
5].
To solve this undesired situation, developers around the world have started using cross-platform development frameworks that allow a single codebase to run on different devices/platforms [
6]. On such a framework, a service vender needs to implement and update one code using one language for one service. Then, the framework outputs the runnable code on each specific device/platform from it, which makes application developments in heterogenous environments much easier. One of the most popular frameworks is
Flutter [
7].
Developed by
Google,
Flutter can provide a solution in fast cross-platform application developments without sacrificing system performances or visual quality.
Flutter is powered by the
Dart programming language [
8], which is also developed by
Google, and offers several developer-friendly features, such as a unified codebase for
Android,
iOS,
web, and
desktop applications, a flexible
widget system, and a
hot reload mechanism for real-time iteration [
9,
10].
However, Flutter has been limited in academic settings, although it provides many benefits for developers. Most university curricula in the world have not yet included Flutter and Dart as part of their programming courses. As a result, students are facing challenges in gaining practical experience using Flutter for application developments. To address this gap, self-learning tools will be efficient solutions that offer proper guidance and hands-on practice, enabling students to study Flutter/Dart effectively on their own.
Aiming to support self-directed learning, we have developed the
Programming Learning Assistant System (PLAS) as a web-based exercise platform for independent programming study.
PLAS provides various types of exercise problems to help students build both conceptual understanding and practical skills. Then, we have extended
PLAS to create the
Flutter Programming Learning Assistant System (FPLAS) [
11], which focuses on helping students practice
Flutter/Dart-based application development.
FPLAS integrates multiple learning components. This environment is set up using
Docker container platform to ensure consistent executions across systems [
12]. In
FPLAS, the
Grammar-Concept Understanding Problem (GUP) focuses on syntax and grammar fundamentals in
Dart [
11]. Meanwhile, the
Code Modification Problem (CMP) allows students to practice editing existing
Flutter code [
13].
During the initial use of FPLAS, unfortunately, we observed an unexpected pattern among the participants of the evaluations. Certain students were able to answer questions correctly, despite lacking a full understanding of fundamental Flutter/Dart concepts. This was particularly common among those with prior experiences in JavaScript, Java, or object-oriented programming (OOP).
To provide structured supports, we developed beginner-oriented documentation that introduces essential concepts using simple explanations and example codes to help students understand the fundamentals of
Flutter/Dart [
14]. Then, based on this documentation, we designed a new set of problem instances called the
Integrated Introductory Problem (INT), and implemented them in
FPLAS. The
INT problems adopt selected formats from
PLAS, such as the
Grammar Understanding Problem (GUP) and the
Element Fill-in-blank Problem (EFP). In addition, the
INT provides interface screenshots to help students relate the code to the interface layout. the
INT aims to strengthen students’ conceptual understanding by linking learning contents with hands-on coding practices, thereby helping bridge the gap between theory and implementation.
An INT instance combines documentation, code editing, visual output, and interactive exercise problems into a single platform. This setup enables students to read explanations, modify the code, and see the visual screen results. The INT helps them connect abstract concepts with practical implementations, which is especially important in User Interface (UI)-focused frameworks, including Flutter.
To evaluate the proposal, we generated
INT instances to study key concepts in the developed documentation, and conducted two rounds of user testing. In the first round, we prepared 16 instances, and assigned them to 23 master’s students at Okayama University, Japan, who had no prior experience with
Flutter or
Dart. The purpose of this round was to assess the system’s usability and educational effectiveness. These students were generally able to complete the assignments correctly, indicating a basic understanding of the targeted programming concepts. However, the
System Usability Scale (SUS) results suggested that the usability of the system could be improved [
15].
Based on the feedback in the first round, we revised both the documentation and the system interface. Improvements were made to the clarity of explanations, navigation structure, and problem instructions to make the learning process smoother and more intuitive for beginners.
In the second round, the revised version of the system was evaluated with a different group of 25 fourth-year undergraduate students. The SUS scores were improved, showing better user satisfactions, and the solutions remained consistently accurate. Additional feedback in this round highlighted further areas for improvements, which we used to make additional refinements to the learning environment and instructional flow.
To systematically evaluate the proposal, we address the following research questions.
RQ1: Does the integration of structured documentation, code, visual output, and INT problems improve students’ understanding of Flutter/Dart concepts?
RQ2: How does the revised system affect the system’s usability and user satisfaction, as measured by the System Usability Scale (SUS)?
The structure of the paper is outlined as follows:
Section 2 contains a review of relevant research on the approach.
Section 3 introduces the beginner-oriented documentation.
Section 4 describes the integrated learning design and the rationale for the
Integrated Introductory Problem (INT).
Section 5 presents the design and implementation of the proposed learning assistant system.
Section 6 presents the experimental results.
Section 7 discusses student feedback and system improvements.
Section 8 discusses the findings, limitations, and implications.
Section 9 concludes this paper with future work.
2. Related Works
In this section, we review existing literature associated with our research topic.
2.1. Flutter for Learning
Several studies have explored the use of Flutter in educational contexts, highlighting its technical capabilities, architectural clarity, and suitability for cross-platform developments.
Nagaraj et al. [
16] emphasized
Flutter’s widget system and hot reload functionality, which support fast iteration and interactive learning. Zou et al. [
17] focused on developer experiences, noting that asynchronous behaviors, animation memory managements, and network image handling often confuse novice learners. These studies point to the need for guided scaffolding when introducing
Flutter to beginners.
Boukhary et al. [
18] proposed a clean architecture model in
Flutter to separate UI components from business logic, aiming for maintainable and scalable educational projects. Radadiya et al. [
19] further elaborated on
Flutter’s rendering pipeline, widget layering, and integration with IDEs, which are relevant when teaching structured UI developments. The architectural overview in [
20] offers a systematic explanation of
Flutter’s framework and engine layers that can serve as instructional material for foundational learning.
Sharma et al. [
21] demonstrated how
Flutter enables students to build mobile applications for multiple platforms without maintaining separate codebases. This efficiency makes it an attractive option for introductory courses with limited time or resources.
While these works demonstrate Flutter’s educational potential, most of them focus on developer experiences or advanced use cases. They often assume prior programming knowledge and do not address the conceptual difficulties faced by beginners in transitioning from imperative programming styles to declarative UI paradigms.
In contrast, our proposed system focuses on supporting beginners by emphasizing fundamental concepts such as widget hierarchy, Dart syntax, reusable widget components, and code structure patterns. The learning materials are carefully aligned with these core elements to provide a structured and progressive learning path that mirrors how Flutter is introduced in academic settings.
2.2. Programming Educational Platforms
In addition to studies that focus on the educational use of Flutter, various programming learning platforms have been developed to support structured instructions, automated code assessments, and learner feedback across different contexts.
Wang et al. [
22] introduced
CodingHere, a browser-based programming system that emphasizes automatic code assessments with error-specific feedback. Their platform enhances student engagements by providing flexible code checking and diagnostic messages tailored to common mistakes.
Hinrichs et al. [
23] developed
OPPSEE, an online platform equipped with full IDE support and automated feedback. Built on a microservice architecture and integrated with
Gitpod, it supports multi-file assignments and collaborative programming in real time, offering flexibility for instructors and learners.
To better understand the landscape of platforms, Combéfis et al. [
24] conducted a broad classification and comparison of automated code assessment tools. Their analysis identified dimensions such as feedback granularity, language coverages, and analysis methods, providing a taxonomy useful for educational tool designers.
Other studies have emphasized conceptual understanding. Jevtić et al. [
25] employed
Self-Organizing Maps (SOMs) to analyze the density and distribution of programming concepts in instructional material. This method offers visualization-based feedback on curriculum design and concept progression.
A different approach was proposed by Hsueh et al. [
26], who developed
Pytutor, a platform that evaluates both student codes and the quality of generated test cases. Their results suggest that students with stronger testing habits achieve better conceptual outcomes, highlighting the value of integrating test-driven learning.
Emerging directions explore accessibility and mobile learning. Hasan et al. [
27] introduced a mobile-focused evaluation platform that supports app developments in
Java and
Kotlin. Their system uses test case matching and automated result comparisons to offer real-time feedback on student submissions, aligning well with mobile development scenarios.
Togni et al. [
28] proposed an early-stage system that integrates real-time speech recognition,
YOLOv5-based object detection, and a cross-platform
Flutter interface. Although this work is currently available as a preprint and under peer review, it reflects the growing interest in adaptive, Artificial Intelligence (AI)-driven educational tools for inclusive mobile learning environments.
While these systems offer a variety of functionalities, from automatic grading and feedback to visual analytics and mobile support, many of them focus on general-purpose programming or system scalability. Their instructional designs, although effective, are not always tailored to the specific challenges of UI-focused, declarative frameworks such as Flutter.
In contrast to these systems, the proposed
INT format integrates documentation, code, output images, and interactive exercises into a unified instructional instance. This design draws inspirations from pedagogical principles of worked examples [
29], while extending them through syntax-level scaffolding (
GUP) and immediate applications via fill-in-the-blank programming tasks (
EFP).
Unlike conventional worked examples, which often present complete solutions for passive observations, the INT format promotes active engagements and progressive concept acquisitions by guiding learners from recognition to production. It embodies principles of scaffolded learning and just-in-time instruction, helping students build both conceptual and procedural knowledge with reduced cognitive load.
4. Integrated Learning Design for Flutter Self-Study
In this section, we present a new learning format called the
Integrated Introductory Problem (INT). This format builds on the documentation introduced in
Section 3, and combines conceptual guidance, a source code, an interface visual output, and interactive exercises into a unified format. The goal of
INT is to provide a structured way for students to move from reading to practice while reinforcing key programming principles.
4.1. Rationale for Improvement
Our previous FPLAS includes the Code Modification Problem (CMP) and the Code Writing Problem (CWP). These formats provided mobile application screenshots and problem instructions that will guide students to complete given programming challenges.
Although many students successfully solved the problems, we observed that students with prior experiences in OOP languages such as JavaScript or Java solved them by relying on coding habits. This means that this previous approach in FPLAS might lead them to answer questions without fully understanding how Flutter/Dart structures the programs and manages the user interfaces. As a result, they often showed limited grasp of the underlying concepts and faced difficulties when tackling more complex and practical programming exercises.
4.2. Overview of INT Framework
To better connect conceptual learning with hands-on coding exercises, in this paper, we design the Integrated Introductory Problem (INT) format. Each INT instance integrates the following four elements:
Explanations or summaries that are beginner-friendly based on the guide documentation;
Source code that directly reflects the explained concept;
A screenshot of the output screen that visually represents the expected app output;
Two interactive task types:
- -
Grammar Understanding Problem (GUP)—questions designed to test understanding of structure, logic, and key terms in Flutter/Dart;
- -
Element Fill-in-Blank Problem (EFP)—partial code with blanks that students must complete to match the expected codes.
This INT format connects written explanations on code behaviors and visual outcomes, encouraging students to understand both how the code works and why it behaves that way. While GUP and EFP offer valuable hands-on experiences, the INT format is designed to further support conceptual understanding by more explicitly linking documentation contents, source codes, and the resulting interfaces. Rather than focusing only on implementations, this alignment helps learners form a clearer connection between what they read, what they write, and what they observe on a screen.
4.3. Curriculum Design and Topic Selection
The 16 topics listed in
Table 1 were derived from the 11 foundational topics that are covered in the beginner-oriented documentation introduced in
Section 3. These topics were based on difficulties observed in previous
FPLAS exercises, such as
Code Modification Problems (CMPs) and
Code Writing Problems (CWPs), where students struggled to understand
Flutter’s declarative structure and widget hierarchy. To address these issues, the 11 fundamental topics were expanded and reorganized into 16 fine sections. This restructuring enables a step-by-step learning experience, helping students build conceptual understanding before moving on to more advanced topics.
Each topic was then implemented as an
INT instance that integrates the documentation, visual outputs, and interactive exercises. This sequence starts with basic structural concepts of a
Flutter/Dart application and progressively moves to more advanced UI behaviors and state managements.
Table 1 summarizes the topics, the key concepts, and their pedagogical role in the system.
The order of these topics was carefully designed to reflect the dependency of the concepts. Topics like state changes and widget rebuilding are placed later in the sequence, after learners have gained familiarity with widget structure and composition. This careful sequencing helps maintain the logical flow of learning and prevents students from being overwhelmed by advanced behaviors before mastering the foundational syntax and patterns.
By combining structured guidance, example codes, visual results, and targeted questions, the INT format enables students to follow a consistent learning iteration that reinforces understanding through repeated application. This integrated design offers a more effective alternative to traditional code-only programming tasks.
5. Integrated Introductory Problem for Flutter/Dart Programming
In this section, we present how the Integrated Introductory Problem (INT) is implemented within the PLAS for self-learning of Flutter/Dart. Each INT instance is designed to combine conceptual explanation, code analysis, and structured practice problems in a cohesive and accessible format.
5.1. System Architecture and Workflow
Each INT instance integrates two types of exercise problems in PLAS: the Grammar Understanding Problem (GUP) and the Element Fill-in-Blank Problem (EFP). The GUP focuses on conceptual recognitions by asking learners to identify key elements such as keywords, widget names, or method names in given code samples. The EFP, on the other hand, requires students to complete partially hidden code based on both the surrounding syntax and the intended output. To solve an INT problem, students are presented with the following:
This workflow combines different forms of learning, such as reading explanations, writing codes, and observing their application interfaces, into a single interactive process. Although students may rely on trials and errors at times, the system encourages them to carefully examine the code and refer to the documentation when necessary.
5.2. Procedure for Creating INT Instances
Each INT instance can be generated through the following structured authoring process:
A complete Flutter/Dart source code is first prepared, targeting one of the selected instructional topics.
The source code is executed, and the screenshots of the rendered mobile application interface are captured.
Specific keywords, expressions, or syntax elements are manually selected to be blanked out for the GUP and EFP formats.
These blanks and their positions are recorded in the configuration file, which is then processed by a custom Python script to generate the problem set.
The script produces a self-contained interface using HTML, CSS, and JavaScript, embedding the source code with blanks, input fields, and screenshots in a layout suitable for web-based use.
Since the current blank selection process is manual, this step offers flexibility in targeting specific learning objectives. In each INT instance, the blanked elements in the EFP were chosen to match a keyword, a method, or a grammar element introduced in the corresponding Grammar Understanding Problem (GUP) that precedes it. Future works will explore the automation of this step using rule-based or context-aware blanking strategies.
5.3. Example Instance: Topic #2 – Flutter Code Structure
An example instance is shown in Listing 1, where students are introduced to the basic structure of a Flutter/Dart application code. This code demonstrates the use of a stateless widget class and how the application starts using the main() function. Its related GUP questions ask students to identify class names, override annotations, and lifecycle methods.
Listing 1. Source code for INT instance on project structure. |
![Computers 14 00417 i001 Computers 14 00417 i001]() |
Table 2 shows sample
GUP questions derived from this code. These questions were manually authored by the researchers to reflect common keyword-level misunderstandings encountered in early-stage Flutter learning. No automatic question generation was used in this version.
The corresponding EFP input file (Listing 2) replaces key elements with double-dollar markers such as $main$. These are parsed into input fields.
Listing 2. Input code with blanks for EFP format. |
![Computers 14 00417 i002 Computers 14 00417 i002]() |
5.4. Interface Design and Scoring Mechanism
As shown in
Figure 1, the
INT web answer interface is divided into two panels: the left panel displays the source code with input fields, and the right panel shows the expected visual output. Students enter their answers directly into the blanks and click a button to receive feedback.
The validation is performed by a client-side JavaScript function that checks each input against the correct answer using the exact string matching. If the answer is correct, the input field turns green. The incorrect answer turns it red. Students may revise and resubmit their answers as many times as needed.
To prevent access to correct answers, all answers are hashed using SHA256, not stored in a plain text. This enables both online and offline use while preserving fairness in open-access environments.
8. Discussion
In this section, we analyze the evaluation results to assess the effectiveness, usability, and learning support of the proposed system, including the impact of the feedback mechanism.
8.1. Interpretation and Answers to Research Questions
We examine whether the integration of a documentation, a source code, a visual output, and INT problems have improved conceptual understanding and the system usability, from the results in two evaluation rounds.
In RQ1, the high average correct answer rates observed in both evaluations suggest that the students were able to understand and apply Flutter/Dart programming concepts effectively using the proposed system. In the first round, 17 out of 23 students achieved perfect scores, and in the second round, 21 out of 25 students did so. The consistent correctness, even after modifications, indicates that the integration of documentation and interactive elements helped reinforce students’ conceptual learning.
For RQ2, the average SUS score increased from 61.67 in the first round to 71.56 in the second round. To verify whether this difference was statistically significant, we conducted an independent sample t-test, resulting in a p-value of 0.035, which is below the 0.05 threshold. This indicates a statistically significant improvement in system usability. The effect size (Cohen’s ) suggests a medium-to-large effect. These findings imply that modifications such as improved documentation structure, clearer instructions, and interface enhancements contributed meaningfully to a more satisfactory and intuitive learning experience.
These findings demonstrate that the proposed learning system effectively supports both the learning and usability goals.
8.2. Limitations
Our study in this paper has several limitations that should be acknowledged.
First, the problem instance creation relied on manual authoring decisions, although the topics and blank selections were carefully designed following the documentation contents and grammar-focused principles for the GUP format. Future works should focus on standardizing the blank selection process to improve the accuracy, fairness, and scalability. For example, we will define rule-based criteria tied to syntax type, widget category, or difficulty levels so that instance generations become further consistent and automated across problems.
Second, the evaluation was conducted with students sharing a similar background, where the number of participants and institutional scope were limited. They have prior experiences in object-oriented programming (OOP) (e.g., Java, JavaScript, C++, or Python) but have no prior knowledge of Flutter/Dart. This alignment supports the system’s intended learner profile. However, the use of two distinct participant groups, where one group consists of master’s students and another of fourth-year undergraduate students, may introduce variability due to differences in academic levels, motivations, or prior exposures to related concepts between them. Future evaluations will be necessary with more controlled or homogeneous participant groups to improve the validity and comparability of the results.
Third, the evaluation was conducted in a short-term setting, involving detailed analysis and active feedback collection from participants. The evaluation of long-term retention or knowledge transfer beyond the immediate learning sessions was not conducted. Evaluating these aspects remains an important direction for future works.
Fourth, this study did not incorporate pre/post testing or baseline comparison groups such as documentation-only or conventional material conditions. As an initial exploratory implementation to students in real educational settings with formal classes, our evaluation prioritized practical usage and feedback collections rather than formal measurement of learning gains.
While the high correct answer rates indicate successful completions of the tasks by students, they may not fully capture their deeper conceptual understanding or transferability. Without follow-up assessments, it remains possible that some students relied on answer pattern recognitions rather than genuine comprehensions of the concepts. Evaluating these aspects remains an important direction for future works.
8.3. Implications and Future Works
The results of this study offer several implications on the design of programming learning environments, particularly for introductory Flutter/Dart programming. The proposed integration of beginner-oriented documentation with two exercise problem formats in INT demonstrated that the structured, context-aware guidance can significantly enhance understanding of introductory Flutter/Dart programming, especially in independent study environments.
The combination of a source code, a visual output through a screenshot, and an interactive solving process implemented in the INT format have helped students connect the abstract syntax with the concrete interface behavior. This approach is particularly valuable when learning UI-focused frameworks such as Flutter. Furthermore, the grammar-focused learning design of GUP and the structured completion tasks of EFP proved effective in bridging the gap between students’ prior object-oriented programming (OOP) experiences and the Flutter’s declarative model.
Based on these findings, several directions are planned for future developments. First, we aim to expand the set of INT problems by covering more advanced Flutter/Dart topics, such as state management patterns and widget compositions. Alongside this, we will explore automating the blank selection process for creating a new INT instance using a rule-based method, reducing manual workloads and ensuring consistency in problem generations.
Second, we plan to enhance the system’s adaptability by incorporating features that respond to individual learning progress or error patterns. In addition, we will explore the development of a mobile application of the system using Flutter/Dart to promote greater accessibility and enable students to engage in Flutter/Dart learning at any time and any place.
Furthermore, we intend to adopt more rigorous evaluation designs in future studies, including the use of pre/post testing and baseline comparison groups, to more accurately assess conceptual learning gains and further validate the educational effectiveness of the proposed system.
In particular, we aim to include methods for assessing conceptual transfers, such as follow-up tasks of writing new codes or explaining learned concepts. To improve internal validity of evaluations, more homogeneous participant groups (e.g., students of the same academic level and background) will be considered to reduce potential confounding factors when comparing different evaluation rounds.