A Guided Self-Study Platform of Integrating Documentation, Code, Visual Output, and Exercise for Flutter Cross-Platform Mobile Programming

Kinari, Safira Adine; Funabiki, Nobuo; Aung, Soe Thandar; Kyaw, Htoo Htoo Sandi

doi:10.3390/computers14100417

Open AccessArticle

A Guided Self-Study Platform of Integrating Documentation, Code, Visual Output, and Exercise for Flutter Cross-Platform Mobile Programming

by

Safira Adine Kinari

,

Nobuo Funabiki

^*

,

Soe Thandar Aung

and

Htoo Htoo Sandi Kyaw

Department of Information and Communication Systems, Okayama University, Okayama 700-8530, Japan

^*

Author to whom correspondence should be addressed.

Computers 2025, 14(10), 417; https://doi.org/10.3390/computers14100417

Submission received: 26 August 2025 / Revised: 22 September 2025 / Accepted: 29 September 2025 / Published: 1 October 2025

(This article belongs to the Special Issue Transformative Approaches in Education: Harnessing AI, Augmented Reality, and Virtual Reality for Innovative Teaching and Learning)

Download

Browse Figures

Versions Notes

Abstract

Nowadays, Flutter with the Dart programming language has become widely popular in mobile developments, allowing developers to build multi-platform applications using one codebase. An increasing number of companies are adopting these technologies to create scalable and maintainable mobile applications. Despite this increasing relevance, university curricula often lack structured resources for Flutter/Dart, limiting opportunities for students to learn it in academic environments. To address this gap, we previously developed the Flutter Programming Learning Assistance System (FPLAS), which supports self-learning through interactive problems focused on code comprehension through code-based exercises and visual interfaces. However, it was observed that many students completed the exercises without fully understanding even basic concepts, if they already had some knowledge of object-oriented programming (OOP). As a result, they may not be able to design and implement Flutter/Dart codes independently, highlighting a mismatch between the system’s outcomes and intended learning goals. In this paper, we propose a guided self-study approach of integrating documentation, code, visual output, and exercise in FPLAS. Two existing problem types, namely, Grammar Understanding Problems (GUP) and Element Fill-in-Blank Problems (EFP), are combined together with documentation, code, and output into a new format called Integrated Introductory Problems (INTs). For evaluations, we generated 16 INT instances and conducted two rounds of evaluations. The first round with 23 master students in Okayama University, Japan, showed high correct answer rates but low usability ratings. After revising the documentation and the system design, the second round with 25 fourth-year undergraduate students in the same university demonstrated high usability and consistent performances, which confirms the effectiveness of the proposal.

Keywords:

Flutter; Dart; cross-platform; self-learning; introductory

1. Introduction

Nowadays, various devices such as personal computers, smartphones, tablets, smart watches, and smart eyeglasses are used with different software platforms and operating systems including Windows, Linux, Android, and iOS [1,2]. As a result, the strong demand for cross-platform developments has been growing in IT industries around the world [3]. Traditionally, building applications for different platforms required separate implementations using native programming languages, since each platform has its own structure and features [2]. For example, Android development typically uses Java or Kotlin, iOS uses Swift or Objective-C, and web applications are developed using HTML and JavaScript.

Under such situations, a service vendor needs to develop and maintain separate codebases using different programming languages for each target platform [4]. This typically results in higher development costs, broader team skill requirements, and increased maintenance overhead [5].

To solve this undesired situation, developers around the world have started using cross-platform development frameworks that allow a single codebase to run on different devices/platforms [6]. On such a framework, a service vender needs to implement and update one code using one language for one service. Then, the framework outputs the runnable code on each specific device/platform from it, which makes application developments in heterogenous environments much easier. One of the most popular frameworks is Flutter [7].

Developed by Google, Flutter can provide a solution in fast cross-platform application developments without sacrificing system performances or visual quality. Flutter is powered by the Dart programming language [8], which is also developed by Google, and offers several developer-friendly features, such as a unified codebase for Android, iOS, web, and desktop applications, a flexible widget system, and a hot reload mechanism for real-time iteration [9,10].

However, Flutter has been limited in academic settings, although it provides many benefits for developers. Most university curricula in the world have not yet included Flutter and Dart as part of their programming courses. As a result, students are facing challenges in gaining practical experience using Flutter for application developments. To address this gap, self-learning tools will be efficient solutions that offer proper guidance and hands-on practice, enabling students to study Flutter/Dart effectively on their own.

Aiming to support self-directed learning, we have developed the Programming Learning Assistant System (PLAS) as a web-based exercise platform for independent programming study. PLAS provides various types of exercise problems to help students build both conceptual understanding and practical skills. Then, we have extended PLAS to create the Flutter Programming Learning Assistant System (FPLAS) [11], which focuses on helping students practice Flutter/Dart-based application development.

FPLAS integrates multiple learning components. This environment is set up using Docker container platform to ensure consistent executions across systems [12]. In FPLAS, the Grammar-Concept Understanding Problem (GUP) focuses on syntax and grammar fundamentals in Dart [11]. Meanwhile, the Code Modification Problem (CMP) allows students to practice editing existing Flutter code [13].

During the initial use of FPLAS, unfortunately, we observed an unexpected pattern among the participants of the evaluations. Certain students were able to answer questions correctly, despite lacking a full understanding of fundamental Flutter/Dart concepts. This was particularly common among those with prior experiences in JavaScript, Java, or object-oriented programming (OOP).

To provide structured supports, we developed beginner-oriented documentation that introduces essential concepts using simple explanations and example codes to help students understand the fundamentals of Flutter/Dart [14]. Then, based on this documentation, we designed a new set of problem instances called the Integrated Introductory Problem (INT), and implemented them in FPLAS. The INT problems adopt selected formats from PLAS, such as the Grammar Understanding Problem (GUP) and the Element Fill-in-blank Problem (EFP). In addition, the INT provides interface screenshots to help students relate the code to the interface layout. the INT aims to strengthen students’ conceptual understanding by linking learning contents with hands-on coding practices, thereby helping bridge the gap between theory and implementation.

An INT instance combines documentation, code editing, visual output, and interactive exercise problems into a single platform. This setup enables students to read explanations, modify the code, and see the visual screen results. The INT helps them connect abstract concepts with practical implementations, which is especially important in User Interface (UI)-focused frameworks, including Flutter.

To evaluate the proposal, we generated INT instances to study key concepts in the developed documentation, and conducted two rounds of user testing. In the first round, we prepared 16 instances, and assigned them to 23 master’s students at Okayama University, Japan, who had no prior experience with Flutter or Dart. The purpose of this round was to assess the system’s usability and educational effectiveness. These students were generally able to complete the assignments correctly, indicating a basic understanding of the targeted programming concepts. However, the System Usability Scale (SUS) results suggested that the usability of the system could be improved [15].

Based on the feedback in the first round, we revised both the documentation and the system interface. Improvements were made to the clarity of explanations, navigation structure, and problem instructions to make the learning process smoother and more intuitive for beginners.

In the second round, the revised version of the system was evaluated with a different group of 25 fourth-year undergraduate students. The SUS scores were improved, showing better user satisfactions, and the solutions remained consistently accurate. Additional feedback in this round highlighted further areas for improvements, which we used to make additional refinements to the learning environment and instructional flow.

To systematically evaluate the proposal, we address the following research questions.

RQ1: Does the integration of structured documentation, code, visual output, and INT problems improve students’ understanding of Flutter/Dart concepts?
RQ2: How does the revised system affect the system’s usability and user satisfaction, as measured by the System Usability Scale (SUS)?

The structure of the paper is outlined as follows: Section 2 contains a review of relevant research on the approach. Section 3 introduces the beginner-oriented documentation. Section 4 describes the integrated learning design and the rationale for the Integrated Introductory Problem (INT). Section 5 presents the design and implementation of the proposed learning assistant system. Section 6 presents the experimental results. Section 7 discusses student feedback and system improvements. Section 8 discusses the findings, limitations, and implications. Section 9 concludes this paper with future work.

2. Related Works

In this section, we review existing literature associated with our research topic.

2.1. Flutter for Learning

Several studies have explored the use of Flutter in educational contexts, highlighting its technical capabilities, architectural clarity, and suitability for cross-platform developments.

Nagaraj et al. [16] emphasized Flutter’s widget system and hot reload functionality, which support fast iteration and interactive learning. Zou et al. [17] focused on developer experiences, noting that asynchronous behaviors, animation memory managements, and network image handling often confuse novice learners. These studies point to the need for guided scaffolding when introducing Flutter to beginners.

Boukhary et al. [18] proposed a clean architecture model in Flutter to separate UI components from business logic, aiming for maintainable and scalable educational projects. Radadiya et al. [19] further elaborated on Flutter’s rendering pipeline, widget layering, and integration with IDEs, which are relevant when teaching structured UI developments. The architectural overview in [20] offers a systematic explanation of Flutter’s framework and engine layers that can serve as instructional material for foundational learning.

Sharma et al. [21] demonstrated how Flutter enables students to build mobile applications for multiple platforms without maintaining separate codebases. This efficiency makes it an attractive option for introductory courses with limited time or resources.

While these works demonstrate Flutter’s educational potential, most of them focus on developer experiences or advanced use cases. They often assume prior programming knowledge and do not address the conceptual difficulties faced by beginners in transitioning from imperative programming styles to declarative UI paradigms.

In contrast, our proposed system focuses on supporting beginners by emphasizing fundamental concepts such as widget hierarchy, Dart syntax, reusable widget components, and code structure patterns. The learning materials are carefully aligned with these core elements to provide a structured and progressive learning path that mirrors how Flutter is introduced in academic settings.

2.2. Programming Educational Platforms

In addition to studies that focus on the educational use of Flutter, various programming learning platforms have been developed to support structured instructions, automated code assessments, and learner feedback across different contexts.

Wang et al. [22] introduced CodingHere, a browser-based programming system that emphasizes automatic code assessments with error-specific feedback. Their platform enhances student engagements by providing flexible code checking and diagnostic messages tailored to common mistakes.

Hinrichs et al. [23] developed OPPSEE, an online platform equipped with full IDE support and automated feedback. Built on a microservice architecture and integrated with Gitpod, it supports multi-file assignments and collaborative programming in real time, offering flexibility for instructors and learners.

To better understand the landscape of platforms, Combéfis et al. [24] conducted a broad classification and comparison of automated code assessment tools. Their analysis identified dimensions such as feedback granularity, language coverages, and analysis methods, providing a taxonomy useful for educational tool designers.

Other studies have emphasized conceptual understanding. Jevtić et al. [25] employed Self-Organizing Maps (SOMs) to analyze the density and distribution of programming concepts in instructional material. This method offers visualization-based feedback on curriculum design and concept progression.

A different approach was proposed by Hsueh et al. [26], who developed Pytutor, a platform that evaluates both student codes and the quality of generated test cases. Their results suggest that students with stronger testing habits achieve better conceptual outcomes, highlighting the value of integrating test-driven learning.

Emerging directions explore accessibility and mobile learning. Hasan et al. [27] introduced a mobile-focused evaluation platform that supports app developments in Java and Kotlin. Their system uses test case matching and automated result comparisons to offer real-time feedback on student submissions, aligning well with mobile development scenarios.

Togni et al. [28] proposed an early-stage system that integrates real-time speech recognition, YOLOv5-based object detection, and a cross-platform Flutter interface. Although this work is currently available as a preprint and under peer review, it reflects the growing interest in adaptive, Artificial Intelligence (AI)-driven educational tools for inclusive mobile learning environments.

While these systems offer a variety of functionalities, from automatic grading and feedback to visual analytics and mobile support, many of them focus on general-purpose programming or system scalability. Their instructional designs, although effective, are not always tailored to the specific challenges of UI-focused, declarative frameworks such as Flutter.

In contrast to these systems, the proposed INT format integrates documentation, code, output images, and interactive exercises into a unified instructional instance. This design draws inspirations from pedagogical principles of worked examples [29], while extending them through syntax-level scaffolding (GUP) and immediate applications via fill-in-the-blank programming tasks (EFP).

Unlike conventional worked examples, which often present complete solutions for passive observations, the INT format promotes active engagements and progressive concept acquisitions by guiding learners from recognition to production. It embodies principles of scaffolded learning and just-in-time instruction, helping students build both conceptual and procedural knowledge with reduced cognitive load.

3. Overview of Flutter Introductory Documentation

In this section, we introduce the beginner-oriented documentation for Flutter/Dart.

3.1. Purpose and Development

To support novice learners in understanding Flutter/Dart programming concepts, we developed an introductory documentation tailored to beginners. This documentation was created in response to the frequent difficulties observed among students during their initial exposures to Flutter/Dart. Many students tend to carry over object-oriented and imperative patterns from languages such as Java or JavaScript, which leads to misunderstandings of Flutter’s declarative UI structure.

For example, students often do not understand why they need to define a main() function or why runApp() must be called before anything is displayed. They may expect the visual interface to appear automatically or try to update elements directly, rather than understanding that UI components are rebuilt via the build() method, which must remain free of side effects. Such misconceptions highlight the need for clearer explanations of Flutter’s rendering flow, widget hierarchy, and state management concepts.

The aim of this documentation is to reduce this gap by providing learning materials that use simpler language, visual supports, and targeted explanations of fundamental components.

The documentation is designed not only to introduce Flutter/Dart syntax and components but also to offer a structured path that gradually builds student understanding, starting from the project structure and moving toward dynamic widget interactions and state managements.

3.2. Documentation Content and Its Role in INT

The documentation consists of 11 core topics that are selected to guide beginners through both Dart language foundations and Flutter application structures. These topics are adapted from our previous work [13] and are restructured to support this learning system. These topics include the following:

Basic Dart programming concepts such as variables, data types, operators, null safety, generics, error handling, and object-oriented programming (OOP);
The role of widgets as the core unit in Flutter’s architecture;
Using constructor arguments in the key-value format to configure widgets;
Composition using nested widgets and layout nesting;
Treating functions and properties as values within Flutter;
Overview of the Flutter and Dart code structure;
Recognizing various elements that can be used as values;
Conditional rendering using ternary operators;
Understanding the hierarchy of widgets in a Flutter interface;
The difference between StatelessWidget and StatefulWidget;
Updating the interface dynamically using the setState() method.

Each topic is accompanied by an annotated source code and rendered output screenshots to facilitate intuitive understanding. The full structure and content of this documentation have been presented in our previous work [14], as a self-study material for Flutter beginners.

In this study, the documentation serves as the foundation for designing the Integrated Introductory Problem (INT). Topics from the documentation are mapped to individual INT instances, each integrating explanation, a sample code, and interactive exercises. This mapping ensures that the learning flow remains consistent and that exercises closely align with the conceptual structure introduced in the documentation.

4. Integrated Learning Design for Flutter Self-Study

In this section, we present a new learning format called the Integrated Introductory Problem (INT). This format builds on the documentation introduced in Section 3, and combines conceptual guidance, a source code, an interface visual output, and interactive exercises into a unified format. The goal of INT is to provide a structured way for students to move from reading to practice while reinforcing key programming principles.

4.1. Rationale for Improvement

Our previous FPLAS includes the Code Modification Problem (CMP) and the Code Writing Problem (CWP). These formats provided mobile application screenshots and problem instructions that will guide students to complete given programming challenges.

Although many students successfully solved the problems, we observed that students with prior experiences in OOP languages such as JavaScript or Java solved them by relying on coding habits. This means that this previous approach in FPLAS might lead them to answer questions without fully understanding how Flutter/Dart structures the programs and manages the user interfaces. As a result, they often showed limited grasp of the underlying concepts and faced difficulties when tackling more complex and practical programming exercises.

4.2. Overview of INT Framework

To better connect conceptual learning with hands-on coding exercises, in this paper, we design the Integrated Introductory Problem (INT) format. Each INT instance integrates the following four elements:

Explanations or summaries that are beginner-friendly based on the guide documentation;
Source code that directly reflects the explained concept;
A screenshot of the output screen that visually represents the expected app output;
Two interactive task types:
-
Grammar Understanding Problem (GUP)—questions designed to test understanding of structure, logic, and key terms in Flutter/Dart;
-
Element Fill-in-Blank Problem (EFP)—partial code with blanks that students must complete to match the expected codes.

This INT format connects written explanations on code behaviors and visual outcomes, encouraging students to understand both how the code works and why it behaves that way. While GUP and EFP offer valuable hands-on experiences, the INT format is designed to further support conceptual understanding by more explicitly linking documentation contents, source codes, and the resulting interfaces. Rather than focusing only on implementations, this alignment helps learners form a clearer connection between what they read, what they write, and what they observe on a screen.

4.3. Curriculum Design and Topic Selection

The 16 topics listed in Table 1 were derived from the 11 foundational topics that are covered in the beginner-oriented documentation introduced in Section 3. These topics were based on difficulties observed in previous FPLAS exercises, such as Code Modification Problems (CMPs) and Code Writing Problems (CWPs), where students struggled to understand Flutter’s declarative structure and widget hierarchy. To address these issues, the 11 fundamental topics were expanded and reorganized into 16 fine sections. This restructuring enables a step-by-step learning experience, helping students build conceptual understanding before moving on to more advanced topics.

Each topic was then implemented as an INT instance that integrates the documentation, visual outputs, and interactive exercises. This sequence starts with basic structural concepts of a Flutter/Dart application and progressively moves to more advanced UI behaviors and state managements. Table 1 summarizes the topics, the key concepts, and their pedagogical role in the system.

The order of these topics was carefully designed to reflect the dependency of the concepts. Topics like state changes and widget rebuilding are placed later in the sequence, after learners have gained familiarity with widget structure and composition. This careful sequencing helps maintain the logical flow of learning and prevents students from being overwhelmed by advanced behaviors before mastering the foundational syntax and patterns.

By combining structured guidance, example codes, visual results, and targeted questions, the INT format enables students to follow a consistent learning iteration that reinforces understanding through repeated application. This integrated design offers a more effective alternative to traditional code-only programming tasks.

5. Integrated Introductory Problem for Flutter/Dart Programming

In this section, we present how the Integrated Introductory Problem (INT) is implemented within the PLAS for self-learning of Flutter/Dart. Each INT instance is designed to combine conceptual explanation, code analysis, and structured practice problems in a cohesive and accessible format.

5.1. System Architecture and Workflow

Each INT instance integrates two types of exercise problems in PLAS: the Grammar Understanding Problem (GUP) and the Element Fill-in-Blank Problem (EFP). The GUP focuses on conceptual recognitions by asking learners to identify key elements such as keywords, widget names, or method names in given code samples. The EFP, on the other hand, requires students to complete partially hidden code based on both the surrounding syntax and the intended output. To solve an INT problem, students are presented with the following:

A partial source code file embedded with blanks;
Supporting explanations from the documentation (Section 3);
Visual output in the form of screenshots showing the intended behavior of the app.

This workflow combines different forms of learning, such as reading explanations, writing codes, and observing their application interfaces, into a single interactive process. Although students may rely on trials and errors at times, the system encourages them to carefully examine the code and refer to the documentation when necessary.

5.2. Procedure for Creating INT Instances

Each INT instance can be generated through the following structured authoring process:

A complete Flutter/Dart source code is first prepared, targeting one of the selected instructional topics.
The source code is executed, and the screenshots of the rendered mobile application interface are captured.
Specific keywords, expressions, or syntax elements are manually selected to be blanked out for the GUP and EFP formats.
These blanks and their positions are recorded in the configuration file, which is then processed by a custom Python script to generate the problem set.
The script produces a self-contained interface using HTML, CSS, and JavaScript, embedding the source code with blanks, input fields, and screenshots in a layout suitable for web-based use.

Since the current blank selection process is manual, this step offers flexibility in targeting specific learning objectives. In each INT instance, the blanked elements in the EFP were chosen to match a keyword, a method, or a grammar element introduced in the corresponding Grammar Understanding Problem (GUP) that precedes it. Future works will explore the automation of this step using rule-based or context-aware blanking strategies.

5.3. Example Instance: Topic #2 – Flutter Code Structure

An example instance is shown in Listing 1, where students are introduced to the basic structure of a Flutter/Dart application code. This code demonstrates the use of a stateless widget class and how the application starts using the main() function. Its related GUP questions ask students to identify class names, override annotations, and lifecycle methods.

Listing 1. Source code for INT instance on project structure.

Table 2 shows sample GUP questions derived from this code. These questions were manually authored by the researchers to reflect common keyword-level misunderstandings encountered in early-stage Flutter learning. No automatic question generation was used in this version.

The corresponding EFP input file (Listing 2) replaces key elements with double-dollar markers such as $main$ . These are parsed into input fields.

Listing 2. Input code with blanks for EFP format.

5.4. Interface Design and Scoring Mechanism

As shown in Figure 1, the INT web answer interface is divided into two panels: the left panel displays the source code with input fields, and the right panel shows the expected visual output. Students enter their answers directly into the blanks and click a button to receive feedback.

The validation is performed by a client-side JavaScript function that checks each input against the correct answer using the exact string matching. If the answer is correct, the input field turns green. The incorrect answer turns it red. Students may revise and resubmit their answers as many times as needed.

To prevent access to correct answers, all answers are hashed using SHA256, not stored in a plain text. This enables both online and offline use while preserving fairness in open-access environments.

6. Evaluation

In this section, we conduct an evaluation of the proposal to examine its effectiveness by generating 16 INT instances covering fundamental Flutter/Dart programming topics and giving them to students at Okayama University, Japan.

6.1. Evaluation Setting

To evaluate the effectiveness and usability of the proposed FPLAS system with Integrated Introductory Problems (INTs), we conducted two rounds of evaluations involving student participants from Okayama University, Japan. Each round examined answer correctness to assigned problems and usability impressions to assess how the system supports conceptual learning and user experiences.

6.1.1. First Round Participants (Initial Group)

The first evaluation was conducted with 23 master’s course students in the Department of Information and Communication Systems at Okayama University. Most of them had completed their undergraduate studies in the same department, ensuring a consistent academic background. All the participants had previously completed an introductory programming and a follow-up object-oriented programming (OOP) course during bachelor’s studies, typically using languages such as Java 21, C++20, or Python 3.12. These students participated in the evaluation by solving the INT problems using the initial version of the system before any interface or documentation improvements were applied.

6.1.2. Second Round Participants (After-Modification Group)

The second evaluation involved 25 fourth-year undergraduate students (final-year bachelor’s students) in the same department at Okayama University. These students had completed both foundational programming and a follow-up OOP course as part of their regular curriculum, using the same set of programming languages as the first group. Although they were in the undergraduate stage, their programming educations were broadly comparable to those of the master’s students.

The similarity in institutional context, department, and curriculum structure ensures that both groups share a comparable academic foundation, making them appropriate for use in consecutive evaluation rounds. While the two groups differ in the academic seniority, neither group had received advanced programming instruction during the coursework, allowing us to reasonably treat them as similar subsets for the purpose of this exploratory system evaluation. These participant groups were selected based on the available formal class sessions in which the system could be tested.

6.1.3. Evaluation Procedure

To conduct the evaluation, we assigned the proposal to students as a two week task. Students accessed the system through a web browser on their personal devices, without any restriction on the platform or operating system. No Docker environment was required, as the system was hosted on a standard server and delivered via a browser interface.

Before the assignment, students attended a 2 × 50-min class session where we introduced the documentation, explained the INT problem format, and demonstrated a use of the system. This ensured that participants had sufficient guidance prior to engaging with the exercises independently.

Students were free to use external resources, including the documentation and Internet searches, as the goal was to replicate realistic self-study conditions. All exercise answers and usage logs were recorded for analysis.

6.2. Solution Results

Here, we discuss the solution results by the students.

6.2.1. Results of Individual Students

First, we discuss the solution results for the individual students. In this context, student submissions refers to each student’s total number of attempts submitted to solve an INT instance, including incorrect ones. The number of answer submission times thus represents how many times a student interacted with the system before achieving the correct answer. Figure 2 shows the correct answer rate and the number of answer submission times for each of the 23 master’s students among the 16 INT instances. This table shows that 17 students among 23 achieved a perfect score (100%), and 6 students achieved a score above 89%. The average number of student submissions per instance was 5.8. For the instances in which students obtained a perfect score (100%), the average number of submissions was 6.7. The results indicate that the generated INT instances effectively served as learning tools for Flutter/Dart for novice students.

6.2.2. Results of Individual Instances

Then, we discuss the solution results for the individual instances. Figure 3 shows the average correct answer rate and the average number of answer submission times among the 23 master’s students for each of the 16 INT instances. The result suggests that the difficulty varied across the instances, where the lowest rate was

97.10 %

for ID = 13 and the highest rate was

100 %

for multiple IDs such as 1, 2, 3, 6, 7, 14, and 15. Similarly, the number of answer submission times required also varied, where the lowest one was

3.13

for ID = 3 and the highest one was

15.04

for ID = 13. Instance ID = 13 exhibited both high submissions and low correct rate due to students’ difficulties in understanding state and build concepts.

6.3. Solution Results After Modification

Some critical feedback was obtained, which was also indicated by the SUS scores presented in the next subsection. In response, we made several minor refinements to the instance interface. These included adding location-specific guidance within the documentation, incorporating Japanese translations alongside the original English content, and rephrasing a few unclear questions while keeping the total question count unchanged. Subsequently, we conducted a second evaluation of the same 16 INT instances with 25 students from the same university.

6.3.1. Results of Individual Students

We now present the results from the second evaluation. Figure 4 illustrates the average correct answer rate and the average number of answer submissions for each of the 25 fourth-year undergraduate students who attempted the 16 INT instances. According to the data, 21 out of 25 students achieved a perfect score (100%), while the remaining 4 students scored above 93%. On average, each student submitted answers 4.4 times per instance. Among those who achieved full marks, the average number of submissions was 4.52. These findings suggest that the revised INT instances continued to function effectively as learning materials.

6.3.2. Results of Individual Instances

Next, we present the solution results for the individual instances in the second evaluation. Figure 5 shows the average correct answer rate and the average number of answer submission times among the 25 fourth-year undergraduate students for each of the 16 INT instances. This suggests that the difficulty still varied across the instances, where the lowest rate was

97.33 %

for ID = 13 and the highest rate was

100 %

for multiple IDs such as 3, 4, 5, 7, 9, 10, 11, 12, and 14. Similarly, the number of answer submission times also varied, where the lowest one was

2.36

for ID = 14 and the highest one was

13.96

for ID = 13.

Compared to the first evaluation, the results demonstrate an overall improvement in both the correct answer rate and the number of answer submissions. The average correct answer rate increased from

99.23 %

to

99.63 %

, while the average number of submissions per instance decreased from

5.80

to

4.40

. These trends suggest that the minor revisions had a positive impact on student performance and understanding. Although instance ID = 13 remained the most challenging, with the highest number of submissions and the lowest accuracy in both evaluations, a slight improvement was still observed, with a

+ 0.23 %

increase in the correct answer rate and a

- 1.08

reduction in the average number of submissions. In contrast, instance ID = 4 showed significant progress, reducing its average submissions from

11.74

to

2.96

and achieving a perfect correct answer rate. This notable improvement indicates that the applied revisions were particularly effective in clarifying the content of certain topics.

6.4. System Usability Scale

To evaluate the usability of the proposal, we used the System Usability Scale (SUS) [15], a widely used tool consisting of 10 standardized questions rated on a 5-point Likert scale. This instrument captures aspects such as ease of use, consistency, and learnability.

6.4.1. SUS Question Design

Table 3 lists the SUS questions adapted for this study. The questions consist of both positive and negative statements related to the system usability, including flow, clarity of documentation, and helpfulness of feedback.

The scoring procedure follows the standard System Usability Scale (SUS) method that was introduced by Brooke [15]. Questions Q1, Q3, Q5, Q7, and Q9 are positively worded and scored as (response − 1), while Q2, Q4, Q6, Q8, and Q10 are negatively worded and reverse-scored as (5 − response). The final SUS score is calculated as:

S U S_s c o r e = (\sum_{i = 1}^{5} (Q_{o d d_{i}} - 1) + \sum_{j = 1}^{5} (5 - Q_{e v e n_{j}})) \times 2.5

6.4.2. SUS Score Interpretation

SUS scores range from 0 to 100, and can be mapped to letter grades that correspond to the perceived usability level. Table 4 summarizes the grade mapping used in this evaluation.

6.4.3. Usability Result Before Modification

Before we applied the modifications of the proposal, 18 master’s students who used the initial version of the system and completed the INT problems participated in the SUS questionnaire. Their responses are summarized in Table 5.

Based on the SUS responses from 18 master’s students before the modification, the average SUS score was 61.67, corresponding to Grade D (OK). In terms of grade distribution, 6 students (33.33%) rated the system as Grade C (Good), 10 students (55.56%) as Grade D (OK), and 2 students (11.11%) as Grade E (Poor), with no responses falling into Grades A, B, or F. This indicates that 66.67% of participants perceived the system’s usability as below satisfactory (Grades D–E), while only 33.33% rated it positively (Grade C).

Positive feedback was notably observed in Q1 (understanding Flutter concepts), Q3 (ease of use), Q5 (logical learning flow), and Q9 (documentation clarity), while Q2 (system complexity), Q6 (design inconsistency), and Q8 (documentation clarity) received more negative responses. These results suggest that, while the instructional content was generally helpful, key usability issues (especially in terms of navigation, design consistency, and clarity of feedback) limited the overall user experience.

6.4.4. Usability Result After Modification

After improving the interface and documentation of the proposal, we collected SUS responses from 16 fourth-year undergraduate students who used the modified system by solving the INT problems. Table 6 shows the distribution of their responses.

Based on the SUS responses from 16 fourth-year undergraduate students after the system modification, the average SUS score was improved to 71.56, which corresponds to Grade C (Good) based on the SUS grade scale. As their details, the responses were distributed as follows: 4 students (25%) rated the system as Grade A (Best Imaginable), 1 student (6.25%) as Grade B (Excellent), 4 students (25%) as Grade C (Good), and 7 students (43.75%) as Grade D (OK). While a few participants reported very high usability ratings (Grade A), the overall average remained at 71.56 (Grade C), suggesting generally positive perceptions with notable areas for improvements.

This indicates that 56.25% of participants rated the system’s usability as satisfactory (Grades A–C), while the remaining 43.75% rated it as Grade D, indicating that some aspects of the system could still be improved. Compared to the before-modification results, this reflects an overall improvement of +9.89 points, along with a shift toward higher satisfaction ratings. However, the presence of 43.75% of students rating Grade D indicates that further refinements of the system are needed.

Positive feedback was especially observed in Q1 (understanding Flutter concepts), Q3 (ease of use), Q5 (logical learning flow), Q7 (introductory documentation), and Q9 (clarity of core principles). In contrast, lower ratings were recorded in Q2 (system complexity) and Q6 (design inconsistency), indicating that some students experienced difficulties with interface navigations or perceived inconsistencies, which were recognized through student feedback and were addressed by implementing a feedback mechanism, as detailed in Section 7, particularly in the subsection Feedback Mechanism Implementation. Overall, the updated SUS scores reflect improved user satisfaction and usability, with the system moving into the “Good” category.

6.4.5. Statistical Comparison of SUS Scores

To verify the usability improvement of the proposed system, we compared the System Usability Scale (SUS) scores obtained in two evaluation rounds. The first round was conducted with 18 master’s students using the initial version of the system, while the second round involved 16 fourth-year undergraduate students using the improved version system.

Table 7 shows the average SUS score and the standard deviation in each round. The results indicate that students in the second round gave higher and more consistent SUS scores compared to those in the first round, despite being more junior in academic level.

An independent sample t-test was applied to the results of the two groups. It resulted in a p-value of 0.0350, which is below the standard 0.05 threshold. This suggests that the observed difference of the mean SUS scores is statistically significant. In addition, the Cohen’s d effect size was calculated to be 0.7755, indicating a medium-to-large effect. These results confirm that the improved system delivered a noticeable enhancement in perceived usability.

Figure 6 compares the distribution of the SUS scores between the two rounds. The boxplot shows that students in Round 2 achieved higher median SUS scores with a wider but still favorable spread. The statistical significance (p = 0.0350) and effect size (Cohen’s d = 0.7755) are noted on the figure, confirming the effectiveness of the improved system in enhancing usability for beginner learners.

7. Free Opinions from Students

In this section, we analyze free opinions from students during two evaluation rounds to complement the SUS results.

7.1. First Round Evaluation

First, we discuss opinions from students during the first round.

7.1.1. Student Feedback

During the first round evaluation, the following opinions were collected:

System UI and Structure:
-
There was no visual feedback when students submitted incorrect answers, which made it difficult to identify and fix mistakes;
-
Students found it inconvenient to look up relevant documentation manually and suggested displaying page references directly on the question screen;
-
The mixing of code snippets and problem descriptions within the same block reduced readability and caused confusion.
Documentation:
-
The side-by-side bilingual format was considered distracting. Students preferred separate Japanese and English versions;
-
The difference in font size between headings and body text was too small, making it difficult to distinguish structure at a glance;
-
Explanations of similar topics (e.g., widget types) were scattered and should be grouped more clearly;
-
The relationships between core components such as Class, Widget, Scaffold, and MaterialApp were hard to grasp without a visual aid (diagram).

7.1.2. Implementation of the Modification

Based on the opinions in the first round, we implemented the following modifications to improve the documentation and user interface:

The documentation was completely rewritten: Japanese and English versions were separated, the layout was improved, related topics were grouped, and long paragraphs were replaced with bullet points for better readability (as shown in Figure 7b compared to Figure 7a);
Documentation references were added directly to each problem screen to guide students to relevant explanations without leaving the page (as shown in Figure 8);
Several ambiguous question texts were rewritten to improve clarity and prevent misinterpretation (see revision in Figure 9b);
The fill-in-blank answer fields were repositioned to align correctly with the context, improving the visual structure and usability (as shown in Figure 9b).

7.2. Second Round Evaluation

Next, we discuss opinions during the second round.

7.2.1. Students Feedback

In the second round, most of the students appreciated the improved structure and readability of the system. However, several participants still reported difficulties when their answers were incorrect, but we received no indications of what was wrong.

In particular, two students demonstrated significantly higher numbers of submission attempts. Upon further inquiry, they explained that the lack of feedback made it difficult to understand whether their mistakes were due to typographical errors, incorrect formatting, or misunderstandings of the programming concepts. This highlighted the need for a more informative feedback mechanism to help students refine their answers effectively.

7.2.2. Feedback Mechanism Implementation

To help students recognize the reasons behind their incorrect responses, we introduced a pop-up hint mechanism. It activates only when a non-empty answer is submitted, ensuring that feedback is given in response to actual input rather than empty fields.

We designed three types of hint logics:

Case sensitivity alert: If the normalized input matches the correct answer hash, the system assumes a case mismatch and displays a hint accordingly.
Typo detection: If normalization fails, the Levenshtein distance is calculated [30]. If the distance is 1 or less, a likely typo is reported.
Predefined template hints: For inputs that are significantly different, a task-specific hint is shown. These hints were generated beforehand using ChatGPT-4o, based on the structure and intent of each question, as explained next.

7.2.3. Template Hint Generation Using ChatGPT

In this study, we used ChatGPT to help generate hints that match the learning needs of beginners and fit the context of each question. Although we did not follow a formal prompt engineering method, our approach naturally included important techniques such as giving clear instructions, setting constraints, and refining prompts based on the results.

Specifically, we applied the following:

Instruction-based prompting: Each prompt clearly explained the task (e.g., generating hints for fill-in-the-blank questions), as shown in Appendix A, Prompt 1;
Constraint specification: We defined strict rules such as “do not include the correct answer”, “avoid technical terms”, and “reference the documentation”, as shown in Appendix A, Prompt 2;
Iterative refinement: We revised the prompts based on the initial outputs, requesting shorter, clearer, and more stylistically varied hints, as shown in Appendix A, Prompt 4;
Style control: We selected appropriate hint types (e.g., Functionality Hint, Keyword Completion Hint, etc.) based on a style list suggested by the model, as shown in Appendix A, Prompt 1 and 2.

The actual generation process proceeded in four steps:

Hint style selection: We first asked ChatGPT to list possible hint styles suitable for coding exercises (Appendix A, Prompt 1). After reviewing the list, we selected seven styles that matched our educational goals, including functionality-based and documentation-referencing hints.
Constraint-driven generation: We instructed the model to generate hints based on our chosen styles, with clear constraints such as “no answer leakage” and “must be beginner-friendly” (Appendix A, Prompt 2).
Feedback and revision: Upon reviewing the output, we gave feedback on issues such as long sentences or uniformity in style, and requested improved versions (Appendix A, Prompt 4).
Bilingual hint production: The final set of hints was generated in both English and Japanese, and was integrated into our feedback pop-up system (Appendix A, Prompt 3).

Note that the integrated hint function shown here was implemented after the initial round of evaluation, and was not part of the system tested in Section 6. Its inclusion is based on user feedback and our system refinement efforts.

Examples of the actual hint messages generated by ChatGPT are shown in Figure 10, where the pop-up provides feedback based on the student’s submitted answer. These hints follow the specified constraints (e.g., no direct answer, beginner-friendly language) and appear only when an incorrect answer is submitted.

8. Discussion

In this section, we analyze the evaluation results to assess the effectiveness, usability, and learning support of the proposed system, including the impact of the feedback mechanism.

8.1. Interpretation and Answers to Research Questions

We examine whether the integration of a documentation, a source code, a visual output, and INT problems have improved conceptual understanding and the system usability, from the results in two evaluation rounds.

In RQ1, the high average correct answer rates observed in both evaluations suggest that the students were able to understand and apply Flutter/Dart programming concepts effectively using the proposed system. In the first round, 17 out of 23 students achieved perfect scores, and in the second round, 21 out of 25 students did so. The consistent correctness, even after modifications, indicates that the integration of documentation and interactive elements helped reinforce students’ conceptual learning.

For RQ2, the average SUS score increased from 61.67 in the first round to 71.56 in the second round. To verify whether this difference was statistically significant, we conducted an independent sample t-test, resulting in a p-value of 0.035, which is below the 0.05 threshold. This indicates a statistically significant improvement in system usability. The effect size (Cohen’s

d = 0.79

) suggests a medium-to-large effect. These findings imply that modifications such as improved documentation structure, clearer instructions, and interface enhancements contributed meaningfully to a more satisfactory and intuitive learning experience.

These findings demonstrate that the proposed learning system effectively supports both the learning and usability goals.

8.2. Limitations

Our study in this paper has several limitations that should be acknowledged.

First, the problem instance creation relied on manual authoring decisions, although the topics and blank selections were carefully designed following the documentation contents and grammar-focused principles for the GUP format. Future works should focus on standardizing the blank selection process to improve the accuracy, fairness, and scalability. For example, we will define rule-based criteria tied to syntax type, widget category, or difficulty levels so that instance generations become further consistent and automated across problems.

Second, the evaluation was conducted with students sharing a similar background, where the number of participants and institutional scope were limited. They have prior experiences in object-oriented programming (OOP) (e.g., Java, JavaScript, C++, or Python) but have no prior knowledge of Flutter/Dart. This alignment supports the system’s intended learner profile. However, the use of two distinct participant groups, where one group consists of master’s students and another of fourth-year undergraduate students, may introduce variability due to differences in academic levels, motivations, or prior exposures to related concepts between them. Future evaluations will be necessary with more controlled or homogeneous participant groups to improve the validity and comparability of the results.

Third, the evaluation was conducted in a short-term setting, involving detailed analysis and active feedback collection from participants. The evaluation of long-term retention or knowledge transfer beyond the immediate learning sessions was not conducted. Evaluating these aspects remains an important direction for future works.

Fourth, this study did not incorporate pre/post testing or baseline comparison groups such as documentation-only or conventional material conditions. As an initial exploratory implementation to students in real educational settings with formal classes, our evaluation prioritized practical usage and feedback collections rather than formal measurement of learning gains.

While the high correct answer rates indicate successful completions of the tasks by students, they may not fully capture their deeper conceptual understanding or transferability. Without follow-up assessments, it remains possible that some students relied on answer pattern recognitions rather than genuine comprehensions of the concepts. Evaluating these aspects remains an important direction for future works.

8.3. Implications and Future Works

The results of this study offer several implications on the design of programming learning environments, particularly for introductory Flutter/Dart programming. The proposed integration of beginner-oriented documentation with two exercise problem formats in INT demonstrated that the structured, context-aware guidance can significantly enhance understanding of introductory Flutter/Dart programming, especially in independent study environments.

The combination of a source code, a visual output through a screenshot, and an interactive solving process implemented in the INT format have helped students connect the abstract syntax with the concrete interface behavior. This approach is particularly valuable when learning UI-focused frameworks such as Flutter. Furthermore, the grammar-focused learning design of GUP and the structured completion tasks of EFP proved effective in bridging the gap between students’ prior object-oriented programming (OOP) experiences and the Flutter’s declarative model.

Based on these findings, several directions are planned for future developments. First, we aim to expand the set of INT problems by covering more advanced Flutter/Dart topics, such as state management patterns and widget compositions. Alongside this, we will explore automating the blank selection process for creating a new INT instance using a rule-based method, reducing manual workloads and ensuring consistency in problem generations.

Second, we plan to enhance the system’s adaptability by incorporating features that respond to individual learning progress or error patterns. In addition, we will explore the development of a mobile application of the system using Flutter/Dart to promote greater accessibility and enable students to engage in Flutter/Dart learning at any time and any place.

Furthermore, we intend to adopt more rigorous evaluation designs in future studies, including the use of pre/post testing and baseline comparison groups, to more accurately assess conceptual learning gains and further validate the educational effectiveness of the proposed system.

In particular, we aim to include methods for assessing conceptual transfers, such as follow-up tasks of writing new codes or explaining learned concepts. To improve internal validity of evaluations, more homogeneous participant groups (e.g., students of the same academic level and background) will be considered to reduce potential confounding factors when comparing different evaluation rounds.

9. Conclusions

This paper presented the Integrated Introductory Problem (INT) format for the Flutter Programming Learning Assistant System (FPLAS), designed to support beginner-oriented self-study in Flutter/Dart cross-platform mobile application programming. By integrating a beginner-friendly documentation, a source code, a visual output, and exercises, the proposal helps students bridge the gap between abstract syntax and interface behavior. For evaluations, 16 INT instances were generated and tested in two rounds with 48 students in total at Okayama University, Japan. In both rounds, high answer correctness rates were achieved, and improvements in System Usability Scale (SUS) scores were observed in the second round after refining the documentation and interface. The results indicate that the proposed learning system is effective in helping novice students understand the core principles of Flutter/Dart programming, even without prior experience. In future works, we aim to extend the coverage of INT to include more advanced topics such as state management and widget composition. We also plan to enhance the feedback function and implement adaptive guidance features, and automate the problem generation procedure using a rule-based method. Additionally, a mobile application version of the platform will be developed to support flexible, anytime learning for cross-platform mobile development education.

Supplementary Materials

The following supporting information can be downloaded at: https://github.com/adinekinar/int-plas-2025 (accessed on 20 September 2025).

Author Contributions

Conceptualization, S.A.K. and N.F.; Methodology, S.A.K.; Software, S.A.K., S.T.A., and H.H.S.K.; Investigation, S.A.K.; Writing—original draft preparation, S.A.K.; Writing—review and editing, S.A.K. and N.F.; Supervision, N.F.; Project administration, N.F.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The ethical review and approval were waived for this study by the Ethics Committee of Okayama University because it utilized only fully anonymized data collected from regular classroom activities, posing no risk to participants.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The original contributions presented in the study are included in the article (and Supplementary Materials).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Prompt Examples Used for Hint Generation

Below are excerpts of the actual prompts used during the template hint generation process:

Prompt 1 – Style Discovery:

Prompt 2 – Style Selection and Constraints:

Prompt 3 – Bilingual Hint Request:

Prompt 4 – Revision Feedback:

References

Statista. Smartphone Users Worldwide—Forecast to 2029. Available online: https://www.statista.com/forecasts/1143723/smartphone-users-in-the-world (accessed on 11 August 2025).
Pinto, C.M.; Coutinho, C. From native to cross-platform hybrid development. In Proceedings of the 2018 International Conference on Intelligent Systems (IS), Funchal, Portugal, 25–27 September 2018; pp. 669–676. [Google Scholar]
Jošt, G.; Taneski, V. State-of-the-art cross-platform mobile application development frameworks: A comparative study of market and developer trends. Informatics 2025, 12, 45. [Google Scholar] [CrossRef]
Martinez, M.; Lecomte, S. Towards the Quality Improvement of Cross-Platform Mobile Applications. In Proceedings of the 2017 IEEE/ACM 4th International Conference on Mobile Software Engineering and Systems (MOBILESoft), Buenos Aires, Argentina, 20–21 May 2017; pp. 184–188. [Google Scholar]
Zou, D.; Darus, M.Y. A Comparative Analysis of Cross-Platform Mobile Development Frameworks. In Proceedings of the 2024 IEEE 6th Symposium on Computers & Informatics (ISCI), Kuala Lumpur, Malaysia, 10 August 2024; pp. 84–90. [Google Scholar]
You, D.; Hu, M. A comparative study of cross-platform mobile application development. In Proceedings of the 12th Annual Conference of Computing and Information Technology Research and Education New Zealand (CITRENZ 2021), Wellington, New Zealand, 14–16 July 2021. [Google Scholar]
Gülcüoğlu, E.; Üstün, A.B.; Seyhan, N. Comparison of Flutter and React native platforms. J. Int. Appl. Manag. 2021, 12, 129–143. [Google Scholar] [CrossRef]
Dart. Dart Overview. Available online: https://dart.dev/overview (accessed on 27 June 2025).
Flutter. Build Apps for Any Screen. Available online: https://flutter.dev/ (accessed on 27 June 2025).
Bailey, T.; Biessek, A. Flutter for Beginners: An Introductory Guide to Building Cross-Platform Mobile Applications with Flutter 2 and Dart; Packt Publishing: Birmingham, UK, 2021. [Google Scholar]
Patta, A.R.; Funabiki, N.; Lu, X.; Syaifudin, Y.W. A study of grammar-concept understanding problem for Flutter cross-platform mobile programming learning. In Proceedings of the International Conference on Vocational Educational and Electrical Engineering (ICVEE), Surabaya, Indonesia, 14–15 October 2023; pp. 249–254. [Google Scholar]
Aung, S.T.; Funabiki, N.; Aung, L.H.; Kinari, S.A.; Mentari, M.; Wai, K.H. A study of learning environment for initiating Flutter app development using Docker. Information 2024, 15, 191. [Google Scholar] [CrossRef]
Kinari, S.A.; Funabiki, N.; Aung, S.T.; Wai, K.H.; Mentari, M.; Puspitaningayu, P. An independent learning system for Flutter cross-platform mobile programming with code modification problems. Information 2024, 15, 614. [Google Scholar] [CrossRef]
Kinari, S.A.; Funabiki, N.; Aung, S.T.; Mentari, M.; Puspitaningayu, P. A study of guide documentation for introductory Flutter programming learning with exercises. In Proceedings of the International Conference on Consumer Technology (ICCT-Pacific), Shimane, Japan, 29–31 March 2025; pp. 1–4. [Google Scholar]
Brooke, J. SUS: A quick and dirty usability scale. Usability Evaluation in Industry; Jordan, P.W., Thomas, B., Weerdmeester, B., McClelland, I.L., Eds.; Taylor & Francis: London, UK, 1995; p. 189. [Google Scholar]
Nagaraj, N.K.; Prabakaran, B.; Ramkumar, M.O. Application development for a project using Flutter. In Proceedings of the International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 20–22 October 2022; pp. 947–950. [Google Scholar]
Zou, D.; Bin Darus, M.Y.; Ramli, A.B. Investigating developer experiences with UI components in Flutter: Challenges and implications. In Proceedings of the International Visualization, Informatics and Technology Conference (IVIT), Kuala Lumpur, Malaysia, 7–8 August 2024; pp. 128–131. [Google Scholar]
Boukhary, S.; Colmenares, E. A clean approach to Flutter development through the Flutter clean architecture package. In Proceedings of the International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 5–7 December 2019; pp. 1115–1118. [Google Scholar]
Radadiya, S.; Ramchandani, S. The impact of Flutter development. Int. Res. J. Mod. Eng. Technol. Sci. 2025, 7. [Google Scholar] [CrossRef]
Flutter. Flutter Architectural Overview. Official Documentation, May 2025. Available online: https://docs.flutter.dev/resources/architectural-overview (accessed on 12 August 2025).
Sharma, S.; Khare, S.; Uniyal, V.; Verma, S. Hybrid development in Flutter and its Widgets. In Proceedings of the International Conference on Cyber Resilience (ICCR), Dubai, United Arab Emirates, 6–7 October 2022. [Google Scholar]
Wang, Y.-L.; Wang, C.-Y.; Hsiao, H.-S. The CodingHere platform for programming courses: Improving learning motivation, effectiveness, and cognitive load. Electronics 2022, 11, 1680. [Google Scholar]
Hinrichs, T.; Burau, H.; von Pilgrim, J.; Schmolitzky, A. A scalable online programming platform for software engineering education. In Software Engineering 2021 Satellite Events; Götz, S., Linsbauer, L., Schaefer, I., Wortmann, A., Eds.; Lecture Notes in Informatics (LNI); Gesellschaft für Informatik: Bonn, Germany, 2021. [Google Scholar]
Combéfis, S.; Girgis, M.; Hage, J.; Ihantola, P.; Leinonen, J.; Rubio, M.A. Automated code assessment tools: A review of the state of the art. Software 2022, 1, 144–159. [Google Scholar] [CrossRef]
Jevtić, M.; Slijepčević, S.; Tomić, A.; Vučetić, M. Source code analysis in programming education: Evaluating learning content with self-organizing maps. Appl. Sci. 2023, 13, 5719. [Google Scholar] [CrossRef]
Hsueh, C.-H.; Wu, J.-C.; Yang, S.-T.; Liu, C.-C.; Lin, K.-H. Design of an online programming platform and a study on learners’ testing ability. Electronics 2023, 12, 4596. [Google Scholar] [CrossRef]
Hasan, M.A.; Ahsan, M. Automated assessment in mobile programming courses: An evaluation system for Java and Kotlin apps. Int. J. Adv. Comput. Sci. Appl. 2022, 13, 295–302. [Google Scholar]
Togni, J. Development of an inclusive educational platform using open technologies and machine learning: A case study on accessibility enhancement. arXiv 2025, arXiv:2503.15501v1. [Google Scholar]
Renkl, A. The Worked-Out Examples Principle in Multimedia Learning. In The Cambridge Handbook of Multimedia Learning, 2nd ed.; Mayer, R.E., Ed.; Cambridge University Press: New York, NY, USA, 2014; pp. 391–412. [Google Scholar]
Levenshtein, V.I. Binary codes capable of correcting deletions, insertions, and reversals. Cybernetics 1966, 10, 707–710. [Google Scholar]

Figure 1. Answer interface for INT instance.

Figure 2. Average correct answer rate and number of submission times for each student’s Flutter INT.

Figure 3. Average solution results for Flutter INT.

Figure 4. Average correct answer rate and number of submission times for each student’s Flutter INT after modification.

Figure 5. Average solution result for Flutter INT after modification.

Figure 6. Comparison of SUS scores between two rounds with effect size (Cohen’s d).

Figure 7. Documentation improvements.

Figure 8. Instance IDs with direct link to corresponding documentation page.

Figure 9. Comparison of the question interface before and after the improvements.

Figure 10. Integrated feedback pop-up in the learning system.

Table 1. Topics of INT instances.

ID	Topic	Key Concept in Focus	Related Documentation Topic	Purpose in INT System
1	Import and Main	Application entry point using `main()`	Basic Dart/Project Structure	Introduce where and how Flutter apps start
2	Structure of Project	File and class organization	Project Structure	Help students navigate and understand basic file layout
3	MaterialApp and Text	Root widget setup and rendering simple text	Widget Usage/Project Structure	Demonstrate how a minimal screen is constructed
4	Basic Key-Value	Passing parameters through constructors	Key-Value Format	Emphasize configuration through parameterized widgets
5	Scaffold and Hierarchy	Layout and nesting via `Scaffold`	Widget Hierarchy	Clarify structural containers and layout relationships
6	Change Text Style	Customizing visual output via widget properties	Using Properties as Values	Show how to change the appearance of basic widgets
7	Using AppBar	Adding common UI components	Flutter Widget Usage	Introduce real-world components in UI design
8	Using Container	Layout control and decoration	Layout and Nesting/ Container Use	Demonstrate flexible styling and positioning
9	Row and Column	Horizontal and vertical alignment of widgets	Widget Hierarchy	Teach composition and alignment principles
10	Button	Introducing interactivity	Stateless/StatefulWidget	Begin adding user interaction elements
11	Dart List	Generating UI elements from lists	Dart List	Reinforce repeated UI from data structures
12	Basic Function	Function declaration and reuse	Dart Functions	Practice organizing logic outside widget trees
13	StatefulWidget	Handling dynamic values in UI	StatefulWidget Usage	Lay foundation for apps with changing states
14	Button Click Function	Executing logic on interaction	Event Handling	Connect UI actions to Dart functions
15	Update with setState	Updating interface with internal state change	setState Mechanism	Show how state affects visible UI
16	Ternary Operator	Conditional UI rendering in expressions	Conditional Expression	Introduce short syntax for visual condition control

Table 2. Sample GUP questions and answers.

No.	Question	Answer
1	Which keyword represents the main application widget class?	`MyApp`
2	Which keyword is used to override a method in Flutter?	`@override`
3	Which keyword is used to send a value back from a function?	`return`
4	Which method is used to define the UI layout of a widget?	`build`
5	Which keyword allows a class to inherit from another?	`extends`
6	Which syntax is used to initialize a superclass’s key?	`super.key`

Table 3. SUS questions in evaluation.

ID	Question Statement
Q1	The system helped me clearly understand Flutter concepts through its exercises and examples.
Q2	I found the system overly complex to navigate.
Q3	I thought the system was intuitive and easy to use for learning Flutter.
Q4	I frequently found it difficult to follow the system’s learning flow.
Q5	The system’s learning flow (e.g., tutorials, exercises, feedback) felt logical and easy to follow.
Q6	I felt there were inconsistencies in the system’s design or functionality.
Q7	The accompanying documentation provided a clear and helpful introduction to Flutter concepts.
Q8	I felt the documentation was confusing or lacked clarity in presenting Flutter concepts.
Q9	The documentation explained Flutter’s core principles in a way that was easy to understand.
Q10	I thought the explanations of Flutter’s core principles were overly technical or hard to grasp.

Table 4. SUS score to grade mapping.

SUS Score Range	Grade
$85 \leq x \leq 100$	A (Best Imaginable)
$80 \leq x < 85$	B (Excellent)
$70 \leq x < 80$	C (Good)
$50 \leq x < 70$	D (OK)
$25 \leq x < 50$	E (Poor)
$x < 25$	F (Worst Imaginable)

Table 5. Distribution of SUS responses (before).

Q	Strongly Disagree	Disagree	Neutral	Agree	Strongly Agree
Q1	0	1	4	10	3
Q2	4	7	4	1	2
Q3	1	1	3	8	5
Q4	3	4	2	6	3
Q5	1	2	0	10	5
Q6	5	6	5	1	1
Q7	0	0	6	7	5
Q8	2	1	6	7	2
Q9	0	0	4	8	6
Q10	1	1	3	9	4

Table 6. Distribution of SUS responses (after).

Q	Strongly Disagree	Disagree	Neutral	Agree	Strongly Agree
Q1	1	2	4	6	3
Q2	5	5	4	2	0
Q3	0	2	5	4	5
Q4	4	2	5	4	1
Q5	0	2	3	7	4
Q6	7	3	3	2	1
Q7	0	0	2	10	4
Q8	9	4	2	1	0
Q9	0	1	1	8	6
Q10	6	6	3	1	0

Table 7. SUS score comparison between two rounds.

Group	Participants	Mean SUS Score	Standard Deviation
Round 1 (Master)	18	61.67	10.85
Round 2 (Undergraduate)	16	71.56	14.63

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kinari, S.A.; Funabiki, N.; Aung, S.T.; Kyaw, H.H.S. A Guided Self-Study Platform of Integrating Documentation, Code, Visual Output, and Exercise for Flutter Cross-Platform Mobile Programming. Computers 2025, 14, 417. https://doi.org/10.3390/computers14100417

AMA Style

Kinari SA, Funabiki N, Aung ST, Kyaw HHS. A Guided Self-Study Platform of Integrating Documentation, Code, Visual Output, and Exercise for Flutter Cross-Platform Mobile Programming. Computers. 2025; 14(10):417. https://doi.org/10.3390/computers14100417

Chicago/Turabian Style

Kinari, Safira Adine, Nobuo Funabiki, Soe Thandar Aung, and Htoo Htoo Sandi Kyaw. 2025. "A Guided Self-Study Platform of Integrating Documentation, Code, Visual Output, and Exercise for Flutter Cross-Platform Mobile Programming" Computers 14, no. 10: 417. https://doi.org/10.3390/computers14100417

APA Style

Kinari, S. A., Funabiki, N., Aung, S. T., & Kyaw, H. H. S. (2025). A Guided Self-Study Platform of Integrating Documentation, Code, Visual Output, and Exercise for Flutter Cross-Platform Mobile Programming. Computers, 14(10), 417. https://doi.org/10.3390/computers14100417

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Q	Strongly Disagree	Disagree	Neutral	Agree	Strongly Agree
Q1	0	1	4	10	3
Q2	4	7	4	1	2
Q3	1	1	3	8	5
Q4	3	4	2	6	3
Q5	1	2	0	10	5
Q6	5	6	5	1	1
Q7	0	0	6	7	5
Q8	2	1	6	7	2
Q9	0	0	4	8	6
Q10	1	1	3	9	4

Q	Strongly Disagree	Disagree	Neutral	Agree	Strongly Agree
Q1	1	2	4	6	3
Q2	5	5	4	2	0
Q3	0	2	5	4	5
Q4	4	2	5	4	1
Q5	0	2	3	7	4
Q6	7	3	3	2	1
Q7	0	0	2	10	4
Q8	9	4	2	1	0
Q9	0	1	1	8	6
Q10	6	6	3	1	0

Q	Strongly Disagree	Disagree	Neutral	Agree	Strongly Agree
Q1	0	1	4	10	3
Q2	4	7	4	1	2
Q3	1	1	3	8	5
Q4	3	4	2	6	3
Q5	1	2	0	10	5
Q6	5	6	5	1	1
Q7	0	0	6	7	5
Q8	2	1	6	7	2
Q9	0	0	4	8	6
Q10	1	1	3	9	4

Q	Strongly Disagree	Disagree	Neutral	Agree	Strongly Agree
Q1	1	2	4	6	3
Q2	5	5	4	2	0
Q3	0	2	5	4	5
Q4	4	2	5	4	1
Q5	0	2	3	7	4
Q6	7	3	3	2	1
Q7	0	0	2	10	4
Q8	9	4	2	1	0
Q9	0	1	1	8	6
Q10	6	6	3	1	0

Article Menu

A Guided Self-Study Platform of Integrating Documentation, Code, Visual Output, and Exercise for Flutter Cross-Platform Mobile Programming

Abstract

1. Introduction

2. Related Works

2.1. Flutter for Learning

2.2. Programming Educational Platforms

3. Overview of Flutter Introductory Documentation

3.1. Purpose and Development

3.2. Documentation Content and Its Role in INT

4. Integrated Learning Design for Flutter Self-Study

4.1. Rationale for Improvement

4.2. Overview of INT Framework

4.3. Curriculum Design and Topic Selection

5. Integrated Introductory Problem for Flutter/Dart Programming

5.1. System Architecture and Workflow

5.2. Procedure for Creating INT Instances

5.3. Example Instance: Topic #2 – Flutter Code Structure

5.4. Interface Design and Scoring Mechanism

6. Evaluation

6.1. Evaluation Setting

6.1.1. First Round Participants (Initial Group)

6.1.2. Second Round Participants (After-Modification Group)

6.1.3. Evaluation Procedure

6.2. Solution Results

6.2.1. Results of Individual Students

6.2.2. Results of Individual Instances

6.3. Solution Results After Modification

6.3.1. Results of Individual Students

6.3.2. Results of Individual Instances

6.4. System Usability Scale

6.4.1. SUS Question Design

6.4.2. SUS Score Interpretation

6.4.3. Usability Result Before Modification

6.4.4. Usability Result After Modification

6.4.5. Statistical Comparison of SUS Scores

7. Free Opinions from Students

7.1. First Round Evaluation

7.1.1. Student Feedback

7.1.2. Implementation of the Modification

7.2. Second Round Evaluation

7.2.1. Students Feedback

7.2.2. Feedback Mechanism Implementation

7.2.3. Template Hint Generation Using ChatGPT

8. Discussion

8.1. Interpretation and Answers to Research Questions

8.2. Limitations

8.3. Implications and Future Works

9. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Prompt Examples Used for Hint Generation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Q	Strongly Disagree	Disagree	Neutral	Agree	Strongly Agree
Q1	0	1	4	10	3
Q2	4	7	4	1	2
Q3	1	1	3	8	5
Q4	3	4	2	6	3
Q5	1	2	0	10	5
Q6	5	6	5	1	1
Q7	0	0	6	7	5
Q8	2	1	6	7	2
Q9	0	0	4	8	6
Q10	1	1	3	9	4

Q	Strongly Disagree	Disagree	Neutral	Agree	Strongly Agree
Q1	1	2	4	6	3
Q2	5	5	4	2	0
Q3	0	2	5	4	5
Q4	4	2	5	4	1
Q5	0	2	3	7	4
Q6	7	3	3	2	1
Q7	0	0	2	10	4
Q8	9	4	2	1	0
Q9	0	1	1	8	6
Q10	6	6	3	1	0