1. Introduction
Augmented reality (AR) is a technology that expands the physical world with additional digital information such as sounds, images, and models [
1]. The central value of AR is that the components of the digital world blend into a person’s perception of the real world. Beyond showing the data, it allows the integration of immersive sensations, which are perceived as natural parts of an environment. In recent years, the growth of AR applications can be attributed to solutions that focus on contextualizing information (e.g., annotating different parts of a physical object [
2], displaying artifacts at a given place [
3], and aligning virtual objects with the real world [
4]—that is, automatically positioning an object on the detected table or floor). In an educational setting, AR technology can be incorporated in the classroom to enhance teaching/learning efficiency and the motivation of both educators and learners [
5,
6].
There are many available tools and libraries—such as ARCore, ARKit, Vuforia, Unity [
7], and ARToolkit [
8]—that developers can use to create and manipulate AR applications. ARCore is a framework from Google for building augmented reality applications for both Android and iOS devices. Apple provides ARKit for making AR applications and games for their iOS devices. Both of these frameworks use a handheld device’s sensors for motion tracking, light estimation, and environmental understanding. Unlike ARCore and ARKit, ARToolkit and Vuforia are computer tracking libraries that overlay virtual objects on the real world based on markers. The above frameworks and libraries can be integrated in Unity for porting an AR application in devices with different operating systems (e.g., Android or iOS). To use these frameworks, it is necessary to have an integrated development environment (e.g., Xcode for iOS and Android Studio for Android, or Unity for both iOS and Android), a compatible device (e.g., ARCore requires that the device must be running Android at least 7.0, and ARKit requires device to be run on iOS 11 with an A9, A10, or A11 processor), and knowledge of a specific programming language (e.g., Objective-C, C#, C/C++, or Java). These requirements are obstacles for beginners who want to create an AR experience on their own. In their study, Nguyen et al. [
9] showed that API version incompatibility in the integrated development environment (IDE) is a major obstacle students face while working with the application in terms of both coding and deploying. The study also indicated that students faced difficulty in analyzing “800+ line scripts”.
To alleviate the difficulty of having an IDE and a compatible device, web-based AR (e.g., WebVR or WebXR) is an alternative approach for users to experience virtual objects in the real world only by using a web browser on their handheld device. A web browser engine supports Hypertext Markup Language (HTML), Cascading Style Sheets (CSS), and scripting languages such as JavaScript that can be programmed with a simple text editor, thereby releasing users from the need for an IDE. Furthermore, web-based AR toolkits such as ARToolkit for web [
8] are compatible with a wider range of devices (e.g., running on OS 4.0.3 or higher for Android and 7.0 or higher for iOS). The use of ARToolKit has been widely adapted in other libraries such as ThreeJS [
10] and A-Frame.io [
11]. Currently, although web-based AR has some limitations due to its low frame-rate and not leveraging the full capacity of in-application AR, the continual development of other technologies will help web-based AR keep growing in the future. For example, the presence of an open binary instruction format WebAssembly [
12] would allow the browser engine to run native code in a browser, and this provides the capability to access the in-application AR features through a web-based VR.
Recent research has produced some insights that describe how to lessen the issue of mastering a certain programming language for young learners and enthusiasts. Block-based programming [
13] is a type of programming language where instructions are mainly represented as blocks (or visual cues) and users drag and drop the cues to form a set of instructions. This programming paradigm enables developers to focus on logical programming rather than memorizing the syntax of coding. Scratch [
14] is an example of using this visual paradigm in K-12 education. By using Scratch, K-12 students are able to program a 2D game and experience it immediately on a web-browser. Radu and Blair [
15] extended Scratch to make it possible to use to create AR. However, their work stopped at rendering 2D images on the screen only; 3D objects and spatial position manipulation have not been implemented. CoSpaces [
16] is a commercial product created for education; it enables students to create virtual reality (VR) applications through context-based language. It also can be used to create a simple AR application by superimposing 3D objects onto the physical environment. However, the interactions between 3D objects are limited. In addition, Laine [
17] pointed out that the majority of AR applications were developed through Vuforia SDK with a few occurrences of ARToolkit, which would not be suitable for non-programmers, but expert programmers. Furthermore, Wu et al. [
18] indicated that in some AR systems, the learning content and the teaching sequence are rather “fixed” such that teachers are not able to make changes to accommodate students’ learning needs. As such, an authoring/storytelling tool is highly needed for teachers and students to create AR applications [
19].
2. Motivation and Research Aim
Motivated by the block-based visual programming paradigm and AR for the web, we aim to bridge the gap between existing technologies. Our intention is to provide a generic web environment and let users to freely create AR applications for their interests. To the best of our knowledge, there is no tool in the literature that offers these features, which makes this research a unique contribution. To address this gap, this paper introduces BlocklyAR, a novel visual programming interface for creating and generating a web-based AR application. The goal of our work is to help learners create and use AR without having to memorize coding syntax. Consequently, this paper contributes to current research as it:
Helps young learners and enthusiasts express their programming ideas without memorizing syntax;
Enables learners to evaluate their coding promptly;
Allows learners to generate an AR application with minimal effort;
Supports users’ ability to share newly created AR applications with others;
Shows the applicability of BlocklyAR by recreating an existing AR application through a visual programming interface;
Evaluates BlocklyAR tool using the technology acceptance model with the following hypotheses:
Hypothesis 1 (H1). Perceived Visual Design positively influences perceived task–technology fit.
Hypothesis 2 (H2). Perceived task–technology fit positively influences perceived ease-of-use.
Hypothesis 3 (H3). Perceived visual design positively influences Perceived usefulness.
Hypothesis 4 (H4). Perceived ease-of-use positively influences Perceived usefulness.
Hypothesis 5 (H5). Perceived ease-of-use positively influences intention to use.
Hypothesis 6 (H6). Perceived usefulness positively influences intention to use.
The rest of this paper is organized as follows:
Section 3 summarizes existing research that is close to our paper.
Section 4 presents methods for constructing our proposed tool and describes the tool’s architecture in detail.
Section 5 evaluates the tool’s application using the technology acceptance model. Some challenging issues are discussed in
Section 6. Our paper is concluded in
Section 7.
3. Related Work
In this section, we briefly review the use of block-based visual programming, using Blockly [
20] as an example. While Blockly has been used in many different domains, we only discuss the research that is relevant to our study, particularly in making AR applications.
Blockly [
20] is a client-side library, a project of Google, for creating block-based visual programming languages and editors (as depicted in
Figure 1). One interesting feature of Blockly is that it can run in a web browser. Visual cues in Blockly can be linked together to make writing code easier. The central value of Blockly is the ability to generate code in many different languages, such as JavaScript, Lua, Dart, Python, and PHP. A typical application that uses Blockly is Scratch [
14]. This web-based visual programming language is targeted primarily at children who are between the ages of 8 and 16.
Radu and Blair [
15] developed a first AR Scratch tool that allows children to create programs that mix real and virtual spaces. In their work, they customized the Scratch environment by adding an AR feature to the interface. ARToolkitPlus library was used for detecting markers’ position and orientation. Once the markers were found, 2D actor sprites (or images) were overlaid on to the corresponding markers. The pilot study result showed that young learners were enthusiastic, and returned to interact with the tool even after the study finished. As indicated by the authors, the main drawback of this tool was not adding a third dimension to the Scratch environment due to its complexity in specifying relationships and interactions between objects.
Mota et al. [
21] built an in-application AR tool called VEDILS (which stands for Visual Environment for Designing Interactive Learning Scenarios). Android users can leverage this tool in order to create AR applications. The VEDILS’s AR components were developed using the Vuforia for image recognition and tracking. Like Scratch, VEDILS relied on the Blockly library for generating visual blocks.
The idea most similar to our work in the literature was presented recently by Clarke [
22], in which the author extended VEDILS for working on iOS devices. His tool enables users to work with 20 augmented reality primitive components, including basic shapes such as boxes, capsules, cones, spheres, text, and 3D models. A tutorial for using the tool was provided to help participants get familiar with the AR components and the visual interface. The pilot study result showed that participants felt empowered by working with the AR components and they could build AR applications after using the AR components. As the author pointed out, API incompatibility was one of the main issues in the study since the tool required iOS 12+ to run. In addition, features such as animations and movements in the AR environment had not developed and these components were put in the future work.
Although there are still a number of tools in the literature for generating AR applications, they are out of the scope of this paper since we are focusing on block-based programming language. The aforementioned studies still faced the same issues as their predecessors when deploying on a device or they lacked interactions in the AR environment. Our work overcomes the limitations of existing studies by implementing the tool on the web-based environment. In addition, animations of the 3D models and interactions in the AR scene are added.
6. Discussion
Our study has several limitations that should be addressed in the future research. The first limitation is the procedure to collect user responses. This was due to the COVID-19 pandemic that prevents us from conducting the study in a face-to-face fashion. In addition, by only watching the video, participants were unable to use the toolkit directly, which may have reduced the motivation to take part in the survey. As such, more rigorous research would be needed to evaluate the adaption and use of BlocklyAR, even though it is not uncommon to collect user responses by watching videos [
44,
45,
46,
47]. Second, BlocklyAR did not support an arbitrary action or actions defined by users; we acknowledge that the action space is huge and users may be interested in only a certain action depending on a given domain. In fact, BlocklyAR can be considered as an abstract or a high-level programming interface for A-Frame combined with AR.js, so we only defined elements that are most commonly be used in an AR application with an extension of controlling the animations and movement of a 3D object. Enthusiasts can refer to the technical detail in
Section 4.1 for replicating and extending the work. Third, the current version of BlocklyAR only supports the marker-based approach, meaning that users have to prepare a marker and put it in front of the camera. We have not taken advantage of WebXR yet, due to the unavailability of a stable WebXR API as well as our lack of compatible devices for conducting the experiment. Fourth, privacy concerns were not taken into account in this study. Rauschnabel et al. [
48] discussed that information captured by a device’s sensors might threaten the privacy of both users and other people, thereby placing an obstacle for using AR technology. Lastly, other factors contributing to the adoption and use of technology might be considered in future studies, which have been discussed in the unified theory of acceptance and use of technology (UTAUT) and its extensions [
49,
50]. That is, it would extend the technology adoption framework used in this study by evaluating the influences of performance expectancy, effort expectancy, social influence, facilitating conditions, hedonic motivation, price value, and habit on the adoption and use of BlocklyAR, as well as the moderating effects of individual differences (age, gender, and experience) on the constructs.
7. Conclusions
This paper introduced BlocklyAR, a novel web-based visual programming interface for creating and generating an augmented reality application. By integrating A-Frame and AR.js toolkit into Blockly, BlocklyAR enables young learners and enthusiasts to create AR experiences. The proposed toolkit can be generalized and extended to many other domains for the use of pedagogical and instructional design with animated 3D models, such as demonstrating the fundamentals of electric circuits, testing/simulating robot movements, or assembling hardware components. Following this approach, users can download free 3D models on the internet [
51] and then apply animations on them by an intermediate tool presented in [
3]. We demonstrated BlocklyAR with a use case where the toolkit can replicate existing work with fewer efforts in programming. Data collected from users’ responses indicated that BlocklyAR was useful in learning and making an AR application, with particular relevance for new learners. We used the technology acceptance model to assess users’ behavior toward using the toolkit in terms of visual design, task–technology fit, perceived usefulness, perceived ease-of-use, and intention to use. Our findings showed that visual design had statistically significant and positive influences on task–technology fit. Task technology fit had a statistically significant and positive influence on perceived ease-of-use, perceived usefulness and had statistically significant and positive effects on intention to use; and perceived ease-of-use had statistically significant and positive effects on intention to use. However, hypotheses H5 (visual design → perceived usefulness) and H1 (perceived ease-of-use → perceived usefulness) were rejected. Future work will be focusing on replacing the printed map with a virtual real world map (e.g., Mapbox, OpenStreetMap) to increase the fidelity of the AR scene. In this regard, real world location such as latitude and longitude will be used as a substitution for markers.