General Software Platform and Content Description Format for Assembly and Maintenance Task Based on Augmented Reality

Shen, Yiming; Ueda, Shuntaro; Fujimoto, Yuichiro; Sawabe, Taishi; Kanbara, Masayuki; Kato, Hirokazu

doi:10.3390/info14020100

Open AccessArticle

General Software Platform and Content Description Format for Assembly and Maintenance Task Based on Augmented Reality

by

Yiming Shen

^*

,

Shuntaro Ueda

,

Yuichiro Fujimoto

,

Taishi Sawabe

,

Masayuki Kanbara

and

Hirokazu Kato

Nara Institute of Science and Technology (NAIST), Ikoma 630-0192, Nara, Japan

^*

Author to whom correspondence should be addressed.

Information 2023, 14(2), 100; https://doi.org/10.3390/info14020100

Submission received: 27 December 2022 / Revised: 30 January 2023 / Accepted: 30 January 2023 / Published: 6 February 2023

(This article belongs to the Collection Augmented Reality Technologies, Systems and Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Augmented reality (AR) support systems have proven to be effective in supporting various tasks, but are not yet widely used in real-world applications due to their high development costs. Although several AR authoring tools have been proposed for content development to solve this problem, most AR systems developed with these tools are not compatible with industrial environments and task types, and have inflexible visualization styles. This study provides a systematic solution of combining the AR task support software platform and the general contents description format by expanding the degree of freedom of the visualization methods and improving the design process. The proposed solution adapts to different activities and environments, and optimizes content implementation and user experience. The user study was conducted to evaluate the task performance including processing time and errors and the user experience including the perceived cognitive load and usability of our solution by comparing with a conventional system. The results show that the proposed software platform could improve the compatibility with industrial environments and tasks, and reduce the workload and cognitive effort, although the task performance was the same as the conventional AR system.

Keywords:

augmented reality; maintenance; task support; user interface

1. Introduction

In conventional industrial installation and maintenance operations, there are numerous components of industrial products and complicated interactions between these components and their associated combinations. Ordinary workers who want to perform such activities accurately and effectively must acquire relevant knowledge and complete professional and comprehensive training. Such training is time-consuming, tedious, site-specific, abstract, and not only time-consuming but also cognitively demanding for the user [1]. In addition, the conventional method of referring to instruction books consisting of text, images, and videos makes it difficult for the user to abstract from the two-dimensional form of an object to locate the relevant components and movements and perform the necessary actions. This creates a high learning cost and cognitive burden for the user, and reliance on conventional help alone in the form of a guidebook can be even more difficult when certain tasks may create personal safety concerns [2].

Augmented reality (AR) is a physical space-based information enhancement technology that enables spatial location in the real world and the display of digital multimedia content such as text, photos, audio, videos, 3D models, and animations [3,4,5,6,7,8]. The use of AR technology in industrial installation and maintenance allows end-users to be in both real and virtual environments, to visually observe the correspondence between virtual and real objects, and to more efficiently and clearly understand the complex, abstract, spatial relationships between the many components of industrial equipment [9]. By integrating the rich instructional format, efficient information transfer, and interactivity of AR into task support instructional systems can increase productivity, reduce training time and costs, ensure quality, reduce errors, and improve the end-user’s experience [10,11,12].

AR authoring tools are crucial tools that contribute to the mainstream adoption of AR in the industry [13]. They can be classified as programming or content design tools, with the latter reducing the need for programming skills to construct AR solutions [14,15,16]. Such tools are designed to read and execute AR solutions as data files on a software platform operating on an end-user’s device. For task support content, the data files interpreted by the software platform are often formatted in an XML-based format. The software platform unifies all the functional components required to produce a complete AR experience and is reusable, considerably decreasing the cost of recurrent system development and the cost of learning the system for the end-user by using the same system for different tasks [17]. As a general task support software platform, there is a lack of flexibility to realistic and flexible industrial environments in terms of support for complicated multi-typed environments and tasks, as well as the end-user’s task support experience and outcomes. For instance, dynamic multi-component combinations of installation or maintenance operations in single-step scenarios are not supported. In addition to the design of the system’s overall interaction interface, there is a lack of attention to the environment and the task. Low degrees of freedom in the interaction between actual and virtual items also result in a bad user experience and an increase in cognitive load for the user.

This research improves the adaptability of the software platform to various industrial environments and task types by extending the flexibility of the visualization methods and improving the design process of the software platform interaction interface to increase task support capabilities and user experience. Referring to the task content description format which matches the software platform, it requires having the ability to describe multiple types of task content. Therefore, we focus on both developing a software platform and content description format, and our research questions are as follows:

R1 How to implement a general software platform that meets the following requirements:
–
The system can be sufficiently adaptable to different types of environments and tasks.
*
Idea: Section 3.2.1 and Section 3.2.2
*
Discussion: Section 6.9.4 and Section 7
–
The system provides more flexibility in the visualization methods so that the user gets a good task performance and experience.
*
Idea: Section 3.2.1 and Section 3.2.2
*
Evaluation: Section 6
*
Discussion: Section 6.8
R2 How to define an AR content description format that meets the following requirement:
–
The format has enough degree of freedom to describe different assembly and maintenance tasks.
*
Idea: Section 5.1
*
Evaluation: Section 5.3

2. Related Works

2.1. Industrial AR and AR Task Instruction

A task support system based on AR helps a user to work intuitively by overlaying visual information on real objects. Previous research has demonstrated that using AR systems for maintenance and assembly operations improves performance, curtails task duration, and reduces mistake rates [2,18]. In the manufacturing industry, AR systems have emerged as a potential tool for training and assisting mechanical production operations [19,20]. Reiners et al. [21] described an AR demonstrator assisting the user in assembling the door lock to the car door. Salonen et al. [22] suggested the use of an AR-based system for supporting assembly tasks. To create this AR task support system, tools such as integrated development environment Unity3D [23], Unreal [24] or Amazon Sumerian [25], in addition to ARToolkit [26], Vuforia [27] or MRTK [28] development kit, can be developed to accelerate the design and development of AR task support systems. However, such a traditional development model requires developers to have the necessary coding ability [20] and a strong knowledge of AR, graphic design, and interaction design, while some problems exist such as high development cost, high threshold, and a long 16 development cycle. This also hinders the widespread applications of AR in industrial fields [17].

2.2. AR Authoring Tool

Some research adopts video as the main media to accelerate the development of AR industrial assistance system. For example, Vuforia Expert Capture [29] is a commercial AR application for in situ AR content authoring with support for first-person-view (FPV) video recording. Vuforia Chalk [30] renders 2D annotations in physical space or captures 2D video to embed in the real world, takes the video as the main content, and does not support a wide variety of virtual asset content.

Another direction is to create an AR authoring tool following the idea of Web Service [31], where the web page content and the browser platform can be separated by defining and using the content description format HyperText Markup Language (HTML).

The service provider creates content written in a standardized format, and the user prepares a browser serving as a platform. In AR authoring tool, AR solutions first require editing tools to write Contents data (commonly structured in XML-based formats) and then read and execute it as a data file on the software platform running on the end-user’s device [17]. In this way, developers only need to edit the content, and the platform can be reused.

Zhao et al. [32] proposed a standard and rapid development method to use Extensible Markup Language (XML) files and an HMD-based platform. Services provided by companies such as BUNDLAR [33], ScopeAR with WorkLink [34], and Dynamics 365 Guides [35] allow creators with little or no coding knowledge to build and deploy task training and assistance for many types of tasks. Generally, the AR authoring tool in the form of “Content+Platform” lowers the barrier of system development compared with the traditional method. Moreover, the program can be reused to lessen the capital and time cost of development. Such AR Authoring Tools are crucial instruments for the widespread applications of AR [17]. However, the system is not sufficiently adaptable for variable and complex industrial environments and tasks; although, such a general AR authoring tool can quickly create AR task support systems for different types of tasks.

2.3. Adaptability of AR Authoring Tool

Zhu et al. [36] developed a context-aware AR system for assessing maintenance situations and rendering relevant and valuable information to operators based on real equipment and environment.

Geng et al. [37] proposed a system that can adapt to different people, environmental objects, and processes of complex industrial operations at runtime. The system based on natural feature tracking provides better environmental adaptability and a suitable level according to the user preference of the end-user.

Although the above studies have enhanced the adaptability of the environment and tasks, the adaptability of the system interface to different types of environments and tasks is not fully considered. Secondly, there is no in-depth research on user acceptance and software learning costs, resulting in functions and styles of the system interface design which are too complex or too simple.

Dynamic 365 Guides [35] describes a good software system interface design, which is easy for end-users to learn and use. However, in the content display panel, only one video or one picture can be supported in the same step, and the system cannot support the type, position, and the number of single-step content, leading to insufficient auxiliary content depending on the situation. Additionally, the position and size of virtual objects presented in the system are fixed, and virtual objects do not have interactive capabilities, making the potential of virtual content in assisting tasks not fully utilized.

3. Overall System and Our Ideas

In the following section, this study first presents how to separate the platform and content parts of the AR task support system to achieve a cost effective and rapid development of the AR task support system. In addition, our method for this study is presented to improve the adaptability of the AR software platform for industrial environments and different types of tasks, as well as to improve the usability of the system and user experience. These methods are also implemented from both platform and content perspectives.

3.1. Overall System

According to Roberto et al. [17], AR authoring tools have a stand-alone approach. Stand-alone augmented reality software has all the necessary components for the development of complete AR experiences. Based on this approach, the AR authoring tool is used to develop AR solutions by first writing the contents data (which is typically structured in XML-based formats) into an editing tool, and then reading and executing it as a data file on a software platform running on an end-user device. Fixed systems are developed using frequently unchanged and reusable parts, i.e., functions and visual components implemented by programs, as platforms, while elements that cannot be reused between tasks, e.g., different virtual assets needed for different tasks, are considered as content. Once the structural content functions supported by the platform and the visual elements of the task assist system are defined, each visual element must be analyzed to determine the required data, and then the customized content format for the platform is developed. This enables standardization and repeatability of assembly instructions, and helps to convert graphical elements into data information that can be stored and transmitted. The relationship between the platform and content in the AR task support system is shown in Figure 1.

3.1.1. Platform

The platform provides a set of customized interface and interaction components. According to the logical relationship of content, the platform can call the resource library to present the visual resources to appear in the corresponding steps and provide the corresponding attributes according to the description file. The end-users can realize the functions through various interaction methods. The most basic function of the platform is the step-by-step content switching function. In addition, the platform also needs to provide some information support according to the cognitive needs of users.

3.1.2. Content

The content mainly includes (1) the structural organization of the visualization assets, (2) the narrative logic of the tutorial, (3) the source path of visualization assets, and the relevant characteristics of each type of visualization resource, as well as (4) control instructions for specific components in the software platform.

3.2. Our Idea

In the following section, we present the details of our ideas.

3.2.1. Idea 1: Increasing the Degree of Freedom of the visualization methods

Conventional AR authoring tools for the design and development of content and platform tend to support mostly fixed scenes. Conventional AR tools determine the position of visual assets in the scene by scanning simultaneous localization and mapping (SLAM) coordinates, object recognition, marker recognition, or manual alignment. If the real scene does not have the specified conditions, no optional alignment method can be performed, rendering the system useless. Therefore, it is necessary to provide multiple registration options. In this paper, we define each of the related words as follows.

World-Pre Defined Coordinate System: The AR system has already stored the fixed environment information needed for the registration (e.g., 3D map built with SLAM). The position and orientation of virtual contents are defined in this coordinate system. The AR system automatically aligns a runtime coordinate system (e.g., a new SLAM coordinate) with the stored coordinate system after turning the AR system. However, if the environment changes, the virtual content cannot be matched with the real-world environment.
World-On Demand Coordinate System: The AR system does not store the environment information, and the virtual content is displayed in the runtime coordinate system (e.g., a new SLAM coordinate system). The user may need to adjust the location of the virtual content manually.
Screen Coordinate System: The virtual content presented by the AR system is fixed to the HMD screen.
Root Object Coordinate System: Object coordinate system with defined relationship to the World-On Demand Coordinate System.
Root Object-World Independent/Fixed Coordinate System: Root Object exists in the SLAM coordinate system, but can freely change its position and fixation mode in the current environment space, without affecting the coordinate positions of other virtual objects in the environment.
Object-World Independent/Fixed Coordinate System: Object exists in the SLAM coordinate system and becomes a subcoordinate system of the Root Object coordinate system, and its position changes with the Root Object coordinate system, but it can adjust the position of its own coordinates separately, and the fixed mode.

Related research [35] in the design and development of contents and platforms tends to mostly support tasks where a single static object is used as an aid. The virtual content is assisted with a single coordinate and object as a reference, and the assisted content is also sized and configured for position and orientation based on this single coordinate. First, in certain tasks, there are often multiple primary objects in a single-step task, in which a static single object-centric support is not applicable. Second, if the static objects in the task do not remain in a constant steady state during the task, this can cause the virtual content to be displayed in the wrong location. Therefore, multiple independent root object coordinates and dynamic and static tracking methods need to be supported in single-step scenes, so that changes in the position and pose of individual coordinates as the task or environmental factors in the scene change do not affect the object coordinate settings and virtual content assistance for the entire scene.

Conventional systems [35] rely on the system’s main content display panel, the main-panel, to display text, images, and videos, and the limited interface space on the main-panel leads to a lack of single-step content assistance. We added a number of separate panels to the system, Sub-panels, to present images and videos to complement the single-step scenes. Virtual content is not only an indication of the position of real objects, but is often used by end-users as a reference for completing tasks. However, there is a low degree of freedom of interaction and the end-user cannot personalize the presentation of the virtual content, making it difficult to interact with the virtual content, e.g., by using virtual measurement tools in the scene. The user cannot zoom in and out, or move and rotate the visual asset, which limits the user’s interaction with the visual asset, and the user needs to adjust his or her viewpoint more frequently, which reduces the user’s experience and knowledge acquisition efficiency, and increases the end-user’s cognitive difficulty and task time.

AR task support systems provide task assistance to the end-user through visual means. The above shortcomings of conventional systems can be summarized as the lack of freedom of the visualization methods. It is the low degree of freedom of this visualization methods that limits the adaptability of the system to changes in the task environment and type, and limits the effectiveness of virtual content in assisting end-users. In summary, it is critical to improve the freedom of visualization methods. We analyze and improve three aspects based on related research:

(1) The types of coordinates and the number of independent coordinates supported by the system. (2) The way and means of independent coordinate registration. (3) The interactivity of the coordinates available in visual assets. For specific comparison points and improvements, see Figure 2.

Improving the degrees of freedom of the visualization methods of the AR task support system generated by the software platform requires not only improving the software platform that directly faces the end-user, but also changing the contents for content description, i.e., enabling the AR content description format to describe tasks with a high degree of freedom of the visualization methods.

3.2.2. Idea 2: Improving the Design Process of the Software Platform

The end-user is assisted through the process of direct interaction with the system, and input of interactive information and knowledge. The system must support the user to the maximum from the technical perspective, but the design of the system’s interactive interface lacks adaptability to the user’s environment and task support. Moreover, unfriendly user interfaces and interaction methods not only reduce the effectiveness of the AR task support system for the user, but also lead to insufficient task quality and higher cognitive load.

Current design and development of corresponding software platforms for AR task support systems only consider more conventional environments and task types. Given complex and changing industrial environments and tasks, the existing system interaction interfaces are difficult to adapt. For example, the inability to move freely on the interaction interface limits the end-user’s actions to a single, fixed position and angle. In small, non-ideal industrial environments, fixed sizes and rotation angles make it difficult for the user to interact smoothly with the system. The design of the system interface is often simply a display of content, and even if the end-user does not use the virtual content, the limited viewing area is always obscured by the virtual content, so the end-user does not have a sufficient view of the real environment, which also poses a significant safety risk to the end-user. Different tasks also require the end-user to adopt different body postures, such as lying down, squatting, leaning, etc. The lack of a system design for these types of tasks greatly increases the difficulty in using the system.

To the best of our knowledge, there are no unified design principles for AR support systems, the design of the corresponding AR task support systems is often based on the developer’s personal vision of the interaction interface. Due to the lack of knowledge of the user background, the design and use of the system interface by the end-user often involves significant learning efforts. The lack of knowledge about AR makes it impossible for the user to control the current state of the system well. For example, whether the Marker has been successfully scanned and whether it is correctly aligned with the object. Unlike conventional computer and mobile app development, the design of AR systems based on HMD devices often does not consider the user’s perception. For example, tangible interaction interfaces often lack physical haptic feedback, so the interaction process requires appropriate feedback through other perceptual methods; otherwise, the user is unable to accurately determine whether the interaction is successful.Second, the limited display angle of current AR devices and the lack of reasonable line-of-sight guidance makes it difficult for the user to find the target object or location, which significantly increases the physical and cognitive burden on the user.

The design of the general AR task support software platform simply uses UI components to implement the more basic functions of the guidance system, or uses a design strategy of infinitely increasing functionality and buttons, which makes it difficult to provide the user with reasonable system usability during the industrial guidance system. Therefore, we believe that the design of the software platform should include an environment, task, and user centered design strategy, which will improve the adaptability of the system to different industrial environments and tasks, as well as the system usability and user experience.

User-Centered Design and Development

The use of user-centered design and development in system design is critical to better meet users’ needs [38]. First, it is important to consider at the outset of system design the user’s understanding and learning costs of using the software, as well as the explicit and implicit cognitive needs that arise from the design task. Second, users are the users of the software platform and being user-centered means involving them throughout the process of planning, designing, implementing, and testing the system [39].

Environment-Centered Design and Development

The software platform must be adapted to different types of industrial scenarios and take into account how the user can interact more flexibly with the system interface in such scenarios, and minimize the inconvenience of the system for the user when performing tasks.

Task-Centered Iteration Design and Development

A software platform for general tasks requires adaptability to different types of tasks and environments. It is not possible to present all required functionality in the first release, and it is not possible to immediately meet design and other aspects and achieve stable task usability in the face of unknown task and environment requirements. By iteratively testing and improving the software prototype with a task-centered approach, the adaptability of the software platform is continuously improved.

4. Development of General AR Task Support Software Platform

In this section, a software platform is developed. Compared with conventional general AR software platforms, this study emphasizes two points to improve the software platform design process: (1) Optimize the freedom of the visualization methods; (2) Increase the adaptability to different tasks and environments. The details of the general AR software platform are offered, comprising workflow, user analysis, contents, and UI components. After the software prototype was completed, the prototype was iterated in accordance with user performance and feedback in real tasks.

Hololens 2 [40] was used as the platform for running the software. This device supports multiple registration functions and input methods, providing more support for developing a software platform with high intelligence representation freedom, and general contents description format. The visual assets and controls of the general AR software platform follow the design guidelines for designing mixed reality applications [41]. One of the purposes of a graphical interface is to enhance the user’s understanding of the data content and its associated information. The ideal interface is designed to provide the user with sufficient information and smooth interaction without making the user feel overly complex [42,43].

4.1. Software Platform Design Process

The AR task support system presents visuals as a collection of interactable objects in an industrial scenario. The floating main-panel and the UI components help the user to acquire intuitive and easy-to-understand information about the task. The abstract augmented reality instruction software platform is concretized and detailed according to the following design process.

In the first step, user analysis is performed to understand the user’s needs; in the second step, the functionality of the software is identified and functional hierarchy is completed; in the third step, graphics are designed and added to the software to implement the functionality; finally, the software is iterated. The detailed process is illustrated in Figure 3.

4.2. User, Environment and Task Analysis

Employing user-centered design and development in task support system design is critical to designing systems that better meet user needs [38].

Since software platforms are ultimately developed to better assist users in performing their tasks, user-centered design requires that users’ needs be incorporated into the system design at the beginning of software development. From a socio-technical perspective, Goodrum et al. [44] argued that designers must consider the dynamics of people, environment, work practices, and technology to develop rich learning and information environments.

The end-user is the direct user of the augmented reality instruction system. Two aspects should be considered in user analysis:

1. Users need to understand and learn to operate the unknown system before the task begins. For example, the augmented reality interface is different from the traditional interaction with a physical screen, and the learning cost of users should be stressed in this process.

2. In the process of task implementation, due to the uncertainty of the environment, task content, and physical movements, it must be ensured that the software operation does not threaten the safety of the user and can assist the user to complete the task efficiently. Therefore, user analysis should be combined with the specific analysis of the environment they are in and the tasks they face.

The specific analysis is described as follows. A total of 30 videos were collected through the video site YouTube [45] by searching for keywords such as “Object Name + Assembly/Repair/Maintenance Task” and “Factory”.

Moreover, 23 videos were collected by searching the keywords “AR task support system”, “AR maintenance”, and “AR industry”. Particularly, user analysis was performed by observing construction videos using AR systems as assistance.

In the subsequent development process, a user-centered design process was maintained, and iteration was conducted on the software through continuous communication and feedback. Although not all situations can be included and some specific tasks require special system customization, the design process was maximized through these analyses to incorporate realistic situations into the design of the software platform, and thus obtain an ideal user-friendly software plateform that can be adapted to different types of environments and tasks as much as possible. The following generalizations about the user, environment, and task characteristics are presented.

About Users

There is a need for single-site implementation tasks and multi-site mobility within the work area;
Different body postures and body angles occur when users are handling different tasks;
In some tasks, such as computer or cell phone installation tasks, the object is very close to the user’s head;
There are tasks requiring users to look down or up for long periods of time;
The users need to hold the tool for a long time;
The users need to wear gloves;
Factory users have specialized areas to manage parts. Individual users can manage the parts themselves, while the lack of sequential guidance makes it easy to confuse the parts.

About Environment

The environment settings are not fixed and may change during the task;
Some user construction environment spaces are narrow;
Some environments have a mix of equipment and are not suitable for users to move frequently;
Some environments do not have a constant background color and ambient light;
Some environments are noisy.

About Task

There are both non-fixed object tasks and fixed object tasks;
Objects need to be rotated during the task;
Objects can change shape during the task;
There are many similar parts in some task objects and it is easy to get confused;
Objects in the task will exist in a single location or multiple locations with multiple processes;
Some objects can be rotated or reversed directly using the hand, while others require users to change positions to continue the task;
Some tasks require users to go inside the object to construct it.

Considerations for system design based on the above analysis

The system interface should support user following and fixing;
The system interface should be user-definable to adjust various angles;
The system interface should be user-definable in size;
The system interface should be minimized temporarily;
The system interface should support various interaction methods;
Users sometimes wear gloves, and the touch buttons and the key spacing should be larger to avoid input errors;
The system can support the configuration of simple parts management;
The system can support multiple object coordinate changes and real-time tracking in the task;
The system interface should occupy as little visual area as possible;
The color and brightness of the system interface should interfere with the user’s observation of the working environment as little as possible.

4.3. Related Software Analysis

A preliminary and abstract idea for AR instruction is presented in this paper, based on user analysis and previous knowledge of some AR task support systems. The specific functionality and details of the interface style and UI components should be determined from the abstract idea to the final implementation of the software platform.

Our strategy is to first analyze the relevant software, summarize the functionality, interface design, and use of the UI components, and add our own solution by taking into account the results of the user analysis in the preliminary session.

Notably, the purpose of analyzing related software is not to incorporate all the functions that can be used in the system design, but to discover the functions in line with the common functions of general task support and the related functional structure. An example of content analysis for related software is presented in Figure 4. Functional and hierarchical analysis examples are exhibited in Figure 5.

4.4. Contents, Functions, and UI Component

Concerning the design of an AR task support system, the following factors should be considered regarding the selection and determination of content, functionality, and UI components: (1) the most basic requirements as a general task support system; (2) new requirements after the enhancement of the freedom of intelligence prompting methods; (3) functional and implicit requirements for the system based on the results of task, environment, and user analysis; (4) different from the traditional desktop and 2D mobile limited screen space design rules.

4.4.1. Contents Determination

In the system, the content is roughly divided into two parts: visual elements (such as text, images, videos, 3D models, and 3D symbols to complete the construction of contents); auxiliary information content (such as table of contents, guide content framework, guide content, logic, title, and page number), as detailed in Table 1.

4.4.2. Functions Determination

The determination of system functions is mainly through three methods:

Analyze and use relevant software, and summarize important and frequently used functions
Collect subjective opinions through online questionnaires
Carry out on-site observation and video analysis of actual tasks, and the analysis content is the functions and frequencies used by users in tasks

Since a general task support system cannot satisfy any demand, too many functions will cause more learning workload. Features are classified into Necessary and Useful according to their importance. Necessary features are indispensable for task execution. Useful features can further enhance the system’s auxiliary effect, but are not indispensable.The functions are divided into constant and on-demand levels following the frequency of use. Given that the main-panel is limited in size, too many icons cause unfocused tasks and low user acceptance, the constant functions is placed on the surface level of the main-panel, and the On-demand functions is considered on the second or third level. The function is provided in Table 2.

4.4.3. UI Component Determination

To reduce the user’s cognitive and physical workload (learning how to use the system interface and UI components, as well as understanding the implementation details of the tasks), the UI components were designed by referring to elements familiar to the user from the computer and mobile phone side. In this way, the user can focus more on the tasks themselves. With respect to the interface and UI component design of traditional computer and mobile phone system software, the following aspects should be considered.

1. The input technology provided by the development platform.For example, Hololens 2 can provide Speech, Gaze, Near/Far Pointer, and other ways to combine UI components with system input technologies, consider the user’s personal preferences, proximity, and distance interactions, and determine the multiple input technologies that each component can provide for the end-user to choose from. For AR platforms, the manipulation of components without physical objects is different from physical 212 interactions with tactile senses, and it replaces tactile feedback with visual and auditory senses.

2. Convenience. The distance between the interactive interface and UI components of AR and the user is usually not constant, different from the interaction with traditional terminals. Therefore, how users can still interact smoothly when they are far away from the interactive interface and UI components must be considered. Our strategy is to allocate a larger area for commonly used UI components. If the area of the interface is limited, either Near Pointer or Far Pointer, the area of some UI components will be automatically enlarged when the user approaches them, allowing the user to easily interact with them. Another strategy is to set the hand controller component of a commonly used feature set in the hand when the user is far away from the interaction interface and UI components. These strategies can help users achieve faster and easier interaction, and improve user experience.

3. The UI components of the AR task support software platform are divided into four functional groups: information display, user input, content and space navigation, and component management. Among them, the basic components such as basic buttons are provided by the MRTK development kit, and most of the system buttons, especially the components related to the system, are developed by us.

The four functional component groups are detailed as follows.

1. Informational components. The information display component is used to display various types of visual asset content and auxiliary information in the task. A sub-panel that can more accurately indicate the working position of the object and provide picture information independent of the main step diagram of the main-panel was designed to solve the problem of insufficient information displayed in the single-step main-panel of the traditional navigation system.

Figure 6 exhibits a sketch of the Informational components that be used.

2. Input Control Components. The main purpose of the user input component is to provide the user with auxiliary content switching and other system controls. In our augmented reality software platform, the system offers two ways to align virtual objects with real objects, marker alignment, and manual alignment. Figure 7 illustrates a sketch of the input control components that be used.

3. Navigational Components. The AR task support system provides a task environment where the virtual and real worlds merge. The tasks are completed in the real world (the basic environment). The virtual tools are adopted as assistance to facilitate immersive operations, such as using virtual navigation to guide the location of parts and tasks. Meanwhile, the virtual content can be easily lost due to the limitations of the display range of the head-mounted display device. Hence, we also need to navigate the virtual world.

Task content navigation. Users need to understand not only the details of instruction in each step but also the overall structure of the task. In our interface design, the information area of the task frame structure is employed to demonstrate the task structure of sections and steps. Meanwhile, the system utilizes a slider to present the current progress. To improve the user’s control over the task process, slider function is added to the component, allowing the user to quickly adjust the progress. In the video part, the same components are added to display and adjust the playback progress.

Task space navigation: Different from traditional guidebooks, the system can provide unbounded task space navigation and positioning by UI components, such as directions for places, tools, parts, and content panels. The system uses a triangle pattern as a guide for mixed space navigation and adopts virtual dynamic lines to connect virtual panels with real objects.

Figure 8 provides a sketch of the navigational components that be used.

4. Parts Management Components.

As revealed in our analysis of users, users need to manage multiple parts for tasks executed in a non-professional factory environment. The AR virtual part management UI component can be adopted to manage multiple areas of the part. Moreover, virtual arrows in step-by-step guidance indicate the location of the part. The system we developed supports multiple independent root objects in one step, enabling users to customize the location of components in accordance with the environment. The relevant virtual parts management components are illustrated in Figure 9.

4.5. Prototype Design and Test

4.5.1. Paper Diagram Design and Test

After the content, functions, and UI components are determined, all components are integrated into the system interaction interface to achieve graphical representation. Graphical interfaces need to arrange and combine scattered information and functions based on logic and hierarchy to enhance users’ understanding of data and related information.

Black and white line drawings enable rapid and low-cost prototype testing. The main method is to exhibit the line draft of the system interface to the end-user.The end-user judges the system interface style and layout and imagines the real use situation following the developer’s description. This is performed to test whether the layout of content functions, the display of information frames, and the switching and exiting of hierarchical interfaces can be operated. With the opinions of end-users, the page information layout, functional structure, and hierarchy were further adjusted, and the interaction process was optimized. The main-panel example is illustrated in Figure 10.

4.5.2. Prototype Implementation and Test

The prototype is based on the line drawing design to implement the function. The main-panel after implementation is presented in Figure 11. Considering the user’s actual postural state and the inconvenience of a single interaction caused by factors in the operating environment, the prototype provides two types of interaction methods based on the HoloLens default interaction technology.

Near Distance Interactive Methods: Direct manipulation with hands [46].

Far Distance Interactive Methods: Point and commit with hands [47], Voice Command [48].

When the basic functions are available, the interactive interface and UI components were mainly examined concerning user experience and environment adaptability and iteration.

1. Color. A crucial issue in the design of graphical interfaces is that users cannot see and respond to all content at the same time [49]. Whether the color of the main interface page conforms to user preferences was tested by users. Additionally, in an AR environment, the system must ensure that virtual content does not hinder the implementation of tasks as much as possible, since virtual content is superimposed on the real environment. Considering that the color of the main interface blocks the real environment, two modes were designed. Specifically, the main interface is colored when the user operates the main-panel; the space of the main-panel outside the content area becomes transparent when the user does not operate. White border lines are adopted as visual assistance to unify the interface in the same panel (see Figure 12 and Figure 13).

2. Size of interface and UI components. The AR system is tested by multiple users to determine whether the interface can present relatively complete content and whether the size of UI components can be easily controlled within the normal interaction range. Regarding complex and changeable environments and different user preferences, the interface size is set to a non-constant proportion, and users can use voice commands or gestures to interactively adjust the main interface size.

3. Fixed mode of the interface. Considering the possibility of multi-location movement of tasks, the fixed mode of the main interface was set as fixed mode and follow mode. The fixed mode supports the user to adjust the interface angle for convenient top and bottom viewing. The following mode interface automatically follows the user’s position, reducing the frequency of manual operation. The details are provided in Figure 14.

4.6. Iteration Design by Case Study

Our software prototypes were optimized through iterative design. This process allows us to obtain feedback from users, so as to assure that the software meets their requirements after modifications. Different types of tasks were used for iterative testing, allowing us to continuously update and polish the prototype to fit multiple types of tasks and environments.

Secondly, the functional settings of the software and the design of interaction methods and appearance styles are mostly based on the developer’s subjective thoughts, and cannot be perfected in one go. They need to be tested in different cases to obtain more objective and rational design results centered on users, tasks, and environments.

Finally, instead of testing all the functions and user interface components at once, they were implemented in batches in multiple stages. In this way, the software can be modified and improved more accurately.

In this section, task-centered iterative software design is illustrated with examples.

4.6.1. Detail of Our Different Cases

Iteration was performed through cases to upgrade the software prototype. The first test evaluation of the software was conducted based on the task of case coffee machine maintenance (See Figure 15). This case study is detailed in Table 3.

Following the evaluation results and user feedback, the software was upgraded, and then the task of case laptop HDD (Hard Disk Drive) replacement task (See Figure 16) was performed. The new version of the software is evaluated. The detail of this case study is presented in Table 4.

4.6.2. Detail of Evaluation Process

After basic HoloLens training for users, researchers introduced the current beta program and let users practice how to use it. Then, the real task test was conducted, and the overall system and the functions and UI components of each test were subjectively evaluated with a questionnaire. Users were interviewed and asked how they felt about the system interface, functions, and user interface components, as well as their shortcomings and suggestions for improvement. Additionally, software problems were discovered through further analysis of task execution video. The evaluation process is illustrated in Figure 17.

4.6.3. Feedback from Each Case Study

After the first case study, the system was improved based on the feedback from the participants. Then, the second case study was performed to check the effectiveness of our improvements. Some examples of user feedback are provided as follows.

Case No. 1

Feedback from Participates:

When moving or zooming the main-panel, the panel can be changed but also will be rotated;
When zooming the panel with one hand, it is difficult to adjust the size (the area where the cursor can be recognized in the four corners of the main-panel is too small);
Participants hope the system’s previous and next buttons can be set on the left side of the main-panel as well;
The participant expects the system to give feedback after all steps are completed;
The icon of the Lock/Unlock button is not easy to understand.

Case No. 2

Feedback from Participates:

Parts Management Panel component appears in front of the user’s eyes, obscuring the user’s view of the main-panel content;
3D animation indication icon is too small to be easily noticed by the user, and dynamic line is too thin;
When performing tasks, hand rays move the main-panel by mistake;
Re-manual alignment tool’s icon is not too easy to understand;
When resizing the panel with one hand, the collider size of the corners should be larger and easier to identify.

4.6.4. Summary of Software Platform Design Process

In this section, how to improve the design process of software platform development with specific user, task, and environment design strategies was described through the detailed process of software platform development.

After the software prototype was developed, the software was developed iterative with case studies of different types of tasks. The iterative results such as the function list and the final style of the software are displayed in each process.

Meanwhile, the method of increasing the degree of freedom of the visualization methods (see Figure 2) is also implemented at the software development stage.

5. Contents: Development of General Contents Description Format

The stand-alone type of AR authoring tool mainly consists of platform and content, a combination that can reduce the cost of system development for a specific task [17]. Content is defined as specific task elements that correspond to the step and chapter structure of the software platform, as well as the parameters of the task support information used in each step. The structure of content needs to represent generically all elements of an industrial operation containing a wide range of properties and interrelationships; their properties, semantic relationships, and dynamic changes must be carefully considered so that the dispersed resources are connected into a logical whole. At the data level, the content description format used is the XML to store the structure of the expression and storage system, and the ontological entities of the index tracking and metrics files. In this section, first, we describe general task content formatting strategies. Then, according to the design of the software platform and the idea of the high-degree-of-freedom visualization methods, the specific details of the format design are introduced. Finally, general content description formats are evaluated.

5.1. Strategy of General Contents Description Format Design

In the format design, according to the strategy to enhance the freedom of intelligence display shown in Figure 2, we specifically designed the format, in the following aspects.

1: Based on the analysis of numerous paper instruction manuals on task structures, we divided the tutorial content format description of the implementation of the task into two levels—Section and Step. Branch elements are provided to support the selective non-linear tutorial narrative approach.

2: Supported assets types were divided into five types: text, image, video, 3D model, and audio. Supported attributes include element name, index path, spatial pose data, color, etc.

3: The coordinate system supports multiple independent object coordinates within single-step scenes, and also supports two kinds of coordinate registration (manual and marker) and static/dynamic coordinate registration. The main-panel supports the switch of registered coordinate types.

4: Visual assets resources support position change and spatial position, scaling, and rotation pose adjustment.

5: Visual assets resources support multiple elements in a single step; pictures and videos are not only presented in the main-panel, but are also presented in multiple sub-panels.

5.2. Detail of General Contents Description Format

Figure 18 shows the structure of the system content. In the Information section, the title and task description of the content are described. The following Global elements describe the content available between the different sections or steps. The content of each section and step is described by the Section element and the Step element. Since the section corresponds to the upper level of the step hierarchy, the Step element is described in the hierarchy at the level below the Section element in the system content. The Section and Step elements should be set according to the target task procedure. When the structure guiding the content describes the content represented by a step without a section, the Section element disappears and the Step element becomes the same hierarchy as the Global element. In addition, if a section or step has branches, it is described using the Branch element below.

5.2.1. Global Elements

The Global element describes the settings of the main-panel and the information about the object coordinate system. The content of the main-panel settings is described in the Main-Panel element. The information related to the object coordinate system acquisition is described in the Calibration element.

Main-Panel Element

Figure 19 shows an example of the depiction of the Main-Panel element, which has the Coordinate element and Transform element. The Coordinate element specifies the coordinate system used to fix the main-panel. As we described, the general AR task support system studied in this study can use the screen coordinate system, the SLAM coordinate system, and the object coordinate system. In the system content, the screen coordinate system is denoted as Screen, and the SLAM coordinate system is denoted as Slam. The object coordinate system is given a name and specified by that name. In this case, the main-panel can be fixed to more than one coordinate system. Thus, the coordinate system in which the main-panel is fixed at system startup is described in the InitialCoordinateSystem, and in AvailableCoordinateSystem, the coordinate system that the operator can use by switching during work. Transform element describes the spatial position of the main-panel based on the current coordinate system.

Calibration Element

Figure 20 shows an example of the Calibration element. The Calibration element describes information about the object coordinate system. The general AR task support system in this research uses three different methods to obtain the object coordinate system. Regardless of which method is used, the Name element is always described in the Calibration element. The Name element describes the name of the object coordinate system. The following Type elements specify the method used to obtain the object coordinate system: TrackableMarker for the tracking with a marker, UntrackableMarker for a static registration method with a marker, and Manual for manual alignment using a 3D model. The subsequent Anchor element specifies the marker data or the 3D model data. These data are uploaded to the internet and the URL of the stored location is described here. When the system uses the content for the first time, the required information (i.e., marker data or 3D model data, here) is downloaded via this URL. This approach keeps the extensibility of the software platform. When using markers, it is necessary for the system to know the physical size of the marker. This is described by the MarkerSize element. The unit of this value is also meters. Finally, the Frame element is described. Here, the wireframe data used to give feedback to the system on whether the object coordinate system is being used correctly is specified by a URL.

5.2.2. Section and Step

The information used in each section/step is described in the Section element and Step element. Figure 21 shows a description example of the Section element and Step element. In the Section element, first, the title of the section is described in the SectionName element. This section title is displayed in the main-panel. Then, Step elements are described for each of the steps that the section has.

In the Step element, first, the title of the step is described with the StepName element. This is also displayed in the main-panel. In the following Main element, the text, video or image, and audio used on the main-panel are described. If the main-panel does not use this information, do not describe the Main element. When handling text in the main-panel, describe it in the Text element. When using video, introduce the Video element and describe the URL of the video data. When using an image, introduce the Image element and describe the URL of the image data. When using the text-to-speech voice of the main-panel, create an Audio element and describe the URL of the voice data there.

When additional information is provided to users as well as the main-panel, its content is described in the Step element. Sub-Panel element contains Name element, Coordinate element, Transform element, Color element, and Sub-Panel Content element. Coordinate element selects the coordinate system in which the sub-panel is fixed. Transform element is the same as the one used for the Main-Panel element. Finally, the Sub-Panel Content element specifies the content on the sub-panel. As with the description of the Main element, the Text, Image, Video, and Audio elements are introduced to specify these elements.

The next section describes how to describe the pointer rope. Figure 22 shows an example of the PointerRope element. First, Target element specifies the name of the 3D model or coordinate system that serves as the start point and endpoint. If a 3D model is specified with the target element, the position is specified based on the coordinate system of the 3D model. In this example, the 3D model named ObjectName 01 is used as the start point. The 3D model named ObjectName 02 is used as the endpoint.

The last section describes how to describe a 3D model. Figure 23 shows an example of the 3D model element. When using a 3D model, ThreeDModel element is described in the Step element. In addition to the Name element, ThreeDModel element contains ModelData element, Transform element, and Animation element. ModelData element specifies the 3D model data by URL. Transform element is similar to the Main-Panel and Sub-Panel elements. Animation element is marked as Loop when the animation is played back in a loop. If the animation is played back only once, Animation element is described as Once.

5.2.3. Branch Elements

The general AR task support system considered in this study can handle the branching of sections and steps. The Branch element is used to describe content with branching sections or steps. An example description of the Branch element is shown in Figure 24. The Branch element is described in the same hierarchy as the Global and Section elements. The Branch element must be named with the Name element, and then the Section element and Step element describe the section or step of the branch. In the step before the branch occurs, CallBranch element is introduced in the Main element. With the introduction of CallBranch element, the UI handle used to transition to the next step disappears and is replaced by a button created on the main-panel for selecting the branch destination. The name of the branch is written on the button. CallBranch element specifies the branch candidate. The branch candidates are specified by describing the branch name separated by commas.

5.3. Evaluation of General Contents Description Format

5.3.1. Evaluation Method

The format needs to have enough flexibility to describe a variety of assembly and maintenance tasks. The evaluation was conducted according to the following procedures.

1. Collect disassembly and maintenance task manuals of the different products. It is hard to realistically describe all types of task content using the proposed format. So, we tried to choose different types and as many tasks as possible. Table 5 summarizes the instruction manuals we collected for this evaluation.

2. Design and try to image the display scheme of the manual content on the software platform. We convert the content of the manual into text, images, videos, and 3D model materials, and design them to be presented on the software platform. We try to design and imagine how these virtual contents will be displayed on the software platform. The organizational structure of the content as well as the narrative logic are also taken into consideration.

3. Use the proposed format to describe the manual content. According to our design proposal, we use the proposed format to try and describe the proposal.

5.3.2. Evaluation Results

Almost all of the task content in each step can be described in our proposed contents description format. In other words, the proposed format can be used to describe contents to support tasks, mostly in assembly and maintenance cases.

6. User Studies

6.1. Purpose of User Study

This study aims to demonstrate that a general software platform can work for various real-world industrial tasks and non-predefined environments, and that the generated system with a high freedom of visualization methods enables the user to obtain a better task performance and user experience than the conventional system.

6.2. Systems

Two systems were developed to validate the effects of the proposed general software platform. They are based on the same conceptual model and have the same functionality, while they differ in the freedom of visualization methods.

System A consisted of all our considerations in the freedom of visualization methods described in Section 3. System B was implemented to mimic the functionality, degrees of freedom on the coordinate system, and visualization methods of visual assets provided in Microsoft Dynamics 365 Guides, one of the current mainstream methods. In addition, based on our design, participants using System A can dynamically configure the positions and angles of all real and virtual objects in the scene according to their real task and environment. On the contrary, if the participants use System B, they need to configure the objects according to the predefined evironments.

The differences between the two systems are summarized in Figure 25. Both systems were developed using Unity3D 2019.4.17f1c1 and MRTK Toolkit 2.6.1, and released on HoloLens 2 using Visual Studio 2019. Operators can interact with the system using both short- and long-distance interactions, such as through voice commands or through natural hand actions.

6.3. Tasks

We use two different tasks, each with its own characteristics, such as the engine maintenance task, which is the construction of a larger component, and the construction direction is around the engine and the top, and it is easy to cause multiple engine displacements during the implementation process. The PC assembly task is representative of a relatively fixed object task. The construction direction is around the top of the PC, and the PC displacement cannot easily occur during the implementation process. However, this task is for the construction of small components, and more detailed instructions are required. This user study uses different task types to verify whether the user can achieve a good task performance and user experience in research question 1, which is about the system providing more flexibility in the visualization methods.

6.3.1. Task 1: Engine Maintenance task

As illustrated in Figure 26, the steps of task 1 are designed from a real-world gasoline engine maintenance task [50]; while, the instructions are obtained from an offical engine maintenance manual. The user must perform some preparations for parts management before beginning the task, owing to the massive number of parts for this assignment. The parts can be divided into large and tiny parts (such as screws). A real-world parts management box was utilized for the management of tiny parts. Regarding the management of large parts that do not fit inside the real parts box, a parts desk was adopted (See Figure 27). The system provides step-by-step directions for the user to handle parts at certain places.

In this scenario, the system needs to know the three different positions (See Figure 27): engine, parts management box, and parts desk. First, they were classified into two categories: those that are unlikely to move during the task (parts management box and parts desk) and those that are more likely to move (engine). In System A, which has a high degree of coordinate freedom, the user manually moves the virtual parts box on the real parts box to align the coordinate of the system with the real world as the initial calibration. The same calibration is also performed for the parts desk. This feature allows the user to configure the assistance to suit his or her work environment. From another perspective, in System B, which has a lower degree-of-freedom coordinate system, the location of the engine and other parts areas in the scene are determined by scanning a QR code maker. Considering that System B has only one independent coordinate, the position of the parts box (and parts desk) cannot be adjusted, and the user must manually move the real parts to match the wire frame of the virtual parts box displayed in a position pre-determined by the content developer.

In system A, which has higher degree-of-freedom coordinate system, the markers were attached to the visible top of the engine, assuring that the QR code can be scanned by the user looking down for real-time position updates (Figure 28a). During the task, two QR markers at different stages of the task implementation were utilized to keep scanning because the shape of the object changes all the time. At the beginning of the task, the system indicates the exact location to be maintained based on the position of the engine marker. At the stages of installing the parts, the system indicates the position of the parts in the parts case, and in the wire frame managed by the virtual part.

In system A, the use of marker tracking provides the participants with an accurate indication of direction and position for the current step; although, the engine is moved intentionally or accidentally during the task (See Figure 29). Additionally, the virtual engine has been designed as a reference object for some steps, and the 3D information for each step is displayed on both the real engine and the virtual engine. The position of the virtual engine tracks the position of the real engine in real-time, and is displayed at the same angle as the real object, allowing the three-dimensional object to provide a more intuitive aid when the user has difficulty understanding the two-dimensional content. In system B, the system also offers the virtual engine, while the position and angle are fixed and unchangeable.

6.3.2. Task 2: PC Assembly Task

Figure 30 illustrates the experimental scene setting of Task 2 (PC Assembly Task). The task steps were derived from a real PC installation task, and the CPU installation steps were omitted for the duration of the total task. The participants need to place all the parts on the specified table in advance (Figure 31). The management of the parts is the same as in the engine maintenance task described above, in which real parts cases and virtual wire-frame are employed for small parts and large parts, respectively.

In system A, which has more degree of freedom in the coordinate system, the marker is attached to the PC part, allowing the user to scan the QR marker for real-time location updates simply by looking down (Figure 32a). At the start of a task, the system indicates the exact location of the PC to be maintained based on the position of the marker. At the stages of installing the parts, it also indicates the exact position of the part in the wire-frame managed by the parts case and virtual parts.

In system A, sub-panels are adopted for content assistance, as exhibited in Figure 33a. Each step contains more information attributed to the complex wiring of the motherboard in the PC installation task and the need for the user to find the specific location of the interface on the motherboard, the correct cable, and the location of the interface on the cable during the task. It is assumed in this paper that if the information is just displayed on the main-panel in this case, the user will miss a lot of details and think that a sub-panel is needed to assist. The small size of the engine is more prone to be moved in the mission, and marker tracking can provide more accurate real-time tracking for the engine maintenance task. Thus, the engine assembly task was employed to evaluate the effectiveness of marker tracking. With the purpose of evaluating the effect of sub-panels, the PC assembly task was employed since this task requires the participants to refer to multiple pieces of information at each step. Sub-panels can provide more detailed guidance for the PC assembly task.

6.4. Participants

In this study, 33 participants were recruited from universities: 23 males and 10 females, aged 23–37 (mean = 27.8 years and SD = 4 years). They were randomly divided into two groups. Our experiment followed a between-subjects design. Half of the participants utilized System A, and the other half utilized System B. Both groups were instructed using identical training material.

6.5. Hypothesis

The evaluation content revolves around the second item in research question 1: whether the system provides more flexibility in the visualization methods, and can enable users to obtain good task performance and user experience. We use two different types of tasks for evaluation. When using our system, the high-degree-of-freedom visualization methods presents different visualization methods effects on different task characteristics. For engine maintenance tasks, this task mainly reflects the performance of the high-degree-of-freedom coordinate system and coordinate registration mode in the visualization methods. So, in the engine maintenance task, we focus on the evaluation of marker tracking. For the PC assembly task, it mainly reflects the performance of the high-degree-of-freedom 2D content medium and assets flexibility in the visualization methods for the task. So, in the engine maintenance task, we focus on the evaluation of the sub-panel.

The following hypotheses are proposed:

The Effects of Marker Tracking

(Task performance)
–
H1: Participants using the system with marker tracking complete the task more quickly than those using the system without marker tracking.
–
H2: Participants using the system with marker tracking complete the task with fewer errors than those using the system without marker tracking.
(User experience)
–
H3: Participants using the system with marker tracking complete the task with less workload demand than those using the system without marker tracking.
–
H4: Participants prefer the system with marker tracking more than the system without marker tracking.

The Effects of Sub-panels

(Task performance)
–
H5: Participants using the system with sub-panels complete the task more quickly than those using the system without sub-panels.
–
H6: Participants using the system with sub-panels complete the task with fewer errors than those using the system without sub-panels.
(User experience)
–
H7: Participants using the system with sub-panels complete the task with less workload demand than those using the system without sub-panels.
–
H8: Participants prefer the system with sub-panels to the system without sub-panels.

6.6. Measures

A series of analyses were conducted to test our hypotheses. The evaluation results were illustrated qualitatively and quantitatively.

Regarding task performance, the task completion time and the errors generated by task completion were selected for quantitative analysis as the evaluation criteria for task performance.

The assembly time was precisely recorded by the system. Occurrence of Error was visually confirmed by the experimenter. If any of the three conditions ((1) the result of the operation is incorrect, (2) participants seek assistance, or (3) no progress lasts more than 1 min) occurred, the experimenter immediately stopped the experiment and informed the subject of the appropriate working procedure.

This intervention aimed to prevent errors at each step from affecting subsequent steps. Moreover, built-in scripts in the system were used to record the spatial location data of the HMDs worn by the participants while the task is in progress, which is also adopted for complementary data. Concerning user experience, firstly, NASA-TLX [51] and System Usability Scale (SUS) [52] were employed for the perceived workload assessment and the system usability evaluation of the system, respectively. After that, interviews with users were conducted to ask questions about system preferences. The results of the interviews are utilized as qualitative analysis and evaluation of user experience.

6.7. Experiment Procedures

The study was designed to be completed in approximately 60 min for both tasks. The experiment was divided into six stages.

1. Participants were welcomed upon arrival and asked to read and sign the consent form (the study was approved by the Institutional Review Board of the university). Afterward, participants were given an introduction to the background of the experiment.

2. We briefly introduced the concept of AR by watching the MRTK [28] demo video.

3. After that, each participant had to adapt to HoloLens and practice interacting with the MRTK demo object with short and long-distance interaction methods.

4. Participants were asked to watch an introduction video of our system to understand how to use the system.

5. Participants were asked to conduct a laptop hard drive replacement task with our system as a training session to get familiar with system operation.

6. Participants performed the task with the assistance of each system. They independently followed the tutorial and position of the specific parts in the designated locations.

After each task, participants were asked to fill out the SUS questionnaire and the NASA-TLX workload questionnaire. Moreover, participants were encouraged to provide feedback on the system or their experience.

6.8. Results

6.8.1. Task Performance Evaluation

Completion Time

Hypothesis 1.Figure 34 illustrates the performance time distribution in the engine maintenance task. In this study, the steps with marker tracking in Engine-A were selected for comparison. Group Engine A was further divided into two groups based on the frequency of using marker tracking, with Engine A-1 being the more frequently used group. After data normality was confirmed with the Shapiro–Wilk test, one-way ANOVA was adopted, revealing no significant effect of the conditions of Engine A-1 (M = 52.88 min and SD = 7.70 min), Engine A-2 (M = 50.22 min and SD = 9.26 min), and Engine B (M = 51.75 min and SD = 9.50 min), (

F = 0.186, p = 0.813

).

Hypothesis 5.Figure 35 presents the average time spent on the PC assembly task. The steps with sub-panels in PC A were selected for comparison. A Shapiro–Wilk test demonstrated that the data were normally distributed. It can be observed that PC A (M = 26.63 min and SD = 7.17 min) took more time than PC B (M = 24.06 min and SD = 6.64 min), leading to a 9.65% increase in the total time spent. However, a one-way ANOVA suggested no significant effect between PC A and PC B (

F = 0.033, p = 0.858

).

Error of Task Implementation

Hypothesis 2.Figure 36 illuminates the box chart of average errors during the implementation in the engine maintenance task. A Shapiro–Wilk test implied that the data were not normally distributed. Thus, a Kruskal–Wallis test was performed, unveiling no significant differences among Engine A-1 (Md = 1.5 and n = 8), Engine A-2 (Md = 3 and n = 9), and Engine B (Md = 1.5 and n = 16) (

H = 1.687, p = 0.430

).

Hypothesis 6.Figure 37 displays the box chart of the average errors in the PC assembly task. A Shapiro–Wilk test on desktop times reflected that the data were not normally distributed. As observed in the figure, the span of PC A (Md = 1 and n = 16) is smaller than the PC B (Md = 1 and n = 17). A Mann–Whitney U test suggested no significance between the PC A and PC B (

U = 118, p = 0.491

).

6.8.2. User Experience Evaluation

NASA-TLX Score

Hypothesis 3. The total and subscale scores of NASA-TLX and the SUS scores were calculated. Each subscale of NASA-TLX was represented by Mental Demand (MD), Physical Demand (PD), Temporal Demand (TD), Performance (PE), Effort (EF), and Frustration (FR). With respect to the engine maintenance task, Figure 38 presents the NASA-TLX overall mean values and standard errors. The results of ANOVA revealed no significant differences among Engine A-1 (M = 35.42 and SD = 11.79), Engine A-2 ( M = 51.45 and SD = 16.53), and Engine B (M = 46.31 and SD = 19.35). Figure 39 provides the scores for each NASA-TLX subscale. The results of ANOVA demonstrated that whether the system was equipped with tracking exerted no significant impact on PD (

F = 1.233, p = 0.306

), TD (

F = 2.217, p = 0.137

), PE (

F = 0.432, p = 0.653

), EF (

F = 0.766, p = 0.474

), and FR (

F = 1.027, p = 0.370

). Regarding MD, there is a significant difference between Engine A1 (M = 6.1 and SD = 3.27) and Engine B (M = 10.88 and SD = 4.49) (

F = 7.029, p = 0.015

). In the figures, significantly different pairs are marked with * when (

p \leq 0.05

) and ** (

p \leq 0.01

).

Hypothesis 7.Figure 40 depicts the NASA-TLX overall mean values and standard errors regarding the PC assembly task. The results of ANOVA revealed no significant differences between PC A (M = 40.75 and SD = 21.25) and PC B (M = 45.59 and SD = 20.01), F = 0.454 and p = 0.506. Figure 41 exhibits the scores for each NASA-TLX subscale. The results of the Mann–Whitney U test revealed that the system equipped with tracking had no significant impact on the MD (

U = 120.50, p = 0.575

), PD (

U = 128.00, p = 0.773

), PE (

U = 119.50, p = 0.548

), EF (

U = 107.50, p = 0.303

), and FR (

U = 120.50, p = 0.575

). ANOVA reflected a 0.05 level of significance on TD between PC A (M = 5.31 and SD = 3.00) and PC B (M = 8.35 and SD = 4.36) (

F = 5.377, p = 0.027

).

SUS score

The overall usability of the proposed training method was evaluated using the SUS questionnaire to further demonstrate the usability of the system. Figure 42 illustrates the SUS scores and standard errors of the engine maintenance task. A Kruskal–Wallis test revealed no significant difference between Engine A-1 (Md = 81.3 and n = 8), Engine A-2 (Md = 65.0 and n = 9), and Engine B (Md = 71.3 and n = 16) (

H = 3.949, p = 0.139

). Figure 43 exhibits the SUS scores and standard errors of the PC assembly task. An ANOVA test suggested no significant differences between PC A (M = 72.81 and SD = 13.16) and PC B (M = 66.32 and SD = 18.07) (

F = 1.375, p = 0.250

).

6.8.3. User Opinion

Hypothesis 4. The user experience of qualitative results was investigated. Results were investigated. As unveiled in the question “which system did you prefer, A or B?”, 82% participants preferred the system (Figure 44) with marker tracking and 81% participants preferred the system with sub-panels (Figure 45).

6.9. Discussion of User Studies

In both cases, users accomplished the specified tasks. The results of the task performance suggested that there is no significant improvement, especially in work completion time, regardless of a slight improvement in task performance with the degree-of-freedom in high visualization methods. As implied in the NASA-TLX test results, System A with high degree-of-freedom of visualization methods exhibited a significant reduction in some factors of subjective workload compared to System B for both tasks. The results of user opinion demonstrated that the participants’ overall experience with system A was positive, contributing to their satisfaction. Compared with system B, system A did not present a significant difference in task performance, while performing better in user cognitive load and user preference. The relevant analysis is described as follows.

6.9.1. Regarding Task Performance

Marker Tracking

System A has a feature for object tracking that allows participants to perform steps with different object orientations by rotating the engine without moving their own position, contributing to lessening the distance participants have to move, and thus saving time. In contrast, System B, avoiding repeating manual object alignment, requires participants to move their own position for the task. However, regardless of whether participants choose to move themselves or move the engine, there is no large impact on the time spent on the task, as the video of all participants performing the task was analyzed. Marker tracking has a greater advantage in reducing the time spent on the task if the object is more difficult to hold stationary (such as a spherical object), or if multiple directions of construction from the object are required.

Because the engine is prone to shifts and angle changes during task execution, participants using system B incurred registration errors between the virtual instructions and the real object, while system A rarely did so. Interviews with participants using system B revealed that they still found the correct working position using the picture content instructions as supporting information, even if object misalignment occurred. In this way, the two systems demonstrated no significant difference in error rate. Marker tracking’s advantage in reducing error rates is greater if there are more similar working positions or similar interfaces on the object.

Sub-Panels

System A employs sub-panels to link virtual 2D contents with the real object. User feedback and video analysis suggest that many sub-panels increase the quantity of virtual material in a scene, requiring participants to observe it all within a small field of vision. As a result, task time spent is increased. Sub-panels allow a high degree of interaction, facilitating task comprehension but increasing task duration.

System A adopts the sub-panel in computer wiring to help participants identify ports quickly. User interviews revealed that regardless of system version, the processes were error-proof due to the fool-proofing design [53] of the cable contact ports in each step, regardless of system version.

6.9.2. Regarding Cognitive Load

Marker Tracking

The experimental results obtained using system A with marker tracking reduced participants’ mental demand. This was due to three reasons. First, system A can provide precise location instructions, making the participants more relaxed to complete the task.

Secondly, some participants prefer to use a reference virtual engine when step-by-step assistance is difficult to understand. System A with marker tracking can provide co-rotation between the virtual engine and the real engine, leading to more interaction between the virtual content. This not only deepens the understanding of the engine structure but also enhances the user’s enjoyment of the task. Finally, the HMD movement data in the engine maintenance task unveiled that participants using system A produced less head movement distance compared to those using system B for the bolt installation step. This occupied one-third of the total engine maintenance task. Figure 46 illustrates the HMD movement length of bolt installation of the engine maintenance task. In this study, the steps to installing bolts in the engine maintenance task were selected for comparison. A Shapiro–Wilk test reflected that the data were normally distributed. The results of ANOVA demonstrated significant differences between Engine A (M = 114.39 and SD = 25.84) and Engine B (M = 138.89 and SD = 27.34) at 95% confidence interval (

F = 7.001

and

p = 0.013

).

Sub-Panels

Experimental results using system A with sub-panels can reduce participation in temporal demand. User interviews suggested that system A kindly provided detailed information for participants with no experience in the task; therefore, novices had a more reasonable estimate of the difficulty and completion time of the task. This increased confidence and reduced anxiety. Participants who used system B felt a lack of adequate assistance and difficulty in controlling the progress of task completion.

6.9.3. Regarding User Opinion

According to user interviews, participants found complex tasks easier to accomplish with the assistance of the AR task support system, regardless of System A or System B. Qualitative results on user system preferences implied that the majority of users chose the system. Subjective user opinions are listed based on user interviews as follows.

Marker Tracking

Marker Tracking allows users to quickly find locations;
Marker Tracking is useful for multi-directional tasks;
Marker Tracking allows users to freely rotate objects and form a 3D shape of objects in their heads, making it easier to familiarize themselves with objects;
Without Marker Tracking, the task would be very stressful.

Sub-Panels

Sub-Panel can provide more detailed information;
Sub-Panel is friendly to newbies;
Sub-Panel is closer for easier viewing of the contents.

6.9.4. Regarding System Adaptability

The support of different systems for the customized environment is investigated. According to the observations of the experimenters, the participants who used System A had differences among participants in the placement of the engine and the parts management box as expected. Some users mentioned that they could use System A to determine the position and orientation of the engine in the scene, as well as the distance between the parts case and the user following their own preferences. Participants using System B adopted pre-set spatial positions of objects in the experiment and were unable to independently change the position of individual objects in the scene. In other words, system A not only has good adaptability to industrial scenarios but also meets the individual needs of users in task implementation.

6.10. Limitations

This user study has the following limitations.

1. The types of tasks. The selected engine and PC assembly tasks were derived from real industrial tasks, while the desired task performance was not fully achieved in these tasks.

2. Environment. The engine and PC experiments were conducted in a school classroom. This was idealistic compared to a real factory environment.

3. Marker tracking. Scanning markers sometimes became difficult during the implementation of the task due to the different heights of the participants. Moreover, it is difficult to set the correct markers from all angles of the objects in real scenes, owing to the irregular shape of the objects.

4. The display style of virtual 3D symbols. According to user feedback, system A can provide precise location indications for components on objects. Nevertheless, this blocks the observation of objects since the virtual symbols are overlaid on objects. Although the user moves the position of the object, this situation cannot be changed because the virtual symbol keeps track of the object. This situation should be considered in the future use of virtual 3D symbols.

7. Discussion

In this discussion, we mainly discuss about the proposed research questions.

About R1 (For software platform):
–
Regarding system adaptability
From the task preparation and implementation process, we can find that in non-predefined environments, participants can freely configure the spatial position of objects in the scene using our proposed system. In addition, our proposed system supports multiple independently dynamically aligned objects or multiple manually aligned objects in a single step. A change in the spatial position of an independent object coordinate will not affect the alignment of other objects. This can show that our system has better adaptability to the different environment and tasks than the conventional system.
–
Regarding task performance and user experience The system has visualization methods with a high degree of freedom, which can objectively provide users with more and more flexible assistance, improve user experience, and reduce cognitive load. However, according to quantitative experimental evaluation, especially in terms of task time and error rate, our system has not improved significantly. Some reason analysis and experiment limitation are explained in the Section 6.9.1 and Section 6.10.
About R2 (For content description format):
–
Regarding describability of general content descripation formatWe try to use the proposed content description format to describe the content of the different types of task manuals we collected. As a result, the proposed content description format can support almost the majority of the assembly and maintenance tasks. In the face of unknown or more complex tasks, we will continue to modify the proposed content description format to improve the describable nature of the format.

8. Conclusions

In this paper, a systematic strategy was proposed to improve the adaptability of AR task support software platform to different types of tasks and environments, as well as user experience and task performance in tasks. This strategy consists of two approaches. First, the degree of freedom of the visualization methods was improved by increasing the number of coordinates and coordinate registration modes that can be supported by the system, providing more 2D content display media and boosting the degree-of-freedom of user interaction with virtual assets.Second, the design process of the software platform was optimized by incorporating user-, environment- and task-centered design strategies throughout the software development process to continuously improve software adaptability. This deepens user acceptance of the system, reduces learning costs, and improves the user experience.

With the purpose of verifying the performance of our systematic strategy, the corresponding software platform was developed, and the development process followed our design strategy. Second, a content description format describing multiple types of tasks was designed. Two user studies were implemented to test the adaptability of our proposed software platform to different types of real industrial tasks, and whether the high-degree-of-freedom visualization methods enhances user task performance and user experience.

The results demonstrated that our proposed scheme can allow the operator to customize the object layout in task scenarios to adapt to multiple types of scenarios. The system can support task implementation for both fixed and moving objects. There is no great improvement in task performance compared with the system in the traditional way. Nonetheless, it possessed a preferable effect on the improvement in user experience and the reduction in user cognitive workload.

The future work of this study is described as follows. Firstly, the ability of the software platform will be improved to parse content description files. Second, based on the task-centered principle, the software platform will be iteratively updated through different types of tasks to improve system adaptability. Finally, the adaptability of the content description file format to multiple AR task support software platforms will be reinforced, enabling the content file to be widely circulated as a common description format.

Author Contributions

Conceptualization, Y.S., S.U., Y.F. and H.K.; formal analysis, Y.S., S.U. and Y.F.; investigation, Y.S., S.U. and Y.F.; writing—original draft preparation, Y.S., S.U. and Y.F.; writing—review and editing, Y.F., T.S., M.K. and H.K.; supervision, Y.F. and H.K.; project administration, Y.S., S.U. and Y.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of the Nara Institute of Science and Technology (2019-I-29, approved on 28 February 2020).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Sorko, S.R.; Trattner, C.; Komar, J. Implementing AR/MR–Learning factories as protected learning space to rise the acceptance for Mixed and Augmented Reality devices in production. Procedia Manuf. 2020, 45, 367–372. [Google Scholar] [CrossRef]
Webel, S.; Bockholt, U.; Engelke, T.; Gavish, N.; Olbrich, M.; Preusche, C. An augmented reality training platform for assembly and maintenance skills. Robot. Auton. Syst. 2013, 61, 398–403. [Google Scholar] [CrossRef]
Nee, A.Y.; Ong, S.K. Virtual and augmented reality applications in manufacturing. IFAC Proc. Vol. 2013, 46, 15–26. [Google Scholar] [CrossRef]
Azuma, R.T. A survey of augmented reality. Presence Teleoperators Virtual Environ. 1997, 6, 355–385. [Google Scholar] [CrossRef]
Stocker-Wörgötter, E. Biochemical diversity and ecology of lichen-forming fungi: Lichen substances, chemosyndromic variation and origin of polyketide-type metabolites (biosynthetic pathways). In Recent Advances in Lichenology; Springer: Berlin, Germany, 2015; pp. 161–179. [Google Scholar]
Aromaa, S.; Väätänen, A.; Aaltonen, I.; Heimonen, T. A model for gathering and sharing knowledge in maintenance work. In Proceedings of the European Conference on Cognitive Ergonomics 2015, Warsaw, Poland, 1–3 July 2015; pp. 1–8. [Google Scholar]
Gattullo, M.; Scurati, G.W.; Evangelista, A.; Ferrise, F.; Fiorentino, M.; Uva, A.E. Informing the use of visual assets in industrial augmented reality. In Proceedings of the International Conference of the Italian Association of Design Methods and Tools for Industrial Engineering, Modena, Italy, 9–10 September 2019; pp. 106–117. [Google Scholar]
Li, W.; Wang, J.; Jiao, S.; Wang, M.; Li, S. Research on the visual elements of augmented reality assembly processes. Virtual Real. Intell. Hardw. 2019, 1, 622–634. [Google Scholar] [CrossRef]
Webel, S.; Bockholt, U.; Keil, J. Design criteria for AR-based training of maintenance and assembly tasks. In Proceedings of the International Conference on Virtual and Mixed Reality, Orlando, FL, USA, 9–14 July 2011; pp. 123–132. [Google Scholar]
Agati, S.S.; Bauer, R.D.; Hounsell, M.d.S.; Paterno, A.S. Augmented reality for manual assembly in industry 4.0: Gathering guidelines. In Proceedings of the 22nd Symposium on Virtual and Augmented Reality (SVR), Porto de Galinhas, Brazil, 7–10 November 2020; pp. 179–188. [Google Scholar]
Ciuffini, A.F.; Di Cecca, C.; Ferrise, F.; Mapelli, C.; Barella, S. Application of virtual/augmented reality in steelmaking plants layout planning and logistics. Metall. Ital. 2016, 7, 5–10. [Google Scholar]
Princle, A.; Campbell, A.G.; Hutka, S.; Torrasso, A.; Couper, C.; Strunden, F.; Bajana, J.; Jastząb, K.; Croly, R.; Quigley, R.; et al. [Poster] Using an Industry-Ready AR HMD on a Real Maintenance Task: AR Benefits Performance on Certain Task Steps More Than Others. In Proceedings of the 2018 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), Munich, Germany, 16–20 October 2018; pp. 236–241. [Google Scholar]
Ramirez, H.; Mendivil, E.G.; Flores, P.R.; Gonzalez, M.C. Authoring software for augmented reality applications for the use of maintenance and training process. Procedia Comput. Sci. 2013, 25, 189–193. [Google Scholar] [CrossRef]
Knopfle, C.; Weidenhausen, J.; Chauvigné, L.; Stock, I. Template based authoring for AR based service scenarios. In Proceedings of the IEEE Proceedings, VR 2005, Virtual Reality, Bonn, Germany, 12–16 March 2005; pp. 237–240. [Google Scholar]
PTC Products. 2022. Available online: https://www.ptc.com/en/products (accessed on 26 December 2022).
CareAR Instruct. 2022. Available online: https://carear.com/carear-instruct/ (accessed on 23 December 2022).
Roberto, R.A.; Lima, J.P.; Mota, R.C.; Teichrieb, V. Authoring tools for augmented reality: An analysis and classification of content design tools. In Proceedings of the International Conference of Design, User Experience, and Usability, Toronto, ON, Canada, 17–22 July 2016; pp. 237–248. [Google Scholar]
Henderson, S.; Feiner, S. Exploring the benefits of augmented reality documentation for maintenance and repair. IEEE Trans. Vis. Comput. Graph. 2010, 17, 1355–1368. [Google Scholar] [CrossRef] [PubMed]
Bellalouna, F. Industrial use cases for augmented reality application. In Proceedings of the 2020 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom), Mariehamn, Finland, 23–25 September 2020; pp. 000011–000018. [Google Scholar]
Nee, A.Y.; Ong, S.; Chryssolouris, G.; Mourtzis, D. Augmented reality applications in design and manufacturing. CIRP Ann. 2012, 61, 657–679. [Google Scholar] [CrossRef]
Reiners, D.; Stricker, D.; Klinker, G.; Müller, S. Augmented reality for construction tasks: Doorlock assembly. In Proceedings of the International Workshop on Augmented Reality: Placing Artificial Objects in Real Scenes, Bellevue, WA, USA, 10 November 1999; pp. 31–46. [Google Scholar]
Salonen, T.; Sääski, J. Dynamic and visual assembly instruction for configurable products using augmented reality techniques. In Advanced Design and Manufacture to Gain a Competitive Edge; Springer: Berlin, Germany, 2008; pp. 23–32. [Google Scholar]
Unity Engine. 2022. Available online: https://unity.com/ (accessed on 23 December 2022).
Unreal Engine. 2022. Available online: https://www.unrealengine.com/ (accessed on 23 December 2022).
Amazon Sumerian. 2022. Available online: https://aws.amazon.com/cn/sumerian/ (accessed on 23 December 2022).
Kato, H.; Billinghurst, M. Marker tracking and hmd calibration for a video-based augmented reality conferencing system. In Proceedings of the 2nd IEEE and ACM International Workshop on Augmented Reality (IWAR’99), San Francisco, CA, USA, 20–21 October 1999; pp. 85–94. [Google Scholar]
Vuforia. 2022. Available online: https://library.vuforia.com/ (accessed on 23 December 2022).
Mixed Reality Toolkit. 2022. Available online: https://hololenscndev.github.io/MRTKDoc/README.html (accessed on 23 December 2022).
Vuforia Expert Capture. 2022. Available online: https://www.ptc.com/en/products/vuforia/vuforia-expert-capture (accessed on 23 December 2022).
Vuforia Chalk. 2022. Available online: https://www.ptc.com/en/products/vuforia/vuforia-chalk (accessed on 23 December 2022).
Web Service. 2022. Available online: https://en.wikipedia.org/wiki/Web_service (accessed on 23 December 2022).
Ganlin, Z.; Pingfa, F.; Jianfu, Z.; Dingwen, Y.; Zhijun, W. Information integration and instruction authoring of augmented assembly systems. Int. J. Intell. Syst. 2021, 36, 5028–5050. [Google Scholar] [CrossRef]
Bundlar. 2022. Available online: https://bundlar.com (accessed on 23 December 2022).
ScopeAR. 2022. Available online: https://www.scopear.com/solutions/worklink-platform/ (accessed on 23 December 2022).
Dynamic 365 Guides. 2022. Available online: https://dynamics.microsoft.com/en-us/mixed-reality/guides/ (accessed on 21 December 2022).
Zhu, J.; Ong, S.; Nee, A. A context-aware augmented reality system to assist the maintenance operators. Int. J. Interact. Des. Manuf. (IJIDeM) 2014, 8, 293–304. [Google Scholar] [CrossRef]
Geng, J.; Song, X.; Pan, Y.; Tang, J.; Liu, Y.; Zhao, D.; Ma, Y. A systematic design method of adaptive augmented reality work instruction for complex industrial operations. Comput. Ind. 2020, 119, 103229. [Google Scholar] [CrossRef]
Willis, J.; Wright, K.E. A general set of procedures for constructivist instructional design: The new R2D2 model. Educ. Technol. 2000, 40, 5–20. [Google Scholar]
Baek, E.O.; Cagiltay, K.; Boling, E.; Frick, T. 53. User-Centered Design and Development. In Handbook of Research on Educational Communications and Technology; Routledge: Oxfordshire, UK, 2008. [Google Scholar]
Microsoft. Hololens 2. 2022. Available online: https://www.microsoft.com/en-us/hololens/ (accessed on 26 December 2022).
Microsoft. Mixed Reality Design. 2022. Available online: https://learn.microsoft.com/en-us/windows/mixed-reality/design/design (accessed on 26 December 2022).
Agrawala, M.; Phan, D.; Heiser, J.; Haymaker, J.; Klingner, J.; Hanrahan, P.; Tversky, B. Designing effective step-by-step assembly instructions. ACM Trans. Graph. (TOG) 2003, 22, 828–837. [Google Scholar] [CrossRef]
Zarraonandia, T.; Aedo, I.; Díaz, P.; Montero Montes, A. Augmented presentations: Supporting the communication in presentations by means of augmented reality. Int. J. Hum.-Comput. Interact. 2014, 30, 829–838. [Google Scholar] [CrossRef]
Goodrum, D.A.; Dorsey, L.T.; Schwen, T.M. Defining and building an enriched learning and information environment. Educ. Technol. 1993, 33, 10–20. [Google Scholar]
YouTube. 2022. Available online: https://www.youtube.com (accessed on 26 December 2022).
Microsoft. Direct-Manipulation. 2022. Available online: https://learn.microsoft.com/en-us/windows/mixed-reality/design/direct-manipulation (accessed on 26 December 2022).
Microsoft. Point and Commit with Hands. 2022. Available online: https://learn.microsoft.com/en-us/windows/mixed-reality/design/point-and-commit (accessed on 26 December 2022).
Microsoft. Voice-Input. 2022. Available online: https://learn.microsoft.com/en-us/windows/mixed-reality/design/voice-input (accessed on 26 December 2022).
Tsotsos, J.K. Analyzing vision at the complexity level. Behav. Brain Sci. 1990, 13, 423–445. [Google Scholar] [CrossRef]
Industries, M.H. 4 Stroke Gasoline Engine. 2022. Available online: https://www.mhi.com/products/industry/gasoline_engine.html (accessed on 25 December 2022).
Hart, S.G.; Staveland, L.E. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Advances in Psychology; Elsevier: Amsterdam, The Netherlands, 1988; Volume 52, pp. 139–183. [Google Scholar]
Brooke, J. SUS-A quick and dirty usability scale. Usability Eval. Ind. 1996, 189, 4–7. [Google Scholar]
Wikipedia. Fool-Proof Design. 2022. Available online: https://en.wikipedia.org/wiki/Idiot-proof (accessed on 25 December 2022).

Figure 1. Conceptual diagram of platform and content in the AR task support system.

Figure 2. Comparison of conventional AR support software platform with our proposed high-degree-of-freedom visualization methods solution.

Figure 3. Software Platform Development Process.

Figure 4. For an example of contents analysis of Dynamic 365 Guides.

Figure 5. For an example of functional level analysis of Dynamic 365 Guides.

Figure 6. Informational Components.

Figure 7. Sketch of Input Controls Components.

Figure 8. Sketch of navigational components.

Figure 9. Sketch of parts management components.

Figure 10. Example of main-panel interaction flow.

Figure 11. The main-panel after implementation.

Figure 12. The color change that occurs in the main interface when the user interacts and does not interact.

Figure 13. The color change that occurs when the user locks the position of virtual parts management box.

Figure 14. Fixed mode of main-panel.

Figure 15. Case No. 1: Coffee machine maintenance task.

Figure 16. Case No. 2: HDD replacement task.

Figure 17. Software Prototype Evaluation Process.

Figure 18. Structure of contents description format.

Figure 19. Example of Main-Panel element description.

Figure 20. Example of Calibration element description.

Figure 21. Description example of Section element and Step element. This step uses sub-panel as an Elaboration element.

Figure 22. Description example of PointerRope element.

Figure 23. Description example of 3D Model element.

Figure 24. Description example of Branch element.

Figure 25. The differences between the two systems.

Figure 26. Task 1: Engine carbon cleaning task.

Figure 27. Experimental scene setting of engine maintenance task.

Figure 28. (a) The engine has two markers attached to different parts to ensure that objects are tracked throughout the entire process. (b) There is no marker on the engine, and the marker on the table is used to determine the coordinates of all virtual objects in the scene except for the main-panel.

Figure 29. The difference between the two systems in the engine maintenance task. (a) System A can display the 3D model or sub-panels in the correct position/orientation by marker tracking. (b) In system B, if the object (the engine) is accidentally moved, the participants need to move the object back to its original position, relying on the wire-frame of the virtual engine bottom shape.

Figure 30. Task 2: PC assembly task.

Figure 31. Experimental scene setting of PC assembly task.

Figure 32. (a) The engine has two markers attached to different parts to ensure that objects are tracked throughout the entire process. (b) There is no marker on the PC, and the marker on the table is used to determine the coordinates of all virtual objects in the scene except for the main-panel.

Figure 33. The difference between the two systems in PC assembly task. (a) In System A, multiple sub-panels can be used in conjunction with the main-panel. (b) In System B, all text, images, and videos are displayed in the main-panel (3D model still available).

Figure 34. The mean of completion time of engine maintenance task.

Figure 35. The mean of completion time of PC assembly task.

Figure 36. Mean number of errors during the implementation in engine maintenance task.

Figure 37. Mean number of errors during the implementation in PC assembly task.

Figure 38. Weighted average score of NASA-TLX workload in engine maintenance task.

Figure 39. Subscale of NASA-TLX workload in engine maintenance task.

Figure 40. Weighted average score of NASA-TLX workload in PC assembly task.

Figure 41. Sub-scale of NASA-TLX workload in PC assembly task.

Figure 42. Mean SUS score for each participant in engine maintenance task.

Figure 43. Mean SUS score for each participant in PC assembly task.

Figure 44. User preference for the system with marker tracking.

Figure 45. User preference for the system with sub-panels.

Figure 46. The movement distance of HMDs in bolt installation.

Table 1. Visual Assets List.

Visual Assets	Text Image Video 3D model 3D Indicator
Information Type	Task title Section title & number Step title & number Step Instruction Image Instruction Video Instruction 3D Instruction

Table 2. Function Table.

	Constant	On-Demand
Necessary Function	Switch to the previous/next step, Switch main-panel fixation mode (Follow/Stay), Zoom in/out Indicate the direction of main-panel Adjust size/position/rotation of panel Adjust size/position/rotation of 3D model	Adjust the Video Progress Open the anchor tool Minimize and Maximize the window Marker scanner Indicate the position of the object Manual Alignment
Useful Function	Display/Hide function list Quickly Switch the step Play/Pause video Go to homepage, Go to the previous/next page (Remote control) Quickly change the position of previous/next button, Data Time	Remote switch main-panel stabilized mode Adjust the progress of the video

Table 3. The detail of the first case study.

System Version	V0.1
Type of task	Maintenance Task
Single/Multi Direction	Single Direction
Number of locations	One
Number of Participates	5
User pose	Stand
Content Composition	One Section
Logic of Content	Liner
Assets	Text, Image
Test Function	Switch to the previous/next step, Zoom in/out, Adjust size/position/rotation of panel, Switch main-panel fixation mode (Follow/Stay), Minimize and Maximize the window, Go to the previous/next page (remote control)

Table 4. The detail of the second case study.

System Version	V0.2
Type of task	Disassembly and Assembly Task
Single/Multi Direction	Single Direction
Number of locations	One
Number of Participates	6
User pose	Stand
Content Composition	Four Section
Logic of Content	No Liner
Assets	Text, Image, Video, 3D Model, 3D Indicator
Test Function	Switch to the previous/next step, Zoom in/out, Adjust size/position/rotation of panel, Switch main-panel fixation mode (Follow/Stay), Minimize and Maximize the window, Go to the previous/next page (remote control),
New Function	Play/Pause video, Adjust the progress of the video, Quickly Switch the step, Marker scanner, Indicate the position of the object, Manual Alignment, Adjust size/position/rotation of 3D mode, Open the anchor tool

Table 5. This table shows the number of guide books we have collected for evaluation.

Work Type	Number of Products	Number of Assembly Tasks	Number of Disassembly Task	Number of Maintenance Task
Manufacturing	7	6	4	12
Construction	9	8	1	0
Chemical industry	2	0	0	4
Transportation	1	2	2	2
Furniture	3	3	0	0
Home Appliances	3	5	0	4

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Shen, Y.; Ueda, S.; Fujimoto, Y.; Sawabe, T.; Kanbara, M.; Kato, H. General Software Platform and Content Description Format for Assembly and Maintenance Task Based on Augmented Reality. Information 2023, 14, 100. https://doi.org/10.3390/info14020100

AMA Style

Shen Y, Ueda S, Fujimoto Y, Sawabe T, Kanbara M, Kato H. General Software Platform and Content Description Format for Assembly and Maintenance Task Based on Augmented Reality. Information. 2023; 14(2):100. https://doi.org/10.3390/info14020100

Chicago/Turabian Style

Shen, Yiming, Shuntaro Ueda, Yuichiro Fujimoto, Taishi Sawabe, Masayuki Kanbara, and Hirokazu Kato. 2023. "General Software Platform and Content Description Format for Assembly and Maintenance Task Based on Augmented Reality" Information 14, no. 2: 100. https://doi.org/10.3390/info14020100

APA Style

Shen, Y., Ueda, S., Fujimoto, Y., Sawabe, T., Kanbara, M., & Kato, H. (2023). General Software Platform and Content Description Format for Assembly and Maintenance Task Based on Augmented Reality. Information, 14(2), 100. https://doi.org/10.3390/info14020100

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

General Software Platform and Content Description Format for Assembly and Maintenance Task Based on Augmented Reality

Abstract

1. Introduction

2. Related Works

2.1. Industrial AR and AR Task Instruction

2.2. AR Authoring Tool

2.3. Adaptability of AR Authoring Tool

3. Overall System and Our Ideas

3.1. Overall System

3.1.1. Platform

3.1.2. Content

3.2. Our Idea

3.2.1. Idea 1: Increasing the Degree of Freedom of the visualization methods

3.2.2. Idea 2: Improving the Design Process of the Software Platform

User-Centered Design and Development

Environment-Centered Design and Development

Task-Centered Iteration Design and Development

4. Development of General AR Task Support Software Platform

4.1. Software Platform Design Process

4.2. User, Environment and Task Analysis

4.3. Related Software Analysis

4.4. Contents, Functions, and UI Component

4.4.1. Contents Determination

4.4.2. Functions Determination

4.4.3. UI Component Determination

4.5. Prototype Design and Test

4.5.1. Paper Diagram Design and Test

4.5.2. Prototype Implementation and Test

4.6. Iteration Design by Case Study

4.6.1. Detail of Our Different Cases

4.6.2. Detail of Evaluation Process

4.6.3. Feedback from Each Case Study

4.6.4. Summary of Software Platform Design Process

5. Contents: Development of General Contents Description Format

5.1. Strategy of General Contents Description Format Design

5.2. Detail of General Contents Description Format

5.2.1. Global Elements

5.2.2. Section and Step

5.2.3. Branch Elements

5.3. Evaluation of General Contents Description Format

5.3.1. Evaluation Method

5.3.2. Evaluation Results

6. User Studies

6.1. Purpose of User Study

6.2. Systems

6.3. Tasks

6.3.1. Task 1: Engine Maintenance task

6.3.2. Task 2: PC Assembly Task

6.4. Participants

6.5. Hypothesis

6.6. Measures

6.7. Experiment Procedures

6.8. Results

6.8.1. Task Performance Evaluation

6.8.2. User Experience Evaluation

6.8.3. User Opinion

6.9. Discussion of User Studies

6.9.1. Regarding Task Performance

6.9.2. Regarding Cognitive Load

6.9.3. Regarding User Opinion

6.9.4. Regarding System Adaptability

6.10. Limitations

7. Discussion

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI