Deep-Learning-Incorporated Augmented Reality Application for Engineering Lab Training

: Deep learning (DL) algorithms have achieved signiﬁcantly high performance in object detection tasks. At the same time, augmented reality (AR) techniques are transforming the ways that we work and connect with people. With the increasing popularity of online and hybrid learning, we propose a new framework for improving students’ learning experiences with electrical engineering lab equipment by incorporating the abovementioned technologies. The DL powered automatic object detection component integrated into the AR application is designed to recognize equipment such as multimeter, oscilloscope, wave generator, and power supply. A deep neural network model, namely MobileNet-SSD v2, is implemented for equipment detection using TensorFlow’s object detection API. When a piece of equipment is detected, the corresponding AR-based tutorial will be displayed on the screen. The mean average precision (mAP) of the developed equipment detection model is 81.4%, while the average recall of the model is 85.3%. Furthermore, to demonstrate practical application of the proposed framework, we develop a multimeter tutorial where virtual models are superimposed on real multimeters. The tutorial includes images and web links as well to help users learn more effectively. The Unity3D game engine is used as the primary development tool for this tutorial to integrate DL and AR frameworks and create immersive scenarios. The proposed framework can be a useful foundation for AR and machine-learning-based frameworks for industrial and educational training.


Introduction
It is important for electrical engineers to understand how to use electrical equipment correctly. However, learning how to use equipment in a few cases has been a challenge for freshman electrical engineering students as many lab equipment are complex with several functionalities that are difficult to understand at the freshman level [1]. Following lab or user manuals and watching video tutorials are traditional approaches to learn how to use equipment. However, they do not guarantee that students will retain all the information. With recent technological advancements, new teaching strategies that create immersive and hands-on experiences are being researched to increase students' interest and knowledge [2].
In this paper, we describe the design and development of a smartphone app that uses deep learning (DL) and augmented reality (AR) to create a learning platform for teaching students how to use electrical lab equipment. These new technologies with their integration into tools and applications used for day-to-day tasks have made life easier not only for students, but also for people in different roles. They also benefit manufacturing industries, gaming, education, health, farming, and a variety of other fields that require process automation [3]. Artificial intelligence (AI) is used to complete complex tasks in the same way that humans do [4]. Extended reality (XR), a concept referring to virtual worlds and human-machine interactions, was developed to supplement the features that

Overview of Deep Learning and Augmented Reality
This work explores the idea of using equipment recognition and an AR-based tutorial to enhance student learning experiences with electrical equipment in their engineering laboratories. Our long-term goal is to develop interactive smartphones apps for lab equipment such as multimeters, oscilloscopes, wave generators, and power supplies. Object detection using DL methods fits our goal because the app can detect specific electrical equipment in the lab with high precision in real-time using state-of-art DL algorithms. AR technology enables us to create virtual scenarios and integrate 3D models, animations, images, and videos embedded into teaching methods. In the developed app, the interactive visualization was created using the Unity3D game engine [12]. The object detection process of the app employs a deep neural network architecture trained with TensorFlow API [13].

Deep Learning
Innovative DL techniques for performing specific tasks have emerged rapidly in recent years due to significant advances in hardware development and data availability. For big data predictive analytics and multi-modal learning, DL algorithms are quite suitable, while traditional ML algorithms face several limitations. Studies in [14,15] point out that these DL methods are constructed with hierarchical layers and use complex neural networks to improve their performances iteratively. Machines equipped with DL models can perform specialized tasks such as driving a car, identifying weeds in a field, diagnosing diseases, evaluating machinery for errors, and even recognizing objects in real-time. They are used in a wide range of computer science domains including computer vision [15], natural language processing [16], and speech recognition [17]. In this work, we are using a deep neural network to detect objects. Object detection is primarily used in computer vision and has gained popularity in a variety of applications over the last decade, including autonomous vehicles, intelligent video, and surveillance systems [18].
For object detection, a type of deep neural network called a convolutional neural network (CNN) [19] has been widely used. CNN is a well-known DL algorithm that employs backpropagation in a feed-forward neural network with a collection of neurons arranged in hierarchical layers. It also exhibits typical neural network characteristics such as multiple interconnected hidden layers [20,21]. CNN for object detection is trained on large labeled datasets and neural network architectures that learn features directly from data without explicit feature engineering. In recent years, object detection methods such as the region-based convolutional neural network (RCNN) [22], you only look once (YOLO) [23], and single shot detector (SSD) [24] have been proposed. The rapid development of neural networks improves the accuracy and real-time performance of object identification tasks significantly [25]. In this paper, we compared RCNN and MobileNet-SSD v2 [26] and observed that MobileNet-SSD v2 has better performance for real-time applications in terms of speed when implemented on mobile devices. It is important to note that MobileNet-SSD v2 is a lightweight deep neural network architecture designed specifically for mobile devices with high recognition accuracy. Therefore, we employed MobileNet-SSD v2 in this work.

Augmented Reality
The relationship between the real and virtual worlds, as mentioned in Section 1, is what distinguishes the various XR technologies. In AR, users perceive virtual objects as real-world extensions, whereas MR users combine and interact in both the real and digital world, and VR users immerse themselves entirely in a virtual world [27,28]. This paper focuses on the development of AR applications that allow us to create experiences by utilizing additional digital data about our surroundings. We can receive digital information in real-time through devices such as webcams, mobile phones, and tablets. In other words, AR allows us to overlay layers of visual information in the physical world, allowing humans to interact with virtual 3D objects as well as physical objects around us [29]. This feature has revolutionized the ways humans learn and comprehend [30]. AR development necessitates three key components: a physical object that serves as a model for the virtual object's interpretation and production; intelligence devices with access to a camera that project an image of a targeted object; and software that interprets the signal sent by the camera [31].
There are diverse types of AR suitable for different applications despite the fact that they all have similar capabilities [32,33]. Figure 1 depicts the two primary types of AR: marker-based AR and marker-less AR.

Marker-Based AR
Marker-based AR works when it is triggered by pre-defined markers. It allows the user to choose where to place the virtual object. Barcodes and QR codes are commonly used as images or photo symbols to be placed on flat surfaces. The program recognizes the marker when the mobile device focuses the target image. The virtual information will be projected by the AR onto the marker that will be displayed on the device. There are many levels of complexity in marker-based AR [34]. For example, a few display virtual information when the device is focused on the marker, while others save that virtual information and allow users to view it again when the device is focused on a different section. The marker-based AR technology leverages images from the actual world or QR codes to extract points, lines, corners, textures, and other properties [35]. These images are used to superimpose and create AR experiences by referencing track points in the physical world.

Marker-Less AR
Marker-less AR is more versatile than marker-based AR. It interacts with the real object without the need for pre-defined markers but leaves the freedom to the user. This allows the user, for example, to position a virtual object anywhere on a real object. Users can experiment with different styles and locations digitally without having to move anything in their immediate surroundings [36]. Marker-less AR collects data from the device hardware such as a camera, a GPS, a digital compass, and an accelerometer for the AR program to function. Marker-less AR applications rely on computer vision algorithms to distinguish objects, and they can function in the real world without specific markers [37,38]. There are four types of marker-less AR discussed as follows: (a) Location-based AR: In this type of AR, simultaneous localization and mapping (SLAM) technology is used to track the user's location as the map is generated and updated on the user's mobile device [39]. To display AR content in the physical environment, the user must detect a surface with a mobile device [40,41]. As an example, the worldfamous AR-based game app, Pokemon Go, uses SLAM technology that allows its users to battle, navigate, and search for 3D interactive objects based on their geographical locations [42]. (b) Superimposition-based AR: Superimposition-based AR applications can provide an additional view along with the original view of the object. Object recognition is required to determine the type of object to partially or completely replace an object in the user's environment with a digital image [43,44]. Using HoloLens glasses, surgeons can superimpose images previously gathered through scanners or X-rays on the patient's body during the operation. They can anticipate potential problems using this approach. (c) Projection-based AR: Projection-based AR (also known as projection mapping and augmented spatial reality) is a technique that does not require the use of head-mounted or hand-held devices. This method allows augmented information to be viewed immediately from a natural perspective. Using projection mapping, projection-based AR turns an uneven surface into a projection screen. This method allows for the creation of optical illusions [45]. (d) Outlining-based AR: This type of AR employs image recognition to create contours or forms and highlight components of the real world using special cameras. It is used by human eyes to designate specific items with lines to make situations easier. Vuforia's Model Target is an example of outlining-based AR. Vuforia is a platform that enables developers to quickly incorporate AR technology into their applications. Model Targets allow apps to recognize and track real-world objects based on their shape [46].
In our project, we built a superimposition-based AR app. We built user interfaces on top of lab equipment, allowing step-by-step instructions to be incorporated into the application for users to understand and learn how to use specific equipment. Using AR technology, immersive experiences are created in a variety of ways. It does, however, have some limitations such as the inability to recognize multiple objects at once. On the other hand, DL models show high performances in recognizing multiple objects at the same time. Integrating AR apps with DL models will help trigger specific AR scenarios based on objects being aimed at with a camera and allow an AR scenario to perform a single tracking without decreasing mobile device performance.

Design and Implementation of the AR App
This section describes the design and development of the AR app that integrates two independent frameworks for object detection and augmented reality as shown in Figure 2. Unity 3D combines the output of these systems by inferring the object detection model with OpenCV and using an AR dataset target with a Vuforia Engine. Furthermore, Unity 3D enables the development of interactive user interfaces. Users can first use their mobile device to infer the object detection model to detect the lab equipment. The inference will classify and localize lab equipment that has been targeted with the mobile camera. When an object is detected, a user interface (UI) button appears, indicating that an AR-guided tutorial is available for the object. Then, an AR scenario will be loaded, allowing students to use their mobile camera to aim at a specific target. Following that, a 3D object will superimpose on top of the physical object, activating UI panels with instructions on how to use the equipment. The app development process consists of integrating a number of different independent systems with their frameworks. In the interactive tutorial development framework, Unity3D was used as the primary development software for generating specific UI instructions and creating immersive interactions between the mobile app and the user. The development framework was integrated with a MobileNet-SSD DL model and a marker-less superimposition AR that activates immersive modules containing 2D/3D objects. The detailed framework integration is discussed below.

Object Detection Framework
MobileNet-SSDv2 [26] architecture was used to build a deep neural network model to detect electrical lab equipment. The architecture comprised MobileNet-v2 as the backbone network,an SSD detector, and feature pyramid network (FPN). MobileNet, as the name implies, is intended for embedded applications on mobile devices to improve accuracy while effectively considering constrained resources. A loss function method was calculated using the predicted and labeled values of the classes and offsets to evaluate how the algorithm models the data. The confidence loss (L con f idence ) occurs when attempting to predict a class, which is softmax loss over multiple classes confidences. Localization loss (L localization ) is defined as a mismatch between the ground truth and the intended boundary boxes [24,47], where α is the weight coefficient, expressed as: MobileNet is a low-latency and low-power model that can be tailored to meet the resource constraints of various use cases. For multi-scale object detection, MobileNetv2 provides a number of feature maps with different dimensions for the backbone detection network to the SSD convolutional layer that uses small convolutional filters to predict scores and class offsets for a fixed set of the standard bounding boxes. MobileNet-SSDv2 extracts features from images, which are then processed through SSD predictor layers that reduce image size to recognize objects at various scales [26,48] as shown in Figure 3. Mobilenet-SSDv2 detector improves the SSD detector by combining MobileNetv2 and FPN while maintaining memory efficiency.

TensorFlow Object Detection API
TensorFlow (TF) API, developed by Google Brain, is a framework for creating a DL network [13]. It is a powerful tool that can be used to create a robust object detection framework with a set of standard functions, eliminating the need for users to write code from scratch. It also provides a list of pre-trained models, which are useful not only for inference but also for building models with new data. Model Zoo is a collection of models that have been previously trained using the common objects in context (COCO) dataset [49]. A workflow for training a DL model using the TF API is shown in Figure 4 and can be described through the following steps: (a) Image Dataset: The model was given input of 643 images collected from various perspective views and in different lighting settings. Each image is of 4032 × 3024 pixels in size. It is necessary to annotate these images before using them to train the model. A software, LabelImg [50], is used in the annotation process [51] that allows users to draw a rectangle in a specific area of the image. During training, the annotation will help the model precisely locate the object in the image. The outlining will generate and save coordinate points in an XML file. (b) TensorFlow Dataset: To make the computation of the DL framework efficient, TF records use a binary file format. Furthermore, TF records enable the dataset to be stored as a sequence of binary strings that improves the model's performance while using less disk space. We converted the XML files generated by LabelImg into TF binary records using a Python script. The last step in configuring the TF dataset is to create a .pbtxt file containing all of the categorical label classes that will be stored in a TF record file. (c) Configuration File: Multiple pre-trained models based on the common objects in context (COCO) dataset are available in TF. These models can be used to set up DL models prior to training on a new dataset. Table 1 lists several popular architectures with pre-trained models. For instance, ssd_MobileNet_v1_coco is the SSD with a MobileNet v1 configuration, ssd_inception_v2_coco represents an SSD with an Inception v2 configuration, and faster_rcnn_resnet101_coco stands for Faster R-CNN with a Resnet-101 (v1) configuration. All these configurations have been derived for the COCO dataset. From Table 1, it can be observed that ssd_MobileNet_v1_coco reaches the fastest inference speed of 30 ms but with the lowest mean average precision (mAP). In contrast, faster_rcnn_resnet101_coco has the slowest inference speed but the highest mAP of 32. We tested both MobileNet SSD v2 and faster RCNN [52] and concluded that Mo-bileNet SSD v2 performs faster inference in mobile devices than the faster-RCNN model in our study. Using a pre-trained model saves time and computing resources. A configuration file, in addition to the pre-trained model, is also required. It must match the same architecture of the pre-trained model. It is recommended to fine-tune the model to maximize the prediction outcome. The process of fine-tuning is divided into two steps: restoring weights and updating weights. After we completed the requirements, we ran the python code provided for TF API to start the training job. Following training, the API will generate a file serving as a training checkpoint in a specific format named .ckpt. This file is a binary file containing all of the weights, biases, and other variables' values. (d) Inference: After training the model, the last step is to put it into production and feed the model with live data to calculate the predicted output. Before testing, we can evaluate the model's accuracy using mAP. In Section 4, the evaluation result is described in detail. We also need a lightweight version of the model to perform inference, so we choose an OpenCV library.  In addition, there is a frozen trained model, a ready-to-use inference model that can generate an output based on the live data input, and the frozen process file is stored in Protobuf (.pb) file. The Protobuf model contains graph definition and trained parameters in a binary format. The text graph representation of the frozen process file is in a human-readable format required by the OpenCV library and is kept in a .pbtxt format. After creating the corresponding file, it is time to examine and test the trained model. We use a function called VideoCapture from OpenCV to test the model, which loads the input video using the PC webcam and then predicts the relevant labels and object location with an enclosed rectangle indicating its pixel location within the input image. Finally, with the Protobuf and the configuration file, we can now use the Unity3D game engine and OpenCV to create our application by triggering AR scenarios based on the detection of electrical lab equipment performed by the DL model during its inference.

Augmented Reality Framework
Vuforia is a framework that enables the creation, recognition, and tracking of virtual and physical objects in the real world using an electronic device. To test the prototype of our tutorial that integrates both object recognition and interactive augmented reality, we developed a tutorial on how to use a multimeter in the lab. A scene (a live video) captured by the camera will be saved to a mobile device. The Vuforia SDK creates a frame (a single image within a series of photos) of the captured scene. It improves the quality of the image captured by the camera so that an AR tracker component can correctly treat it. It uses the latter to analyze the image and search the database for matches, which may include one or more targets. Finally, the program renders virtual material such as photographs, videos, models, and animations on the device screen, creating a hybrid image of what we perceive as holographs. The process of generating AR targets is depicted in Figure 5 and can be described in the following steps: (a) Object Scanning: It is the primary tool for generating an object data file, which is required for generating an object target in the target manager on the Vuforia webpage. This app is available on the Vuforia developer website, which users can access after creating a developer account. The ObjectScanner [53] app is used, and the scanning environment is configured. The Vuforia developer portal provides a printable target image that defines the target position and orientation relative to the local coordinate space to scan the object and collect data points. It also distinguishes and removes undesirable areas. This printable target image is used in conjunction with an Android application, which is available for free download from the Vuforia official website. During scanning, the printable target image must be placed under the object to be scanned. Using the Vuforia scanning mobile application, the user can start collecting data points from the object. To achieve the best scanning quality, it is recommended to work in a noise-free environment with moderately bright and diffuse lighting. It is also recommended to avoid objects with reflective surfaces. In this work, a multimeter met all of the requirements, and a successful scanning was achieved. (b) Data File: Following the scanning, an object data file is created. The mobile app will also show how many scanning points the object has. The completed scanning area is evidenced by a square grid that changes color from gray to green. The object data file contains all the object's information. There is a test scenario to determine whether the scanned object has sufficient data for augmentation. In this scenario, a green rectangular prism will be drawn in one of the object corners relative to the target image coordinate space. (c) Target Management System: Vuforia has a framework that allows developers to choose from various target types, such as picture targets, stimuli targets, cylinder targets, object targets, and VuMarks. The system will process and manage the data for visual examination. A developer license is required to fully utilize the Vuforia manager web-based tool, which includes access to a target manager panel where a database can be uploaded, and targets can be added to the management system. The 3D object option must be selected when selecting the target type, and the object data file must be uploaded. (d) Export Target Dataset Following the web-tool processing the information, the database can be downloaded by choosing the desired platform. The platform can be converted into a package that can be used in the primary development process as well as to create AR experiences in Unity.

Lab Training Application Framework
With the help of AR and DL, the equipment learning application focuses on teaching and improving the student's learning experience on how to properly use electrical equipment. Unity3D will provide libraries that allow these technologies to be combined on top of assets, animations, and 3D models to create training scenarios that will engage students in learning through experience. The development procedure is shown in Figure 6 and can be described in the following steps:  (a) Setup Environment: The setup starts with the creation of a new project using a Unity hub. After creating and opening the project, it is essential to switch to a different build platform because Unity allows us to create once and deploy anywhere. In other words, we can select a platform from the list of available platforms in Unity, such as Windows, WebGL, Android, iOS, or any gaming console. We chose Android as the deployed platform for this project. The platform can be changed in the build settings windows, which can be accessed via the file bar. Additionally, the rendering, scripting, and project configuration must be modified. (b) Libraries and Assets: (1) OpenCV Library: OpenCV For Unity [54] is a program that uses AI algorithms to analyze and interpret images on computers or mobile devices. This Unity asset store product allows users to test AI pre-trained models that can be used to run algorithms and executable applications on mobile devices. The model employs a script that requires a binary file of a DL model with trained weights (weights of deep neural networks are not modified in this stage), and a file model network configuration. This script is granted access to the device resource, specifically the camera, so that the script can pass input to the model and start object detection inference, which will generate bounding boxes and labels around the object detected. (2) Vuforia Engine: This library allows Unity to create AR experiences for mobile devices. It is a collection of scripts and pre-made components for developing Vuforia apps in Unity. It includes API libraries written in the C# language that expose the Vuforia APIs. This library supports all traceable functions as well as high-level access to device hardware such as the device camera.
Assets: They are graphical representations of any items that could be used in the project. It is made up of user interface sprites, 3D models, images, materials, and sounds, all with their own design and functionality. Photoshop is used to create art sprites, such as images for a virtual manual and blender. A 3D modeler software is used to create 3D models.
(c) Scenarios creation (1) Main menu: The application includes a menu scenario, as shown in Figure 7, that will allow the user to select various modes based on their preferences. It includes a tutorial that teaches students how to use the application. There is a training mode to help students learn more about lab equipment or electrical components. (2) Object detection: In this case, the DL model is used in conjunction with the OpenCV library in Unity. The application has access to the device's camera from which it will infer the object detection model provided by the object detection framework. Furthermore, depending on the object that is being targeted, the application automatically generates bounding boxes around the desired object with its respective label and confidence. When the user points to the desired equipment, a bottom panel will appear with the option to load the AR experience or continue looking for other lab equipment. The OpenCV library allows us to specify the desired confidence value threshold during the model inference. During model inference, we can specify the desired confidence value threshold using the OpenCV library. The model draws a green rectangle around the detected equipment. The detection threshold confidence value is set to 90%, which means that the confidence must be greater than or equal to 90% to indicate a detection with a rectangular bounding box. This percentage was chosen because the lab equipment is quite different. The score of 90% would ensure that the lab equipment detected had a high confidence level.
Learning scenarios: A 3D visual aid is provided in this scenario to understand the essential functions of the equipment selected during the detection scenario. Figure 8 shows how users will be able to access an interactive 3D model repre-sentation of the equipment or view the equipment from various perspectives. In other words, it is a virtual manual introductory guide. (4) AR training scenario: When the application detects an object that has previously been configured in Unity, a 3D model will be superimposed on top of the physical object in the mobile app. It will also include a UI for the user to interact with, allowing them to understand and explore the physical object, while the mobile application provides additional information in the form of holograms, as shown in Figure 9.
(d) Deployment: The final step of the framework is to build the project for the desired end platform, which can be Android or iPhone. The scenarios in Unity must be selected and linked together. Unity will launch and generate a platform associated file that can be directly installed on mobile devices.

Experimental Results
In this section, performance for DL and AR frameworks are discussed.

Object Detection
The dataset used in this study is a collection of 643 images with annotations. The dataset is divided into four classes: multimeter, oscilloscope, power supply, and wave generator. The collected samples were randomly split into training and test sets with the ratio of 70% and 30%, respectively. We employ commonly used evaluation metrics such as precision, recall, and mAP to evaluate the model performance in the application.

Evaluation Metrics
• Intersection over Union (IoU): It is also known as the Jaccard index used for measuring the similarity and diversity of sample sets. In an object detection task, it describes the similarity between the predicted bounding box and the ground truth bounding box. Equation (4) expresses IoU in terms of area of the prediction and ground truth bounding boxes.
It is important to define a threshold to define what is the correctness of the prediction for IoU.
mean Average Precision (mAP): It takes under consideration both precision and recall. It is also the area beneath the precision-recall curve. The mAP can be computed by where AP k is the average precision of class k, and n is the number of classes. Figure 10 shows our model during inference when the new dataset was fed to the model. It demonstrates the correct detection of four types of lab equipment in a single shot when the confidence threshold value (i.e., the threshold related to the confidence score to determine whether the detection is an object of interest or not. Confidence scores of the predicted bounding boxes above the threshold value are considered as positive boxes, or vice versa) is greater or equal to 90%. The experimental results shown in Table 2 support that our model has a high mAP. In practice, the DL model can recognize all of the electrical lab equipment that has been pre-selected.   Table 2 shows the average precision and average recall for a given IoU score and mAP. The IoU is a range between 0.50 and 0.95. Using 193 testing images, the average precision of our proposed model achieved a mAP of 81.4% and an average recall of 85.3%. Some failure cases were due to the low ambient lighting and a lack of training datasets with varying lighting conditions. The DL model deployed on a mobile device uses CPU resources of the device to infer and predict objects. We ran the DL model on two different mobile devices to evaluate performances of devices for real-time prediction. We used the frame per second (FPS) unit, which measures the number of images that the mobile device screen displays every second.
According to Table 3, Samsung has a performance of 8.5 FPS, and One plus has a performance of 5.5 FPS, indicating that the device hardware resources are required to accelerate the inference performance.

Augmented Reality
Detecting a multimeter in real-time is a good way to test the accuracy and precision of AR-based object detection. Figure 11 shows two mobile devices with three different luminous intensities of 25 lux, 150 lux, and 350 lux. In addition, we included various distances between the mobile camera and the multimeter in our evaluation to understand how good our scanning process was when collecting data points from the multimeter. This evaluation enables us to determine the optimal room lighting configuration for good AR-based object detection. We included a toggle button in the test AR scenario during the evaluation to indicate whether the application keeps detecting the multimeter. Table 4 shows the results of the AR-based detection experiments. Due to the camera's lack of focus, we discovered that our camera was not tracking the multimeter during our preliminary results. We included a script in our Unity engine project that allowed us to focus on the mobile camera. The evaluation table includes focus parameters that will help us decide whether to include this feature in the AR experience. We chose 50 cm and 100 cm for our evaluation because these are the typical distances between the lab equipment and the students. The final column contains the result in True/False format, indicating whether the multimeter was detected. We concluded that Vuforia can detect objects even in low light conditions. However, the distance will have an impact on the detection results. According to our table evaluation, the focus parameter increases the likelihood of detecting a multimeter in different light intensities, but it also depends on the camera resolution.

Conclusions and Future Work
In this study, we developed an interactive multimeter tutorial using deep learning and augmented reality. We integrated a deep learning model, namely MobileNet-SSD v2, and an AR target database into a game engine to detect objects automatically. Unity3D was used to create the augmented tutorial, which includes a mobile game infrastructure. The tutorial functions as a virtual manual for the equipment, which provides an immersive experience by projecting holograms on objects recognized by the app via a mobile camera. In the future, we will create tutorials for additional lab equipment. One application will be the addition of a 3D interactive breadboard in the app to help students understand electrical circuits. Another potential enhancement of the proposed AR-and AI-based education tool would be to support remote learning, in which students can learn lab equipment through the AR streaming on their mobile devices or personal computers.