Development of a Drawing Application for Communication Support during Endoscopic Surgery and its Evaluation Using Steering Law

: The purpose of this study is to construct a hands-free endoscopic surgical communication support system that can draw lines in space corresponding to head movements using AR technology and evaluate the applicability of the drawing motion by the head movement to the steering law, one of the HCI models, for the potential use during endoscopic surgery. In the experiment, the participants manipulated the cursor by using head movements through the pathway and movement time (MT); the number of errors and subjective evaluation of the difficulty of the task was obtained. The results showed that the head-movement-based line drawing manipulation was significantly affected by the tracking direction and by the task difficulty, shown as the Index of Difficulty (ID). There was high linearity between ID and MT, with a coefficient of determination R² of 0.9991. The Index of Performance was higher in the horizontal and vertical directions compared to diagonal directions. Although the weight and biocompatibility of the AR glasses must be overcome to make the current prototype a viable tool for supporting communication in the operating room environment, the prototype has the potential to promote the development of a computer-supported collaborative work environment for endoscopic surgery purposes.


Introduction
With the recent evolution of Augmented Reality (AR) glasses, Computer-Supported Collaborative Work (CSCW) environments using AR for communication support between workers have led to its expanded use in the fields of industry [1] and education [2]. Billinghurst et al. [3] argue that, compared to fully immersive VR environments, the application of AR environments to CSCW interfaces has the following advantages, i.e., (1) users of the system can work with each other's facial expressions, gestures, and body language, enriching the means of communication between users by providing a natural channel for communication, (2) users can manipulate the virtual image using familiar realworld tools to enhance the intuitiveness of the interface, (3) users can view the virtual object while simultaneously referring to the actual object, and (4) since there is no need to model the entire environment, a fast information and communication environment is established.
The motivation of our research is to introduce an AR-based CSCW environment to support communication between surgeons during endoscopic surgery in the operating room. Endoscopic surgery is a safe and non-invasive surgical procedure that has been used successfully in many cases and it has developed rapidly as an alternative to conventional open surgery [4]. The research focused on its variety of tools and techniques [5]. In general, endoscopic surgery consists of two surgeons who communicate with each other about the surgical site and the procedure while looking at a monitor on which images taken by the endoscope are projected. However, the surgeons usually grasp the endoscopic devices through access ports towards the body's interior tissues and proceed with the operation while maintaining the surgical field, and they cannot take their hands off the devices unless there is a severe problem. This treatment is vital for safe surgery to maximize continuous tactile perception, which can only be obtained indirectly through gloves and devices [6]. Additionally, time is lost due to the surgical field's collapse if the hand is released. In such a situation, both parties communicate cautiously about the surgical field's state obtained from the monitor and proceed with the operation. When requiring communication regarding the operation's detailed strategy, it is impossible to specify the image of the inside of the body projected on the monitor due to both hands being assigned to the devices. This problem makes it difficult for accurate and effective communication between surgeons, inducing disruptions of surgical workflow, which increases the duration of surgery and cost [7][8][9], and the development of the potential risk of medical accidents.
The development of a hands-free communication device between operators in such an environment has been attempted in the past [10][11][12][13][14][15][16][17][18]. Trejos et al. [14] developed a communication support device called WHaSP (Wireless hands-free surgical pointer), which is equipped with a 6-axis acceleration sensor and Bluetooth device in the headset. The device has remarkably improved usability. Using this device, they conducted communication tests in an operating room environment with more than 100 cases. The results showed that the developed device significantly reduced the chances of the supervisor taking his hands off the surgical tools and pointing at the screen and significantly reduced the cases of abandoning the sterile barrier, compared to the surgical operation without the device.
Ease of use is a key element in the functionality of a hands-free communication device. Gurevich et al. [12] stated that in collaborative tasks using AR systems, it is more important to reduce the cognitive load among operators by having a simple annotation function, i.e., a function to show a trend with less precise lines, rather than to support precise drawing of information by head movements. Therefore, our system's usefulness should be considered to reduce cognitive load by annotation rather than the ability to draw precise lines on the endoscope image as a design consideration.
However, according to Trejos et al. [14], it was questionable whether pointing operation by head movement could improve communication between operators. They found no significant difference in total operative time, the total number of instructions, and total time during which instructions were given with and without WHaSP, suggesting the need to consider improvement from multiple perspectives.
To address these communication issues, the study by Sakata et al. [19] suggested that the use of pointing as the primary operation should be improved. They have conducted experiments on the selection and assembly of Lego bricks by remote collaborative work using two methods: a laser pointer visual assistance system (WACL) and a head-mounted display (HMD) with a line-drawing function. As a result, they concluded that the laser pointer instruction using WACL was beneficial enough to identify the work object and location. However, in situations where detailed explanations were required, the HMD with line-drawing input reduced the number of words and the user's work time giving instructions compared to the WACL, arguing for its effectiveness of line-drawing as a means of communication.
This research was motivated by the desire to develop a prototype of a communication support system for endoscopic surgery with a line-drawing function operated by head movement. We adopted a Microsoft HoloLens, head-mounted Augmented Reality (AR) glasses that can be operated hands-free [2]. This device can detect the frontal direction based on the accelerometer and gyroscope sensors' information and projects a cursor on the surface of a real object in that direction. In our developed system, the cursor can be displayed on the surface of the screen where the endoscope image is projected in the operating room. The AR glasses wearer can then manipulate the cursor by moving the position of his or her head. The cursor plays the same role as the mouse cursor in computer operation, except that head movements manipulate the cursor. The system also allows a voice input to change line color and line thickness as well as start/stop commands for their convenience. In this way, the surgeon can give the other surgeon detailed positioning instructions for the affected area on the screen by moving the cursor with head movements.
In order to determine the effectiveness of this prototype in an endoscopic surgery environment, it is necessary to check whether the performance of the proposed environment can be applied to the current HCI model of drawing operations. If this validation of model applicability can be achieved, this study can provide a contribution to the design of CSCW environments for endoscopic surgery. Jagacinski et al. [20] reported on a study that used Fitts' law to evaluate the performance of pointing operations using head movements. They evaluated the performance of two-dimensional discrete pointing operations by head movements using a sighting device attached to a helmet. As a result, the performance of pointing in 45, 135, 225, and 315 degrees was lower than that in horizontal and vertical directions using head movements. These results indicate that the direction of drawing by head movement affects the performance of pointing operations.
On the other hand, as far as we know, no studies have evaluated the performance of drawing operations using head movements, and their characteristics have not been clarified. Since it is not a pointing operation but a drawing operation, we used the steering law [21], which is a derivative of Fitts' law and is used to evaluate the operability of drawing. By investigating the applicability of the steering law to drawing operations by head movement, it is verifiable as to whether or not the performance of drawing operations by head movement is affected by the direction of the operation. Several input devices have been evaluated by using steering law, and the Indices of Performance (IP) is available in previous studies [21,22], which can be compared with the performance by head movements obtained by this study. Research exists that applies the steering law to input devices such as mice and tablets, as well as to manipulation methods for VR navigation [21,23], and it has been shown that the steering law is effective as a performance evaluation index [20].
The objectives of this study are threefold; (1) to verify whether or not drawing motion using head movement can be modeled by steering law; (2) to compare the performance of using head movement input with that of other input devices; (3) to quantify the IP for diagonal drawings generated by head motion compared to horizontal and vertical drawings.

ARHD: AR Head-Drawing System for Endoscopic Surgical Communication Support
Our project was designed to develop an AR application, ARHD, whose primary function was to draw lines in space by head motion. The ARHD was built as a Microsoft Ho-loLens application using three tools: Unity, a real-time 3D development platform; Visual Studio, a comprehensive development environment; the Mixed Reality Toolkit, a toolkit provided by Microsoft that can be imported into Unity to create HoloLens applications. The relationship between these three elements can be seen in the schematic diagram shown in Figure 1. In constructing the ARHD, we created a C# script called "Line Manager," which was responsible for the function of drawing lines in space by head motion. Line Manager draws lines following a cursor that exists in the AR environment, and the function of displaying the cursor in the AR environment was created using scripts such as "Gaze Manager" and "Cursor" in the Mixed Reality Toolkit. Figure 2 shows an actual ARHD image of a surgeon conveying instructions to the other surgeon by head movement over an endoscopic image. The line-drawn information is shared among the surgeons wearing the AR glasses.

Methods
In this paper, we investigate the applicability of steering laws to head movement drawing operations, while evaluating performance in each direction of operation by performing a path-following task.

Participants
Nine participants between the ages of 21 and 25 were recruited. They were either undergraduate or graduate students at the university to which the authors belong. All participants had normal or corrected to normal vision.

Experimental Setup and Procedure
The experimental environment is shown in Figure 3. In a quiet laboratory space, participants wearing AR glasses were positioned in a standing posture 1.5 m away from the wall. The distance from the participant to the wall was determined based on the typical distance between the surgeon and the display in the actual operating room to which the authors belong. On the wall, a 60 cm by 60 cm display area was set to show test tracks. The center of the display area was 165 cm from the floor. The test track consisted of two squares with different colors and a white pathway connecting two squares. A blue square represented a starting area, and a yellow square represented a goal area. Independent of the test track, a cursor controlled by the head motion appeared on the display area.
The participants manipulated the cursor by using head movements through the white pathway from the start area to the goal area. The measurement was started when the cursor left the start area and advanced to the white pathway. The time to complete steering through the pathway without being out of bounds (MT), the number of times the cursor deviated from the pathway (the number of errors), and the cursor's movement speed were sampled at 60 Hz. The measurement was terminated when the cursor reached the goal area or went out of the pathway before reaching the goal area. When the cursor reached the goal area, the displayed test track disappeared and the next randomly selected test track was displayed for the next trial. If the cursor went out of the track, an error was recorded, and the measurement was reset. The color of the cursor was changed according to the status for the purpose of giving visual feedback to participants.
After completing the informed consent process, the participants were asked to practice wearing and operating the AR glasses to get used to them. The experimenter explained the range of motion of the AR glasses' headband and asked the participants to put on the AR glasses by themselves so that their nose and head would not hurt and they could see the AR screen in front of them. Next, as a preliminary trial, we had the participants practice steering through nine different test tracks with the cursor to get used to the task operation. Once the practice was completed, we had the participants run the main trial. When the participants completed 48 tasks, half of the 96 test tracks, they were asked to remove their AR glasses and take a break.
After the participants completed all 96 tasks, they took off their AR glasses and filled out the questionnaire. The flow from the start of the first practice trial to the end of filling out the questionnaire was taken as one set, and three sets were conducted in this experiment.
The questionnaire regarding the difficulty of the tracking task was collected for each of the 24 directions of operation. Each questionnaire was evaluated on a seven-point scale. The larger the number, the more difficult the task was perceived.

Data Collection
In this study, we are trying to verify whether the steering law can be applied to any tracking direction by head movement. The steering law models the task of manipulating a cursor with a fixed width so that it does not go out of the path (steering task). For a given path C, let ds be the length of a small section and W(s) the width of the pathway, where s denotes curvilinear abscissa of the pathway. That is, W(s) means the spatial margin at a position s in the pathway. When the cursor is moved along the pathway, MT and the Index of difficulty (ID) can be expressed by Equations (1) and (2) MT a b ID    where a and b are the constants dependent on the accuracy of the track which the surgeon intends to specify, the device used in the task and the individual who performs the task. Equation (1) shows that ID and MT have a proportional relationship, and a regression line can be obtained from the plot of MT when a steering task is performed. The reciprocal of the slope of this regression line is known as the Index of Performance (IP) for the device that performed the steering task [21,23] and is shown in Equation (3).
In order to examine whether the path-following performance by head movement can be modeled by steering law, two independent variables were included: ID and tracking directions (θ). IDs were determined by the track's total lengths (A) and widths (W). A had two levels (20 cm, 50 cm), and W had two levels (2 cm, 4 cm). Since the track used in the experiment was a straight line, Equation (2) can be simplified as Equation (4) θ had 24 levels from 0° to 345° with an increment range of 15°. Thus, the data were acquired under 2592 conditions including nine experimental participants, three trial numbers, four different IDs, and 24 different θs. Each participant performed a total of 108 trials in randomized order.
Dependent variables included elapsed time when the cursor is moved from the start to the goal: MT [sec], and the number of out of bounds during trials: number of errors. MT data were subjected to logarithmic transformation to provide normality.
IPs, the number of errors, and the scores obtained from the questionnaire were evaluated for each tracking direction. Error rates for each tracking direction were calculated by the number of errors divided by the number of all trials, including invalid trials. The linear regression model was applied to represent the relationship between ID and MT for each tracking direction to evaluate whether the model could be applied regardless of the tracking direction. A repeated-measures ANOVA was performed for MT and error rates. All statistical analyses were performed using the software package R-4.0.3.

Movement Time and the Applicability of Steering Law by the Tracking Direction
The Shapiro-Wilk normality test was conducted on the log-transformed MT data and the normality was not rejected (W = 0.96, p = 0.1481). Hence, a paired two-way analysis of variance was performed, with the two independent variables θ and ID as the main effects. The results showed that the head-movement-based line drawing manipulation was significantly affected by the tracking direction θ (F23,184 = 14.87, p < 0.001, ηp 2 = 0.158) and by the task difficulty ID (F3,24 = 409.2, p < 0.001, ηp 2 = 0.704). There was no significant difference (F69,2280 = 0.75, p = 0.938) in the interaction. Figure 4 shows the relationship between ID and MT at θ = 0 degrees. There was high linearity between the two, with a coefficient of determination R² of 0.9991. Figure 5 summarizes the MT for all 24 tracking directions evaluated in the experiment. Figure 6 summarizes the coefficient of determination R² of the regression line for each tracking direction. Figure 6 shows that R² > 0.97 in all directions, which is highly linear and compatible with the steering model, and indicates that the steering law is applicable to drawing operations by head movement regardless of tracking direction. Figure 7 summarizes the IPs for the 24 tracking directions. According to Figure 7, the IP is higher in the horizontal and vertical directions compared with diagonal directions. The minimum value of IP (2.31 s −1 at θ = 135°) was 46.7% of the maximum value (4.94 s −1 at θ = 0°).

Error Rates and Subjective Evaluation
The Kruskal-Wallis rank sum test was conducted for error rates and it revealed that ID had a significant effect on the error rate (χ2 = 77.679, p < 0.001), but the tracking direction did not (χ2 = 9.3644, p > 0.5). The scores obtained from the questionnaire were also evaluated using the Kruskal-Wallis rank sum test and the tracking direction had a significant effect on the score (χ2 = 267.61, p < 0.001). Figure 8 shows the relationship between tracking direction and the average scores obtained from the questionnaire. It is clear from this figure that the participants perceived the oblique directions of 45 degrees, 135 degrees, 225 degrees, and 315 degrees to be more difficult to operate. In the 135-degree condition, which participants perceived as the most difficult, the mean score was 4.96 points, which was 4.2 times higher than the score (1.18 points) for the 270-degree condition. Figure 8. Relationship between tracking direction and the average scores obtained from the questionnaire. The higher the score, the more difficult the participants found the operation.

Discussion
The experimental results showed that the main effects, the direction of operation and ID, significantly affected MT. It was shown that the steering tasks that follow the path in all 24 directions, evaluated in the experimental conditions, can be modeled by the steering law with a high coefficient of determination. Jagacinski et al. [20] evaluated the effect of direction on the pointing task by head motion. They reported that the relationship between ID and MT was represented by a Fitts model with a coefficient of determination of 0.86 to 0.98. Our results also showed the applicability of the steering model with a similar coefficient of determination. Studies that evaluated steering tasks using manual straight and circular paths also reported that the relationship between ID and MT showed a high coefficient of determination. (R 2 = 0.986 ~ 0.999) [21]. As for the magnitude of MT, compared to the horizontal and vertical directions, the study by [20] showed that pointing tasks in the diagonal direction tended to take a little longer. We read from the data in their paper that the 315° direction had the highest MT, about 1.15 times higher than the 90° direction, which had the lowest MT. This ratio was relatively higher in our study, where a difference of about 1.94 times was found between horizontal-vertical and diagonal steering tasks. The simple positioning and steering operations showed a consistent trend in that they tended to take more time for diagonal movements. The difference in the ratios can be attributed to the intrinsic difference between the pointing task, which is to head in a free path to the target, and the steering task, which is to track the path without straying from it continuously.
To evaluate the usefulness of ARHD as an endoscopic surgical communication support system, we compare the results obtained in the present study with the performance of cursor manipulation using existing input devices evaluated by the previous studies [21,22] using the steering law. Accot et al. [21] evaluated the performance of a steering task using five input devices (mouse, tablet, trackball, touchpad, and TrackPoint) with a trajectory in which the cursor is manipulated straight to the right horizontally (corresponding to θ = 0° in our experiment). Kobayashi et al. [22] evaluated the performance of a steering task with a laser pointer in the same way as [21]. From the present study, the IP of drawing operation by head movement in the direction of θ = 0° was 4.941 s −1 , which is lower than the results of a mouse (14.3), tablet (14.4), trackball (5.30), touchpad (6.70), TrackPoint (8.70), and laser pointer (9.80). All of the interfaces used in this comparison are hand-operated interfaces, indicating that head manipulation is less precise than hand manipulation. This result is consistent with the trend reported by Douglas and Mithal [23], who found that IP was lower in head manipulation than in hand manipulation, and that IP was lower in large limb than in small limb pointing manipulation. Other than handoperated interfaces, Monteiro et al. [24] used a VR navigation interface called VirtuSphere to apply a steering model to the operation of moving in a VR environment. They reported that the IP was 1.585 s −1 for a linear path, which is lower than that of a manual gamepad operation (4.546). In their system, inertia was applied to start, stop, and change the direction during the tracking task, making it difficult to operate the system. The only device used in the present study was the AR glasses, which may have had relatively little effect on the operation compared to [24].
From the results of the IP and subjective evaluation questionnaires, it was found that diagonal manipulation was more difficult than horizontal and vertical manipulation. This result is consistent with the results reported by Jagacinski et al. [20], who considered two factors: perceptual and kinematic. As a perceptual factor, they focused on visual acuity and speculated that the present results would be obtained if visual acuity was superior in the horizontal and vertical axes. As for the kinematic factor, they speculated that the result was due to the complexity of the movements required to manipulate the oblique method. The neck joint movements can be divided into three categories: Cervical flexion/extension, Cervical rotation, and Cervical side-bending. The head movement in the horizontal and vertical directions can be achieved by any one of the above three movements. However, movement in the oblique direction can include not only the combined movement of the above three but also may add movement of the lower limbs, i.e., waist and foot joints, which may have increased the overall difficulty of the operation.
Although the operability of ARHD by head movement was not as good as that of hand operation as an endoscopic surgical communication support system, the fact that information can be presented directly by the head is still helpful for practical use because AR can be projected onto the real environment, unlike the VR environment. Additionally, since the steering model can be applied, it became clear that improvements using this model will be effective in designing the space in the operating room for using the system in the future, such as the display size of endoscope images and the distance between the surgeon and the display. Once clear performance differences between different angles become apparent, it would be helpful to improve the system so that the operating room can display several virtual monitors adjusted to appropriate angles instead of using a fixed endoscopic surgical image monitor. Al Janabi et al. [25] have developed a system that uses AR to display the monitor as a hologram instead of a conventional monitor, allowing the surgeon to view the endoscopic image in real-time and at any position. In this way, the introduction of a technology to project the endoscope image itself as a hologram at an arbitrary position in the operating room space can be a factor that increases the usefulness of this system.
There are various endoscopic surgery techniques, and the technique targeted in this study assumes laparoscopic gastrectomy using five trocars (two for each surgeon and one for the camera) in the field of digestive surgery. However, according to Kim et al. [26], thanks to recent advanced and improved technologies, there have been reports of cases in which laparoscopic gastrectomy with a small number of ports has been safely performed. In such a case, not all of the hands of both surgeons are occupied by the device, and some communication is possible with hands free. However, a recent report [27] also states that most surgeons still use at least five trocars for laparoscopic gastrectomy in the field of digestive surgery. Thus, the proposed system is still valid, although limited.
Our studies' immediate goal was not to introduce the communication supporting system for practical use in its current form. Instead, the immediate goal was to evaluate the performance of this prototype as a benchmark for considering whether it could serve as a base idea for future development. Therefore, the prototype system does have some inherent limitations for practical use. Most of all, the use of head-mounted AR glasses does not fit in a clinical environment in terms of the sterilization issues, weight problem, the requirement of hand gesture, and difficulty in using the holographic display in strong light conditions. With recent technological advances, we believe that head-mounted AR glasses will continue to progress in the direction of being lighter and more packaged and that the problems above of weight, sterilization and the need for gestures will not necessarily present in the future. As for the light environment, a recent report by Kubben and Sinlae [28] tested the Microsoft HoloLens, the same device in our study, in an operating room under two lighting conditions (general background theatre lighting only and general background theatre lighting and operating lights) and concluded that the visibility of the holograms was good if the device was configured to use high brightness for display. However, since their study was not conducted in an actual endoscopic surgery, they did not describe the mutual interference of light and other factors, and further evaluation will be necessary.

Conclusions
The prototype for supporting communication in an endoscopic surgery environment was constructed. This study evaluated the applicability of the drawing motion by the head movement to the steering law, one of the HCI models, for potential use during endoscopic surgery. The experimental results showed that the steering tasks could be modeled by the steering law with a high coefficient of determination. In order to use this prototype directly in endoscopic surgery, we have to overcome many challenges. Therefore, we would like to promote further development to explore the possibility of using this prototype in an environment such as during preoperative surgical planning.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the contract written in the Institutional Review Board of Organization for Research and Development of Innovative Science and Technology, Kansai University.

Conflicts of Interest:
The authors declare no conflict of interest.