Evaluation of HMDs by QFD for Augmented Reality Applications in the Maxillofacial Surgery Domain

Today, surgical operations are less invasive than they were a few decades ago and, in medicine, there is a growing trend towards precision surgery. Among many technological advancements, augmented reality (AR) can be a powerful tool for improving the surgery practice through its ability to superimpose the 3D geometrical information of the pre-planned operation over the surgical field as well as medical and instrumental information gathered from operating room equipment. AR is fundamental to reach new standards in maxillofacial surgery. The surgeons will be able to not shift their focus from the patients while looking to the monitors. Osteotomies will not require physical tools to be fixed on patient bones as guides to make resections. Handling grafts and 3D models directly in the operating room will permit a fine tuning of the procedure before harvesting the implant. This article aims to study the application of AR head-mounted displays (HMD) in three operative scenarios (oncological and reconstructive surgery, orthognathic surgery, and maxillofacial trauma surgery) by the means of quantitative logic using the Quality Function Deployment (QFD) tool to determine their requirements. The article provides an evaluation of the readiness degree of HMD currently on market and highlights the lacking features.


Introduction
The concept of AR application as a solution to support surgery was born in the last decades of the 20th century [1]. Today's technologies are giving plenty of new opportunities to make AR an everyday tool for surgery experimenting in many disciplines to navigate their fields as a pre-and intra-operative tool or as a training device.
There is a rich body of literature about AR applications in different surgical specialities: AR-based surgical navigation offers new possibilities in complex orthopaedic osteotomies for a safer and more accurate execution [2]. The training of procedures as thoracoscopy, arthroplasty (and many other) is improved by the introduction of AR instruments [3,4]. In liver surgery a HMD for AR has been evaluated in the context of preoperative planning and training [5]. The implementation, through HMD, in maxillofacial surgery of preoperative Computed Tomography (CT) images improves the identification of critical anatomical structures [6,7]. A projector-based AR has been developed for laparoscopic nephrectomy to assist the identification of tumours boundaries [8]. A similar projected AR is used to realize the external visualization of virtual models of the organs and related surgical information [9]. In neurosurgery has been tested an AR system, that gives information directly projected on patient's head, to realise tailored craniotomy [10]. AR instruments may support the knowledge transfer between specialists (e.g., clinician or specialist) and those seeking understanding and insight (e.g., patient and relatives, learners, etc.) [11].
The previous brief overview shows how the lack of a common investigation methodology leads to a fragmentation of the information. This scientific analysis aims to converge on a standard method and to define guidelines for a handbook of AR applications in surgical practice.
The saying "beauty is only skin deep" could be further from reality when considering how the underlying structure of the maxillofacial region provides the foundation of our appearance. The maxillofacial region importance goes far beyond appearance: it provides protection to the cranium and critical functions including sight, smell, taste, talking and breathing. For this reason, any injury or disorder in the maxillofacial region must be treated thoroughly as it can, sometimes, be life-changing or even life-threatening. Therefore, maxillofacial surgery procedures are crucial and performing surgery on such a critical region requires particular care, as difficulties or mistakes may put heavy tolls on patients.
Maxillofacial surgery has been chosen as the cornerstone of this study due its importance, hence the extensive consultable literature. Three practices were identified as those that benefit the most from AR: oncologic and reconstructive surgery, orthognathic surgery, and maxillofacial trauma surgery. This study investigates the distinctive traits of each scenario (e.g., common practice, operations timing, useful features, etc.), extrapolates the requirements, and seeks the more fitting AR devices' specifications.
After this brief introduction of the study subject, the next section will present the method focused on maxillofacial common surgical procedures and AR devices. The results will be reported in a dedicated section and, after a brief discussion, conclusions will be drawn about the more appropriate solutions to apply AR to maxillofacial surgery.

Materials and Methods
This is a two-step analysis: first desk research has been performed to investigate maxillofacial practices and AR devices, then the QFD method has been applied to evaluate the relationships between qualitative requirements of the practices and quantitative properties of the devices. Desk research involves a complete review of the literature, including articles and datasheets, and is indispensable to a deep analysis of the functioning and potentialities of the research subject [12].

Desk Research on Maxillofacial Practices
Surgery continues to evolve, with science making great strides on an almost daily basis. Over the years, learning from trials and errors, surgical knowledge and techniques have developed to the extent to be branched out into many surgery specialities, each focused on specific areas-usually an anatomical district of the body or, occasionally, a particular technique or type of patient. The term maxilla, borrowed from New Latin, indicates the jaw and jawbones; therefore, maxillofacial surgery is a field of medicine specializing in the diagnosis and treatment of diseases, injuries, and defects involving the functional and aesthetic aspects of the bone and soft tissues of the region which comprises the oral cavity and the jaws, the face, the head, and the neck. The extension of maxillofacial region led to the creation of different subspeciality.
This work evaluates the use of AR in three of those subspecialities: oncologic and re-constructive surgery, orthognathic surgery, and maxillofacial trauma surgery. These practices were chosen because their common practice requires a surgical intervention with a certain degree of complexity.

Oncological and Reconstructive Surgery
The maxillofacial region may host different kind of tumours, cancerous or benign, which often require a surgical reconstructive intervention on damaged structures. Bone structures may be damaged directly by a tumour or indirectly by the treatments of the disease: aggressive tumours, such as the squamous cell carcinoma, are treated with radiother-apy which may cause a necrosis of the bone structures irradiated, called osteoradionecrosis. This is one of the most feared complications after head and neck radiotherapy; in the worst cases, bone re-construction is necessary to restore necrotized tissue. Among head and neck sites, the mandible is the most commonly affected bone [13]. Besides bone structures, cancers may grow in soft tissue such as the odontogenic tissue or the lip, oral cavity, nasopharynx, and pharynx tissue. Odontogenic tumours are a heterogeneous group of lesions derived from elements of tooth-forming tissues with a very low frequency of cancerous forms. Although there is a significant variation in the invasiveness of the growth pattern among these tumours, they generally grow slowly, causing the distortion of the bone. However, large tumours may exhibit a much more aggressive behaviour that includes extensive destruction of the bone and proliferation into the soft tissues. The common odontogenic tumours are usually treated conservatively by enucleation or curettage, whereas aggressive or recurrent tumours require radical resection and so a reconstructive intervention.
Common surgical reconstruction practice harvests a graft from the fibula to repair the mandible. The first reported fibula-free flap mandibular reconstruction was made by Hidalgo [14]. Compared to previous bone flap harvesting sites, the fibula can be harvested as a pure osseous flap or as a hybrid graft with muscle and skin, thereby permitting great flexibility for the reconstruction of virtually any mandibular and soft tissue defects [15]. Flap harvest is relatively straightforward, allowing the ablative and the reconstructive surgeons to work simultaneously.
Restoring the functional and aesthetic characteristics of the jaw requires accurate preoperative planning to define the position and orientation of the osteotomies and the length of the fibula's segments. Preoperative planning information is used to make tools named cutting guides. Two guides, one to remove the part of the mandible affected by the tumour and one to harvest the fibula flap, are used in the intraoperative phase. Those cutting guides support and orient surgeon's tools during resections. Correct positioning is fundamental in each intraoperative step to obtain a good outcome; however, to have a minimally invasive surgery, surgical accesses must be narrow, meaning that there is a lack of visibility. Giving the surgeon a way to see, anchored to the patient's body, the placement of virtual guides would hasten the fixing and enhance the positioning of the physical guides. Visual assistance could also help the surgeon during the reconstruction phase to place the fibula segments inside the mandible osteotomy site and check the reconstruction compliance with the preoperative planning model. Moreover, the hybrid nature of fibula flaps would require a way to see the proper vascularisation of the soft tissues. Seeing a live view of the blood flux in the operating room could really improve the post-operative procedures.
To further improve quality and safety of a surgery, the Digital Imaging and COmmunications in Medicine (DICOM) data sets and virtual planning may be used as a useful source of information to keep in check during the surgery.

Orthognathic Surgery
Orthognathic surgery is the subspecialty of plastic and maxillofacial surgery that corrects congenital and acquired deformities of the head, skull, face, neck, jaw, and associated structures. This subspecialty was pioneered by Dr Paul Tessier, who studied the techniques to correct facial anomalies through a transcranial approach in the second half of the 20th century. This specialty has grown to become the subspecialty of plastic surgery that restores facial form by addressing both skeletal and soft tissue abnormalities. Orthognathic surgeons have expertise in bones, muscles, nerves, fat, and skin.
It is said that "beauty will save the world" or, at least, may help in physical and psycho-social health; correcting dentofacial deformities solves skeletal malocclusion and improves the aesthetics of the face, reaching a harmonious facial balance [16]. Studies are now focusing on the ability to forecast soft tissue shape after bone repositioning during preoperatory planning. The zygomatic region is one of the most involved areas, and a deeper knowledge of this region could lead to the development of predictive surgical tools [17,18]. Surgical intervention in early childhood obviates many problems: the greater plasticity of the tissue allows there to be significant remodelling and subsequent growth that is not observed in adult tissues [19]. Orthognathic surgery is not just an aesthetic matter; infants and children with syndromes may have more difficult airways and complicated surgical repair with more bleeding. Associated anomalies include facial and airway features that make mask ventilation and intubation difficult [20]. Orthognathic surgery requires a high level of accuracy because the treatment is applied to a sensitive area. Specific complications include venous air embolism, dural tears, extradural or subdural hematoma, blindness, permanent neurological deficits [21].
To restore the facial contour, as well the functionality of the craniofacial skeleton, an autogenous bone graft (iliac, costal, tibial, and calvarial grafts) is used, while alloplastic bone substitutes can be used to alter facial contour. Today, the mix of traditional procedures with new tools to make harvesting easier and faster [22], and the use of titanium miniplates to stabilise bone grafts in craniofacial and orbital osteotomies [23], makes the procedures safer and less taxing for the patient. Orthognathic surgeons are also responsible for the repositioning of the maxilla, mandible, or chin. The Bilateral Sagittal Split Osteotomy (BSSO) is one of the most common types of jaw surgery [24]. A BSSO first-mandible methodology can be summarised as follows: (1) the surgeon performs the first osteotomy, (2) the mandible is set in the correct position using the intermediate splint, (3) the surgeon performs the second osteotomy, (4) the maxilla is placed using the final splint, and (5) upper and lower jaws are screwed to the skull bone with customized titanium plates. A visual assistance system could be useful to compare the positioning of maxilla and mandible to the preoperatory planning, acting as a kind of feedback for the surgeon to verify the deviation between the planned model and the surgical outcome.

Maxillofacial Trauma Surgery
Maxillofacial injuries are a frequent cause of emergency department visits. Traumas may present skin lacerations, burns, obstruction to the nasal cavity or sinuses, damage to the orbits, fracture to the jawbone, and missing or broken teeth. Maxillofacial trauma may also involve serious or life-threatening symptoms, such as brain injury, airway obstruction, excessive bleeding, or shock. The location of injuries may be described by the ratio 6:2:1 between mandibular, zygomatic and maxillary area [25]. The aetiology of the traumas is usually comprised between accidents (e.g., traffic, work, etc.), falls, sport activity or violence, where the ranks may vary accordingly to the considered world area. There is a rising number of maxillofacial traumas due to the rising popularity of stand-up electric scooters. Studies from Korea [26] and the US [27] highlighted how, in case of e-scooter accidents, the patients present a myriad of craniofacial trauma, ranging from abrasions, lacerations, and concussions to intracranial haemorrhage and Le Fort II and III fractures.
In maxillofacial trauma, the patient's condition can deteriorate quickly [28], or the patient may present extended heavy injuries with tissue loss (a condition that requires a reconstruction made with bone graft). Excluding the worst cases, the most common facial traumas can be solved by reducing the fracture, placing the jaws into post-operative maxillamandibular fixation, if necessary, and fixing the bones with a titanium mesh [29]. Many studies dedicated to maxillofacial trauma treatment show how different situations may be solved by counterintuitive solutions: it has been discovered how high-velocity maxillofacial injuries delayed treatment contributes to a critical revascularization period, resulting in improved healing and decreased post-operative morbidity and complications [30]; surgical intervention for facial trauma in elderly patients is less frequently indicated because of physiologic and psychologic changes brought on by the ageing process [31].
Orbital injuries are common in maxillofacial traumas and a common cause of blindness. Imaging is imperative for diagnosing orbital fractures, as clinical examination can-not thoroughly assess their presence or severity [32]. The patient's CT scan images are used to extract the 3D model of the skull; then, virtual preoperatory planning is performed by a team of engineers and physicians. Once the natural head position has been defined, the normal contralateral orbit is mirrored onto the fractured orbit to define the reposition of the fractured bones and the placement of the fixing plates. The planning is then transferred to the surgical navigation system which guides the surgeon in real time through each surgical step. The surgical navigation system has two critical defects: (1) the surgeon must shift attention from the operating site to the navigation system monitor to consult the planning. The surgeon, while keeping in mind the planning information, must mentally overlap it onto the real patient while operating. This is a demanding level of concentration to maintain throughout the surgical procedure; (2) the surgical pointer, essential for the surgical navigation system position tracking, shares the access of the surgical instruments and, for this reason, the surgeon must remove it to operate. This touch-and-remove practice introduces deviations between the planning and the surgery.
The surgeon's work could be improved and simplified using a system that shows the planning directly on the patient's face.

Desk Research on Augmented Reality Technologies
Mixed reality (MR) was firstly defined by Paul Milgram and Fumio Kishino as any point somewhere along the "virtuality continuum" which connects real environments to virtual ones ( Figure 1). Among the many MR technologies emerged over the years, probably the best known is AR, which refers to all cases in which the display of an otherwise real environment is augmented by means of virtual (computer graphic) objects [33].
Orbital injuries are common in maxillofacial traumas and a common cause of blindness. Imaging is imperative for diagnosing orbital fractures, as clinical examination cannot thoroughly assess their presence or severity [32]. The patient's CT scan images are used to extract the 3D model of the skull; then, virtual preoperatory planning is performed by a team of engineers and physicians. Once the natural head position has been defined, the normal contralateral orbit is mirrored onto the fractured orbit to define the reposition of the fractured bones and the placement of the fixing plates. The planning is then transferred to the surgical navigation system which guides the surgeon in real time through each surgical step. The surgical navigation system has two critical defects: (1) the surgeon must shift attention from the operating site to the navigation system monitor to consult the planning. The surgeon, while keeping in mind the planning information, must mentally overlap it onto the real patient while operating. This is a demanding level of concentration to maintain throughout the surgical procedure; (2) the surgical pointer, essential for the surgical navigation system position tracking, shares the access of the surgical instruments and, for this reason, the surgeon must remove it to operate. This touch-andremove practice introduces deviations between the planning and the surgery.
The surgeon's work could be improved and simplified using a system that shows the planning directly on the patient's face.

Desk Research on Augmented Reality Technologies
Mixed reality (MR) was firstly defined by Paul Milgram and Fumio Kishino as any point somewhere along the "virtuality continuum" which connects real environments to virtual ones ( Figure 1). Among the many MR technologies emerged over the years, probably the best known is AR, which refers to all cases in which the display of an otherwise real environment is augmented by means of virtual (computer graphic) objects [33]. In the early 1990s, the concept of AR was invented by Boeing: a technology used to "augment" the visual field of the user with information necessary in the execution of the current task [34]. The first immersive system, "Virtual Fixtures", was created by the USAF. In 1998, there was the first mainstream application by Sportsvision that broadcasted the first live NFL game using a graphic system to visualise a virtual 1st & Ten line. Recently, the unveiling of the Google Glass device in 2014 gave the possibility of immersive experiences with Google apps; in 2016, Microsoft HoloLens started to be sold as more advanced AR device but with an important price to be an everyday accessory; in the same year, the app developer Niantic launched Pokémon Go, a smartphone AR georeferenced game that hit 1 billion downloads in March 2019. Devices are necessary to use AR, and HMD are a particular subset of MR-related technologies that involve the merging of real and virtual In the early 1990s, the concept of AR was invented by Boeing: a technology used to "augment" the visual field of the user with information necessary in the execution of the current task [34]. The first immersive system, "Virtual Fixtures", was created by the USAF. In 1998, there was the first mainstream application by Sportsvision that broadcasted the first live NFL game using a graphic system to visualise a virtual 1st & Ten line. Recently, the unveiling of the Google Glass device in 2014 gave the possibility of immersive experiences with Google apps; in 2016, Microsoft HoloLens started to be sold as more advanced AR device but with an important price to be an everyday accessory; in the same year, the app developer Niantic launched Pokémon Go, a smartphone AR georeferenced game that hit 1 billion downloads in March 2019. Devices are necessary to use AR, and HMD are a particular subset of MR-related technologies that involve the merging of real and virtual worlds. Inserting virtual objects into real scenes requires a process of registration or correct alignment of the virtual world with the real one. The objects in the real and virtual worlds must be properly aligned with respect to each other, or the two worlds' coherence will be compromised. Without accurate registration, AR cannot be accepted in many applications [35]. Registration methods can be divided into two types: sensorbased registration and computer-vision-based registration [36][37][38]. Following a form factor categorisation, AR devices can be broadly divided into four types:

1.
Head-up displays (HUD) are mostly used on vehicles (e.g., airplanes and cars) to give additional info without taking the driver's eyes off the road. This kind of device has a fixed transparent screen, in the line of sight of the pilot/driver, where information is projected.

2.
Holographic displays use light diffraction to generate 3D objects in the real space, or to give a depth to displayed images without using 3D glasses or other tools. 3.
Smart glasses or wearable devices may have many aspects, ranging from glasses-type devices to more futuristic visors. Visors are also known as head-mounted displays (HMD), and they include two families: optical see-through and video see-through devices. This classification derives from the technological solution adopted to show the real-world image to the user: an optical device views reality directly through transparent displays while a video device shows the reality through camera images.

4.
Handheld devices are tools such as smartphones or tablets where AR is realised as video see-through by capturing the reality with the camera of the devices.
This article is focused on HMD because, using wearable devices, the surgeons may retrieve the information without looking away, while having their hands free to operate. This article aims to evaluate the readiness of commercial HMD to be used in a medical application, so knowing the market offer was necessary. Currently, it is not possible to trace a neat line between commercial product families because many features are common to all devices; for this reason, the devices evaluated have been grouped by their commercial purpose into industrial, professional, and entertainment devices. A functional classification was useful to have a deeper understanding of devices capabilities, avoiding subsequent trivial recommendations. The general traits of each group have been outlined in the next paragraphs.

Comparison of Market HDMs
Market research has been carried out to find the current availability of HMD for AR. Then a full comparison has been conducted evaluating the parameters used for the device's datasheets (Table S1). The market offer is not broad, but the proposition is varied. A first and interesting result of the research is the absence of a standardisation of hardware characteristics, and so many different specifications are used in datasheets. Meanwhile, most, if not all, the devices use a standard platform such as Unity™ to develop applications and interfaces. Given the diametrically opposed situation of hardware and software, the article will be focused on HMD hardware specifications.
The analysis has been done with specifications from datasheets, but macro-specifications will be used in the article for a better legibility. The specifications grouping will be illustrated in Table 1, while a macro-specification explanation is listed below: 1.
Computing power: the horsepower of a device in terms of hardware and software capabilities.

2.
Display: device core specification in AR applications.

4.
Sensors: device features necessary to get information from the surroundings. 5.
File formats: multimedia documents manageable by the device and so the kind of information the user may consult. 6.
Connectivity: ports and technologies to connect with other devices. 7.
User inputs: ways for the users to control the device. 8.
Power: how the device is powered. 9.
General features: miscellaneous information about the device.
The market research found seven HMD devices following four research criteria: (1) the device must be an HMD; (2) the device must be an AR HMD or, at least, a hybrid AR/VR; (3) the device must be currently available on the market or, at least, in a preselling phase; (4) the device must have an adequate datasheet to refer. The research results show an astonishing absence of medical-oriented HMD. Many projects are under way but far from being commercialised (Oculenz ARwear™, Beyeonics One™, Augmedics Xvision™) and other products have been already discontinued (Sony HMS-3000 MT™, Viking Systems HMD™). Meanwhile, the economists forecast a total of 26 million HMD units sold in 2023 [39] and a growth with a total CAGR (Compound Annual Growth Rate) of 43.8% in the next eight years with the healthcare sector leading the trend [40]; those predictions legitimate the expectation of a growing number of medicaloriented devices in the next years.
The devices that meet the research criteria have been gathered accordingly to their general purpose (Table 2), and their specifications have been organised in an ample comparison table available as supplementary material (Table S1). The price of devices varies considerably, showing noticeable steps if products with different purposes are compared. An analysis on devices grouped by purpose has been done to investigate the existence of common traits occurring along each group while has been decided to ignore the price parameter from this point onwards.
The first group evaluated has been labelled as industrial tools and contains two devices ( Table 3). One of the key points of this group is the use of commercial Operative System (OS), that means the use of a common languages to program. Depth cameras are not a common aspect, while these devices show a robust autonomy and provide some kinds of certifications (e.g., IP67 protection). Good connectivity and audio/video features are a logical consequence of a design oriented towards remote maintenance services. The user input methods are limited to physical buttons and voice commands, features such as eye or hand tracking are not provided but Inertial Measurement Units (IMU) are common. Another distinguishing trait is the ample choice of wearability solutions. The devices of this group are light and compact, and one has a monocular design.
The second family groups devices marketed as professional/enterprise solutions: it is a collection of all-rounder devices with varied characteristics (Table 4). This group achieve the highest (NVIDIA Parker SOC™ paired with an NVIDIA Pascal GPU™) and the lowest (no computing units on-board) computing power among the evaluated HMDs. There are different OS approaches, and it is common to find proprietary systems. These devices have great displays ranging from HD resolution with 50 degrees as the horizontal field-of-view to 4 k resolution with a 115 degree (for information, human eyes common horizontal field-of-view is around 120 degree). This group give full possibilities of hands-free use by having as inputs voice, eye, and gesture controls. The maximum autonomy for the battery powered devices is 3.5 h. Some devices still miss a depth camera, but another equip a LiDAR paired with a RGB depth camera. Enterprise devices are binocular solutions with different weight and dimension, ranging from a 300 g goggleshaped device to a 1 kg visor.
The third group comprises devices meant for entertainment purposes. This group is a bit odd because populated by devices that are similar models produced by Epson™ (Table 5) and differentiated just by slight differences.
The characteristic of entertainment devices is a rational use of hardware to obtain a satisfactory trade-off between performance and battery duration. Both devices have a light glasses structure equipped with the worsts displays among all the devices evaluated. To increase autonomy, depth cameras or any kind of tracking for eyes or hands, are not present.

Quality Function Deployment
Today the average consumer has a multitude of options available to satisfy his needs; many similar products and services populate the market. Each consumer makes his choices based upon a mix of a general perception of quality and value, and personal perception grown with experience; however, perceptions are a difficult matter to manage because they are unmeasurable and hard to compare.
QFD has been developed as a customer-oriented design method, focused on satisfying requirements specified by the customers [41]. QFD, aside from its original scope, is a structured method useful to quantify the strength of relationships between orthogonal dimensions such as application requirements (qualitative) and devices specifications (quantitative).
The following table (Table 6) illustrates a QFD framework chart where the maxillofacial surgery requirements, which emerged from the desk research on practices, are related to the macro-specifications obtained from devices datasheets. Intersections represent the contribution of a specification in satisfying a requirement. Aggregating the contributions from each specification will reveal the importance of device features. A requirement may be present in different scenarios but with a different type of importance; thus, each requirement must be weighted. Requirements weights range from one to five. The device's specifications contribute with various degrees to a requirement satisfaction, so it is necessary to score each contribution. The contribution's score may be zero (specification unsuitable/not related), three (weak relation), six (moderate utility), and nine (very useful). Three QFDs have been drawn up, one for each maxillofacial procedure previously explained, and structured as the framework. The QFD results will be represented by means of a radar chart. The values have been obtained as the sum of each specification's weighted contribution scores. Essentially, each score assigned to a specification (collected in the columns of the Supplementary Material Tables S2-S4) has been multiplied by a factor representing the weight of each scenario's requirement, and the products have been added together. The resulting number will be used as value for that specification in the radar chart. A range 0 ÷ 450 will be used because: (1) specifications may be completely not related to a requirement (e.g., the information on the manageable type of files is not useful to achieve the visual cutting guides feature) and so their contribution is null; (2) the highest scored value will be kept as the upper limit for each radar in order to give a visual comparison between the three scenarios.
The use of QFD method is not sufficient to reach the purpose of evaluating the readiness of current HMD for medical applications. A virtual device, named best-case, has been created to represent the current best, consequently the readiest, device regarding the applications requirements. The makeup of the best-case device has been done taking the best features available among the evaluated HMD for each specification; it is important to underline how the best case is not an idealistic device, but a feasible one made by picking features from devices currently on the market.
The evaluation of the devices will be done by comparing them with the best-case device and requires comparable specifications to be completed. While is easier to compare a measurable information like, as example, the battery autonomy, many other specifications are too heterogeneous to be directly compared. For this reason, a scoring system will be implemented to rank devices.
The score calculation phases will be: 1.
Points assignment-the non-measurable specifications, such as the CPU or the camera, will be arranged from worst to best and receive points from one to five. In the case of a feature where the interest is just to have it or not (e.g., the depth camera or the hand tracking) a simple plus/minus one point will be used.

2.
Adjustment-the assigned points will be tuned using, in each scenario, requirements weights.

3.
Standardisation-for the weighted points an upper limit of five will be forced. The points will be proportionally re-calculated. This operation is necessary to avoid distorted information when comparing the three scenarios.

4.
Aggregation-the standardised scores will be aggregated to summarise devices performance in each scenario.

Results
Following the data gathering, an expert group was assembled to analyse requirements and specifications. The expert group was composed of twenty people, ranging from 35 to 60 years of age (median age 42 years), with medical (maxillofacial surgeons) or technological (computer science and biomedical engineers) skills. The multidisciplinary nature of the expert group was meant to voice both medical and engineering aspects. Thus, the qualitative requirements, and their weights for each scenario, were obtained from the discussion held between the surgeons involved as experts. The engineers, considering the voice of customer, evaluated the technical specifications of the devices and assigned the scores for the QFD analysis.
The medical-engineering group has proved to be essential to fully explore the matter of the project. The results are presented in the following sections. Each scenario will be evaluated considering both factors (requirements and specifications) of QFD analysis: firstly, a point-by-point comment about requirements will be presented; then the radar chart will match requirements with specifications to show their relative importance; and, lastly, a line chart will be used to show the evaluated devices suitability by comparing standardised scores.

Scenario n. 1-Oncological and Reconstructive Surgery
This application resulted as the most challenging because involves two different operation sites. This means having two medical teams working at the same time with an increase in medical data needed to meet the requirements of the increased number of 3D models. Moreover, in the oncological scenario, there is the possibility to visualise soft tissue in addi-tion to bones. The complete QFD table has been supplied in the Supplementary Material (Table S2).
The explanation of qualitative requirements is as follows: • Simple and smart user interface: smooth navigation through menus and an intuitive layout for tools and information. • Custom user interface: custom interface layout for user comfort. • Lightweight and comfortable: the device must not hinder or weary the surgeon.

•
Ample and unobstructed field of view: the device must not hinder the surgeon's field of view. The AR is an addition to, and not a substitution for, human view. • Real-world colour fidelity: the device must not alter the aspect of real objects. Device autonomy must be adequate for the required task: in battery-powered devices, the autonomy must be adequate to last the duration of the operation. • Cordless device: many people and devices are present in the operating room; therefore, easy-to-manoeuvre solutions are preferable.

•
Optical zoom: some structures and vessels are too small to be seen by the human eye; therefore, a digital zoom may alter the images. • Add-on accessories: a modular device allows the device to be customized.
The specifications' relative importance is shown in Figure 2. The radar chart underlines the importance of computing power due to the number and complexity of 3D models handled. The resolution is also important to have a proper fusion between the real scene and virtual models. Good cameras are useful for the correct tracking of anatomical parts, as the hand tracking features. Physical ports are needed to add external sensors (as the optical zoom).
Appl. Sci. 2021, 11, x FOR PEER REVIEW 12 of 23 The specifications' relative importance is shown in Figure 2. The radar chart underlines the importance of computing power due to the number and complexity of 3D models handled. The resolution is also important to have a proper fusion between the real scene and virtual models. Good cameras are useful for the correct tracking of anatomical parts, as the hand tracking features. Physical ports are needed to add external sensors (as the optical zoom).  Figure 3 shows which device, between those evaluated, can better fulfil the oncological scenario requirements. The best-case device is represented as a white background element to improve the contrast with the market devices' coloured lines; this makes it easier to identify the best-performing devices. The chart shows how the cabled XR-3™ is penalised in a scenario where mobility is highly regarded.  Figure 3 shows which device, between those evaluated, can better fulfil the oncological scenario requirements. The best-case device is represented as a white background element to improve the contrast with the market devices' coloured lines; this makes it easier to identify the best-performing devices. The chart shows how the cabled XR-3™ is penalised in a scenario where mobility is highly regarded.

Scenario n. 2-Orthognathic Surgery
This application usually only involves one medical team. There are fewer sources of noise and visual disturbances; therefore, voice robustness and hand tracking robustness have a lower importance compared to the oncological scenario. Orthognathic surgery uses

Scenario n. 2-Orthognathic Surgery
This application usually only involves one medical team. There are fewer sources of noise and visual disturbances; therefore, voice robustness and hand tracking robustness have a lower importance compared to the oncological scenario. Orthognathic surgery uses fewer standard tools than the previous scenario and so requires fewer 3D models. The complete QFD table has been supplied in the Supplementary Material (Table S3).
The explanation of qualitative requirements is as follows: • Simple and smart user interface: smooth navigation through menus and an intuitive layout for tools and information. • Custom user interface: custom interface layout for user comfort. • Lightweight and comfortable: the device must not hinder or weary the surgeon.

•
Ample and unobstructed field of view: the device must not hinder the surgeon's field of view. The AR is an addition to, and not a substitution for, normal view. • Real-world colour fidelity: the device must not alter the aspect of real objects. • User feedback: using visual (colour maps, vector maps, etc.) or audio message for real-time surgical assistance. • Robust voice control: the number of people in the operating room should not condition the voice command's functionality. • Robust to occlusions: the device must keep track of operation site, surgical tools, and surgeon's hand independently from any visual hindrance. • Robust to contrast: the device must not be influenced by any light sources (e.g., surgical lights, headlights) present in the operating room. • Robust face tracking: the device must never lose the operation site tracking. The face is a sensitive site in case of head repositioning due to the number of joints in the head/neck area. • Display and navigate face DICOM: the face is important as the site of the implant; therefore, all available medical data are necessary to be displayed at the surgeon's request.

•
Display and handle 3D models of bones: it is fundamental that bone resection models are displayed. • Display and handle standard tools 3D models (distractors, locators): a library of standard tools used during operations must be managed. • Display and handle non-standard tools 3D models (guides, splints, plates): tailored surgical tools must be easily managed. • Display osteotomy cut lines: virtual guides to assist the surgeon to identify the position and orientation of the resection planes.

•
Choice for every 3D model's display style: solid, wireframe or hidden. Many objects are visualised in mixed reality; choosing the best display style for each one is, therefore, necessary. • Share live images on other devices: viewing the operation site from the user's pointof-view assists teamwork and is useful as a means of training.

•
Record images for training and evaluation: recorded images from the surgeon's point-of-view are useful for training and postoperative checks. • Device autonomy must be adequate for the required task: in battery-powered devices, the autonomy must be adequate to last the duration of the operation. • Cordless device: many people and devices are present in the operating room; therefore, easy-to-manoeuvre solutions are preferable.

•
Optical zoom: some structures and vessels are too small to be seen by the human eye, while a digital zoom may alter the images. • Add-on accessories: a modular device allows the device to be customized.
Specification relative importance is shown in Figure 4. Robustness requirements are marginal compared to the previous scenario, while computing power, resolution, camera, and hand tracking features are still important but with a lower magnitude. In general terms, the area delimited by the radar is smaller than the previous scenario.  The following chart ( Figure 5) shows which devices are better to fulfil the requirements of the orthognathic scenario. The best-case device is represented as a white background element to improve the contrast with the market devices coloured lines; this makes it easier to identify the best-performing devices. The worst performance, like the previous scenarios, is achieved in the sensors field. The following chart ( Figure 5) shows which devices are better to fulfil the requirements of the orthognathic scenario. The best-case device is represented as a white background element to improve the contrast with the market devices coloured lines; this makes it easier to identify the best-performing devices. The worst performance, like the previous scenarios, is achieved in the sensors field.

Scenario n. 3-Maxillofacial Trauma Surgery
This application proves to be the less challenging between the evaluated scenarios. In trauma treatment, the preoperatory planning, where the skull fragments puzzle is recomposed, is as important as correctly shaping the fixing plates. The complete QFD table

Scenario n. 3-Maxillofacial Trauma Surgery
This application proves to be the less challenging between the evaluated scenarios. In trauma treatment, the preoperatory planning, where the skull fragments puzzle is recomposed, is as important as correctly shaping the fixing plates. The complete QFD table has been supplied in the Supplementary Material (Table S4).
The explanation of qualitative requirements is as follows: • Simple and smart user interface: smooth navigation through menus and an intuitive layout for tools and information. • Custom user interface: custom interface layout for user comfort. • Lightweight and comfortable: the device must not hinder or weary the surgeon.

•
Ample and unobstructed field of view: the device must not hinder the surgeon's field of view. The AR is an addition to, and not a subtraction of human view. • Real-world colour fidelity: the device must not alter the aspect of real objects. Device autonomy must be adequate for the required task: in battery-powered devices, the autonomy must be adequate to last the duration of the operation.

•
Cordless device: many people and devices are present in the operating room; therefore, easy-to-manoeuvre solutions are preferable.

•
Optical zoom: some structures and vessels are too small to be seen by the human eye, while a digital zoom may alter the images. • Add-on accessories: a modular device allows the device to be customized.
The specification relative importance is shown in Figure 6. Computing power, resolution, and cameras, like the previous case, are still relatively the most important specifications; all the other features are significantly less important. Figure 7 shows which device, between those evaluated, can better fulfil the maxillofacial trauma requirements. The best-case device is represented by a white background element to improve the contrast with the market devices coloured lines; this makes it easier to identify the best-performing devices. The reduced gap between devices scores, with respect to the previous scenarios, is a consequence of the easier requirement list.
fore, easy-to-manoeuvre solutions are preferable.

•
Optical zoom: some structures and vessels are too small to be seen by the human eye, while a digital zoom may alter the images. • Add-on accessories: a modular device allows the device to be customized. The specification relative importance is shown in Figure 6. Computing power, resolution, and cameras, like the previous case, are still relatively the most important specifications; all the other features are significantly less important.  Figure 7 shows which device, between those evaluated, can better fulfil the maxillofacial trauma requirements. The best-case device is represented by a white background element to improve the contrast with the market devices coloured lines; this makes it easier to identify the best-performing devices. The reduced gap between devices scores, with respect to the previous scenarios, is a consequence of the easier requirement list.

Discussion
The relative importance of device specifications is compared in Figure 8. From the radar, a comparison is possible to obtain two important pieces of information: (1) the on-

Discussion
The relative importance of device specifications is compared in Figure 8. From the radar, a comparison is possible to obtain two important pieces of information: (1) the oncological and reconstructive surgery is the most demanding scenario; (2) each specification's weight is, in broad terms, similar in the three scenarios. tracking feature must deal with more visual disturbances. In the same way characteristics that make easier to move while using the device, as being battery powered or being cordless, gain importance. Two operational sites mean the number of required 3D models available during the surgery is doubled. 3D model handling is directly proportional to computing power demand. The hybrid nature of the grafts, composed of bones and soft tissues to improve vascularisation and reconstruction, require more complex modelling, when compared to the other scenarios, therefore increasing the computing power required. The orthognathic and trauma scenarios may be generally described as a scaled version of oncological surgery. The specifications of the computing power group, with the resolution and the camera, are the three most important features. This fact can be explained considering how, put simply and regardless of the considered scenario, the job is to superimpose grafts 3D models on a real body.
The joint work group of physicians and engineers has identified a further requirement: the possibility to track the patients using non-visual landmarks. Maxillofacial surgeries are not "clean procedures" where blood may make the landmarks necessary for tracking dirty, hindering the correct registration of the virtual elements onto the real scene.
The evaluation charts (Figures 3, 5 and 7) show, using the standardised scores, the devices' performances for each specification groups. Table 7 aggregates the macro-specification scores into one score, which is useful to describe the devices overall performance for each scenario. The resulting scores are shown in the QFD tables (Tables S2-S4). They confirm, as previously stated, that the oncological scenario is the most challenging. The complexity of Oncological and reconstructive surgery is suggested by a requirement list that is longer than other scenarios.
The key factor is the necessity to manage face and leg operating sites at the same time, making the presence of two medical teams in the operating room. The amount of people needed for the surgery increases the weight of many requirements. Two teams, although well-coordinated, produce more noise, hindering the voice recognition features; the hand tracking feature must deal with more visual disturbances. In the same way characteristics that make easier to move while using the device, as being battery powered or being cordless, gain importance. Two operational sites mean the number of required 3D models available during the surgery is doubled. 3D model handling is directly proportional to computing power demand. The hybrid nature of the grafts, composed of bones and soft tissues to improve vascularisation and reconstruction, require more complex modelling, when compared to the other scenarios, therefore increasing the computing power required. The orthognathic and trauma scenarios may be generally described as a scaled version of oncological surgery.
The specifications of the computing power group, with the resolution and the camera, are the three most important features. This fact can be explained considering how, put simply and regardless of the considered scenario, the job is to superimpose grafts 3D models on a real body.
The joint work group of physicians and engineers has identified a further requirement: the possibility to track the patients using non-visual landmarks. Maxillofacial surgeries are not "clean procedures" where blood may make the landmarks necessary for tracking dirty, hindering the correct registration of the virtual elements onto the real scene.
The evaluation charts (Figures 3, 5 and 7) show, using the standardised scores, the devices' performances for each specification groups. Table 7 aggregates the macrospecification scores into one score, which is useful to describe the devices overall performance for each scenario. Some groups of specifications (audio, files format, connectivity, and user inputs) are less demanding and so are easily compelled by most of the devices. The computing power and the display groups are very important specifications, and the devices have varied performance levels. The lower score of the XR-3™ device is down to the computing power, because it has a lack of on-board hardware (all elaborations are entrusted to the desktop PC connected by cables); this characteristic is a flaw for the scenarios object of this article. The power group has some scores fluctuations caused by the battery duration (with the exclusion of the XR-3™, which is not powered by a battery). The general features group are miscellanea that comprehend many specifications; modularity is achieved only by the M4000™, which has the possibility to equip external batteries to extend the autonomy; the cordless specification is missed only by the XR-3™ device; mass, external dimensions, and wearability are mainly influenced by the form factor and the purpose of the device. The analysis highlights the sensors group as a weakness: Depth cameras and ambient light sensor are not always available while the cameras have an ample range of performances in terms of resolution. Some devices have a resolution that is too weak to guarantee a reliable recognition of the scene, which is a critical problem for a correct tracking of the operating site.
None of the evaluated HMD has the features needed to be the current best. Each device may reach the best-case performance in some specifications but are lacking in others. The lack of modularity is certainly a flaw for every device (the M4000™, despite the possibility to equip additional batteries, cannot receive new features) because it prevents the users from adding any feature to the devices and customizing a generical HMD into a medical one.

Conclusions
This article aimed to study the application of HMD in three operative scenarios of the maxillofacial domain. In conclusion, it is possible to get information regarding the medical aspects, as the applied method of analysis has shown sensible differences in the three maxillofacial scenarios, and regarding the readiness of current HMD with their principal flaws to be adequate as medical tools.
Oncologic and reconstructive surgery has proven to be the most demanding scenario from the evaluation of surgeon requirements. This practice involves two operation sites (maxillofacial region and fibula) managed by two different medical teams, and so has a larger number of people in the operating room than the other scenarios. Moreover, working on the two different sites requires the amount of data delivered to the surgeon by the device to be doubled (landmarks, medical data, 3D models, etc.) when compared, for example, to the orthognathic scenario. The dual-site aspect of oncological procedures should make it possible to consider the feasibility to equip both teams with a device. That opportunity will improve and enhance the way to exchange information between the surgeons, and so the connectivity of the device will become a milestone.
The overall score (Table 7), as planned, is suitable to quickly depict the readiness of the devices, confirming the result with an easier reading. The table illustrates how no device can reach the best-case yet, but three of them have promising scores: (1) the M4000™industrial monocular device with a slightest idea of modularity; (2) the HoloLens 2™renowned HMD used in many applications with a lot of available documentation; (3) the Magic Leap One™-powerful HMD which has recently led its manufacturer to pivot the business towards professional customers in order to avoid bankruptcy.
Even if HoloLens 2™ is a more up-to-date device, its scores are lower than the other two high-scoring devices, for instance, the Magic Leap One™, mainly because of the available computing power. The hardware features, grouped under the name of computing power, were considered essential to give to the surgeon a trustable and robust instrument. In fact, during an operation, the HMD must always be responsive and avoid stuttering, lagging, FPS drops or any other phenomena that may confuse or hinder the surgeon. The HoloLens 2™ have the possibility to use the Azure Remote Rendering™ real-time service to improve its performance, but this is heavily dependent on the internet connection. Thus, in the analysis, only the stand-alone capabilities of each device were considered and evaluated accordingly.
The analysis recognized three sensors that were fundamental for medical procedures but not yet implemented in any device:

1.
Optical zoom to look at small structures during the vascular anastomosis in oncological surgery. Optical see-through devices have too many difficulties to have a digital magnification overlapped on the scene; although it is easier for video see-through, digital magnification has resolution issues.

2.
Blood flux live view to verify during the surgery the proper vascularization of the implanted graft. 3.
Non-optical landmarks reader to overcome the need for an appropriate and clean scene are essential to not lose sight of landmarks and, consequently, the registration between reality and virtual elements.
In conclusion, the use of the QFD method to study the devices fitting into surgery applications has been successful. The analyses originated from the use of the QFD method led to discover unexpected relations.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/app112211053/s1, Table S1: Comparison table of the evaluated device specifications;  Table S2: QFD scores table for the oncological and reconstructive surgery scenario; Table S3: QFD  scores table for the orthognathic surgery scenario; Table S4: QFD scores table for

Conflicts of Interest:
The authors declare no conflict of interest.