Enhancing Pilot ‘Mission’ Projection Through a Virtual Reality Flight Simulator: A Quasi-Transfer of Training Study

Somerville, Alexander; Joiner, Keith; Wild, Graham

doi:10.3390/sci8040070

Open AccessArticle

Enhancing Pilot ‘Mission’ Projection Through a Virtual Reality Flight Simulator: A Quasi-Transfer of Training Study

by

Alexander Somerville

¹

,

Keith Joiner

²

and

Graham Wild

^1,*

¹

School of Science, University of New South Wales, Canberra, ACT 2612, Australia

²

Capabilities Systems Centre, University of New South Wales, Canberra, ACT 2612, Australia

^*

Author to whom correspondence should be addressed.

Sci 2026, 8(4), 70; https://doi.org/10.3390/sci8040070

Submission received: 19 February 2026 / Revised: 18 March 2026 / Accepted: 23 March 2026 / Published: 26 March 2026

Download

Browse Figures

Versions Notes

Abstract

The purported benefits of Virtual Reality for pilot flight simulator training, such as increased immersion and presence, would be of great benefit in training those flight skills that rely on visuospatial awareness. The implementation of this technology for the training of pilots requires careful consideration of its ability to transfer required skills and of any comparative advantages over conventional flight simulators. In order to examine this question, a quasi-transfer-of-training study was conducted using a separate-sample pretest–posttest design. The ability of a low-cost VR simulator to transfer flying skills and mission projection skills, using internally valid measures, during a common flight manoeuvre was evaluated. Results were consistent with improved post-intervention flying performance (g = 0.875) and ‘mission projection’ performance (g = 0.661), with no statistically significant difference between the estimated effect sizes, as well as the combined measure (g = 0.768). The findings indicate that the VR simulator was associated with better performance in the quasi-transfer of basic flying skills, those skills that require understanding of spatial relationships based on visual information, and in the broader training of technique. These findings must, however, be considered in the context of the noted limitations of the technology and the research design.

Keywords:

Virtual Reality; extended reality; flight; pilot training; flight simulator; flight training

1. Introduction

The use of flight simulators to prepare pilots for training airborne, or even to replace the aircraft as a training device, is well established in the literature [1]. Within the airlines, simulators have largely replaced initial type-rating training in the aircraft [2]. However, in order for a simulator to achieve certification to that level of utility, it needs a consummate level of fidelity, as deemed necessary by regulators, which requires substantial capital investments [3]. For most of the aviation industry, and particularly for General Aviation (GA), the use of simulators is still a means to prepare the trainee for the learning activity to be conducted airborne [4,5]. Flight simulators used for this purpose, regardless of the technologies they employ, must therefore enable the transfer of the ‘right’ kinds of skills to the trainee. Instruction in a flight simulator for the purpose of achieving the ability to maintain the aircraft within predefined and legislated [6] tolerances is not without value. The training of hand-flying skills in a flight simulator may, in part, function through the cognitive priming of the trainee, as opposed to directly developing specific fine-motor skills [7]. This view broadly aligns with instructional practice [8], even if it is not yet established empirically. Yet training only this cognitive aspect of flight as a preparatory step before airborne training would still leave elements of technique to be learned in the aircraft.

The technique can be viewed in the narrower conception as simply the means by which the procedure is achieved [9], or it can be viewed as part of the broader ‘art’ of flying. In this latter conception, the technique incorporates matters of situational awareness, decision-making, and mission projection, all of which remain dependent upon the underlying skills of flight. The pilot maintains an awareness of the present state of the ownship, projects this state into the near future time-space, judges the projection, and adjusts their control actions to achieve the mission [10]. As such, the pilot is an active agent in the learning procedure. Perfect procedure is then not the goal, but rather an adequate procedure that achieves mission success [11,12]. Of course, the technique is predicated on the procedure and particularly on perceptual–motor activities. The skill and long-acknowledged need for ‘ingenuity’ on the part of the flight instructor [5] could enable better preparatory training of the technique in the flight simulator. Nonetheless, the incorporation of technologies that allow for better immersion and understanding of one’s own state or ‘agency’ would be of great benefit—Virtual Reality (VR) is one such technology.

The development of ubiquitous Virtual Reality in recent years, and particularly the emergence of consumer-grade VR-HMDs (head-mounted displays), has resulted in an increased interest in the technology for education. Interest in this use of VR for education has, of course, included applications within [13,14] and outside [15] aviation. In considering VR for aviation education, and more specifically pilot training, its benefits can be viewed relative to the conventional flight simulator that would otherwise be used. Compared to the conventional flight simulators or procedure trainers that are common in General Aviation, VR is considered more immersive, interactive, and engaging [15]. Each of these traits is of value in the training context, directly and indirectly, by enhancing either cognitive [16] or motivational fidelity [17,18].

The interactivity and immersion of VR may also work synergistically to enable greater visuospatial awareness [19,20]. Improving these aspects of flight simulation may prove especially beneficial for the training of those flight tasks and phases of flight that require the pilot to coordinate the ownship with regard to external visual references (e.g., visual circuits). These tasks require both accurate flight control inputs (i.e., procedure) and anticipation and projection of the relative motion of the aircraft (i.e., technique) [21]. Conventional simulators are known to be less effective for the training of this kind of ‘precision’ flying [22] as compared to their effectiveness for simpler flying tasks [10]. While the exact cause of this differential effectiveness between precise and general tasks is unknown, it may well be related to the tunnel vision caused by truncations of the visual field, resulting from the necessarily limited dimensions of a two-dimensional monitor [23]. The simpler perceptual–motor coordination skills, which are critical for aircraft control, must also be trained or cognitively primed if the VR simulator is to replace the conventional one.

The use of flight simulators to train perceptual–motor skills, particularly the fine motor movements required for aircraft control, is not fully resolved in the existing literature. Training in flight simulators can result in improved airborne performance, even when the inputs used in the simulator exhibit little resemblance to those in the aircraft. For instance, even using a keyboard to control the flight simulator can have benefits [7]. The mechanism by which this improvement occurs may be related to cognitive priming, where the trainee is mentally prepared for the airborne experience [24]. This suggests that the trainee has developed related skills, though not necessarily the ‘muscle memory’ of precise flight control. Studies into developing these skills can be characterised into two types: true-transfer-of-training or quasi-transfer-of-training research. A true-transfer-of-training design, in the context of flight training, involves training in a simulator and the subsequent measurement of transferred skills in the aircraft. The alternative quasi-transfer-of-training design begins with training in a given simulator but assesses transferred skills on a second, often higher-fidelity, simulator. In this design, the second simulator is treated as an adequate analogue for the real aircraft for the purpose of assessing transfer.

In true-transfer studies, perceptual–motor skill transfer is primary. In practice, however, the more immediate concern may be whether the use of a simulator, regardless of design, reduces actual time in the aircraft (i.e., improves training efficiency) [5]. In a study of this kind, and in real-world flight training that uses flight simulators in this way, Thorndike’s law of readiness [25] would apply. In quasi-transfer studies, where a second simulator is used as an analogue for the aircraft, near-identical controls can be implemented. As a result, the trainee does not need to acquire an entirely new set of perceptual–motor skills, since the controls are substantially the same between simulators. In this context, both perceptual–motor skills and cognitive priming may be supported simultaneously. Therefore, for a quasi-transfer study, and indeed for a sufficiently high-fidelity simulator (e.g., those used by most airlines), the laws of exercise, effect, and recency [25] would likely apply. The skill of aircraft control can also be measured and thus quantified. Such an aggregate or primary control metric would be classified as a Type 1 metric per [26], involving the error [27] or accuracy [28] of, for example, the flight path for a given prescribed or regulated manoeuvre.

Training the technique of flight in the simulator relies on the same educational techniques as would otherwise be employed airborne. The skills that form the technique would, in the Australian context, be covered by multiple standards of the non-technical skills (NTS) contained in Schedule 2 of the Manual of Standards Instrument for Part 61 of the Civil Aviation Safety Regulations (CASR) [6]. The technique can be directly taught and facilitated by the flight instructor but should also begin to develop as the mechanics of control are developed [29]. The simulator can be given preference in some circumstances (e.g., instrument flying) due to its lower cost and ease of repetition. Conventional simulators in General Aviation are generally limited by two-dimensional displays and lack an immersive and situated experience [30,31]. As such, those parts of the technique that rely on maintaining an effective external lookout (NTS1.1) and using this to assess the situation and make proper decisions (NTS1.3)—that is, mission projection—are likely inhibited by the tunnel vision view of the display. A simulator that allowed for a visual understanding of three-dimensional spatial relationships (i.e., visuospatial relationships) would, in principle, enable the training of technique as readily as the training of procedure.

In order to measure the technique objectively, a metric other than error in the control of the aircraft is needed [10]. Such measures of procedure can show ‘poor’ performance in a circumstance that a flight instructor would judge performance to be adequate. The goal of such is not perfection, but the competent completion of the mission [11,12]. To quantify this, some meaningful measure of the mission outcome should be calculated.

The extent to which VR supports the transfer of both perceptual–motor coordination and mission projection skills, within the flight simulator paradigm, remains open to debate. The current literature on the use of VR for flight simulation and the use of other extended reality-spectrum technologies for this purpose inadequately addresses these areas of pilot training [14]. Where such areas have been examined, the transfer observed is both varied and accompanied by substantial uncertainty. Present research, for various reasons, is focused on the use of the technology in the earlier stages of pilot training (i.e., ab initio), which is arguably the precise time when the development of these skills ideally begins. The assumed benefits of VR, in comparison to conventional simulators, such as increased immersion, may be of benefit to the transfer of both technique and procedure flight skills.

Considering the importance, the question guiding the present research is as follows: ‘Does the use of a Virtual Reality flight simulator develop participants’ perceptual–motor coordination skills, mission projection skills, or both, as evaluated by two measures (one measuring a proxy of procedure, and one a proxy of technique) of their performance during a common flight task?’ In order to answer this question, empirical evidence is provided on the efficacy of the VR simulator within a quasi-transfer framework.

2. Materials and Methods

The present study utilises a quasi-experimental separate-sample pretest–posttest design. This experimental design, popularised by Campbell and Stanley [32], assesses the performance or status of one group prior to receipt of the trial intervention and assesses the same metric for a second, separate group following the intervention. In such a design, all participants are subject to the intervention, with the estimated intervention effect being represented by the difference between the pretest and posttest scores of the two groups. Separate-sample pretest–posttest designs eliminate the need for a control group, allowing use in circumstances where the creation of a control group is impractical, impermissible, or both [28]. The design can also be used in situations where researchers would otherwise use pre-experimental designs, such as the one-shot case study, but with potentially fewer risks to internal and external validity. The underlying theoretical framework of the research supposes that experience of the simulation, and therefore practice of the skills, would increase performance (i.e., expertise) in the simulation [33].

The researchers were provided with anonymised activity and flight [simulator] data that was the result of, and originally for the sole purpose of, an undergraduate education activity for a course at an Australian university. The creation of a disparity in educational outcomes would not have been permitted in this context. The structure of the data was not a result of experimental choice, but rather a practical limitation of available hardware, software, and time. Thus, the order in which participants arrived determined whether they were tested before or after the intervention, resulting in a quasi-random allocation based upon order of arrival. This resulted in a group making use of a PCATD (personal-computer aviation training device) prior to use of the VR-HMD-based simulator (i.e., the pretest group) and a group making use of the VR-HMD prior to use of a PCATD (i.e., the posttest group). Participant allocation, therefore, followed a nonprobability sampling process that most closely resembles convenience sampling [34].

Data were available for 44 participants (n = 41, three excluded due to incomplete data), of which 17 (n = 17) were pretested for their flying skills on a PCATD, and 24 (n = 24) were posttested on the PCATD. The data provided to the researchers did not contain any demographic information (e.g., age, sex, etc.), nor were such data subsequently sought, in accordance with the ethics approval under which the research was conducted. During the original educational activity, all participants had been briefed to discontinue use of the simulators if they experienced discomfort during use. The nature of this activity and the resulting data precluded the establishment of true randomisation of group allocation. The group allocation depended upon arrival order rather than controlled randomisation, which increases the risk that systematic differences may exist between the groups. The absence of any demographic or other participant-level data further limits assessment of the comparability of the groups. A unique risk of this sampling method is that more organised and motivated students received the VR-HMD first (i.e., posttest group allocation).

The hardware used for the simulator was a computer, a computer monitor, a VR-HMD, and a set of flight-specific peripherals. The computer used for both the assessment PCATD and the VR simulator was equipped with an Intel i7 (2.5 GHz) CPU, an Nvidia RTX 3060 (16 GB) GPU, and 16 GB (DDR4) of RAM. The computer monitor used for the PCATD was a 27-inch flatscreen, with a resolution of 1920 × 1080 pixels and a refresh rate of 60 Hz. The VR-HMD used for the VR simulator, which ultimately served as the visual display medium, was a Meta (Menlo Park, CA, USA) Quest 2 (formerly known as the Oculus Quest 2). The Meta Quest 2 contains two display panels, one per eye, with a resolution of 1832 × 1920 pixels at a maximum refresh rate of 90 Hz. The headset generally conforms to the definition of ‘virtual reality’ as given by Phillips [35]. The flight-specific peripheral used for both simulators was a Thrustmaster (Hillsboro, OR, USA) T.16000 FCS HOTAS [36] for roll, pitch, and engine control, and associated T.Flight rudder pedals [37] for yaw control.

The software used for both simulators was the Windows 11 OS (operating system) for the computer, and the Laminar Research (Columbia, SC, USA) X-Plane 11 flight simulator (Version 11.20) [38], the VR-HMD onboard OS (Version: v61), and the associated software packages required to establish a link between the VR-HMD and the computer. All graphics settings within the X-Plane software were set to medium. Post hoc testing indicates that this computer and software combination would achieve a simulator frame rate of at least 60 Hz. However, the particular environmental conditions during the original lab activity, the link to the VR-HMD, and other unknown factors may have reduced the actual refresh rate. The researchers are unable to unambiguously confirm this based on the provided data. The simulated aircraft for both simulators was a Cessna 172 that was equipped with a Garmin G1000-style flight deck and traditional standby instruments, typically used in General Aviation training.

The flight task to be flown in all situations, and therefore in both simulators, was a left-turning downwind airport landing approach circuit. The aircraft was preestablished at a [very] early left downwind position (−35.26662, 149.17899) for Canberra International Airport (ICAO: YSCB) Runway 35, at circuit altitude (2900 ft AMSL, ~1000 ft AGL), and at cruise speed (105 KIAS). The goal of the activity was to track parallel to the runway (RWY 17), by visual reference only, to a position abeam the threshold of that runway (the ‘mission’). Figure 1 shows the ‘ideal’ flight path (white) overlaid on a map, with several example flight paths (black) also shown. The end of the ‘ideal’ flight path (−35.31385717, 149.1784932) shown in Figure 1 is correct, as Runway 35 at YSCB has a displaced threshold [39]. The downwind leg can be examined for procedural correctness (the ‘procedure’) with reference to the tolerances of straight and level (S&L), at the private level, as specified in the Manual of Standards (MOS) Instrument for Part 61 of the Civil Aviation Safety Regulations [6] and shown in Table 1. In order that the flight would primarily be conducted with regard to visual reference to the ground and horizon, the [simulated] Attitude Heading Reference System (AHRS) was disabled (i.e., simulated to have failed). A button on the HOTAS was mapped to the ‘glance left’ function so that the relative position of the aircraft in relation to the runway could be determined during use of the PCATD. The prevailing wind in the simulation was set to nil.

All participants flew this task on both simulators, with the pretest group flying on the PCATD before the VR simulator and the posttest group flying on the VR simulator before the PCATD. Assessment of the independent-group metrics on the PCATD then enables the assessment of the quasi-transfer of flying skills associated with the VR intervention.

For the purpose of evaluating participant procedural correctness, flight simulator data for altitude, heading, and airspeed were measured against the MOS tolerances in Table 1 for the whole downwind leg. The time that was flown while in conformance with all of these tolerances was calculated as a proportion of the total flight time to provide the percentage of time in tolerance (PTiT) metric using a typical sampling rate of 10 Hz. The PTiT metric is taken as the measure of procedure, representing the portion of the manoeuvre during which the participant was accurately performing the control action required to maintain the flight standards.

The assessment of the ability of a participant to achieve the mission required the extraction of latitude, longitude, and altitude data for the final point of the flight path. The distance from the end of the ‘ideal’ flight path to the end of each participant’s flight path was calculated using the haversine formula [40]. This distance information was then combined with the change in altitude (Δ_altitude) using basic trigonometry to provide the absolute displacement (AD) metric in feet. In the interest of preserving the directional alignment of the measures, the absolute displacement metric was reverse-scored. That is, an increase in the AD (i.e., a reduction towards zero displacement from the target) is indicative of better performance. The AD metric is taken as the measure of technique, representing the accuracy of the mission project across the whole of the manoeuvre. The mission of the downwind leg of the circuit, as with the circuit more broadly, is to maintain and position the aircraft such that a landing can be safely effected [29,41]. Thus, correct spatial positioning (e.g., abeam the threshold) reflects the integrated outcome of maintaining an awareness of the present state of the ownship, projecting this state into the near future time-space, judging the projection, and adjusting control action as required.

The null hypothesis (H₀) for this research was that there would be no significant difference between the participants’ flight performance, based upon the PTiT and AD, between the pretest and posttest. That is, there would be no evidence of transfer associated with VR simulator exposure. Conversely, the first alternative hypothesis (H₁) was that at least one of the metrics (i.e., PTiT or AD) would be greater posttest, indicating improved flight performance. The null hypothesis is, therefore,

H₀:

Δμ_PTiT ≤ 0 and Δμ_AD ≤ 0

where Δμ_PTiT and Δμ_AD represent the change in the mean between pretest and posttest of the PTiT and the AD, respectively. The first alternative hypothesis being

H₁:

Δμ_PTiT > 0 or Δμ_AD > 0

where the expectation is that the posttest mean of at least one metric will exceed the pretest mean.

If the null hypothesis (H₀) is rejected, and there is evidence of improvement, the second hypothesis (H₂) will be tested to evaluate the relative improvement across the metrics. That is,

H₂:

Δμ_PTiT = Δμ_AD

where, owing to the potential superiority of VR simulators over conventional simulators for visuospatial awareness, performance is expected to improve similarly for both mission and procedure.

In advance of statistical testing, which would likely have always made use of some form of independent samples t-test, data were checked to validate underlying assumptions of normality and homoskedasticity. The equality of variances was checked by Levene’s test, and the normality of the distributions was checked by the Shapiro–Wilk test. The PTiT data were found to be homoscedastic and normally distributed, but the AD data had both distributional and variance issues. To maintain statistical power and enable direct comparison between the metrics, the AD data were transformed using the Yeo–Johnson transformation. That is, transformation allowed both metrics to be analysed using parametric tests, rather than requiring AD to be tested non-parametrically. The alternative would have been either to analyse PTiT using a non-parametric test as well, in order to maintain comparability, resulting in a loss of statistical power, or to analyse the two metrics using different inferential approaches, and thus reducing the direct comparability of the results. Unlike other data transformations, such as the Box–Cox transformation, the Yeo–Johnson transformation can handle negative values [42,43]—this being necessary as all AD data were negative. The lambda (λ) value was determined for the AD pretest data only and then applied via the transformation to all AD data. Normality of the distributions and homoskedasticity were then rechecked by the same tests, with the data found to now be suitable for parametric testing.

Following data extraction, calculations, and assumption checking, an independent-samples t-test was used to check the difference in flight performance between the pretest and the posttest for both metrics. The Hedges’ g effect size [44] was calculated for each measure to enable understanding of the mean difference between the pretest and posttest, and, therefore, the estimated effect associated with prior VR exposure. Hedges’ g was more suitable than Cohen’s d due to the incorporation of a correction factor for small sample sizes and better handling of unequal group sizes [45]. The mean differences between performance measures for the pretest and posttest groups were divided by the pooled standard deviation, as opposed to the maximum likelihood estimator [44]. Interpretation of the effect sizes is based on Cohen’s [46] guidelines. Specifically, an effect size that is greater than 0.65 is interpreted as large, with 0.35–0.65 being interpreted as moderate and within the range of 0.2–0.35 being interpreted as small.

To enable statistical comparison of the outcomes of the two metrics and thus evaluation of the second hypothesis (H₂), the correlation of the endpoints (PTiT and transformed AD) was calculated. This also allows for the calculation of the combined effect size. The sum of the cross-products of the deviations from the means, for each measure, approximates the covariance. The covariance was then normalised by dividing by the total degrees of freedom across the pretest and posttest groups. Dividing by the pooled standard deviation resulted in an estimate of the linear correlation (r) between the intervention effects on the two measures. A z-test of the effect sizes that incorporated the correlation, as per the difference method of Cohen [47], as cited by Lakens [46], was used to assess whether any change in the measures was equal. The overall effect size was then calculated by the method recommended by Borenstein et al. [48].

In combination, these measures assess the statistical significance of the observed group differences associated with VR exposure, the relationship between the measures, and the practical significance of the observed differences. Owing to the characteristics of the ‘experiment’ that created the data, a priori calculation of power was not possible. Post hoc power calculations were not computed, as they provide no additional information that is not already provided by confidence intervals and can risk misinterpretation of the results [49].

3. Results

The PTiT metric for the pretest group (M = 33.3%, SD = 19.1%, Md = 35.1%) and the posttest group (M = 51.7%, SD = 21.7%, Md = 48.9%), and the AD metric for the pretest group (M = −2707.5 ft, SD = 768.2 ft, Md = −2319.2 ft) and the posttest group (M = −2262.4 ft, SD = 540.7 ft, Md = −2146.2 ft), required assumption checking prior to statistical testing. Levene’s test for Equality of Variance showed homoskedasticity between the pretest and posttest data for the PTiT metric (F(1, 39) = 0.322, p = 0.564), but the AD metric was heteroskedastic (F(1, 39) = 5.496, p = 0.025). The Shapiro–Wilk test showed that the PTiT data were normally distributed for both the pretest group (W = 0.957, p = 0.568) and the posttest group (W = 0.978, p = 0.852). The Shapiro–Wilk test showed that for the AD metric, the pretest data were non-normally distributed (W = 0.864, p = 0.018), and the posttest data were normally distributed (W = 0.926, p = 0.08). The Yeo–Johnson transformation was applied to the AD pretest data, and the optimal lambda value was determined (λ = 3.521). AD posttest data were also transformed using this lambda value. Retesting of the AD data showed that the AD pretest data were now normally distributed (W = 0.914, p = 0.115), without producing distributional issues in the AD posttest data (W = 0.936, p = 0.133). The AD data were now also homoscedastic (F(1, 39) = 0.005, p = 0.942). Figure 2 shows the means of the untransformed PTiT data, the untransformed AD data, and the transformed AD data, for the pretest and posttest groups.

The results of the independent samples t-test, shown in Table 2, comparing the separate pretest and posttest groups’ PTiT, showed a significant (t(39) = 2.813, p = 0.008) mean difference of 18.43 (95% CI [5.18, 31.67]) percent. This indicates that PTiT was significantly higher for the posttest group with the prior VR exposure, with participants in that group maintaining within tolerance for, on average, 18.43 percentage points more of the task time. The results of the independent samples t-test between the groups for absolute displacement (AD) showed a significant (t(39) = 2.126, p = 0.040) mean difference of 0.789 (95% CI [0.038, 1.540]). Though less directly interpretable due to the data transformation, this result indicates that the VR simulator intervention significantly reduced the distance to the ‘mission’ target for the posttest group, thereby improving their flight skills. Both t-test results indicate a significant difference between the participants’ flight performance, based upon the PTiT and AD, between the pretest and posttest. These results, which show evidence that at least one performance metric improved between pretest and posttest groups, support the first alternative hypothesis (H₁).

The calculated Hedges’ g effect sizes between the pretest and posttest, as shown in Table 2, were consistent with positive transfer for both the PTiT metric (g = 0.875, 95% CI [0.219, 1.531]) and the transformed AD metric (g = 0.661, 95% CI [0.024, 1.298]). As per the convention set forth by Cohen [47], this would correspond to a large effect size for the PTiT and a moderate effect size for the AD. When considering these effect sizes, it is important to note that they are dependent, and they have a moderate positive correlation (r = 0.453). Despite the usual interpretation of the effect sizes suggesting a difference between improvements in the two metrics, the z-score (z = 0.620) indicates that, at the 95% confidence level for a two-tailed test, they are not significantly different (p = 0.535). The first alternative hypothesis (H₁) is supported. The results are also consistent with the second hypothesis (H₂), insofar as the performance improved similarly for both metrics.

4. Discussion

The use of VR simulators in pilot training requires proper consideration of their efficacy in developing flight-critical skills, particularly in circumstances where such simulators will displace conventional simulators. The percentage of time in tolerance (PTiT), here representing the portion of the manoeuvre during which the participant was accurately performing the procedure of flight, and the absolute displacement (AD), representing the accuracy of their mission projection (i.e., the technique), were both significantly higher in the posttest group following VR exposure.

The PTiT data, which can be readily interpreted in raw form, show significant improvement between pretest and posttest: an increase from a mean of 33.3% for the pretest group to a mean of 51.7% for the posttest group. The standard deviation, which represents the variability of the PTiT data within the pretest and posttest groups, shows a small increase. The substantial improvement in group performance suggests that the small change in SD is of limited practical consequence.

The absolute displacement (AD) data are more readily interpreted in their pre-transformed state. In that state, the posttest group’s mean AD was −2362.0 ft, whereas the pretest group’s mean AD was −2707.5 ft. Unlike the small increase in the variability of performance for the posttest group in the PTiT metric, the AD metric shows less variability, based on the SD. The underlying data are reflected in the large positive effect size (g = 0.875, 95% CI [0.219, 1.531]) for the PTiT, the moderate positive effect size for the AD (g = 0.661, 95% CI [0.024, 1.298]), and the large positive combined effect size (g = 0.768, 95% CI [0.217, 1.318]). The confidence intervals for these effect sizes are quite wide, particularly for the AD metric, indicating uncertainty in the estimates. Specifically, the estimate of effect size for the AD metric spans from a quite small effect size to a comparatively large effect size.

The significant improvement in the PTiT, which is taken to represent the improvement in the procedure, suggests that a VR simulator may adequately substitute for a PCATD within the limited scope of what was examined. A large positive effect size (g = 0.875, 95% CI [0.219, 1.531]) is consistent with transfer, or other suitable cognitive priming, having occurred for the participants prior to the aircraft analogue. This improvement occurred with the use of the same flight control peripherals in both the PCATD and the VR simulator. As such, the exact mechanism by which the improvement occurred is unclear. Though both the laws of exercise, effect, and recency [25] would most likely be applicable. That the flight was conducted in VMC (visual meteorological conditions) may have also given an advantage to the VR simulator by means of its immersive exterior visual scene. These factors in combination may have led to the improvement. A useful follow-on research direction would be to repeat the experiment in instrumented flight conditions. Of course, no participant achieved a perfect score (i.e., 100%), but perfection is not the goal of flight training [50]; competent completion of the mission is the goal. That is only possible with a suitable technique.

The significant improvement in the AD, which is taken as denoting an improvement in the participants’ technique, represented through improved mission projection, would support the use of VR to prepare for training these skills in an airborne environment. A moderate positive effect size (g = 0.661, 95% CI [0.024, 1.298]) indicates that prior VR exposure was associated with improvements in those underlying non-technical skills through some combination of the technology’s inherent characteristics. The posttest group (i.e., those participants who had received the intervention but were now being tested on the PCATD), despite no longer having the advantage of the VR simulator (e.g., increased immersion and such), performed better on this metric with prior VR exposure. That is, without the immersion of the VR simulator, they were still able to maintain a better external lookout (NTS1.1) and use this to properly assess the situation and make appropriate decisions (NTS1.3). The claimed immersive and situated experience could have improved the general consideration of the relative spatial relationships. It is possible that, as with the PTiT, the laws of exercise, effect, and recency [25] are applicable here. Importantly, the improvement in the proxy measure of technique was not significantly different from the proxy measure of procedure. Unlike conventional simulators, there would appear to be evidence to support the training of those more visuospatial skills and manoeuvres in VR.

In consideration of the possibility that the two measures are not perfect representations of their skills, as noted previously, it is also important to consider the more general effect. The combined effect size (g = 0.768, 95% CI [0.217, 1.318]), which indicates a moderate positive transfer, suggests that whatever the VR simulator exposure, it is associated with improved subsequent performance. This combined effect size, as well as the two other effect sizes, is broadly comparable to the existing xR flight simulator literature [14]. The only VR simulator transfer research with which both design and measure (i.e., PTiT specifically) are shared cannot be readily statistically compared. In that research [28], the underlying data were non-normal, and the Hedges’ g effect size was calculated from the results of a Mann–Whitney U test by use of the method of Tak and Ercan [51]. Nonetheless, the large positive effect size of the research (0.946, 95% CI [0.56, 1.37]) is comparable to the effect size in the present research. Finally, the AD effect size, which is likely associated with improved understanding of visuospatial relationships, can be compared to previous VR simulator training for formation flying [52]. That research showed a far larger positive effect size (g = 1.819, 95% CI [0.08, 3.56]), but this has far wider confidence intervals, reflecting greater uncertainty in the estimate. The effect size from the present research is broadly consistent with that of the previous research, as it is within the confidence intervals of that earlier effect size.

The present research differs in many ways from previous research on the use of VR simulators, and indeed conventional simulators, and this should be considered when interpreting the results. The design of this research and the method of training within a quasi-transfer framework are the two most distinguishing differences. The somewhat unusual structure of the data and their origin effectively necessitated the use of a quasi-experimental, separate-sample pretest–posttest design. This design is common in other areas of education research (e.g., ongoing medical education [53]), but it is rarely used in pilot education. While the data could have been interpreted through the use of several kinds of pre-experimental design, the actual process of their generation and the sources of internal and external invalidity made the design used appropriate. Further, again owing to the origins of the data, history, maturation, instrumentation, and mortality, which are the inherent sources of invalidity for this design [32], are likely to have been minimised. For example, maturation poses little risk of invalidity due to the short length of the intervention, the testing, and the combined time. It is unlikely that sufficient time had passed to have resulted in an increase in the participants’ age, hunger, tiredness, or the like. That the time between all pretests and posttests was less than two hours reduces the likelihood that the remaining sources of invalidity could materially impact the outcome.

Within quasi-transfer research, and indeed within true-transfer research, it is most common to train the participant to a set level of proficiency in the experimental simulator, then train them to the same level of proficiency in the aircraft [or aircraft analogue], and then quantify the instruction saved in the aircraft. The instruction saved is more generally calculated as the training effectiveness ratio [54]. The activity that originated this data did not train to a level of proficiency but for a set length of time. Furthermore, in examining the results of the present research, the outcomes must also be considered within the larger context of the transfer literature and with due regard as to the means of quantification and interpretation. The use of a quasi-transfer study means that, particularly as regards the perceptual–motor skills, the effect sizes are highly likely to be an overestimate of the transfer [51]. The combined effect size (g = 0.768), as well as the two internal metrics from which it was derived (i.e., PTiT and AD), are liable to overestimate the true transfer that is achievable between a VR simulator and the aircraft. Additionally, these internal metrics are not commonly used and will require further application to ensure their validity. Finally, the posttest group scores are ultimately based on a portion of flying at the end of the test, in which they are approaching twice the exposure as the pretest group. That is, they are continuing to learn during the posttest, as such learning is inherent [10,55]. Despite the various limitations, the research nonetheless shows promise for VR simulators for this kind of pilot training.

5. Conclusions

The research presented here supports that a VR simulator, which herein was constituted by a VR-HMD deployed as the visual system, can be used for the training associated with both the flight procedure and the technique of trainees. Between the pretest and the posttest, both the procedure performance and the technique performance, as well as the combined performance, were significantly higher for that group with prior VR exposure. The answer to the question, ‘Does the use of a Virtual Reality flight simulator develop participants’ perceptual–motor coordination skills, mission projection skills, or both, as evaluated by two measures (one measuring a proxy of procedure, and one a proxy of technique) of their performance during a common flight task?’ is that VR was associated with improved perceptual–motor coordination and mission projection performance, with no statistically significant difference in improvement between the two metrics. The improvement in performance was achieved using [relatively] inexpensive hardware, short intervention exposure time, and without either an instructor or formal flight training. When the noted limitations of the technology and the design of this research are given due consideration, and the assignment and task are carefully evaluated, this research tentatively supports the use of a VR simulator within the examined context. Additional research is recommended on the true transfer achievable, the advantages attainable with more modern VR-HMDs, and the use of the VR simulator in other flight training domains, such as those currently supported by PCATDs.

Author Contributions

Conceptualization, A.S. and G.W.; methodology, A.S.; validation, A.S., G.W. and K.J.; formal analysis, A.S. and G.W.; investigation, A.S. and G.W.; resources, G.W.; data curation, A.S.; writing—original draft preparation, A.S.; writing—review and editing, A.S., G.W. and K.J.; visualisation, A.S. and G.W.; supervision, G.W. and K.J.; project administration, G.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with approval from the Human Research Advisory Panel of the University of New South Wales (Approval HC220421 on 31 August 2022).

Informed Consent Statement

Consent was waived by the Human Research Advisory Panel of the University of New South Wales as the data and the task upon which this research is based were part of, and originally solely for the purpose of, an undergraduate laboratory course. All data were deidentified prior to provision to the researchers.

Data Availability Statement

The datasets presented in this article are not readily available because of restrictions on their distribution to parties external to the original research, as imposed by the ethics approval. Requests to access the datasets should be directed in the first instance to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Somerville, A.; Lynar, T.; Wild, G. The nature and costs of civil aviation flight training safety occurrences. Transp. Eng. 2023, 12, 100–182. [Google Scholar] [CrossRef]
Vidakovic, J.; Lazarevic, M.; Kvrgic, V.; Vasovic Maksimovic, I.; Rakic, A. Flight Simulation Training Devices: Application, Classification, and Research. Int. J. Aeronaut. Space Sci. 2021, 22, 874–885. [Google Scholar] [CrossRef]
Lee, A.T. Flight Simulation: Virtual Environments in Aviation; Ashgate: Aldershot, UK, 2005. [Google Scholar]
Allerton, D.J. The impact of flight simulation in aerospace. Aeronaut. J. 2010, 114, 747–756. [Google Scholar] [CrossRef]
Povenmire, H.K.; Roscoe, S.N. Incremental transfer effectiveness of a ground-based general aviation trainer. Hum. Factors 1973, 15, 534–542. [Google Scholar] [CrossRef]
Federal Register of Legislation (Australian Government). Civil Aviation Safety Regulations 1998; Federal Register of Legislation (Australian Government): Canberra, Australia, 1998.
Dennis, K.A.; Harris, D. Computer-based simulation as an adjunct to ab initio flight training. Int. J. Aviat. Psychol. 1998, 8, 261–276. [Google Scholar] [CrossRef]
Federal Aviation Administration. Aviation Instructor’s Handbook (FAA-H-8083-9B); U.S. Department of Transport: New York, NY, USA, 2020.
Reiser, R.; Brecke, F.; Gerlach, V. On the Difference Between Procedure and Technique in Pilot Instruction; Instructional Resources Laboratory: Tempe, AZ, USA, 1972. [Google Scholar]
Eddowes, E.E.; Waag, W.L. The Use of Simulators for Training In-Flight and Emergency Procedures; AGARD: Neuilly-sur-Seine, France, 1980. [Google Scholar]
Campbell, R.D. Flight Instructor’s Manual, 5th ed.; Campbell Partnership: Guildford, UK, 1994. [Google Scholar]
Air Services Branch. Elementary Flying Training Manual and Instructor’s Handbook, 1st ed.; Department of Transport: Ottowa, ON, Canada, 1944.
Cross, J.I.; Boag-Hodgson, C.; Ryley, T.; Mavin, T.; Potter, L.E. Using Extended Reality in Flight Simulators: A Literature Review. IEEE Trans. Vis. Comput. Graph. 2022, 29, 3961–3975. [Google Scholar] [CrossRef] [PubMed]
Somerville, A. Applications of Extended-Reality in Pilot Flight Simulator Training: A Systematic Review with Meta Analysis. Vis. Comput. Ind. Biomed. Art 2025, 8, 25. [Google Scholar] [CrossRef]
Kavanagh, S.; Luxton-Reilly, A.; Wuensche, B.; Plimmer, B. A systematic review of virtual reality in education. Themes Sci. Technol. Educ. 2017, 10, 85–119. [Google Scholar]
Marougkas, A.; Troussas, C.; Krouska, A.; Sgouropoulou, C. How personalized and effective is immersive virtual reality in education? A systematic literature review for the last decade. Multimed. Tools Appl. 2024, 83, 18185–18233. [Google Scholar] [CrossRef]
Huddleston, H.; Rolfe, J. Behavioural factors influencing the use of flight simulators for training. Appl. Ergon. 1971, 2, 141–148. [Google Scholar] [CrossRef]
Hopkins, C.O. How much should you pay for that box? Hum. Factors 1975, 17, 533–541. [Google Scholar] [CrossRef]
Azarby, S.; Rice, A. Understanding the Effects of Virtual Reality System Usage on Spatial Perception: The Potential Impacts of Immersive Virtual Reality on Spatial Design Decisions. Sustainability 2022, 14, 10326. [Google Scholar] [CrossRef]
Uz-Bilgin, C.; Thompson, M. Processing presence: How users develop spatial presence through an immersive virtual reality game. Virtual Real. 2022, 26, 649–658. [Google Scholar] [CrossRef]
Langewiesche, W. Stick and Rudder: An Explanation of the Art of Flying; McGraw-Hill: New York, NY, USA, 1944. [Google Scholar]
Woodruff, R.R.; Smith, J.F.; Fuller, J.R.; Weyer, D.C. Full Mission Simulation in Undergraduate Pilot Training: An Exploratory Study; AFHRL-TR-76-84; Air Force Human Resources Laboratory, Flying Training Division, Williams Air Force Base: Mesa, AZ, USA, 1976. [Google Scholar]
Bradley, D.R.; Abelson, S.B. Desktop flight simulators: Simulation fidelity and pilot performance. Behav. Res. Methods Instrum. Comput. 1995, 27, 152–159. [Google Scholar] [CrossRef][Green Version]
Li, Q.; Li, B.; Wang, N.; Li, W.; Lyu, Z.; Zhu, Y.; Liu, W. Human-Machine Interaction Efficiency Factors in Flight Simulator Training Towards Chinese Pilots. In Proceedings of the Advances in Simulation and Digital Human Modeling, Cham, Switzerland, 25–29 July 2021; pp. 26–32. [Google Scholar]
Thorndike, E.L. The Fundamentals of Learning; Teachers College Bureau of Publications: New York, NY, USA, 1932. [Google Scholar]
Lysaght, R.J.; Hill, S.G.; Dick, A.; Plamondon, B.D.; Linton, P.M.; Wierwille, W.W.; Zaklad, A.L.; Bittner, A.C., Jr. Operator Workload: Comprehensive-Review and Evaluation of Operator Workload Methodologies; U.S. Army Research Institute for the Behavioral and Social Sciences: Alexandria, VA, USA, 1989. [Google Scholar]
Smith, J.K.; Caldwell, J.A. Methodology for Evaluating the Simulator Flight Performance of Pilots; Air Force Research Laboratory, Brooks City Base: San Antonio, TX, USA, 2004. [Google Scholar]
Somerville, A.; Lynar, T.; Joiner, K.; Wild, G. Virtual Reality Flight Simulation: A Quasi-Transfer of Training Study. Preprints 2026. [Google Scholar] [CrossRef]
Civil Aviation Safety Authority. Flight Instructor Manual—Aeroplane; Department of Infrastructure, Transport, Regional Development, Communications and the Arts: Canberra, ACT, Australia, 2006.
Hanks, W.F. Foreword. In Situated Learning: Legitimate Peripheral Participation; Learning in Doing: Social, Cognitive and Computational Perspectives; Cambridge University Press: Cambridge, UK, 2012; pp. 13–24. [Google Scholar]
Cardenas, I.S.; Letdara, C.N.; Selle, B.; Kim, J.H. ImmersiFLY: Next generation of immersive pilot training. In 2017 International Conference on Computational Science and Computational Intelligence (CSCI); IEEE Computers: Las Vegas, NV, USA, 2017; pp. 1203–1206. [Google Scholar]
Campbell, D.T.; Stanley, J.C. Experimental and Quasi-Experimental Designs for Research; Ravenio Books: Austin, TX, USA, 2015. [Google Scholar]
Ericsson, K.A.; Krampe, R.T.; Tesch-Römer, C. The role of deliberate practice in the acquisition of expert performance. Psychol. Rev. 1993, 100, 363. [Google Scholar] [CrossRef]
Leedy, P.D.; Ormrod, J.E. Practical Research: Planning and Design; Pearson Education: New York, NY, USA, 2015. [Google Scholar]
Phillips, J. Development of a low-cost virtual reality workstation for training and education. In Research Reports: 1995 NASA/ASEE Summer Faculty Fellowship Program; Marshall Space Flight Center: Huntsville, AL, USA, 1996. [Google Scholar]
Guillemot Corporation S.A. THRUSTMASTER T.16000M FCS Hotas. Available online: https://www.thrustmaster.com/products/t-16000m-fcs-hotas/ (accessed on 10 February 2022).
Guillemot Corporation S.A. THRUSTMASTER T.Flight Ruder Pedals. Available online: https://www.thrustmaster.com/products/t-flight-rudder-pedals/ (accessed on 10 February 2022).
X-Plane 11, (Version 11.20) [Computer software]; Laminar Research: Columbia, SC, USA, 2023.
Airservice Australia. ERSA FAC YSCB; Airservice Australia: Canberra, ACT, Australia, 2024. [Google Scholar]
Lawhead, J. Python and Geospatial Algorithms; Packt Publishing: Birmingham, UK, 2023. [Google Scholar]
FAA. Advisory Circular 90-66C: Non-Towered Airport Flight Operations; Federal Aviation Administration: Washington, DC, USA, 2023.
Yeo, I.K.; Johnson, R.A. A new family of power transformations to improve normality or symmetry. Biometrika 2000, 87, 954–959. [Google Scholar] [CrossRef]
Weisberg, S. Yeo-Johnson Power Transformations; Department of Applied Statistics, University of Minnesota: Minneapolis, MN, USA, 2001. [Google Scholar]
Hedges, L.V. Statistical Methods for Meta-Analysis; Academic Press: Orlando, FL, USA, 1985. [Google Scholar]
Hoyt, W.T.; Del Re, A. Effect size calculation in meta-analyses of psychotherapy outcome research. Psychother. Res. 2018, 28, 379–388. [Google Scholar] [CrossRef] [PubMed]
Lakens, D. Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs. Front. Psychol. 2013, 4, 863. [Google Scholar] [CrossRef]
Cohen, J. Statistical Power Analysis for the Behavioral Sciences; Routledge: London, UK, 1988. [Google Scholar]
Borenstein, M.; Hedges, L.V.; Higgins, J.P.; Rothstein, H.R. Introduction to Meta-Analysis; John Wiley & Sons: Chichester, UK, 2021. [Google Scholar]
Hoenig, J.M.; Heisey, D.M. The abuse of power: The pervasive fallacy of power calculations for data analysis. Am. Stat. 2001, 55, 19–24. [Google Scholar] [CrossRef]
Schneider, W. Training High-Performance Skills: Fallacies and Guidelines. Hum. Factors 1985, 27, 285–300. [Google Scholar] [CrossRef]
Tak, A.Y.; Ercan, I. Comprehensive Evaluation of Reference Values of Parametric and Non-Parametric Effect Size Methods for Two Independent Groups. Int. J. Stat. Med. Res. 2022, 11, 88–96. [Google Scholar] [CrossRef]
Redei, A. Applications and Evaluation of a Motion Flight Simulator. Ph.D. Thesis, University of Nevada, Reno, Reno, NV, USA, 2019. [Google Scholar]
Markert, R.J.; O’Neill, S.C.; Bhatia, S.C. Using a quasi-experimental research design to assess knowledge in continuing medical education programs. J. Contin. Educ. Health Prof. 2003, 23, 157–161. [Google Scholar] [CrossRef] [PubMed]
Roscoe, S.N.; Williams, A.C. Aviation Psychology, 1st ed.; Iowa State University Press: Ames, IA, USA, 1980. [Google Scholar]
Kozak, J.; Hancock, P.A.; Arthur, E.; Chrysler, S.T. Transfer of training from virtual reality. Ergonomics 1993, 36, 777–784. [Google Scholar] [CrossRef]

Figure 1. Map with overlay of ideal flight path (white) and several example flight paths (black).

Figure 2. (a) Mean PTiT data for the pretest and posttest groups, with 95% confidence intervals; (b) Mean untransformed AD data for the pretest and posttest groups, with 95% confidence intervals; (c) Mean transformed AD data for the pretest and posttest groups, with 95% confidence intervals.

Table 1. Straight and level tolerances (rating standard—aeroplane category—A3.2) [6]).

Flight Path or Manoeuvre		Flight Tolerance
Nominated heading		±10°
Straight and level	Altitude	±150 ft
Straight and level	IAS	±10 kts

Table 2. Test statistics.

	T	df	p	Mean Difference	SE Difference	Hedges’ g	SE Hedges’ g
PTiT	2.813	39	0.008	18.426	6.549	0.875	0.335
AD	2.126	39	0.040	0.789	0.371	0.661	0.325

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Somerville, A.; Joiner, K.; Wild, G. Enhancing Pilot ‘Mission’ Projection Through a Virtual Reality Flight Simulator: A Quasi-Transfer of Training Study. Sci 2026, 8, 70. https://doi.org/10.3390/sci8040070

AMA Style

Somerville A, Joiner K, Wild G. Enhancing Pilot ‘Mission’ Projection Through a Virtual Reality Flight Simulator: A Quasi-Transfer of Training Study. Sci. 2026; 8(4):70. https://doi.org/10.3390/sci8040070

Chicago/Turabian Style

Somerville, Alexander, Keith Joiner, and Graham Wild. 2026. "Enhancing Pilot ‘Mission’ Projection Through a Virtual Reality Flight Simulator: A Quasi-Transfer of Training Study" Sci 8, no. 4: 70. https://doi.org/10.3390/sci8040070

APA Style

Somerville, A., Joiner, K., & Wild, G. (2026). Enhancing Pilot ‘Mission’ Projection Through a Virtual Reality Flight Simulator: A Quasi-Transfer of Training Study. Sci, 8(4), 70. https://doi.org/10.3390/sci8040070

Article Menu

Enhancing Pilot ‘Mission’ Projection Through a Virtual Reality Flight Simulator: A Quasi-Transfer of Training Study

Abstract

1. Introduction

2. Materials and Methods

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI