A Review of Current Research in Subscale Flight Testing and Analysis of Its Main Practical Challenges

Testing of untethered subscale models, often referred to as subscale flight testing, has traditionally had a relatively minor, yet relevant use in aeronautical research and development. As recent advances in electronics, rapid prototyping and unmanned-vehicle technologies expand its capabilities and lower its cost, this experimental method is seeing growing interest across academia and the industry. However, subscale models cannot meet all similarity conditions required for simulating full-scale flight. This leads to a variety of approaches to scaling and to other alternative applications. Through a literature review and analysis of different scaling strategies, this study presents an overall picture of how subscale flight testing has been used in recent years and synthesises its main issues and practical limitations. Results show that, while the estimation of full-scale characteristics is still an interesting application within certain flight conditions, subscale models are progressively taking a broader role as low-cost technology-testing platforms with relaxed similarity constraints. Different approaches to tackle the identified practical challenges, implemented both by the authors and by other organisations, are discussed and evaluated through flight experiments.


Introduction
Downscaled physical models, also referred to as subscale models, have played an essential role in aerospace. Computational methods are slowly pushing experimental techniques towards a secondary role as verification or calibration tools, but physical models are still undoubtedly effective at revealing unexpected issues and providing confidence in estimations and assumptions. Wind-tunnel testing is probably one of the most established experimental methods in this field, but it sometimes fails to satisfy the requirements of modern aeronautical research and development: modern wind tunnel facilities are a scarce and costly resource, and are therefore not efficient for quick or iterative explorations of the design space during the initial stages of aircraft development. Further, the testing volume and capabilities are sometimes too limited to evaluate the integration of novel or immature technologies at a vehicle level. On the other hand, flight testing of full-scale, manned vehicles is, at this stage, often prohibitive in terms of both cost and risk. Testing of untethered subscale models, often referred to as subscale flight testing (SFT), could offer an affordable and low-risk alternative for gathering both qualitative and quantitative information.
The cost and capabilities of SFT have changed significantly thanks to factors such as the miniaturisation of mechatronic and communication systems, advances in rapidprototyping and manufacturing techniques, and the availability of both software and hardware from the booming drone market. The development of sophisticated test objects at a wide range of scales can now be carried out even by organisations with limited resources. As a result, the practical threshold for engaging in scientific flight experiments with untethered models is becoming lower than ever before. It is, therefore, necessary to re-evaluate the SFT role and its potential within the contemporary aircraft research and development process.
Generally, free-flying subscale models cannot meet all similarity conditions required for simulating the exact, full-scale aircraft behaviour. In contrast to modern wind-tunnel facilities, neither the fluid properties nor its conditions can be adjusted at will when flying in the open atmosphere. The need for prioritising certain similarity aspects over others leads to a variety of scaling approaches and different areas of application. Some of these are sometimes misunderstood or oversimplified into a Reynolds number issue.
NASA has historically been one of the most frequently mentioned actors in the field of subscale flight testing. A 1979 Technical Paper by Wolowicz et al. [1] summarises the methods and the scaling principles that NASA's subscale experiments were based on. In 2009, Chambers [2] presented an extensive historical review of NASA's research activities involving subscale models from the 1940s to 2008. This publication describes how free-flying models, both remotely controlled and uncontrolled, were traditionally used in low-speed tests that are considered to be high risk, such as studying the dynamics of high angle-of-attack (AOA), stall, departure, post-stall and spin regimes. While of great historical interest, Chambers' review does not provide a complete overview of the possibilities and limitations of modern SFT: it focuses mainly on one type of similaritydynamic -and it blends open-air flight testing with tests inside wind-tunnel facilities. To the best of the authors' knowledge, no other comprehensive review of the SFT method has been produced in recent years.
This paper aims to present an overall, up-to-date picture of SFT focusing on its main practical applications in contemporary aircraft research and development. An extensive literature review is carried out in order to reveal how SFT has been used in recent years as well as to synthesise its remaining challenges and practical limitations. Different approaches to tackle these challenges, implemented both by the authors and by other organisations, are discussed and evaluated through flight experiments.

Analysis
Some of the terms and principles used in this analysis may have different interpretations and uses in other contexts. For the sake of clarity, some definitions will be included here.

Definitions
Subscale flight testing (SFT) is defined here as an experimental method in which a downscaled, unmanned aerial vehicle is free-flown in the open atmosphere to obtain qualitative or quantitative information about a larger vehicle, a more complex system or a technology of interest.
This definition may differ slightly from other more open interpretations as it implies the following: 1. The test object is flown unconstrained in the open atmosphere, which excludes flight inside wind-tunnel facilities. 2. The test object does not have any crew on-board, independently of its control method. 3. The test object represents a significantly larger, more complex system or technology and is therefore far from a final product, which excludes conventional flight testing of unmanned aerial vehicles (UAVs) but does not exclude technology demonstrators.
Downscaled UAVs are sometimes referred to as scaled models or even model aircraft. Note that, unless stated otherwise, the term model is used throughout this paper to designate a physical representation of an object, generally of a different size but connected by some type of similarity.
In this context, the concept of similarity or similitude represents an equivalence of properties or behaviour between two systems that are described by the same physics. Hence, it is a condition that depends on the nature of the physical phenomenon of interest and that can be defined via dimensional analysis [3,4]. In a purely mechanical interpretation at a general level, there are three types of physical similarity of relevance in this analysis: • geometric similarity, which implies equivalence in shape and proportions; • kinematic similarity, which implies equivalence in motion; • and dynamic similarity, which implies equivalence in motion and forces.
The term scale factor refers to geometric similarity and it is typically used to represent a proportional ratio of linear dimensions of the model with respect to the original object: Both geometric and kinematic similarity are typically used in aircraft development for communication and knowledge exchange, such as in technical drawings, flow visualisation tools or computer-aided design (CAD) environments. The quantitative analysis of the behaviour and performance of an aircraft generally involves the synthesis of multiple forces and motions, thus requiring dynamic similarity. This is the case for aerodynamics, flight mechanics, structures and other disciplines.
The non-dimensional parameters derived from dimensional analysis and that govern a particular scaling problem are commonly referred to as similarity parameters, scaling parameters or, more generally, scaling laws. The derivation of similarity parameters with useful applications in aircraft development is discussed extensively in references [1,5].
Focusing on the mechanical analysis of aircraft behaviour and performance in a typical atmospheric flight, the forces and moments are generally a function of the aircraft and fluid properties, the characteristics of the motion, and gravitational effects: The nomenclature used here is described in the list of symbols at the end of the paper. All seventeen of these physical quantities involve three fundamental units (mass, length and time) and lead to fourteen different non-dimensional parameters defining similarity for flight dynamics for most of the practical cases. Following the dimensional analysis conducted in [1] and assuming fluid compressibility, these parameters take the following form: where C F and C M are now also non-dimensional aerodynamic coefficients in the usual form of F (1/2)ρV 2 l 2 and M (1/2)ρV 2 l 3 respectively. These similarity parameters are described in Table 1.
As a consequence of the diversity of requirements, physical limitations and technological limitations (for example, the gravitational field cannot be adjusted), it is generally impossible to design a subscale experiment in which all these similarity requirements are met simultaneously [1,2,5]. Instead, subscale experiments are designed to achieve a balance between fulfilling a subset of relevant parameters and relaxing others, depending on the phenomenon of interest. A particular approach to this similarity problem and the considerations around a specific scaled experiment can be referred to as a scaling method.

Common Scaling Methods
An extensive literature review was carried out with a focus on subscale experiments performed in the context of aircraft research and development. It was observed that most experiments with subscale models could generally be categorised into four different scaling methods, according to their main focus and the degree to which they fulfil similarity conditions. These are listed in Table 2. Table 2. The most common scaling methods used in SFT experiments can be grouped into four different types.

Aerodynamic scaling
Similarity of the flow field, disregarding similarity of the aircraft self-motion

Dynamic scaling
Similarity of the rigid aircraft motion as well as the aerodynamic loads that cause it

Aeroelastic scaling
Builds on dynamic scaling and includes similarity for vehicle deformations All (a) to (n)

Demonstrative scaling
Scaled demonstration of a particular technology, system, or capability; partially or fully disregarding the vehicle's similarity conditions

Variable
This general classification is indicative and by no means unequivocal. The boundaries between these methods are not always clearly defined, since many of the phenomena involved are closely interrelated. Some cases might fall partially in between or even outside these categories, although most subscale experiments can be related to at least one category.

Aerodynamic Scaling
The ultimate goal of aerodynamic scaling is to estimate the properties of the equivalent, real-world flow field around the full-scale vehicle. Note that although it is commonly assumed that this implies an accurate geometric similarity between the two articles, this is not necessarily required as long as the flow field produced is equivalent. Further, it is possible to disregard the vehicle motion and to transform the original problem into a flow-similarity problem. For instance, to study the flow around a wing at a specific attitude it is not necessary to take into account the mass properties of the aircraft. Besides static tests, dynamic tests are also possible by, for instance, inducing flow curvature and with rotating or oscillating models. Note that the similarity parameters from Equation (3) indicated in Table 2 as relevant for this scaling method can be further reduced to (a)(b)(c)(h) for typical, static-only experiments.
Two major side effects come from disregarding other similarity parameters. First, models are assumed to be rigid or in a static deformation state. Second, the only way of conducting these experiments is by externally forcing the model to hold the desired attitude, since freedom of movement would produce dissimilar dynamic responses. This can be achieved by mounting the model on static or dynamic test rigs inside a closed test section, as in wind tunnels [6]; or in the open atmosphere, as in car-top testing [7].
This scaling approach is predominantly used in wind-tunnel testing and is extensively covered in related literature. Its use in SFT (as defined in Section 2.1) is, however, uncommon: only a limited number of full-scale flow problems can be explored in free atmospheric flight with subscale models and it can be difficult to achieve the desired attitude or motion without adding dynamic similarity. Still, an SFT platform could be classified here as aerodynamically scaled if it is specially designed to reproduce the full-scale flow conditions or, at least, reduce the effects of Reynolds number and compressibility deviations.

Dynamic Scaling
Instead of recreating the flow field, the focus of dynamic scaling is on similar vehicle motion. In the study of rigid-body dynamics, it is generally enough to ensure that the resultant aerodynamic forces and moments acting on the body are similar at every state. Although it is sometimes assumed that such a condition requires precise geometrical similarity, this is not true. Geometric or flow-field modifications would not affect the rigidbody dynamics as long as all the aerodynamic forces and moments, as well as the mass properties of the body, are similar. Further, a rigid vehicle can disregard those parameters that account for an elastic airframe, i.e., (f) and (g). In this case, any aeroelastic effects are either neglected or accounted for by other means.
Observe that the parameters (d) and (e) prescribe that the mass distribution of the full-scale and subscale vehicles must be similar in order to obtain equivalent inertial force and equivalent rigid-body motion. Hence, the moments of inertia of the subscale vehicle must be proportional to those of the full-scale counterpart; a condition that can substantially influence the structural design of the subscale airframe and the experiment limitations.
The experimental study of aircraft dynamics is costly and difficult to perform inside wind tunnels, but it is a valuable complement to simulation and can be used for validation. Simulation models are very precise once the aerodynamic and mass characteristics are identified accurately; however, any unmodelled dynamic effects in particular flight regimes can cause significant deviations. Free flying scaled models seem to be an interesting alternative to dynamic wind-tunnel tests and dynamic scaling has therefore typically enjoyed great popularity in SFT experiments [2].

Aeroelastic Scaling
The quest for performance optimisation in modern aircraft has led to the growing use of light, more efficient, flexible structures in which both structural dynamics and flight dynamics are tightly coupled to flight control. The experimental validation of aeroelastic properties and advanced control laws is a risky and generally expensive process, due to the potentially catastrophic damage that some of these phenomena could cause to aircraft integrity and test facilities. Hence, free-flying subscale models represent an interesting alternative for early evaluation and verification.
A subscale experiment that aims to investigate dynamic aeroelasticity will necessarily involve elastic, inertial and aerodynamic forces. The similarity requirements for structural flexibility, (f) and (g), add to those already involved in both dynamic scaling and aerodynamic scaling. This means that, for a general dynamic problem with a flexible aircraft, similar aerodynamic forces and moments can only be obtained by satisfying all the similarity requirements initially included in Equation (3). As stated above, designing such an ideal subscale experiment is practically impossible and, therefore, similarity is only partially achieved in the few cases where this scaling method is used in SFT.

Demonstrative Scaling
Demonstrative scaling, or demo-scaling, is a term proposed here to encompass an increasingly common use of scaled models that does not follow the traditional interpretation of scaling laws. Demonstrative scaling can be defined as a scaling method in which the test article features a scaled form of a technology or capability that is yet to be proven in a relevant environment at a greater scale, while the test vehicle itself does not necessarily share a physical similarity with a full-scale vehicle.
The similarity parameters, if applicable, depend on the nature of the feature or technology of interest. In most cases, an exact mathematical formulation of similarity is neither relevant nor necessary for a successful acquisition of information. This definition covers a wide variety of cases and applications, from basic functionality demonstration in a near-laboratory environment to sophisticated validation tests in the expected operational environment. Further, the mentioned attributes are shared by most of the research vehicles commonly known as demonstrators, but they do not correspond with near-production prototypes in which the features of interest are nearly in service.
The literature review shows that this wide group is becoming predominantly popular in SFT experiments: the increasing access to remotely piloted vehicles seems to promote experimental investigations of relatively immature technologies such as advanced control, transition between vertical and horizontal flight, or radical propulsion solutions, but without the risk and cost that a traditional approach involves.

Recent SFT Projects
Extensive literature research was carried out in order to form a clear picture of the actual utilisation of SFT in recent years. The focus was on identifying projects with activity during the last decade, even though the initial sampling covered a much larger time span. The selection criteria for a detailed study were defined as follows: • Project or platform has produced at least one publication in English in a scientific journal or conference. • Project or platform has shown signs of activity (publications or related research activities) during the last decade (2010-2020). • Project or platform utilisation fits the definition of SFT given in Section 2.1.
The methodology for review and classification of recent literature was based on the following steps: 1. Keyword-and keystring-based search using established search engines for scientific publications (SCOPUS, Google Scholar) with an extended timespan (1990-2020). 2. Filtering and removal of duplicates, resubmissions, drafts, and publications outside the field of interest. 3. First filtering of valid SFT projects based on information from title, abstract, main features and conclusions. 4. Tracing of citations in the already selected publications. 5. Expert consultation for additional references. 6. Second filtering of valid SFT projects according to the selection criteria detailed above. 7. Grouping of publications based on the project they relate to. 8. Analysis of each project's aim, methods and platforms. 9. Elaboration of a final list of SFT platforms according to its utilisation. Table 3 presents all selected SFT platforms along with the available information about the subscale model and the scaling method used, according to the scaling method classification defined in Section 2.2.   [126] Note that the list in Table 3 only reflects those SFT platforms that, to the best of the authors' knowledge, have produced scientific publications. In fact, the actual number of relevant SFT projects or platforms is difficult to quantify due to the variety and unreliability of the communication channels used to disclose project information, but it is expected to be significantly higher. Clear examples of this are the numerous demonstrative SFT experiments on novel vertical lift vehicles being carried out by multiple small companies, and SFT of unconventional configurations by other organisations that so far have only been communicated via press releases or the media. Furthermore, despite the explicit definition of SFT given in Section 2.1, the border between UAV development and SFT is sometimes ambiguous. The inclusion or exclusion of certain projects from this list may therefore be subject to discussion.
The main purposes and research topics that these selected SFT platforms have been used for, according to the consulted publications and the additional information obtained, are illustrated in Figure 1.   Table 3 have been used for, along with their respective scaling methods. Note that each platform may have been used for multiple purposes.

Most Common Issues Associated with SFT
Both the literature review and the authors' own experience from multiple SFT projects (see [7,[91][92][93][94][95]127]) served to identify some of the most significant issues that affect typical SFT activities. These can be broadly summarised as follows: • Scaling issues: -Not possible to attain sufficiently high Reynolds and Mach numbers to ensure aerodynamic similarity.

-
Flow distortion due to the need for model actuation, instrumentation, propulsion, and manufacturing constraints. • Data analytics issues: -Measurements usually disturbed by turbulence due to ground proximity and operational constraints.

-
Lack of an appropriate specialised framework for data conditioning and visualisation.

Tackling SFT Issues
Different approaches and solutions to the SFT issues identified in Section 2.4 are discussed here. The authors' proposals will be presented along with other relevant approaches found in the literature.

Approaches to Scaling Issues
Except for demonstrative platforms with no strict similarity requirements, the fundamental limitations mentioned in Section 2.4 affect all scaling methods. Four aircraft concepts, with subscale models manufactured and flown at Linköping University, will be used here to illustrate and discuss some of these limitations. The design characteristics of these full-scale concepts are presented in Table 4. The first scaling issue refers to the difficulty of achieving sufficiently high Reynolds and Mach numbers to ensure similar flow behaviour at both full-scale and subscale. Both quantities are strongly influenced by the flow properties. Adjusting the air conditions at will, as done in pressurised or cryogenic wind tunnels, is not an option when free-flying in the open atmosphere. In most cases, the impact of these two parameters can be studied independently, considering that Reynolds number effects are typically negligible outside the boundary-layer region near the body surface [5,[128][129][130]. Experimental studies such as [131,132] support this assumption by showing a clear distinction between the effect of Mach number and effects caused by Reynolds number variations. This assumption is valid for subsonic and supersonic flow for Mach numbers less than approximately 5, a point at which the shock waves start interacting extensively with the boundary layer, and hence Mach number effects couple to viscous effects [1].
Focusing first on the Reynolds number deviations, this problem is closely coupled to the scale factor of the model since Reynolds number is directly proportional to a characteristic linear dimension, l in Equation (3)(a). Although it could be compensated for by a higher flight speed and air density, these are usually limited by practical testing constraints. Independently of the type of aircraft, a deviation in Reynolds number is therefore unavoidable at small scales, as demonstrated by the different tests shown in Figure 2.
While there is no simple solution to this issue, the most obvious mitigation strategy is minimising these deviations by aiming for larger scale factors and focusing on the modelling of relatively small and slow aircraft such as general aviation aircraft. This is exemplified by the Light-Sport Aircraft concept (c) in Figure 2, where the 34%-scale, jet-powered model also flies at speeds comparable to those of the full-scale counterpart.
A second alternative, appropriate for a wider range of applications, is to geometrically modify the subscale model to mitigate low-Reynolds-number effects on the flow and the flying characteristics. This approach requires a detailed estimation or measurement of both the full-scale and subscale characteristics in order to identify the sensitivities and design appropriate modifications. For instance, a common technique to simulate flow conditions expected at a higher Reynolds number is to artificially fix the laminarturbulent transition and the separation at a predefined location by means of grit, strips and other flow-tripping devices [1,128]. Examples of more substantial geometrical modifications applied to SFT platforms can be found in the literature: Heine et al. [133] and Vahora et al. [134] show re-designs of the main wing aerofoil to match the full-scale performance, a technique also implemented on the e-Genius-Mod (Bergmann et al. [22]) and the GA-USTAR (Ananda et al. [36]) SFT platforms. This approach is also discussed by Raju-Kulkarni et al. in [135], where it is also proposed to use specifically designed, aerodynamically tailored, subscale models for individual experiments limited to certain flight characteristics of interest.
A third approach to this issue is to rely on knowledge of the flow characteristics and take advantage of the Reynolds number insensitivity of certain flow phenomena, i.e., to accept any deviations while being aware of their general effects. Because of their effects on the boundary layer, Reynolds number variations have a strong impact on the flow topology and forces produced across bodies for which there is no geometrically fixed separation, such as smooth and curved surfaces [128,130]. In flow around sharp corners, however, the separation is usually fixed at the edge and the flow often presents little or no dependence on Reynolds number. This kind of sharp feature is commonly found in modern combat aircraft with a large flight envelope and stealth capabilities. For instance, many contemporary fighters feature chined forebodies and highly swept lifting surfaces for controlled radar reflectivity, high-speed performance and high angle-of-attack (AOA) excursions. This approach is used by the authors in the case of the GFF concept (a), where the aerodynamics at high AOA are dominated by separated flow and vortex structures. Although the boundary layer region prior to separation is strongly dependent on Reynolds number, both the vortex sheet and the inviscid outer flow region remain relatively insensitive to Reynolds number variations [130]. Furthermore, the literature generally agrees that the separation location, trajectory and breakdown of the primary vortices is mainly determined by the geometry and the angles of attack and sideslip in sharp and highly swept surfaces. Consequently, configurations with these characteristics are especially well suited for experiments with subscale models in which Reynolds number similarity cannot be maintained.  Table 4.
With respect to the second part of the aerodynamic similarity issue, deviations in Mach number are more difficult to avoid and mitigate. The literature agrees that the Mach number has a significant influence on the flow characteristics as soon as compressibility effects begin to be significant, and this may happen even far below the transonic regime.
Wolowicz et al. [1] state that the differences in true and incompressible-flow dynamic pressure and temperature may be significant for Mach numbers in excess of 0.20, while other sources often extend this limit to Mach 0.3. It should be noted that even in a flow with a low free-stream Mach number, compressibility effects may be significant in local flow around certain geometries. For instance, the studies on the F/A-18 fighter configuration reviewed by Erickson et al. [136] revealed that the flow around some components such as leading-edge extensions present compressibility effects at Mach numbers as low as 0.15-0. 25.
In contrast to the Reynolds number, the Mach number is not directly related to the scale factor of the model but a consequence of the structural design of the model, its propulsion system, and the maximum flight speed at which the model can be safely controlled. There are no effective techniques to artificially compensate for Mach number dissimilarity, which makes it an important limitation for all studies in which compressible flow is relevant. Figure 3 illustrates this issue by showing the different regimes at which the concepts of Table 4 were tested. Unless the subscale models are specifically prepared and allowed to fly at very high speeds (as in the case of some drop-models used by NASA [2,137]), SFT experiments are necessarily constrained to the study of the low subsonic regime. In addition to the issues caused by dissimilarity, local flow disturbances can also be introduced by instrumentation, mechanisms, propulsion or any other systems needed for a functional free-flying model. In most cases, these are considerably different to those of the full-scale concept, and can cause unintended effects in the flow topology and flight performance. Thanks to the miniaturisation of mechatronics and radio-control (RC) systems, a relatively unobtrusive system integration is often possible even at small scales (see for instance [45,127,138,139]), but some modifications may remain unavoidable. In those cases, a complementary computational fluid dynamics (CFD) or wind-tunnel study may help isolate the effects.
Even with unavoidable differences in flow characteristics with respect to the fullscale aircraft, SFT can always be used to validate the aerodynamic characteristics of a corresponding virtual subscale simulation model. The aerodynamic scaling effects between full-scale and subscale simulation models, denominated 'virtual scaling error' in [135], can then be studied with other methods such as CFD, as commonly done in contemporary wind tunnel experiments [140].

Management of Dynamic Similarity, Mass, and Response Time
The Froude number, parameter (m) in Equation (3), becomes significant whenever dynamic similarity is considered. This parameter, however, dictates different requirements for subscale model velocity than those coming from the Reynolds and Mach numbers. Even assuming a certain degree of Reynolds number dissimilarity, satisfying both Mach and Froude numbers simultaneously would require an improbable change in the gravitational field. This issue generates two possible approaches to dynamic scaling: one that pursues complete Froude number similarity at the expense of dissimilar fluid-compressibility effects, and another that pursues complete Mach number similarity at the expense of dissimilar proportions between inertial and gravitational effects. The first is commonly known as Froude scaling, while the latter is sometimes referred to as Mach scaling [1]. Froude number similarity can be met by tailoring the mass properties of the subscale model, which is significantly easier to achieve than Mach number similarity, as discussed above. In fact, all the SFT platforms listed in Table 3 that use dynamic scaling are based on the Froude scaling variant. This is also the approach followed by the authors in all dynamically scaled platforms at Linköping University.
Further, the similarity of mass ratio and mass moments of inertia, terms (d) and (e) in Equation (3), prescribe model weights that can differ significantly from those found in similar non-dynamically scaled models. If the aircraft is assumed to be a rigid body, the moments and products of inertia can easily be matched by distributing individual masses along the airframe, although this technique is only applicable if the airframe is initially lighter than the target weight. In fact, achieving the right inertial characteristics becomes rather challenging if the prescribed weight is lower than that resulting from a typical model manufacturing process: special materials or manufacturing techniques may be necessary and the inertia requirements may have a significant impact on the model design. This was the case for the subscale model of a human powered aircraft (case (d) in Table 4), where the weight budget for a 6-metre wing was under 200 g.
In some cases, the requirements for similarity in mass and inertial characteristics will make it impractical or infeasible to conduct a subscale experiment at certain scale factors. The different examples introduced in Table 4, plus two additional cases, are used in Figure 4 to illustrate this problem. Here, the background colours suggest a relative level of difficulty (and eventual cost) of performing SFT with instrumented, dynamically scaled models according to their take-off mass: low (green), medium (yellow) and high (red). For instance, the legal requirements for civil operation of this kind of vehicle in many European countries change significantly when the take-off mass exceeds 25 kg, and even more dramatically when it exceeds 150 kg. On the lower side, it becomes generally difficult to build and operate a functional model of less than 1 kg of take-off mass including the necessary instrumentation. Figure 4 shows two lines for each case: the solid lines represent similarity to the full-scale vehicle at sea level, while the dashed lines represent similarity to the full-scale vehicle when it flies at its design altitude. The GFF, indicated as (a), is a good example of the trade-off needed: the subscale model is operated below the required weight for dynamic similarity in order to avoid the cost and the legal requirements of a heavier vehicle category.
Another potential problem relating to dynamically scaled experiments is the decrease of actuation and response times with the scale factor, at a rate of (l m /l f s ) 1/2 . The model may not only require higher sampling rates from the instrumentation and data acquisition systems, but also demand a faster control system: the speed at which the control surfaces are deflected should match the time requirements. While the latency of the radio-control system is usually not a problem with modern equipment, a bottleneck can be found at the speed of proportional servo-actuators. Commercial off-the-shelf (COTS) digital servoactuators were sufficient for the subscale models presented in Table 4. However, smaller scales may require special high-speed servo-actuators or even alternative solutions such as simpler, non-proportional (on-off) high-speed actuators with separate pilots for each axis, as used by NASA in various small-scale models in the past [1,2,137]. , , Figure 4. Take-off mass and scale factor according to dynamic similarity for SFT at sea level. Background colours indicate relative complexity, considering the challenges and cost of manufacturing and operation. Solid lines represent similarity to the full-scale vehicle at sea level while dashed lines represent similarity to the full-scale vehicle at its design altitude. The GFF subscale model in case (a) is lighter than prescribed due to cost and regulations. Besides the aircraft described in Table 4, two complementary examples have been added: * (e) corresponds to NASA's AirSTAR GTM-T2 [121], and ** (f) corresponds to NASA-Boeing's X-48B [108], both scaled with respect to their design altitude.

Aeroelastic Considerations
For a general dynamic problem with a flexible aircraft, obtaining similar aerodynamic forces and moments would require satisfying all the similarity parameters in Equation (3). This general formulation includes the elastic similarity between full-scale and subscale vehicles by means of the parameters (f) and (g), which account for the aeroelastic bending and aeroelastic torsion, respectively. While in dynamically scaled rigid aircraft, similarity in mass moments and products of inertia could be met by adding the necessary masses on the airframe, aeroelastic scaling requires that the actual distribution of these masses is similar to that of the full-scale vehicle [1]. This detail adds a significant complexity to the design and manufacturing of the model, considering that the loads do not scale at the same rate as the stiffness and that the structural construction of models is normally different [137].
Two different approaches to this issue are found in the literature. The first approach consists on focusing of specific aeroelastic problems and proposing a specific similarity formulation for each case. By assuming a partial similarity, tailored to the specific problem of interest, it is possible to find a balance between a feasible platform and a flight-test solution. For instance, Ouellette et al. propose in [66] a more practical set of aeroelastic scaling laws applied to the study of couplings between the short-period mode and the wing structural dynamics such as body freedom flutter (BFF) [141]. Among other simplifications, these authors argue that the sensitivity of the short-period mode to the Froude number is generally low and therefore the flight velocity can be lower than that prescribed by typical Froude scaling. The reduced set of similarity requirements allowed the development of a feasible subscale experiment whose results were reported in [67]. Another example of this approach for the study of BFF is the Lockheed Martin X-56A or Multi-Utility Technology Testbed (MUTT), a 15% scaled version of the Sensorcraft configuration with interchangeable wings [77,78,142].
The second alternative is to further disregard other similarity factors and focus exclusively on investigating the aeroelastic problem of interest. In practice, this approach blends with the demonstrative scaling method as defined in Section 2.2. An early example of this approach is the Drones for Aerodynamic and Structural Testing (DAST) programme carried out by NASA from 1977 to 1983 [143][144][145]. During this programme, two BQM-34 Firebee II target drones were modified with supercritical aerofoils and new wing geometry, the Aeroelastic Research Wing (ARW). These vehicles were mainly used to evaluate active control systems and flutter suppression techniques, as well as for stability and structural investigations. A more recent take on active flutter control and aeroelastic tailoring is the European Flutter-free Flight Envelope Extension for Economical Performance Improvement (FLEXOP) research project [13][14][15][16][17][18] and its continuation, FLiPASED.

The Demonstrative Scaling Approach
Demonstrative SFT platforms, as defined previously, are not necessarily bound by the typical similarity parameters in Equation (3). The scaling principle is applied to the technology or capability, and not necessarily to the entire vehicle. Therefore, it is not possible to generalise how scaling issues affect these experiments. They are, however, exposed to the same issues regarding flight testing and data analytics.

Approaches to Flight-Testing Issues
There are multiple approaches to flight testing subscale aircraft. Critical aspects for flight test design, such as risk evaluation and infrastructure requirements, vary widely between different test objects. Additionally, the existing regulations for unmanned aircraft systems (UAS) in most countries define a significant step in requirements between flight within visual line-of-sight (VLOS) and beyond visual line-of-sight (BVLOS) [146]. Low-cost equipment for extended-VLOS (EVLOS) and BVLOS operations is readily available, but a civil operator is usually asked to either certify the system according to nearly full-scale standards or operate inside costly segregated airspaces [147][148][149]. In practice, certification is usually not feasible and only a few civil organisations can make regular use of segregated airspaces to fly BVLOS operations. Instead, most organizations perform SFT in nonsegregated airspace following EVLOS and VLOS rules as far as practical. While an exact classification cannot be formulated, most of the SFT experiments found in literature could be roughly divided into four different categories according to the type of operation and standards, see Table 5.  [7], Taylorcraft [40], ITA-BWB [27] There is extensive literature covering flight-testing methods for both for conventional [150,151] and relatively large unmanned vehicles [152][153][154], which could be related to SFT experiments of level 4 and some of level 3 in Table 5. Despite being more common, little has been published about specific methods for VLOS testing of relatively smaller RPA, i.e., experiments corresponding to levels 1 and 2, and even some at level 3. A good example of the testing conditions within VLOS is the FLEXOP project, where the limited airspace and the relatively large size of the demonstrator had an important impact on the design of the experiment and the vehicle itself [13,14]. The procedure description given by Bunge et al. [40] illustrates a typical test of much smaller aircraft at level 1. Most subscale platforms at Linköping University are tested at levels 1 and 2. Among these, the GFF demonstrator (case (a) in Table 4) represents the most challenging case for SFT within VLOS at level 2. This case will therefore be used here as an example to present the authors' solutions. Figure 5 represents typical conditions found when flight testing this type of platform within VLOS, although the exact available range and height might vary slightly depending on the local regulations and safety margins.  Table 5): a challenging environment with severe exposure to ground turbulence.

Optimising SFT at Levels 1 and 2
In a previous publication, the authors proposed and tested three techniques to cope with some of the issues that hinder high-quality data acquisition during VLOS testing (levels 1 and 2 in in Table 5): short testing time, imprecision of excitation manoeuvres, constant manoeuvring and lack of steady conditions [93]: • Automation of test manoeuvres: Custom-made software that is able to precisely command any kind of pre-programmed excitation manoeuvres without the need for a closed-loop flight controller or an on-line ground station, hence avoiding eventual redundancy and certification requirements. • Optimised manoeuvres for flight mechanical characteristics: Reduce exposure time by exciting different axes and controls simultaneously using multisine signals [94]. • Optimised manoeuvres for performance evaluation: Executing certain dynamic manoeuvres that allow for the exploration of a wide area of the polar in a short time. Figure 6 is an example of the lift characteristics of the GFF demonstrator identified from a single flight using some of the manoeuvres proposed in [93].
For most SFT campaigns with relatively small vehicles and limited to VLOS, imitating the flight test organisation and execution procedures normally used in conventional flight testing of large RPA or manned vehicles (level 4 and above) is far from optimal. Failing to recognise the specific needs and the inherent advantages of SFT within VLOS, especially in the case of relatively small vehicles at levels 1 and 2, defeats the purpose of scaled testing by eventually increasing costs, time, and administrative burden. Figure 7 tries to illustrate this idea by comparing the relative importance typically given to different factors on both manned/level 4 SFT and levels 1 and 2 SFT. A good example of the need for a reasonable approach is the level to which the flight is planned and scripted in test cards. Figure 8 illustrates the approach followed during flight testing of the GFF demonstrator at level 2. Each circuit is divided into a test manoeuvre and a pattern flight. The test manoeuvre takes place inside a defined test window, while the pattern flight makes use of the available manoeuvring area inside the safety limits. Instead of defining each consecutive movement of the aircraft during the circuit, the test cards focus on specifying the conditions at which the aircraft should enter the test window and the precise movements that should be done inside it. In most cases, the test manoeuvre can also be executed automatically using the application described in [93]. The pattern flight is then executed in a relatively free manner, but always within previously agreed parameters and speeds. This approach gives more flexibility to the remote pilot, but also relies heavily on his or her judgment during the flight. The results obtained after several campaigns with the GFF demonstrator suggest that this procedure is beneficial even at level 2. Considering the reduced airspace, the influence of external factors and the speed at which the events usually unfold, the remote pilot is often in the best position to judge the optimal manoeuvring needed to attain the desired entry conditions before the next manoeuvre. A real-life example of the application of this method in level 2 SFT of the GFF demonstrator is shown in Figure 9, which corresponds to the same flight that generated the data shown in Figure 6.

Specific Data Acquisition Solutions
The instruments and data acquisition systems typically used in manned flight testing are, in most cases, not appropriate for subscale aircraft. The main reasons are often weight, size and relative cost, but can also include different needs in terms of resolution and sampling frequency. For example, while an altitude error of 5 m and a sampling frequency of 25 Hz may be acceptable for flight testing the performance of a full-scale aircraft, it can be insufficient for a subscale aircraft with faster dynamics and operating between 0 and 120 m above the ground.
Researchers have approached this issue with a variety of solutions, including both standalone data acquisition systems and integrated flight control systems (autopilots) with data logging capabilities. Dantsker et al. [45] present an extensive review of different solutions used for research purposes by different organisations. Until recently, COTS solutions for data acquisition were limited to few hobbyist-type systems such as the Eagle Tree Systems Flight Data Recorder [155] or higher-end, industrial UAV systems such as the RCATS UAV [156]. COTS autopilot systems for UAV, such as the Cloud Cap Piccolo series [157], have also been used for data logging in several projects. The gap between these two levels was filled with a variety of custom-made data acquisition solutions, most of which were based on the integration of custom and COTS sensors of different types but resulting in similar architectures [44,56,69,95,138,139,158,159]. The need for an intermediate-level system, specifically adapted to the SFT needs, has motivated the recent appearance of a few specialised COTS solutions such as the Al Volo FDAQ [45].
Over the last few years, in line with the expansion of the consumer and semiprofessional UAV market, many open-source autopilot software projects have reached the community. In some cases, these projects also commercialise accompanying hardware with different capabilities and at different levels of integration. Paparazzi [160] and PX4/Pixhawk [161] are two well-known projects that are commonly used in research applications [12,20,40,127,162,163]. Open UAV-software development platforms such as DRONEKIT [164] and the Dronecode Foundation [165] also provide tools that facilitate a quick integration of data acquisition components and functions. The current availability of capable low-cost sensors and miniaturised processing boards, in combination with these software tools, enable the development of custom data acquisition solutions at very low costs, such as the system presented by Koeberle et al. [159]. The data acquisition system currently used in SFT experiments at Linköping University is also a good example of a low-cost, custom implementation based on a commercially available, open-source autopilot from the UAV segment. This data acquisition system integrates both COTS and custom-made instruments, while the flight controls are operated separately using a professional COTS RC system. The data system is built around a Pixhawk flight controller [161] running a modified version of the Ardupilot open-source software [166] for dedicated and enhanced logging capabilities. This controller is complemented by both COTS and custom-made transducers and communication modules. Figure 10 shows a general layout of the system components integrated into the GFF demonstrator. While telemetry is only used to monitor critical flight parameters during flight, both raw and fused sensor data are logged on board at sampling rates up to 100 Hz.
Some generic data acquisition system components (imaging, positioning, communication devices) are usually best sourced from the general electronics and UAV markets. However, more specialised flight-test instruments may still require customisation or vehiclespecific development. Airdata probes and flow-angle transducers in small subscale aircraft are a good example of this issue. The smallest five-hole probe solution commercialised by Aeroprobe Corporation [167,168] is widely used for low-AOA tests provided that there is enough volume to allocate the hardware; however, few vane-based solutions exist at such small scales. Figure 11 shows two custom-made airdata systems developed for two different SFT applications: a high-speed, full-data model used on the GFF demonstrator, and a simpler, lightweight model used on a smaller test-bed aircraft.  . Two airdata probes with contactless flow-angle transducers custom-made for two different applications: A rugged, full pitot-alpha-beta system for high-speed flight using the GFF demonstrator (top), and a low-weight, pitot-alpha system with 3D-printed parts developed for a low-speed test-bed aircraft (bottom).

Approaches to Data Analytics Issues
Even with efficient use of manoeuvring time and airspace, measurements from SFT at very low altitude may still suffer from significant contamination from air turbulence. In contrast to other more defined or coloured measurement noise sources, such as propulsionsystem vibrations or actuators, turbulence can mix with the dynamic frequencies of interest and it can be challenging to filter out its effects. Besides flying in favourable weather conditions, the only possible mitigation lies in the data analysis method. In [169], Morelli et al. proposed a robust method for the identification of flight mechanical characteristics in turbulent environments and demonstrated it using NASA's GTM T-2 subscale aircraft. This method is based on system identification in the frequency domain and indicated for the evaluation of flight-mechanical characteristics. Other studies such as [170] also followed a similar approach to develop an suitable identification method and confirmed that the influence of wind gusts and turbulence is more significant than that of sensor noise. In [94], the GFF demonstrator was used to develop and verify an enhanced identification method to cope with high levels of turbulence in the flight-test measurements.
Filtering signals, conditioning data, checking for consistency and estimating results are usual tasks performed in all sorts of flight testing, not only SFT. Organisations with flight-testing capabilities have usually developed proprietary tools to cover these tasks both online (in real time) and offline, but in most cases these are focused on full-scale vehicles and manned flight-testing operations. Software tools covering certain parts of the offline data analysis process have been released to the community for research (see NASA's SIDPAC for system identification [171]) and educational (see the collection of programmes offered in [172]) purposes. Further, NASA's open-source Open Mission Control Technologies (OpenMCT) visualisation framework is being increasingly used in SFT projects to manage, display and broadcast real-time telemetry data. Both commercial and opensource UAV ground-control software such as Mission Planner [173] also begin to offer some degree of data analysis capabilities. While most of these tools are perfectly applicable to SFT experiments, there is currently no freely available, integrated flight-test data analysis solution that is specifically tailored to SFT needs.
In addition to a usually large mix of sensors of different types, SFT experiments sometimes involve quick changes of configuration or relatively frequent changes in the data acquisition system. The Aircraft Log Analysis (ALAN) software is a collection of tools for MathWorks, Inc. (Natick, MA, USA) MATLAB environment (versions 9.2 and higher) that has been developed at Linköping University with these needs in mind. ALAN, currently in its second version and soon to be released as open code, covers a large portion of the usual post-flight analysis tasks. Figure 12 describes the workflow and the kinds of tasks that can be performed in this software. Particular importance has been attached to the data visualisation capabilities of the software. On the one hand, a comprehensive presentation of the flight data not only enables a better comprehension of the events but also allows a much quicker identification of segments of interest. In addition to typical plotting functions, this was achieved by presenting data in different forms of animation, including a virtual 3D reproduction of the vehicle during flight ( Figure 13). On the other hand, an intuitive and visual display of generated results facilitates quicker evaluation and verification. Figure 14 presents an example where a set of manoeuvres is manually selected for parameter estimation by directly clicking over the plot area. The results obtained are then presented graphically, allowing a quick comparison between the different time-domain methods used.

The Role of SFT in Aircraft Development and Opportunities for Future Research
Aerodynamic, dynamic or aeroelastic scaling can be effectively used to complement and validate digital design tools and simulations models. However, as computing resources increase and estimation methods based on CFD become more sophisticated, interest in using SFT as a tool for estimating or verifying flight characteristics seems to be waning. Research activity during the last decade indicates that the main interest in SFT is shifting from a flight-characteristics estimator to broader use as a low-cost technology-testing platform. The use of demonstrative scaling-partially or entirely disregarding the physical similarity of the vehicle with the purpose of evaluating a representative version of its technology-is becoming increasingly predominant. This approach seems especially appropriate for the early development stages, when the vehicle configuration is not yet well defined and accurate scaling is thus irrelevant.
In contrast to CFD simulations and wind-tunnel testing, the SFT method is not particularly well suited for the precise estimation tasks and minor design refinements typically carried out during the detailed design stages of aircraft development. SFT can instead play a much more valuable role during a pre-conceptual or conceptual-design phase. Figure 15 illustrates this idea by showing a possible application of SFT during the initial stages of a typical aircraft development process. In this figure, the areas highlighted indicate where SFT experiments can provide unique information or complement other design tools. SFT   Nevertheless, some of the SFT method's previously discussed issues remain partially unresolved. Other aspects or potential applications of SFT are currently insufficiently explored, or lack a contemporary analysis. According to the knowledge gaps identified in this study, the following topics could be of great interest for future research: • Implications of partial similarity and scaling inaccuracies on the measurability, fidelity and extrapolability of flight characteristics: Beyond basic aerodynamic considerations such as Reynolds number or compressibility deviations, the effects of not fulfilling other similarity parameters is still a controversial topic, especially if the purpose of the SFT experiment is to estimate the flight or handling characteristics of a full-scale vehicle. While this topic has been widely discussed in the wind-tunnel literature, little open information is available for free-flight models. Recent publications [8,135,174] show ongoing efforts to identify and quantify these effects using different approaches. • Benefits of early subscale experimentation in the maturation of new technology using a demonstrative scaling approach: While the growing interest in using demonstrative subscale platforms to increase the technology readiness level (TRL) of new technologies may indicate that the method has a positive effect in the development process, no scientific studies have tried to interpret or quantify these benefits in comparison to other development strategies. • Suitability of SFT for the evaluation of handling qualities with a human pilot in the loop: The usefulness of SFT for experimenting with automatic flight control laws is, at this point, indisputable. However, its suitability for obtaining human-pilot ratings of handling qualities is unclear. Earlier experiences from NASA [2] suggest that SFT may not be appropriate for this purpose while Mandal et al. [70] suggest wide variations in pilot behaviour. Specific studies taking into account modern control and information augmentation systems would be desirable. • Specific flight-testing methods, measurement and analysis techniques for efficient subscale experiments: The testing environment, procedures and even the measurement solutions often seem to play an important role in both the capabilities and the results of SFT. While this is a wide area ranging from unmanned aircraft operations to manoeuvre design and data acquisition techniques, its understanding is key to enabling efficient and useful SFT experiments.

Conclusions
There seems to be growing interest in using subscale flight testing as a research and development tool. Compared to traditional subscale flight testing platforms, recent advances in mechatronics, rapid prototyping and unmanned-vehicle technologies now seem to favour the use of smaller vehicles and low-cost systems.
The practical use of subscale flight testing for the replication of full-scale flight behaviour is clearly limited by unavoidable physical constraints derived from the similarity principles. Only a limited number of flow conditions can be explored with free-flight subscale models at low altitude without accounting for undesired scale effects. This, added to the convenience of computational methods, may explain why subscale flight testing is currently unpopular for studying purely aerodynamic problems. However, the use of subscale flight testing to study both rigid-and flexible-body flight dynamics is more common and constitutes an alternative to certain kind of costly or unfeasible wind-tunnel tests. In these cases, aerodynamic (flow) similarity is typically relaxed or reduced to the replication of the most characteristic parameters.
The review of current research also shows that the main use of subscale flight testing is progressively shifting from a flight-characteristics estimator to a broader use as a low-cost technology-testing platform. The use of subscale models to evaluate a scaled version of a new technology or feature, partially or entirely disregarding vehicle similarity, is becoming increasingly common. This practice is defined here as a demonstrative scaling approach.
Regarding flight testing methods, the subscale flight testing projects examined present large differences in equipment, facilities, procedures and, hence, costs. In most cases, the requirement to operate within visual line-of-sight appears to be the main factor constraining the potential and efficiency of the experiments. Specific flight-testing solutions for visual line-of-sight are evaluated and suggested.
The general lack of specialised testing instruments, avionics and analysis tools for subscale platforms is being progressively filled by industrial systems for unmanned aerial vehicles on one side, and by purpose-built systems mixing commercial off-the-shelf and custom-made solutions on the other. The recent proliferation of mass-produced and opensource components for the consumer 'drone' segment has also facilitated the development of economical solutions. A low-cost data acquisition system with commercial off-the-shelf components and custom-made instruments is presented here as an example.
Considering the current capabilities as well as the volume and diversity of the findings obtained with subscale flight testing during the last years, it seems clear that this method can still play a valuable role during the early stages of aircraft design and low technology readiness levels. Further, its use can be expected to grow as the unmanned-systems market and other enabling technologies offer more capabilities at lower costs.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations
The following nomenclature is used in this manuscript: