A Review of Validation Methods for the Intracranial Response of FEHM to Blunt Impacts

: The following is a review of the processes currently employed when validating the intracranial response of Finite Element Head Models (FEHM) against blunt impacts. The authors aim to collate existing validation tools, their applications and findings on their effectiveness to aid researchers in the validation of future FEHM and potential efforts in improving procedures. In this vain, publications providing experimental data on the intracranial pressure, relative brain displacement and brain strain responses to impacts in human subjects are surveyed and key data are summarised. This includes cases that have previously been used in FEHM validation and alternatives with similar potential uses. The processes employed to replicate impact conditions and the resulting head motion are reviewed, as are the analytical techniques used to judge the validity of the models. Finally, publications exploring the validation process and factors affecting it are critically discussed. Reviewing FEHM validation in this way highlights the lack of a single best practice, or an obvious solution to create one using the tools currently available. There is clear scope to improve the validation process of FEHM, and the data available to achieve this. By collecting information from existing publications, it is hoped this review can help guide such developments and provide a point of reference for researchers looking to validate or investigate FEHM in the future, enabling them to make informed choices about the simulation of impacts, how they are generated numerically and the factors considered during output assessment, whilst being aware of potential limitations in the process.


Introduction
Mild Traumatic Brain Injury (mTBI) is widely known to have a significant cost in society [1][2][3] but, compared to injuries in other parts of the body, is not well understood. As such, head impacts and injuries are continuously researched, particularly in relation to public and automotive safety and sporting injuries. Over the past two decades or so, Finite Element Head Models (FEHM) have become a key tool in investigating the intracranial effects of blunt force impacts and studying potential injury risk factors. Numerous FEHM have been developed, representing the brain and surrounding tissues in a number of ways, the most prominent of which have been comprehensively reviewed [4][5][6][7]. Background theory around modelling the human head is also well documented [8,9], however there are still challenges associated with modelling the human head, particularly relating to the representation of the brain tissue and the interface between the brain and the skull.
Brain tissue has complex mechanical properties and presents significant challenges during testing. As such, identification of its mechanical behaviour and subsequent constitutive modelling is an extensively discussed topic [10][11][12][13][14]. In relation to FEHM, the result of this is that a range of viscoelastic and hyperelastic material models have been implemented to govern the behaviour of the brain. Within a single model type, characteristic material properties can vary by an order of magnitude and it has been shown that both the type of material model [12,15] and the material properties within [16,17] can have significant effects on the impact responses of FEHM.
The treatment of the interface between the brain and the skull also has a significant influence on the impact response experienced by the brain. The complex system of membranes and cerebral spinal fluid (CSF) forming this interface is necessarily simplified in FEHM. While research has been conducted into including genuine fluid representations [18][19][20], the CSF has generally been modelled as a combination of contact algorithms and "fluid-like" solids [5][6][7]. As with the brain tissue, various factors in the CSF/brain-skull interface representation have been shown to affect the impact response of the brain in FEHM [21][22][23][24][25].
As a result of these uncertainties, there is a lack of a "gold standard" or uniformly accepted way of representing the major components of FEHM. Furthermore, design choices such as the level of tissue differentiation [21], the overall size [21,26] and geometry [27][28][29] of the model and the simulation environment and element formations used [30] affect the FEHM performance characteristics. Validation of the complete assembly is therefore essential, especially when considering the intracranial response of models; the key point of interest when investigating mTBI and concussion type injuries using FEHM. However, despite following similar validation processes, studies comparing the responses of FEHM have found the reactions of different models to similar impacts to be incomparable [17,[31][32][33].
This review aims to assist researchers in making informed choices during the development and validation of future FEHM by establishing the range of possible tools available to them. Furthermore, by including research efforts and discussions around the methods previously applied during FEHM validation and approaches to assessing model performances, this review hopes to encourage more consistency in validation protocols, moving towards best practices within the materials available. While other factors in FEHM may cause differences in behaviour, collating the information currently available around FEHM validation may improve understanding of the contribution of validation procedures in this. Previous reviews [4][5][6][7] and instructional texts [8,9] have largely focused on providing information and opinions on the properties within FEHM. This review hopes to compliment such works by providing similar material on the validation stages, completing the model development process.
The process of validating FEHM consists of replicating impacts created in a laboratory environment on human head samples, either as live subjects or cadaver specimens, in the numerical environment. Responses recorded in the empirical method are then compared to equivalents in the simulation environment. The intracranial response of FEHM has been validated against intracranial pressure (ICP), relative brain displacement and brain strain behaviours, with the vast majority of existing FEHM using a small group of empirical studies; Nahum et al. [34] and Hardy et al. [35,36], although the exact cases taken from each, their method of application to simulations and response evaluation varies significantly. These and other experimental sources previously applied during FEHM validation are summarised with particular focus on the data provided allowing impacts to be replicated within simulation environments and the impact response data provided. Potential alternative sources offering similar validation potential through altered methodology are also introduced. Latter sections consider the implementation of these empirical data. Section 3.1 identifies the specific cases against which existing FEHM are validated while Sections 3.2 and 3.3 discuss the methods used to generate impact scenarios during simulations and how outputs are compared and assessed to the empirical data. Finally, works considering the validation process and its implementation are highlighted in Section 4.

Experimental Procedures Used in FEHM Validation
The first component of any validation is identifying experimental data to compare numerical performance to. When validating the intracranial response of FEHM, this data are drawn from experimentation performed on either post-mortem human subjects (PMHS), also known as cadavers, or live human subjects. Each has their advantages and limitations. Cadavers can be directly instrumented and subjected to impacts of any severity, including those that would result in injury. On the one hand, they are post-mortem: All the experiments discussed in this work [34][35][36][37][38][39][40] used unembalmed cadavers. However, they still underwent preparation processes and were repressurised with CSF substitute which could alter conditions. Furthermore and the properties of brain tissue may change post-mortem [41]. On the other hand, experimentation on in vivo subjects is ethically limited to scenarios that do not put volunteers at risk of injury, and monitoring the impact response has to be done via external methods, limiting the data that can currently be collected.
The following sections summarise experimental procedures that have previously been applied in the validation of FEHM, using both cadaver and in vivo subjects. Within each subject type, experimental procedures are grouped according to the impact response metric they predominantly investigate. The experimental procedures used are briefly outlined and discussed, focusing on the data available to both generate impacts in the simulation environment and to assess and compare impact responses. Additional experimental investigations that provide potential alternatives to the impact data are also introduced and compared to the existing options.

Intracranial Pressure Response
The Intracranial Pressure (ICP) response is the most frequently considered FEHM validation metric. Practically every FEHM in the public domain has replicated impact conditions from Nahum et al. [34] and many supplement this with a scenario from Trosseille et al. [38]. Both Nahum et al. [34] and Trosseille et al. [38] place transducers at various locations in the sub-dural region of re-pressurised, un-embalmed cadaver samples. The cadavers are then secured and impacted by a piston travelling at a constant velocity.
Nahum et al. [34] conducted 15 impacts in total, across two investigations. In the first investigation they impacted eight cadaver samples once, changing the mass, surface material and impact velocity of the piston to study the effect of impact characteristics on the ICP response. These are listed by Nahum et al. [34] as experiments 36 to 38, 41 to 44, and 54. In the second experiment they impacted a single cadaver seven times with the same piston at five different velocities to consider the relationship between head acceleration and ICP peaks. These are experiments 46 to 52. All the cadavers had transducers fitted in the frontal, parietal and occipital sub-dural regions, and in the posterior fossa. In two cases (tests 36 and 37) transducers were located bilaterally in the occipital region to monitor the symmetry of the pressure response. The second occipital transducer in experiment 36 was also placed in the epidural layer (all other transducers were sub-dural). In further experiments, transducers were placed adjacent to each other in the posterior fossa (tests 46 to 52) and occipital region (36) to monitor experimental procedure. The pressure in the carotid siphon was also monitored for experiments 41 to 54. The location of each transducer is described in relation to skull sutures with the exception of the frontal transducer, which is placed adjacent to the impact site. Exact distances from a datum are however, not detailed. The cadaver head was secured in a forward tilted position to allow the horizontally travelling piston to impact the front of the head, in line with the mid-sagittal plane and 45 • above the Frankfurt plane, targeting the crown of the forehead. For each impact, Nahum et al. [34] measure and report the mass and velocity of the piston prior to impact and the peak values for impact force and linear acceleration, alongside severity measures for each impact (e.g., GSI [42], HIC [43] and injury damage codes). The peak pressures recorded by each transducer are also given for the first investigation, as summarised in Table 1. Case 37 is reported in more detail, where each response metric is plotted throughout the duration of the impact, as illustrated in Figure 1. Consequently, this is the most commonly replicated scenario in FEHM validation.
Prior to the study discussed above, Nahum and Smith [37] conducted a series of experiments following a similar protocol, with a static cadaver head impacted in the frontal region. Ten cadavers were impacted once each by a piston of mass between 5.31 and 5.39 kg at velocities between 3.96 and 9.70 m/s with a variety of padding materials on the impactor. During the tests, the impact force, accelerations of the impactor and cadaver arterial and CSF pressures were monitored and the impact force and kinematic response of each impact reported. Pressure responses, however, were not, as readings were distorted by cannulas in the assembly. After impact, the brain tissue of each cadaver was inspected for contusion and any observed damage was described in detail. Five of the 10 cases resulted in no physical injury. The remaining five presented evidence of haemorrhage and/or vascular rupture in various locations.
The five no-injury cases and one injury example (presenting coup contusion) was used from Nahum and Smith [37] during the validation of the GHMBC (Global Human Body Models Consortium) FEHM [44]. The six impacts were simulated and the peak pressure response magnitudes compared to the injury outcome from the original experiment. A significantly higher peak pressure of 227 kPa was recorded for the case presenting injury than the non-injuring group, where it was 136 ± 55 kPa. This was used to indicate that a contusion threshold could be developed through future work.
Trosseille et al. [38] studied the intracranial response of three cadavers across six impacts with the intention of creating a tool to validate FEHM, particularly for use in automotive crash research. These authors also conducted an investigation into factors influencing the ICP response. Two of the cadavers (labeled as MS408 and MS428 in the original publication) had pressure transducers fitted in the arachnoid space of the frontal, parietal and occipital regions. Further transducers were inserted into the third and lateral ventricles in test MS428. Accelerometers were fitted to the skull and set into the brain tissue in four locations to compare the relative motion of the bodies. Limited preparation time was available for cadaver MS429 so instrumentation was reduced to a single intracerebral accelerometer and frontal and occipital pressure transducers. The exact location of each piece of instrumentation is reported by Trosseille et al. [38], confirmed via an X-ray image of each cadaver taken after transducer installation. Impacts were generated by striking a stationary cadaver with a 23.4 kg impactor at velocities between 5 and 7 m/s. Trosseille et al. [38] summarised the conditions of each impact by location, velocity and the profile of the impactor. Impact responses are reported and described through the peak linear accelerations of the skull in the x (posterior-anterior) and z (inferior-superior) directions and the peak rotational acceleration and velocity about the y (medial-lateral) axis. The intracerebral acceleration peaks are given in a single direction at each transducer as well as the peak ICP at each monitored location. Table 2 lists the reported impact conditions, whole head (skull) accelerations and pressures for each test. Further to this, the full time-history recordings of each transducer are included as appendices, as exemplified for MS428-2 in Figure 2.
Despite there being sufficient information available to replicate all the scenarios in Table 2, test MS428-2 is the sole impact to have been applied to the validation of prominent FEHM. Of the scenarios created by Trosseille et al. [38], this case presents the most severe kinematics and is performed on the most completely instrumented cadaver sample. Furthermore, impact location, direction and velocity of the remaining cases is relatively similar to that of test MS438-2, as are the approximate relationships between acceleration measurements.

Relative Brain Displacement Response
FEHM have frequently been validated against the relative motion between the brain and skull compared to cadaver experiments of Hardy et al. [35,36]. These authors use biplanar X-ray technology to track target points in the brain and skull during at total of 45 impacts at velocities between 2.5 and 3.9 m/s. The kinematic response of each impact was recorded through accelerometers fitted to each cadaver's skull. The target points in the brain were Neutral Density Targets (NDTs): Tin granules encased in polystyrene, providing an X-ray opaque node of similar density to the surrounding tissue that could be traced during impacts without damaging the tissue.
In their earlier study [35], three cadavers (C731, C755 and C383) were impacted in the frontal or occipital region. Cadavers C731 and C755 had usable data collected from six impacts between them, mainly to the occipital region, with the exception of C755-T5 which was a frontal impact. These impacts were created by accelerating a static cadaver by striking it with a piston. The six cases had comparable kinematic responses with average peak resultant accelerations of 22g ± 5g and 1718 ± 583 rad/s 2 (mean±SD). C383 was subjected to four deceleration impacts, three of which on the frontal region and test C383-T4 on the occipital area. More variation was seen across these cases where the sample was decelerated against a fixed block, with peak kinematic responses of 69g ± 24g and 4539 ± 3439 rad/s 2 . The difference between acceleration and deceleration responses was attributed to the impact method [35]. The only occipital deceleration impact (test C383-T4) displayed significantly higher peak resultant accelerations (108g and 10,482 rad/s 2 ), contributing to the spread in deceleration kinematics.
In this study by Hardy et al. [35], two vertical columns of NDTs were inserted into the anterior and posterior regions of a single hemisphere of the brain of each cadaver sample with 7-12 mm between each NDT in each column. In this arrangement, the key NDTs refer to the inferior-most and superior-most target of each column.
The 2007 study [36] focused only on deceleration impacts, but generated a greater range of kinematic responses. During deceleration impacts, cadaver specimens were suspended from a carriage on rails. The cadaver was mounted at the neck such that it could rotate about its mounting but translation was restricted to along the rails, creating an oblique impact. The initial orientation of the cadaver was adjusted to create impacts in different directions. A piston then brought the complete apparatus to the desired speed before impact. As well as introducing lateral motion by impacting the temporal and parietal regions of two cadavers (C380 and C393), approximately half the impacts across the study were offset from the anatomical axes, altering the linear and rotational motion of the cadaver. Of the 25 impacts conducted on cadavers suitably instrumented for validation, 15 had a helmet fitted to the sample. As well as enabling investigations into the effects of protection, this further altered the impulses experienced by the cadaver heads. The properties of each impact across both studies are summarised in Table 3. Impacts conducted on two cadavers were not deemed suitable for FEHM validation so are not included in the 25 cases considered. Cadaver C427 had instrumentation failure preventing meaningful data collection in terms of simulating replications, and cadaver C408 was instrumented to study the brain motion at the cortex periphery, near the brain-skull interface during occipital impacts. Uncertainty around the behaviour in this area compared to deeper regions of the brain, as discussed by Hardy et al. [36], render this example less well suited to FEHM validation compared to other similar impact cases provided within the same work.
The NDT configuration was different in this investigation [36] from the earlier work [35]. Instead of columns, NDTs were arranged in clusters of seven, creating an array of triads about a central NDT with an approximate radius of 10 mm. The placement of the clusters was dependent on the direction of impact. Four cadavers (tests C288, C241, C015 and C064) had clusters in the anterior and posterior regions of the right brain hemisphere and consequently experienced occipital and frontal impacts. Two more (cadavers C380 and C393) had clusters in the parietal-frontal region of each hemisphere and were subjected to lateral impacts. In this arrangement the central NDT of each cluster was the key point. Pressure transducers were also inserted within the brain tissue of cadavers used in the later study [36] approximately at the expected coup and contrecoup locations. Impact characteristics are presented in similar ways across both of these studies [35,36]. Tables summarise the impact location and speed (in Hardy et al. [35] only) and peak values for linear accelerations, rotational accelerations and rotational velocity. The cadaver motion is also described via plots of these kinematic parameters at the samples' centre of gravity, as exemplified for test C380-T2 [36]. The motion of the NDTs is recorded in two ways. The first, shown in Figure 3a, traces the co-ordinates of each NDT in a given plane, showing their path of motion from the impact. The second, shown in Figure 3b maps the displacement of key NDTs relative to that of the skull, along directions parallel to the anatomical axes. For cases in the later study [36], the pressure recorded throughout the impact is plotted, as are principal and shear strains calculated in each NDT cluster. Not all data was successfully captured for every impact scenario. Table 3 indicates which data are available and reported for each cadaver impact [35,36].
The strain calculations provided by Hardy et al. [36] were made possible by the cluster arrangement used empirically. Strains were calculated from the relative displacements of the NDTs within each cluster. To achieve this, nodes were created in the LS-Dyna explicit simulation environment representing each NDT in their initial location. Each node was connected to its immediate neighbours to form a group of triangular elements representing the NDT cluster. The displacement histories of each NDT were applied to the relevant nodes, replicating their relative motion, allowing the strain between points, across the triangular elements to be calculated. However, a later re-examination of the data by Zhou et al. [45] found many cases had missing or low quality data affecting strain calculations. Once these were sifted out, data from 15 clusters across 14 impacts and five cadavers was deemed suitable for use. Furthermore, even within this group the strain measures required recalculation. NDTs were once again modelled as nodes in LS-Dyna, connected to their neighbours creating to form triangular elements and subjected to the relevant displacements. Zhou et al. [45] then considered the Green-Lagrangian strain in each element and registered the mean across the cluster as the indicative strain measure. These adjustments resulted in average principal strains increasing by up to 201% and shear strains decreasing by up to 247%, on average, across the 15 sifted clusters. Strain-rates were also affected by 40-49%.
The predicted strains further increased when, instead of groups of planar triads, NDT clusters were considered as groups of tetrahedral elements in another re-examination by Zhou et al. [46]. NDTs were excited via their displacements as before to generate the strain response from experimental scenarios. The strains calculated by this were compared to the responses of five FEHM variations from Atsumi et al. [25], Mao et al. [44], Kleiven [47]. Nodes within the volume occupied by NDT clusters were identified. The FEHM were subjected to impacts and their displacement fields recorded. The simulated displacement was then applied to the nodes representing the NDT clusters, connected both by triangular and tetrahedral elements. The strains generated in these isolated clusters were compared to the strains recorded in the complete FEHM. The tetrahedral formation was found to show significantly better agreement with the FEHM simulation responses across all five model variations (peak strain difference < 0.01) than with triad outputs, where the average peak difference was 0.03. It is therefore proposed that the Green-Lagrangian strain responses calculated from tetrahedral cluster formations should be used as the empirical strain reference data during FEHM validation [46]. An example comparison of the strain calculated by Hardy et al. [36] and Zhou et al. [45,46] for impact test C380-T2 is shown in Figure 3b. With the exception of test C288-T3, only one NDT cluster per experiment had data suitable for recalculation and therefore only depict the strain response in one region of the brain.   Across the 36 tests summarised in Table 3, only seven (tests C288-T3, C380-T1, -T2, -T3, -T4, -T6 and C393-T4) have complete data sets, with respect to brain motion and strain. The short duration of the data available for test C380-T5 limits its potential to be used in validation of FEHM. Of these, only test C288-T4 represents an impact primarily in the anterior-posterior direction due to its occipital location. The remaining six are lateral impacts to the temporal and parietal regions. There is, however, a spread of aligned and offset cases, and cases with and without a helmet present. Table 3. Summary of impact properties and reported cadaver response features across experiments conducted by Hardy et al. [35,36], excluding those performed on C408 and C472 [36], which were considered unsuitably instrumented for replication in a validation environment. Data is only considered if full time-histories are published. Peak response magnitudes are more completely reported through [35,36]. Brain Strain measures marked as reported correspond only to those which were corrected in later work by Zhou et al. [45]. A further ten cases (tests C755-T2, -T3, -T5, C383-T1, -T3, -T4 from Hardy et al. [35], and tests C288-T1, -T2, -T4 and C064-T2 from Hardy et al. [36]) provide enough data to replicate the impacts numerically and analyse the brain motion. For test C288-T4, where NDT motion data are available but kinematics are not, the impact speed is published (3.0 m/s), as are details of the deceleration block the specimen collides with [36]. This additional detail is not available in Hardy et al. [35], preventing tests without kinematic data from being replicated (i.e., tests C731-T1, -T2, C755-T4 and C383-T2), regardless of the availability of NDT motion. These cases include five deceleration scenarios to the occipital region, two decelerations to the frontal region and three acceleration cases, two occipital and one frontal. The cases conducted reported in Hardy et al. [36] represent unprotected/protected impacts and aligned/offset impact locations in approximately even proportions within the selection. Another six tests have adjusted strain data reported by Zhou et al. [45] (tests C241-T5, -T6, C064-T1, C393-T1, -T2 and -T3), despite NDT motion data not being published by Hardy et al. [36] for these cases. While only a small number of tests provide information allowing complete brain displacement and strain comparison, combinations of tests can be applied to FEHM and considered independently to build up a more comprehensive validation tool.

Outputs
Biplanar X-ray and radio opaque targets within the brain were also used by Guettler et al. [39]. Previously discussed impact generation methods based on impacts introduce a high degree of variability in the kinematic response of the head. Guettler et al. [39] instead generated an impulse by fixing the cadaver specimen into a high powered articulation device. This generated an angular velocity impulse following a half sine shape, with a peak velocity of 41.2 rad/s over 42.1 ms, providing a peak rotation acceleration of approximately 4000 rad/s 2 . Four repeats of this test were quantitatively compared to their average behaviour using two rating systems and showed excellent correlation with an average normalised root mean squared deviation (NRMSD) of 4.0% and an average CORA (Gehre et al. [48]) score of 0.955. The angular speed over a period of 100 ms for each repetition and the idealised pulse are provided as potential input data for the validation of FEHM.
The motion of the brain was tracked via tin targets placed throughout the specimen tissue, without the coatings used by Hardy et al. [35,36], as the targets were small enough not to cause damage to the surrounding tissue. NDTs were inserted in each hemisphere of the brain, spaced according to fractions of the brain size with the intention of creating a scale-able approach to improve the consistency of validation across FEHM of various sizes. Six NDTs are implanted in the left hemisphere in a radial pattern, arching out from the centre of the Frankfurt plane while the right hemisphere had nine targets inserted according to a grid system. Figure 4 illustrates these spacing and the proportional measurements of the target locations. As with the previously discussed works [35,36], the impact response is represented via motion traces of NDTs and their relative displacements parallel to the anterior-posterior and inferior-superior axes. The controlled nature of the impact meant that the behaviour of a given target could be compared across impact repeats. The NDTs had an average CORA score of 0.846 and normalised root mean squared difference of 9.0% between repetitions, indicating very good correlation between the impact conditions and responses. The frontal and occipital ICP response was also monitored by transducers fixed in the skull and the recordings for all four impulse repeats at each location are provided. At the time of writing, the authors are unaware of this experimental procedure being applied in any FEHM validation studies.  [39]. The radial arrangement (blue) has NDTs implanted at 20 mm depth intervals measured from the outer surface of the skull (d) radiating 20 • anterior and posterior from the mid coronal plane and is implemented in the left hemisphere of the brain. The grid arrangement (green) places columns of NDTs spaced 1/3 of the distance from the mid coronal plane to the anterior edge of the skull (L 1 ) and half way between the mid coronal plane and the posterior limit (L 2 ). The space between the rows of each column is 1/6 of the height from the Frankfurt plane to the outer skull surface at the mid coronal plane (H).
Sonomicrometry was used as an alternative method to biplanar X-ray by Alshareef et al. [40,49] and applied during the validation of an updated version of the GHBMC FEHM [44,50]. Sonomicrometry monitors the time ultrasound pulses take to travel between piezo-electric crystals. Knowing the speed of sound in a material allows distances to be calculated and setting up arrays of crystals allows their location to be determined. Alshareef et al. [40] inserted 24 receiver crystals into the brain tissue, with a further eight fixed to the skull as transmitters. The location of each receiver was then monitored through a series of controlled rotational impulses. A pneumatic actuator was used to generate angular velocity impulses with a haversine shape profile. The peak velocity magnitude was set to either 20 and 40 rad/s and the pulse duration to 30 and 60 ms, respectively. The four possible magnitude/duration combinations were applied and the 40 rad/s, 30 ms test was repeated a second time in each direction. While impulses were applied about each anatomical axis, the impulse rotational velocity and acceleration profiles, and subsequent results are only provided for the coronal scenario. The repeated tests of the highest severity impulse were shown to have excellent similarity through both CORA analysis with an average score of 0.93 (see Section 3.3) and an average difference in root mean square velocity magnitudes of 1.2 ± 0.68 rad/s. The displacement response of the brain was again presented similarly to seen in earlier works [35,36] with in situ motion traces for each receiver crystal from each impulse scenario. The relative displacement of three receivers are also presented across all the impact scenarios as displacement magnitudes against time. Alshareef et al. [49] employ similar techniques, using 30 sonomicrometry crystals in the brain and six control crystals on the skull of six cadaver samples. These were subjected to controlled rotational impulses with similar properties to used in the 2018 study. However, analysis focused on the relationship between the impulse kinematics and the physical response of the brain and as such, the data provided is of limited potential when considering FEHM validation. Overall, the sonomicrometry technique allows the relative displacements to be presented along all three anatomical axes, where biplanar X-ray techniques are limited to monitoring two directions. This work has been applied to investigations into the advancement and application of the GHBMC head model [50,51].

In Vivo Experimentation
The In Vivo response of the brain to impulsive loads can be monitored using tagged Magnetic Resonance Imaging (MRI) techniques. Bayly et al. [52], Sabet et al. [53] and Feng et al. [54] conducted a series of experiments on live volunteer subjects, monitoring the response of the brain during mild frontal [54] and occipital [52] impacts and rotational impulses [53] in MRI scanners. Knutsen et al. [55] conducted further experimentation into rotational impulses, looking to optimise data acquisition. Since these works have been published, the potential of tagged MRI techniques has been explored in a number of ways. This section reviews the experimental methods and data that have already been used in relation to FEHM validation and research. Other works with similar potential are also mentioned.
Tagged MRI uses radio-frequency pulses to superimpose a grid of "tag lines" onto an MRI scan, which then move with the brain tissue. The locations of intersections between tag lines can be mapped into a regular mesh by comparing the location of each intersection to its neighbours. This mesh deforms as the tag lines move with the brain tissue during impacts. Comparisons of mesh configurations at each time-step to the original mesh allow a deformation gradient tensor to be calculated for a complete slice through the brain. This can be used to calculate the deformation and displacement response of a complete brain slice.
Sabet et al. [53] and Feng et al. [54] each subjected three live 22 to 45 years old male volunteers to mild impacts while acquiring MRIs of their brains. Sabet et al. [53] considered rotational impacts by having the subjects lie supine, with their head in a cradle within an MRI scanner. The subjects released a weight causing the cradle to rotate about their inferior-superior axis. A stopper was then impacted, causing a rotational deceleration impulse, approximately 200 ms after the initial release. Feng et al. [54] considered predominantly linear motion by having subjects lie prone with a cradle lifting their forehead and chin. Subjects released their head supports causing a drop of approximately 2 cm before the forehead impacted a rubber stop 40-50 ms after release, creating a frontal impact. In both studies, each impact was repeated to maximise the temporal resolution of the acquired data. A sample of complete angular acceleration repeats recorded during one set of rotational impacts is provided by Sabet et al. [53] and shown in Figure 5. The repetitions are also shown, normalised to have time reset at each trigger release. Feng et al. [54] provide estimated rigid body displacements linearly in the anterior-posterior (x) and inferior-superior (y; others have defined this direction as z [35,36,38]) directions and rotational displacement about the medial-lateral axis for each subject of the head-drop experiment. These are illustrated in Figure 6, along with the average across the subjects. In both studies, the subject initiating the impact motion also triggered the MRI capture, which recorded an image approximately every 6 ms.
Sabet et al. [53] considered the response of the brain to the rotational impulse across four axial planes. The "zero plane" passed through the genu and splenium of the corpus callosum and the remaining three were 2 and 4 cm superior and 1 cm inferior of the reference. The radial-circumferential strain in each plane was recorded for 60 ms after the impact in 6 ms time-steps, and reported using contour plots mapping the strain levels across each slice. The response of subject S1 was depicted at each plane and compared to the response of either S2 or S3. An example of this for the +2 mm plane is shwon in Figure 7a, featuring two subjects (labelled S1 and S2), along with the average between the two volunteers. The fraction of area experiencing strains above the 0.02, 0.04 and 0.06 thresholds is shown in Figure 7b for the +2 mm plane for subjects S1 and S2, along with the average between subjects.  (c) Rotational accelerations for 0 < t < 300 ms.  x, Av x, S1 x, S2 x, S3 y, Av y, S1 y, S2 y, S3 z-rot, Av z-rot, S1 z-rot, S2 z-rot, S3 Figure 6. Rigid body (skull) displacements in the anterior-posterior (x), inferior-superior direction (y) and about the medial-lateral axis (z-rot) recorded by Feng et al. [54] during mild frontal impacts to three subjects (S1, S2, S3). The authors have calculated and also present the average motion across the three subjects, resulting in mean peak displacements of 8.0 mm (x), 1.0 mm (y) and 0.1 rad (z).

Figure 7.
Examples of brain deformation measures adapted from Sabet et al. [53]. (a) Colourplots indicating the brain deformation levels from the point of impact (216 ms from trigger release) as radial-circumferential strain for subject S1 and S2, adapted from [53]. Again the authors have presented additional data depicting the average response between the subjects, calculated by digitising the published contour plots into strain levels for each subject, averaging them at each pixel and re-plotting into a new contour plot on the same scale as the original image. Only pixels where both subjects have brain tissue (contour plot is coloured) are included in the average. The slice shown is in the axial plane, 2 mm above the plane between genu and splenium of the corpus callosum. (b) Area fraction of the brain experiencing radial-circumferential strains above thresholds of 0.02, 0.04 and 0.06 in the axial plane, 2 mm above the plane between genu and splenium of the corpus callosum. The levels experienced by S1 and S2 [53] are presented, again with additional data showing the mean between the cases, indicated as "Av", calculated by the authors for the purposes of this review.
Feng et al. [54] focused on a single slice in the sagittal plane of the left hemisphere of each subject during the frontal impact experiments. In order to separate the motion of the brain from the skull, these authors manually identified 10 landmarks at tag line intersections within the skull. The skull was considered a rigid body and the displacement vectors of the new landmarks were used to realign each image to a constant anatomical frame. The relative displacements of the tag line intersections within the brain tissue were monitored, giving the relative motion of the brain compared to the skull across the complete brain slice. Deformation of the mesh between intersects was also analysed to consider the principal strains across the slice. Both vector fields (largely depicting direction) and contour plots (indicating magnitudes) of the relative brain displacement are presented for each of the three volunteer subjects (S1, S2, S3) for 22.4 ms after the point of impact (61.6 ms from trigger release), of which examples are seen in Figure 8a,b. The principal strain is similarly presented, although with an extra level of detail. Each element of the tag line mesh has a circle drawn within its grid square. When the deformation tensor is applied to to the mesh, the circles are similarly stretched into ellipsoids, as shown in Figure 8c. The direction and degree of the elongation indicate the direction of the principal strain and its magnitude, along with the colour gradients seen in other works. The relative displacement and principal strain magnitudes of four discrete points within the brain at similar locations to the NDTs tracked during frontal and occipital impacts conducted by Hardy et al. [35,36] are plotted for the full 168 ms of the impact test. These can be seen in Figures 9 and 10.  Average S1 S2 S3 Figure 9. Relative brain displacement magnitude responses to mild frontal impacts conducted by Feng et al. [54] at four discrete points. The chosen locations relate to where NDTs were inserted in specimens during cadaver experimentation [35,36]. Displacements recorded for each of the three subjects (S1, S2, S3) are presented at each location in the original publication and replicated here [54]. As with previous examples, the authors have calculated and present the mean response across all three subjects.
Both Sabet et al. [53] and Feng et al. [54] build on the tagged MRI work of Bayly et al. [52]. In their study, Bayly et al. [52] subjected three volunteers to mild occipital impacts. The volunteers lay supine with their heads raised 2 cm (the same height [54] then used to generate frontal impacts) and neck in flexion. Releasing a trigger caused their head to drop via neck extension before hitting a stop, with a peak acceleration of 2g − 3g over 40 ms. A single slice in each of the sagittal and axial planes is considered. Contour plots depict the strain magnitudes in the anterior-posterior and inferior-superior directions and the principal strain magnitude after impact. Four time-steps 6 ms apart are provided and compared to the state of the brain prior to release. Area fraction measures above thresholds between 0.2-0.5 are also provided against time.
Knutsen et al. [55] optimised MRI acquisition settings during rotational accelerations, similar to those conducted by Sabet et al. [53]. A gelatin phantom and an updated trigger system were used to reduce the number of rotations the volunteer had to be subjected to to determine the strain response across a given brain slice to as few as eight rotations, from the 72-144 rad/s 2 reported in previous studies. The mean of the peak acceleration magnitudes was 260 rad/s 2 , which is slightly higher than those recorded by Sabet et al. [53], generated by a 32 • rotation. Both maximum shear strain and radial-circumferential shear strain contour plots are available in the supplementary material of the paper across two axial slices for each of the three subjects.

Time (ms)
Strain Average S1 S2 S3 Figure 10. Principal strain magnitudes recorded during mild frontal impacts conducted by Feng et al. [54] at the same four discrete points as seen in Figure 9. The strains recorded from each subject, extracted from the original publication [54], are presented with a newly calculated group mean.
Across existing FEHM, the Worcester Head Injury Model (WHIM) [56] has been validated against brain displacement and deformation from cadaver-based experimental data (see Table 4) and the in vivo experimentation conducted by Sabet et al. [53], while Saboori and Sadegh [57] and Saboori and Walker [58] validated a new FEHM against the observations of Feng et al. [54]. A partial FEHM developed during a preliminary study before developing infant FEHM and validation tools [59] was validated against anterior-posterior brain displacements measured in the coronal plane by Knutsen et al. [55].
Tagged MRI techniques have continued to develop. For example, Chan et al. [60] studied the tagged MRI responses of a significantly larger cohort of volunteers (n = 33) than previously seen for rotational accelerations, reporting a mean peak acceleration of 195 rad/s 2 . While these experiments do not appear to have been applied to FEHM validation yet, they present potential in vivo impact response data sources. The impulse conditions are described through plots of the mean angular position, rotational velocity and rotational acceleration across the full cohort against time. The strain behaviour in the brain is subsequently depicted via colour plots of the shear strain magnitude across 13 axial slices spread between the inferior and superior brain limits, as seen in Figure 11a, for 145 ms after the point of impact, through nine time-steps. Additionally, the maximum recorded shear, first (tensile) and second (impressive) principal strains are plotted for each brain slice. The area fraction of the brain experiencing each strain type above a 3% threshold is further analysed. At each time-step, the mean percentage of brain tissue exceeding 3% strain across the full cohort is provided for the whole brain and in each segmented part, namely the cortical grey matter, white matter and deep grey matter. The distribution of effected areas is also plotted through spider diagrams, providing an idealised measure of the expected strain patterns. Figure 11 provides an example of this alongside the "raw" shear strain colour plots.
(a) Shear strain magnitudes through axial brain slices.
(b) Distribution of brain area experiencing strains above 3% for the tensile principal (E 1 ), compressive principal (E 2 ) and shear (γ max ) strains. Figure 11. Example of brain strain responses reported by Chan et al. [60], 54 ms after volunteer trigger release. (a) Maximum recorded shear strains through 13 axial brain slices moving from inferior to superior regions. The superior most slice does not register any strains above 1% so appears blank. (b) Distribution of the area fraction of the brain experiencing strains above 3% for each of the tensile principal (E 1 ), compressive principal (E 2 ) and shear (γ max ) strains. The shaded area indicates the mean proportion across the cohort of volunteers, with the confidence interval indicated via the bordering line. The spiderplot is orientated with the frontal region of the brain (F) at the top, occipital (O) bottom and left (L) and right (R) as indicated. The segments between represent the temporal and parietal regions respectively.
Increasing the range of strain data available, Gomez et al. [61] recently proposed combining tagged MRI techniques with simulations constrained to the MRI data to expand previously 2D strain response analysis to 3D, known as HARP-FE. Both linear and rotational acceleration scenarios were considered, providing potential data for FEHM validation across a greater spacial range, especially when considering small deformations. This approach has been expanded by Knutsen et al. [62] on 20 MRI data-sets captured during controlled rotational impulses and mild occipital impacts, using similar apparatus to previous tagged MRI methods. The 3D displacement data from HARP-FE analysis allows the maximum principle strain (MPS) to be calculated across the brain tissue, which is presented as colour plots in the central slice of each anatomical plane for both impulse regimes for 10 time-steps of 18 ms. Furthermore, in regions of white matter where there are high levels of anisotropy, the fibre-strain is calculated, a prevalent metric when considering more severe traumatic brain injuries. These strain responses are also analysed in terms of the fraction of the complete brain/white matter volume experiencing MPS/fibre strains over thresholds of 0.1, 0.2 and 0.3 at each time-step. While less applicable to FEHM validation, the relationship between these strain responses and the head kinematics are also discussed.
Further tagged MRI experimentation has been conducted, alongside the development of new FEHM by Chen et al. [63]. These authors recreated an occipital impact following an equivalent method to the one proposed by Bayly et al. [52], with 12 impact repetitions per image series. Relative brain displacements are presented for both an axial and a sagittal plane (although exact locations are not provided) via pairs of colour plots indicating the displacement magnitude within a −3 to +3 mm range, along each axis in the displayed plane (e.g., for the sagittal plane, one series of colour plots indicates the inferior-superior brain displacement an the second the anterior-posterior). The displacement maps are then used to further validate an FEHM, previously developed and validated against the results of Nahum et al. [34] for intracranial pressure response [64]. At the time, this was believed to be the first attempt to validate an FEHM against in vivo impact response data. The numerical response was further analysed to obtain the shear and principal strains in each plane. The development and validation of subject-specific head models against the tagged MRI response of the same subject has been conducted by Ganpule et al. [65] and Lu et al. [66]. The former [65] used data directly from Knutsen et al. [55], while the latter [66] subjected a new volunteer to both rotational accelerations in the axial plane (similar to Knutsen et al. [55]) and a neck flexion-extension acceleration (similar to Bayly et al. [52]). The rotational acceleration histories for both impact types are provided and applied to the FEHM as boundary conditions during validation. The radial-circumferential strain response in the rotational plane is considered for each impact, across 145 ms (8 time-steps) and through five brain slices separated by 10 mm. The maximum shear strain is also provided for three slices across the five time steps describing the main impulse. Peak radial-circumferential and maximum shear strains are also reported for the cerebral grey and white matters, the thalamus, caudate and putamen. In both these cases [65,66] however, the head models have been developed using methods closer to Material Point Method [67] than more conventional FEHM.
Employing MRI to monitor the dynamic response of the head to impacts is an evolving practice. As well as alternative theories to process tagged-MRI data (e.g., Xing et al. [68]), different imaging techniques are available. Badachhape et al. [69,70] employed magnetic resonance elastography techniques, previously used to determine tissue properties [71], to consider the harmonic response of the brain to repeated oscillations. In the first study [69], a driven actuator oscillated the heads of six volunteers for 100 ms at 50 Hz within an MRI scanner. Accelerometers fitted to a mouthpiece measured the rigid body kinematics of the skull, allowing its motion to be separated from that of the brain. Badachhape et al. [69] use colour plots for a single time-step to illustrate the total displacement, "wave displacement" (relative brain motion) and "curl" (brain displacement from the shear wave only). The RMS averages of these measurements and the resulting strain are compared in the inner, middle and outer regions of the brain. The following study [70] used the same methods but expanded the analysis to study the relationships between the scalp, skull and brain motions.

Empirical Data Applied to FEHM Validation
Among existing FEHM, the most commonly used validation criteria are the intracranial pressure (ICP) response against data from Nahum et al. [34] and Trosseille et al. [38], and the relative displacement of the brain within the skull, compared to that observed and reported by Hardy et al. [35,36]. The individual impact cases taken from these established sources and applied to prominent FEHM are summarised in Table 4, with pointers to any further works employed to judge the intracranial performance of the FEHM. The FEHM listed in Table 4 have either been validated against multiple sources, been included in other FEHM review works, including [5][6][7], or used in research efforts beyond the initial validation/research group.  -2  C383-T1,T3,T4,  C755-T2,T3,T5   C288-T1,T2,  C380-T2,T3, As the technology to create FEHM has become more established and accessible, researchers have begun to create in-house models designed/customised for specific investigations and applications. Often models are developed to investigate the influence of particular impact conditions, such as location [93,94], or the presence/properties of protection [94][95][96][97][98]. Numerous more FEHM have been developed to investigate the effect of properties within the models themselves. Often the intracranial response of these models is validated against a single, qualitative comparison to test 37 from Nahum et al. [34]. Further new FEHM have been developed alongside experimental procedures, particularly as tagged Magnetic Resonance Image (MRI) technology allows in vivo tracking of the brain during head motions. These are discussed alongside the experimental procedure development in Section 2.2.

Replication of Empirical Scenarios in Numerical Environment
In order to validate FEHM, empirical conditions must be replicated in the numerical environment, namely impacts need to be generated in simulations. A number of methods have been applied to existing FEHM, depending on the experimental data available and the properties of the FEHM in question.
Two methods of recreating the impact conditions from Nahum et al. [34] are predominantly used. One directly applies impact properties to the FEHM being validated, either via the force history [74,82,88] or the head acceleration response [25,44] provided by Nahum et al. [34] for experiment 37. Alternatively, a simple model of the cylindrical impactor is created and impacts the FEHM at the relevant velocity (9.94 m/s for test 37) [47,72,78,80,81,86]. However, the impactor is padded and the corresponding material details are not provided. Researchers have modelled the padding as a linear elastic material [73,80,86], rubber [81] or foam [47,78], and adjusted the properties until the impact force produced matches the force-history provided by Nahum et al. [34]. The resulting padding properties are published for WSUBIM (elastic modulus E = 49 MPa) [73], SUFEHM (Strasbourg University Finite Element Head Model) (E = 13.6 MPa, Poisson's ratio ν = 0.16) [80] and YEAHM (YEt Another Head Model) (E = 6 MPa, ν = 0.16) [86]. Despite the differences in elastic moduli, simulations of experiment 37 on all three FEHM report matching the peak contact force measured empirically. However, WSUBIM uses 7.9 kN as the reference value [72] while SUFEHM and YEAHM state/plot 6.9 kN. Figure 1a, adapted from Figure 1 of Nahum et al. [34], demonstrates a peak value of 6.9 kN, in agreement with Sahoo et al. [80] and Fernandes et al. [86]. However the 7.9 kN value is listed in a table in Nahum et al. [34] (Series I in Table  III), introducing some confusion about the matter. Zhang et al. [72] further used the frontal pressure response to validate the material choice, rather than the model response, and matched the 141 kPa peak empirical value to within 10% (154 kPa, +9%). This pressure value is agreed on between the plots and tables in Nahum et al. [34].
The Wayne State University Brain Injury Model (WSUBIM) [72,73] and GHBMC [44] had their validation extended beyond experiment 37 to all six cases in the first series of experiments from Nahum et al. [34]. The impactor properties were adjusted to change the conditions experienced by WSUBIM. GHBMC scaled the acceleration response provided for test 37 so the peak value matched that given for each experiment and applied that directly to the FEHM skull, defined as rigid.
Subsequent experimental work which had FEHM validation in mind during their procedures, including Trosseille et al. [38], Hardy et al. [35,36], monitored the motion of the skull and provide complete kinematics at the centre of gravity (CG) of the specimen's head. Detailed motion data are similarly provided for the controlled cadaver and in vivo impacts. The kinematics can be applied directly to the FEHM as boundary conditions on an equivalent point in the skull, provided this is defined as rigid. This allows the motion resulting from the impact to be recreated without the need to consider the impactor properties or contact behaviour. Furthermore, as the complete motion of the head is replicated, the constraining properties of the neck do not need to be considered, an aspect thought to be influential for longer duration impacts [79,99]. This has therefore been the primary method of recreating empirical conditions from Trosseille et al. [38], Hardy et al. [35], Hardy et al. [36] and in vivo experimental conditions in the numerical environment. Of the models listed in Table 4, only Tse et al. [78] used an alternative method during the validation of the Nanjing FEHM.
Tse et al. [78] continued to model external impactors throughout the validation of the GHBMC. The steering wheel used during test MS428-2 was modelled as a rigid cylinder of the same mass as the impactor used by Trosseille et al. [38] (23.4 kg) and impacted the nasal bone of GHBMC at 7 m/s. To replicate experiments from Hardy et al. [35,36], a block with a 45 • angled impacting face was modelled with acrylic material properties and fully constrained. GHBMC was then assigned the relevant initial velocity and collided with the block, initiating the deceleration impulse, as seen in the empirical method. These impacts have longer durations than those reported by Nahum et al. [34], meaning the neck may influence the response. GHBMC includes a neck structure (vertebrae C1-C7) within the FEHM. The base of this was constrained against all degrees of freedom during simulations of test MS428-2 [38], mimicking the cadaver being supported at the torso. Linear motion was, however, permitted in the impact direction during replications of Hardy et al. [35,36]. This was to replicate the cadaver being able to slide along the impact apparatus during the original experiments.

Assessment of FEHM Performance and Validation
Once the empirical scenario has been replicated in the numerical environment, the performance of the FEHM needs to be assessed compared to observations from testing. Impact responses are presented in a number of ways, including discrete point histories, absolute peaks, area fraction measures and spacial plots. In the majority of cases, particularly although not exclusively for the earlier FEHM, this has been through a qualitative comparison of the simulated and empirical outputs [21,44,47,72,[74][75][76][77]80,82,86]. Others have supplemented this with peak value comparisons [44,77,80].
More recently, statistical measures have also been applied to quantitatively analyse the similarity of numerical outputs to empirical data. Tse et al. [78] used Pearson correlation to objectively compare how well two variations of their FEHM performed during validation simulations. There is, however, no benchmark value or rating system to judge the quality of either model against empirical data. Pearson correlation is limited to considering the linear correlation between points of data sets. The Normalized Integral Square Error (NISE) method separately compares the differences in phase, amplitude and shape of two time series that are then summed to give an overall measure [100]. Based on generally accepted coefficients of variation (e.g., ratio of standard deviation to mean), acceptable NISE limits are proposed for each of phase, amplitude and shape in terms of the repeatability and reproducibility. While the limits of NISE were proposed for determining the quality of experimental procedures and results, the line comparison equations have been applied in judging FEHM validation. Kimpara et al. [101] and Ji et al. [56] calculated a correlation score between 0 and 100 by subtracting the NISE from 1 and multiplying by 100 during the validation of THUMS (Total HUman Model for Safety) and WHIM FEHM. These were then related to bio-fidelity ratings aligned with the classifications detailed in ISO/TR-9790 [102], as shown in Table 5. The CORA method (Gehre et al. [48]) combines corridor and cross-correlation rating methods to rate the similarity of curves between 0 and 1. As with NISE, the cross-correlation aspect of CORA considers the shape, amplitude and phase of responses. The parameters used within each analysis stage can be adjusted to fit the scenario being considered and its limits. Research into to the validation of full body FE models found CORA to be the most comprehensive of analysis methods available, although it was recommended to keep the evaluated aspects separate rather than combining to a single rating for the model [103]. CORA has since been included in the investigative technical report ISO/TR16250:2013 [104], providing recommendations on the analysis of time-history signals in the testing of vehicle safety and computer model validation. In a study focused on FEHM, Giordano and Kleiven [105] found CORA more reliably evaluated validation signals than NISE (see Section 4.2). During the development of FEHM, CORA has been used by Atsumi et al. [25], Miller et al. [83], Miyazaki et al. [85] and Trotta et al. [90] for the validation of ABM (Atlas-based Brain Model), THUMS (a version superseding the model previously analysed using NISE [101]), Tokyo and UCDBTM V2.0 (University College Dublin Brain Trauma Model) models, respectively. Several parts of the GHBMC full body model have been validated against CORA and the tools recommended in ISO/TR16250:2013 (e.g., [106][107][108][109]). However, Mao et al. [44] is the most recently published record of the head intracranial response validation found and CORA is not employed. Table 5. Bio-fidelity classifications adapted from [102]. The classifications have an equivalent distribution but the range is increased by a factor of 10 (from a 0 to 10 range, to 0 to 100). CS is the Correlation Score.

Retrospective Validation and Comparison Investigations
The range of representations in terms of geometry, material models and included features across different FEHM is well documented and discussed in literature. As such, (particularly prominent) FEHM have been subjected to further investigations into their behaviour and comparisons to other models.
Miller et al. [31] looked to objectively and quantitatively compare the performance of FEHM against relative brain displacement data. Five tests (C755-T2, C383-T1, T3, T4 from Hardy et al. [35] and C291-T1 from Hardy et al. [36]) were selected to represent a range of impact locations and kinematic properties as well as to represent cases typically used during FEHM validation. Existing results from simulating these cases were compared for the ABM [83], SIMon (Simulated Injury Monitor) [76], GHBMC [44], THUMS [25], KTH (Kungliga Tekniska Högskolan) [47,110] and WHIM [56,81] FEHM (KTH and WHIM had partial data sets available). The relative displacements at each NDT in both directions considered were analysed using CORA. These ratings were averaged to provide a model rating for each impact scenario. These were averaged again to provide an overall rating, for the available scenarios. The ranking of the six FEHM was also considered and it was found that the FEHM ranked differently according to CORA rating for each test. The KTH FEHM [47] was stated to have the highest overall rating (0.413 ± 0.059), although the results for WHIM [56,81] are higher (4.15 ± 0.035). The WHIM also had the most consistent rating (lowest standard deviation). However, these two FEHM only had data available for three and two of the impact scenarios considered, respectively, which is a limitation of the study. Of those with data for all the scenarios considered, the ABM [83] had the best overall rating (0.376 ± 0.053).
Talebanpour and Smith [33] compared the response of the GHBMC [44], THUMS [25] and SIMon [76] FEHM to the brain strain response recorded during tagged MRI experiments from Bayly et al. [52]. The impacts were generated by simulating the FHEM droppig onto a rubber pad, replicating the conditions experienced by in vivo subjects. Despite being subject to identical conditions, the three FEHM experienced different head kinematics, particularly rotational accelerations, after the point of impact. As well as comparing the distinct models, THUMS then had the brain material models from GHBMC and SIMon applied to it, and the responses were again compared. Compared to the tagged MRI response, all three FEHM underestimated the strain response, but by differing magnitudes, although there were consistencies in the strain distribution pattern across the monitored brain slice. Furthermore, there were discrepancies when just the brain material changed. Talebanpour and Smith [33] surmise that the brain material model, the skull-brain interface and the presence/properties of the neck all influence these discrepancies.
Drake et al. [28] investigated the influence properties of NDTs have on the simulated brain response compared to impacts from Hardy et al. [36]. A mass replicating the NDTs used in the empirical method was modelled within the FEHM brain volume. Compared to the original FEHM, relative brain displacements magnitudes changed by less than 2.5%. The effects remained less than 10% when the mass of the modelled target was increased ten fold. The authors also changed the location of NDTs, offsetting the targets by 13.5 to 16.6 mm in directions parallel to each anatomical axis. The average point-to-point displacement magnitudes across five impact simulations changed by up to 64% (for NDT locations being offset in the inferior direction by 13.5 mm). These findings suggest that the presence of instrumentation (i.e., NDTs) in the experimental method do not affect the suitability of the data collected for FEHM validation. However, the relative brain displacement response is sensitive to the location it is monitored from. Drake et al. [28] comment that empirical locations are given from an approximate cadaver CG, which could be erroneous or offset from that of the FEHM. The varying geometries of different FEHM will mean the CG and NDT locations will have different levels of influence on the validation of each model, indicating the potential in adaptable methods similar to those employed by Guettler et al. [39].

Investigation into Validation Protocol
Giordano and Kleiven [105] critically evaluated the validation data available from Nahum et al. [34], Trosseille et al. [38] and Hardy et al. [35,36] and Hardy's PhD Thesis with the aim of creating a standardised validation protocol for FEHM. Experimental data for intracranial pressure response, relative brain displacement and brain deformation were judged according to selection criteria. Impact scenarios were selected if they were deemed to have sound experimental methodology (termed "accuracy"), complete kinematic data available (termed "reproducibility") and to represent unique impact conditions (termed "redundancy"). Cases were judged indepentdently for each validation metric. No cases from Nahum et al. [34] passed the reproducibility criterion as rotational kinematics, thought to be of high importance when considering TBI, and the cadavers used by Trosseille et al. [38] were seen to have air within the intracranial space, thereby failing the accuracy criterion. Table 6 summarises the experiments suggested for use by Giordano and Kleiven [105], excluding those within Hardy's PhD Thesis. While the selection process is set to be unbiased, not every impact case within the assessed studies [34][35][36]38] appears to have been considered and subjected to the selection criteria. For example, although C380-T4, -T5 and -T6 were deemed suitable for the complete set of validation metrics (ICP, relative brain displacement and brain deformation), C380-T1 and -T2 are not mentioned throughout the publication.
The influence each scenario should hold when considering the overall performance of the model was then weighted (0 < v i < 1.0) according to experimental limitations and data errors, as detailed by Giordano and Kleiven [105]. The weighting that should be applied to each target (NDT or pressure transducer) within each test is similarly calculated and provided (0 < w ij < 1.0). This is particularly pertinent for the brain deformation cases where the tests are all given equal weighting overall. However, the targets within each experiment have a range of weightings. For example, test C241-T5 negates (weighting = 0.0) seven of the 14 target points entirely, has five targets with 0.1 weighting and one each with 0.3 and 0.7. Table 6 also provides the overall test weighting for each validation criterion.
Nine of the suggested cases (tests C241-T5, C241-T6, C288-T3, C064-T4, C380-T4, C380-T5,  C380-T6, C393-T3 and C393-T4) were then applied to the KTH [47], KTH-voxel [75], THUMS [25] and GHBMC [44] FEHM by directly imposing impact kinematics onto a node at the centre of each of the models. The likeness of the simulated response of each FEHM to empirical data was measured using NISE [100] and CORA [48] methods. The ratings of the model across each test, with their weightings incorporated, were combined to create an overall bio-fidelity rating B of each model, with 0 < B < 10, using the same scale as seen in Table 5. CORA was deemed to better analyse the FEHM responses with NISE tending to underestimate amplitude errors and was therefore over estimating the bio-fidelity of the FEHM. CORA was therefore the tool used for further performance and bio-fidelity analysis. The control parameters within CORA (ver. 3.6.1) [48] were optimised to the data sets considered for each validation criterion to those listed in Table 7. As such, errors in overall response nature are penalised more severely than slight shifts in phase or magnitude. With the optimised analysis settings, across the full set of experiments meeting selection criteria, the four FEHM tested were found to have bio-fidelity ratings between 5.80 and 6.43, deeming them "fair" representations of the human head, according to ISO/TR-9790 [102] (see Table 5). Table 6. Experimental cases passing the selection criteria proposed by Giordano and Kleiven [105] during critical evaluation of the empirical data from Nahum et al. [34], Hardy et al. [35], Trosseille et al. [38] and Hardy et al. [36] for validation against ICP, relative brain displacement and brain deformation. Each impact is summarised by its location, the peak resultant linear acceleration and the peak rotational acceleration about a single axis. Where impact cases have been selected for multiple validation criteria, repeats have been indicated through italics. At the time of writing, no FEHM has been validated according to the above protocol. Caution and the need for further research into the sifting and weighting protocols has been raised [91,111]. Guettler et al. [39] intended to ensure their cadaver experimentation procedures would pass all criteria laid out by Giordano and Kleiven [105] to remove the need for case selection and to further standardise the validation process. Furthermore, despite Giordano and Kleiven [105] proposing an unbiased selection criteria, some cases published within the assessed works [35,36] are not mentioned and therefore, it is assumed, not subjected to the selection criteria. This includes cases C380-T2 and -T3 [36] which have previously been used in FEHM validation and present full sets of relative brain displacement and strain response data as well as partial pressure response data. The optimised CORA parameters proposed by Giordano and Kleiven [105] have however been adopted during FEHM validation and investigation analyses [20,90,112]. Table 7. Optimised parameters for CORA (ver. 3.6.1) [48] analysis of FEHM performance against cadaver data from Hardy et al. [35,36], as determined by Giordano and Kleiven [105].

Conclusions
Current FEHM produce different responses within the brain to similar impacts, despite being considered validated and, as yet, there is no widely accepted set of validation criteria. This paper reviews and discusses experimental data and methods that could be applied during the validation of the intracranial response of FEHM, and how it has been implemented in previous validation efforts.
Surveying the experimental data available showed there was no single source providing sufficient data to completely validate the intracranial response FEHM. However, there are multiple options available to validate the intracranial pressure [34,36,38,39], relative brain displacement [35,36,39,40,54] and brain strain responses [46,[52][53][54][55]60]. Across each of these there are combinations of cadaver and in vivo experimentation, blunt impacts and controlled impulses providing a range of kinematic profiles and subsequently brain response characteristics. The range of methods used to record impact responses also adds depth to the measures FEHM can be validated against. For example, analysis of tagged MRI data allows the area fractions of the brain experiencing strains above a given threshold to be introduced.
There is unarguably scope and value in an experimental procedure being developed to specifically provide a comprehensive and more efficient FEHM validation tool. Adapting data from studies with a research focus beyond this application has limitations. Yang and Mao [111] highlighted these to include a lack of external and internal dimensions of the empirical subjects, limited cohort sizes and a lack of in vivo injury data and understanding. Whilst Chan et al. [60] have gathered response data from larger volunteer groups and Guettler et al. [39] have attempted to introduce procedures that allow for dimension changes, these have not yet been widely implemented and only provide a single, rotational impulse each.
Combinations of the experimental data currently available have the ability to reasonably assess the behaviour of FEHM. Researchers should be aware of the limitations and potential errors of each data source; for example the sensitivity the relative brain displacement response has to the monitored location [28]. This is how validation has been approached to date. However, as summarised in Table 4, the selection of cases between research groups has been inconsistent and Miller et al. [31] included differences in validation tests conducted on FEHM in the limitations of comparing their validity and accuracy. Giordano and Kleiven [105] addressed these issues by proposing a standardised impact case selection protocol. This has been acknowledged as promising in principle. Further research is, however, required to reach a consensus on the selection criteria and weighting factors suggested [91,111], or if such selection should occur at all [111]. Should a protocol become approved and adopted, it would need to be applied to a wider selection of empirical data than the limited range previously considered [34][35][36]38]. Furthermore, there should be evidence that all test examples within a given research effort were assessed. Within the current environment, researchers should be encouraged to validate FEHM against sets of impacts from a range of studies using different data recording methods and response metrics, ideally including cadaver and in vivo references, and a covering a range of kinematic profiles likely to be relevant to the future applications of the FEHM to provide a balanced assessment its performance, and to mitigate the limitations and biases of particular empirical methods and data.
As well as the experimental data available, this paper reviewed how validation is conducted in the simulation environment, specifically how impacts are recreated and how outputs are compared to empirical recordings. Impact replication has been largely, and increasingly with time, seen to be achieved by applying kinematics recorded during experimentation directly to the skull of the FEHM. While the skull needs to be modelled as rigid to use this method, its response can be validated in separate simulations, as has been done previously (e.g., by Mao et al. [44]). Directly applying kinematics ensures the FEHM experiences exactly the same motion as the empirical sample, providing a consistent baseline during validation and removes boundary conditions from consideration, such as the inclusion of a neck which was seen to influence kinematic behaviour when other impact initiation methods were used [33]. If modelling the impactor is to be implemented, experimental procedures need to provide detailed information on impactor properties including materials, dimensions and boundary conditions to avoid the range of assumed representations seen in previous studies and to minimise the error introduced through the impactor/FEHM interaction.
Assessing the validation performance of FEHM has been largely qualitative. It is clear quantitative assessment is essential to allow for objective conclusions to be drawn, especially when considering the effects of property changes within an FEHM type or the biofidelity of a given model, and therefore its suitability for use in injury research. A handful of measures have been considered for time-series based data. CORA was found to be the most suitable method for comparing output histories, allowing for magnitude, phase and shape characteristics of time-histories to be reliably compared [105] and has been recommended in relevant technical documentation [104] so should become widely adopted. Not all data are suited to this type of analysis. For example, the colour-plots of strain levels across brain slices widely presented from tagged MRI experimentation. Currently, these image-based outputs rely on qualitative assessment. However, this can be supplemented with quantitative analysis of extracted time-based data, including strain-histories at discrete points (e.g., Figure 10) and the area fraction of the brain experiencing strains above given thresholds against time (e.g., Figure 7).
FEHM are incredibly valuable tools in numerous research areas. Their development and validation is, however, far from simple. This review has found there is scope to improve the validation procedures in a number of ways. Models can, however, be reasonably investigated and judged before being applied to research. Particularly in terms of how simulations are generated and outputs are assessed, simple selections can move validation procedures towards best practices within the realms of what is currently available and established. A good understanding of the intended applications of the FEHM being validated and the characteristics of each experimental method should help researchers choose impact scenarios for validation. While there is no clear best set of cases, variety is likely to provide the most reliable outcomes.

Conflicts of Interest:
The authors declare no conflict of interest.