Intra-System Reliability Assessment of 2-Dimensional Shear Wave Elastography

: The availability of 2-Dimensional Shear Wave Elastography (2D-SWE) technology on modern medical ultrasound systems is becoming increasingly common. The technology is now being used to investigate a range of soft tissues and related pathological conditions. This work investigated the reliability of a single commercial 2D-SWE system using a tissue-mimicking elastography phantom to understand the major causes of intra-system variability. Sources of shear wave velocity (SWV) measurement variability relates to imaging depth, target stiffness, sampling technique and the operator. Higher SWV measurement variability was evident with increasing depth and stiffness of the phantom targets. The influence of the operator was minimal, and variations in sampling technique had little impact on the SWV. This study demonstrated good reliability of 2D-SWE SWV measurements when imaging soft (10 kPa and 17 kPa) elastography phantom material. Higher SWV measurement variability was evident with increasing depth and stiffness of the phantom targets. The influence of the operator was minimal, and variations in sampling technique had little impact on the SWV for most targets. The reliability achieved in this study of the soft targets is encouraging and may assist the development of reliable imaging protocols for clinical practice when acquiring 2D-SWE measurements.


Introduction
Ultrasound based elastography techniques can be applied to investigate a range of clinical scenarios, with several different technologies currently available on various commercial ultrasound systems. Shear Wave Elastography (SWE) involves the generation of shear waves propagating perpendicularly from the main axis of the ultrasound beam, the speed of which is linked to the tissue's elastic modulus [1]. One technique [2-Dimensional Shear Wave Elastography (2D-SWE)], generates these shear waves by applying multiple transducer-generated acoustic force pulses to create a two-dimensional parametric colour map of tissue stiffness. The operator is then able to "sample" areas within this colour map using proprietary software to obtain quantitative measurements of tissue elasticity by measuring the shear wave velocity (SWV). An advantage of 2D-SWE is that the elasticity profile of a larger section of tissue can be evaluated with a single acquisition [2]. By measuring the velocity of these lateral shear waves and comparing measurements from normal and pathological tissue, researchers can detect or monitor disease progression [3][4][5][6][7][8].
Like other ultrasound-based quantitative techniques such as Doppler flow analysis, 2D-SWE is susceptible to measurement variability which is unrelated to intrinsic physiological conditions. SWV measurements can be influenced by variations in imaging conditions such as sampling depth [9], presence of acoustic artifacts [10] or operator relevant errors in technique [11]. In addition, the use of different elastography sono-platforms has also been shown to introduce considerable variability in measurements [12][13][14][15]. A reliable measurement is one in which the variability between repeated measurements remains within a clinically acceptable limit [16]. Such a limit provides confidence that measurements can differentiate between normal and pathological conditions. Although several studies have investigated the reliability of ultrasound derived elastography measurements [9,[12][13][14][15][17][18][19], there is limited published data for 2D-SWE systems and a lack of universally agreed-upon reliability criteria for this system [20].
Modern 2D-SWE systems now incorporate software tools to assist operators to visually assess the stability of the acquired SWV and to inform them of the ideal placement of the sampling region of interest (ROI) box [21]. Alongside a 2D colour-coded map of tissue stiffness, a novel approach provided by some manufacturers is the addition of a "propagation contour map" (Figure 1), which uses a series of contour lines to depict shear wave arrival times at different anatomical points in the tissue being assessed [22,23]. Regions in the image displaying parallel contour lines indicate shear waves are propagating smoothly with minimum variability [21]. These two images provide complementary information designed to produce a visual indicator of the best acquisitions to select for quantitative analysis and to optimise the placement of the ROI sampling box in an area free of significant artefacts. Operators utilising 2D-SWE technology need to have confidence that any derived measurements are a true representation of the tissue being sampled and have a good understanding of the factors that can introduce variability. Thus allowing them to take steps to minimise these at the time of imaging.
This study aimed to investigate the reliability of a modern 2D-SWE system using a tissue-mimicking elastography phantom to identify the major causes of intra-system variability. The evaluation includes an analysis of the sources of SWV measurement variability, as they relate to imaging depth, target stiffness, sampling technique and the operator. . Each image display is split into the parametric colour map on the left, (higher SWV coded as red/orange and slower SWV as blue/green) and reliability propagation map on the right (parallel lines indicating increased reliability).

Materials and Methods
This study utilised a model 049A human tissue mimicking quality assurance elastography phantom produced by Computerized Imaging Reference Systems, Inc., Norfolk, VA, USA (CIRS) made from Zerdine ® , a poly-acrylamide polymer [24] (Figure 2). This phantom has acoustic properties comparable to human tissue, with a reported acoustic velocity of 1540 m/s ± 10 m/s and an attenuation coefficient of 0.5 dB·cm −1 ·MHz −1 [25]. Zerdine ® phantoms from CIRS have been validated and have demonstrated adequate homogeneity and low viscosity levels [9,18,26]. Other investigations using tissues mimicking phantoms have shown that ultrasound frequencies used in the MHz range can be translated in vivo [27,28].
The phantom contains a series of eight cylindrical tubes which taper from 16.7 mm to 1.6 mm at set intervals from one end of the phantom to the other. Each tube provides an imaging target containing one of four known elasticity values, Type 1 (10 kPa), Type II (17 kPa) Type III (47 kPa) and Type IV (85 kPa). These targets offer a range of elasticity values dependent on the clinical area of interest. Current machine dependent reference ranges for the liver, report values <7 kPa for normal tissue and >17 kPa for advanced fibrosis [2]. Elastography studies of the breast have demonstrated values between 30 and 50 kPa for normal parenchymal tissue and much higher values in malignant breast lesions [29]. The four targets are located at 3 cm and repeated at 6 cm depth. The background medium surrounding the targets can be sampled and has a known elasticity of 25 kPa. The largest diameter (16.7 mm) was chosen to limit the potential for shear wave refraction at the target boundary [30] and to accommodate the sampling ROI.
SWV measurements were performed using a Canon Aplio i-series 600 machine (Canon Medical, Otawara-shi, Japan) with a curved 8C1 probe. The system can make the conversion of SWV to Young's modulus E (kPa) using the equation E = 3ρ(SWV) 2 . This equation assumes the tissue density is equal to water (ρ= 1 g/cm −3 ), and that the relationship between Young's modulus and shear modulus (G) is E ≈ 3 G, which occurs for incompressible materials having a Poisson's ratio of 0.5. Soft tissue is often regarded as incompressible [30]. In this study, we chose not to convert the machine measured values of SWV to kPa to focus on the sources of intra-system variability as they applied to SWV. Elastography guidelines recommend reporting values in SWV, as this is the quantity measured by the machine and the conversion to kPa relies on a range of assumptions, which can introduce additional errors [21,31].
Elastography samples were acquired using the system's single-shot method, which activates a 2.5 MHz acoustic radiation force pulse over a single frame. SWV measurements were independently performed by 2 experienced Sonographers (CE and EC), each with over 15 years of ultrasound experience, and 1 less experienced practitioner (AJ) to evaluate the potential of a training effect.
To ensure the distribution of the sample means from each operator were approximately normally distributed and to conduct parametric tests, 30 consecutive SWV measurements were obtained from each phantom target and background [32]. SWV measurements were repeated at two available depths 3 cm and 6 cm. To remain consistent with published guidelines, a single circular ROI was utilised for sampling [33]. The mean SWV (metres/sec) was recorded using a ROI sampling size of 10 mm and repeated using a ROI size of 5 mm. Each operator utilised the on-system SWE measurement software and placed the ROI within each target. Using the wavefront visual propagation map, each operator then placed the sampling ROI in an area of the highest reliability.
The intervals between the contours on the visual propagation map are wider in regions where the shear waves travel faster and closer together when travelling slower (Figure 1). Areas of parallel contour lines represent areas in which shear waves propagate smoothly and indicate that the reliability of data acquisition in these areas is highest [8]. Statistical analysis was performed using IBM SPSS Statistics for Windows Version 26.0 (IBM Corp. Armonk, NY, 2019). The independent samples t-test was used to compare means of SWV from the same target at the two different depths (3 cm and 6 cm) and the two sampling methods (ROI size; 5 mm and 10 mm) [34]. Homogeneity of variances was tested using Levene's variance test [35]. Inter-operator agreement was calculated using an analysis of the Intraclass Correlation Coefficient (ICC) [35]. The ICC utilised a two-way random effects model and selection of absolute agreement between the operators [36]. The ICC values were classified according to the common criteria [37]: excellent (ICC > 0.75), fair to good (ICC 0.40-0.75) and poor (ICC < 0.40). The repeatability coefficient (RC) and coefficient of variation (CV) were used to assess, test and retest reliability [16]. The RC and CV were calculated independently for each operator to ensure the conditions between repeated measurements were the same [16]. The RC is the value below which the absolute differences between two measurements would lie with 0.95 probability [38] and can be calculated by the formula: The CV is calculated as the ratio of the standard deviation (SD) to the mean, with a smaller percentage indicating less variability. The significance level was set at p ≤ 0.05.

Imaging Depth
The single-shot method on the machine provided a reliable colour filling of the 2D colour map from all the targets at both depths. Figure 3, displays the influence of depth on repeated measurements of SWV (metres/sec), collected from all 3 operators. The mean SWV recorded from the 10 kPa target between 3 cm and 6 cm was not significantly different (Welch's unequal variances t-test t238 = 0.742, p = 0.459). Levene's test revealed variances between 3 cm and 6 cm were unequal (F = 9.259, p <0.01). The mean SWV at 17 kPa between 3 cm and 6 cm was also not significantly different (Welch's unequal variances t-test t238 = 0.808, p = 0.438) and variances between 3 cm and 6 cm were unequal (F = 9.162, p <0.01). All other targets demonstrated a significant difference in mean SWV at the two imaging depths. Beyond 25 kPa, the measured SWV was underestimated with increasing target stiffness. The percentage difference in mean SWV between 3 cm and 6 cm was 6.1%, 15.2% and 18.7% for the 25 kPa, 47 kPa and 85 kPa targets, respectively.

Inter-Operator Reliability
Overall inter-observer reliability was calculated after the SWV measurements for both depths for all targets were pooled. Inter-observer reliability was 0.976 (95% CI 0.971-0.986) and 0.918 (95% CI 0.878-0.945) at 3 cm and 6 cm, respectively. Interobserver reliability stratified according to operative experience showed comparable ICC for 3 cm depth but poorer ICC for 6 cm, although still within the excellent range Table 1.

Measurement Repeatability
Repeatability ranges using calculations of RC and CV from all operators are summarised in Table 2. Both depth and target type influenced the ability of the system to provide consistent repeated measures. The smallest variation in repeated measurements resulted from the 10 kPa and 17 kPa targets. The RC and CV ranged from 0.06-0.24 m/s and 1.3-5.7% for both these targets at 3 cm and 6 cm. The greatest variability in repeated measurements resulted from the 45 kPa and 85 kPa target. The largest variability in repeat measurements resulted from the 85 kPa target at 3 cm depth.

ROI Sampling Size
The mean SWV from the two sampling methods (ROI 10 mm vs. 5 mm) were compared and there was no significant difference between these sampling methods except for the 85 kPa target. The 85 kPa target demonstrated a significant difference (p < 0.0001). Levene's test revealed variances between 3 cm and 6 cm were equal (F = 0.326, p = 0.569). Results from the two sampling methods are summarised in Table 3.

Discussion
Inter-system variability is a known issue in ultrasound elastography research and may preclude the meaningful comparison of measurements performed on different systems, even when the same phantom [9,18] or patient [19,39] is imaged. As a result, many investigators would agree that standardisation of imaging protocols on a single system must first be done before it is applied clinically [8,10,21,33,40]. This work provides an important analysis of the sources of intra-system variability on a modern 2D-SWE commercial system. To the author's knowledge, this is the first work to investigate these factors on the popular Canon Aplio i-series (Canon Medical, Otawara-shi, Japan), and provides clinically relevant SWV ranges where the system performs reliably and highlights factors that can introduce variability.
The influence of sample depth on elastography measurements has previously been reported [12,15,17,18] suggesting a trend towards lower SWV values with increasing acquisition depth [15,17]. As a result, recommendations have been made to include the depth of the target organ or lesion into any standardized protocol for SWE [2]. The results in this study concur with previous data and support this recommendation, with many of the phantom targets displaying lower mean SWV at 6 cm compared to 3 cm. However, it is important to recognise that there is a greater underestimation of SWV with increasing target stiffness. Further, our study was able to demonstrate good agreement between the two softer targets (10 kPa and 17 kPa) reporting no significant difference between the mean SWV derived from the two depths. Imaging depth results in a systematic bias and a subsequent underestimate of SWVs with increasing depth, and the literature provides several explanations for this discrepancy. SWV is viscosity and frequency dependant [41], and measurement errors may be related to tissue viscoelasticity. In our study, using a phantom with low viscosity the cause of depth dependant bias is not likely due to viscosity dispersion. SWV is calculated via a time-of-flight (TOF) estimation using several tracking beams to measure the arrival time of the shear wave at locations away from the central axis of excitation [42]. Geometric errors resulting from focusing the acoustic force pulse at different depths and a bias caused by the lateral range of the tracking beams have been elegantly described in more technical papers [43]. Transducer shape also may contribute to this bias, with a larger bias detected using curvilinear shaped probes, like the one used in this study, compared to a linear shaped transducer [43]. Random errors reducing measurement precision caused by the attenuation of the shear wave tracking pulses have also been implicated in these variations [9]. Increasing depth results in increased signal jitter and noise causing an underestimation of the measured SWV due to averaging [44]. These types of random errors have also been reported to be more prevalent when imaging stiffer targets, with experiments showing that TOF uncertainty increases with increasing stiffness, resulting in less precise repeated measurements [42]. Our experiment confirms this is still applicable to a commercial scanner, demonstrating greater variability of repeated measurements of stiffer versus softer targets. Therefore, we recommend guidelines involving 2D-SWE should also consider the relative stiffness of the structure being assessed, and not simply its depth, and indicate higher reliability of SWV measurements can be achieved when imaging softer structures in the range of 1.0-2.0 m/s. This range would be most relevant to clinical investigations involving the liver [2] and placenta [8].
As observed in similar studies, the inter-observer agreement was high between all operators in the study [11,12,45,46]. This result is consistent with other studies demonstrating high inter-observer agreement of 2D-SWE on phantom studies [12] and clinical studies [11]. The experience of the operator is likely to play a greater role in the clinical setting. In practical terms, the elastography phantom provides a convenient platform to scan and it proved relatively easy to identify each target and acquire repeated measurements even for someone new to the technique. In the clinical environment, however, achieving standardised imaging planes due to the variability of human anatomy is likely to be more difficult.
Measurement repeatability is an important consideration for any quantitative measurement system [16]. Repeated measurement calculations in this study were comparable to reports in other studies albeit on different systems [12,13,39]. This study achieved very low RC and CV values when imaging the 10 kPa and 17 kPa targets, indicating excellent repeatability. However, increasing target stiffness is a significant source of measurement variability, as discussed above. Such a range of values would likely make it difficult to interpret the results and to detect differences between normal and abnormal tissues when imaging tissues that produce high SWV values.
An analysis of the mean SWV between the sample ROI size of 5 mm and 10 mm demonstrated overall good reliability, with only the 85 kPa target demonstrating a significant difference between means. Greater variability and subsequent averaging of a larger ROI explains why there are lower values for the 10 mm ROI compared to the 5 mm ROI. Published guidelines relating to liver elastography promote the use of a 10 mm or larger ROI [2,21], however, this may not be practical for smaller structures or organs in which there is a larger variability in the B-mode characteristics. One example could be the placenta, which is a large organ, but often develops in various configurations, such as variable thickness and may display areas of grey-scale heterogeneity such as placental lakes [47][48][49]. Therefore, to evaluate such an organ with ultrasound elastography a larger number of smaller ROIs may be required to provide a better assessment of its overall elastic properties.
There are several limitations to this study. It is important to recognise that this study aimed to investigate which factors contribute most to intra-system variability. This is most easily controlled via measurements obtained from a homogeneous elastography phantom and therefore may not be fully transferrable to the clinical setting. The study phantom material has a low value of viscosity [50]. Human tissue typically has greater viscosity which can result in variations in the attenuation and speed of the generated shear wave [6]. Measurements were only acquired at two depths-3 cm and 6 cm. Increasing depth is a cause of SWV discrepancy when imaging the same target due to shear wave attenuation [13,15]. Thus, it is likely that the intermediate and harder targets may demonstrate less variability at depths between 3 cm and 6 cm, however, this work did not address this. In this study, the greatest intra-system variability occurred because of phantom target depth and stiffness-these factors are likely to be exacerbated in human tissue.

Conclusions
This study demonstrated good reliability of 2D-SWE SWV measurements when imaging soft (10 kPa and 17 kPa) elastography phantom material. Higher SWV measurement variability was evident with increasing depth and stiffness of the phantom targets. The influence of the operator was minimal, and variations in sampling technique had little impact on the SWV for most targets. The reliability achieved in this study of the soft targets is encouraging and may assist the development of reliable imaging protocols for clinical practice when acquiring 2D-SWE measurements.
Author Contributions: C.E. conceptualization, investigation, methodology, visualization, data curation, software, original draft preparation, E.C.: investigation, methodology, visualization, data curation, review S.K. supervision, review and editing. V.C.: resources, project administration, funding acquisition, D.F.: resources, project administration, funding acquisition, supervision, review and editing. All authors have read and agreed to the published version of the manuscript.

Acknowledgments:
The authors acknowledge the support during data collection by Ayaka Johnson.

Conflicts of Interest:
The authors declare no conflict of interest.