High-frame-rate 3-D vector flow imaging in the frequency domain

: Ultrasound vector Doppler techniques for three-dimensional (3-D) blood velocity measurements are currently limited by low temporal resolution and high computational cost. In this paper, an e ﬃ cient 3-D high-frame-rate vector Doppler method, which estimates the displacements in the frequency domain, is proposed. The novel method extends to 3-D an approach so far proposed for two-dimensional (2-D) velocity measurements by approximating the (x, y, z) displacement of a small volume through the displacements estimated for the 2-D regions parallel to the y and x directions, respectively. The new method was tested by simulation and experiments for a 3.7 MHz, 256-element, 2-D piezoelectric sparse spiral array. Simulations were also performed for an equivalent 7 MHz Capacitive Micromachined Ultrasonic Transducer spiral array. The results indicate performance (bias ± standard deviation: 6.5 ± 8.0) comparable to the performance obtained by using a linear array for 2-D velocity measurements. These results are particularly encouraging when considering that sparse arrays were used, which involve a lower signal-to-noise ratio and worse beam characteristics with respect to full 2-D arrays.


Introduction
Estimation of blood flow velocity through Doppler ultrasound (US) [1] has been for a long time limited to the axial component of the velocity vector. The lack of knowledge of the so-called Doppler angle between the US propagation direction and blood flow direction has in fact prevented the accurate estimation of the velocity magnitude. Even if several methods were proposed to overcome this limitation by estimating the two components of the velocity vector lying within the scan (x-z) plane (2-D Doppler), such methods only estimate the velocity in a single sample volume by exploiting triangulation [2][3][4], Doppler bandwidth [5], and transverse oscillations approaches [6,7]. Nonetheless, the introduction of high-frame-rate (HFR) imaging methods, based on plane wave transmission [8][9][10], has paved the way for the extension of vector flow imaging (VFI) to multiple points (e.g., covering a two-dimensional region of interest (ROI)). Several approaches have been so far proposed based on either 2-D speckle tracking [11,12], multiangle Doppler analysis [13][14][15], transverse oscillations [16,17], or color Doppler and continuity equations [18].

High-Volume-Rate Imaging
The imaging method used in this work is based on the transmission (TX) of unsteered plane waves from a 2-D probe and on parallel beamforming in reception (RX). Since no compounding is applied, full volumes can be reconstructed at the maximum rate (i.e., equal to the pulse repetition frequency (PRF)). Then, radiofrequency (RF) volumetric data must be clutter-filtered to remove contributions due to stable or slowly moving tissues. Specifically, hereinafter, since only vessels with rigid walls are considered, clutter removal is obtained by a high-pass filter with a cutoff frequency of 5 Hz. Furthermore, RF echoes are sampled at frequency fs equal to 39 MHz; hence, the distance between pixels in the z direction is δ z = c/(2f s ) = 20 µm. The lateral distance between pixels is the distance between beamformed lines (300 µm) both along the x direction (δ x ) and y direction (δ y ) (see Figure 1).

3-D Displacement Estimation
In order to estimate the velocity components for a specific point of interest P(xp, yp, zp) (i.e., the black dot in Figure 2), the volume of RF data is divided into parallelepipeds, hereinafter referred to as "kernels", containing MB × NB × PB pixels surrounding P. For each kernel, to estimate the average displacement between consecutive TX events, the 2-D VFI algorithm-proposed in [12] and recalled in Appendix A-is independently applied to the two perpendicular planes (yellow and purple in Figure 2) that cross the kernel's vertical (red in Figure 2) axis. Thus, the average (x, z) and (y, z) displacement components are obtained. Note that the displacement component along the z-axis is estimated twice, allowing a more robust assessment by averaging the two estimates. It is worth highlighting that kernels can be arbitrarily located (i.e., they can be either partially overlapped or isolated), only influencing the spatial resolution of vector velocity estimates. In this study, kernels were located on a regular grid and overlapped 80% and 90% in the axial and both lateral directions, respectively, thus giving the same spatial resolution of 0.3 mm along the three directions.

3-D Displacement Estimation
In order to estimate the velocity components for a specific point of interest P(x p , y p , z p ) (i.e., the black dot in Figure 2), the volume of RF data is divided into parallelepipeds, hereinafter referred to as "kernels", containing M B × N B × P B pixels surrounding P. For each kernel, to estimate the average displacement between consecutive TX events, the 2-D VFI algorithm-proposed in [12] and recalled in Appendix A-is independently applied to the two perpendicular planes (yellow and purple in Figure 2) that cross the kernel's vertical (red in Figure 2) axis. Thus, the average (x, z) and (y, z) displacement components are obtained. Note that the displacement component along the z-axis is estimated twice, allowing a more robust assessment by averaging the two estimates. It is worth highlighting that kernels can be arbitrarily located (i.e., they can be either partially overlapped or isolated), only influencing the spatial resolution of vector velocity estimates. In this study, kernels were located on a regular grid and overlapped 80% and 90% in the axial and both lateral directions, respectively, thus giving the same spatial resolution of 0.3 mm along the three directions.

3-D Displacement Estimation
In order to estimate the velocity components for a specific point of interest P(xp, yp, zp) (i.e., the black dot in Figure 2), the volume of RF data is divided into parallelepipeds, hereinafter referred to as "kernels", containing MB × NB × PB pixels surrounding P. For each kernel, to estimate the average displacement between consecutive TX events, the 2-D VFI algorithm-proposed in [12] and recalled in Appendix A-is independently applied to the two perpendicular planes (yellow and purple in Figure 2) that cross the kernel's vertical (red in Figure 2) axis. Thus, the average (x, z) and (y, z) displacement components are obtained. Note that the displacement component along the z-axis is estimated twice, allowing a more robust assessment by averaging the two estimates. It is worth highlighting that kernels can be arbitrarily located (i.e., they can be either partially overlapped or isolated), only influencing the spatial resolution of vector velocity estimates. In this study, kernels were located on a regular grid and overlapped 80% and 90% in the axial and both lateral directions, respectively, thus giving the same spatial resolution of 0.3 mm along the three directions. Given that the developed algorithm is highly parallelizable, in order to compute the 3-D displacement for the whole volume and to keep the computational burden as low as possible, the processing is split into three steps. First, the 2-D VFI algorithm is applied to the planes perpendicular to the y direction to compute both the (x, z) displacement components ( Figure 3A); then, the (y, z) components are similarly obtained by processing the data from planes perpendicular to the x direction ( Figure 3B); finally, the velocity components corresponding to the same kernels are grouped ( Figure 3C). Here, this method was implemented using the basic Matlab (The MathWorks, Natick, MA, USA) functions and by exploiting only the central processing unit (CPU) of the host computer. Given that the developed algorithm is highly parallelizable, in order to compute the 3-D displacement for the whole volume and to keep the computational burden as low as possible, the processing is split into three steps. First, the 2-D VFI algorithm is applied to the planes perpendicular to the y direction to compute both the (x, z) displacement components ( Figure 3A); then, the (y, z) components are similarly obtained by processing the data from planes perpendicular to the x direction ( Figure 3B); finally, the velocity components corresponding to the same kernels are grouped ( Figure 3C). Here, this method was implemented using the basic Matlab (The MathWorks, Natick, MA, USA) functions and by exploiting only the central processing unit (CPU) of the host computer.

Transmission and Reception Settings
The proposed method was validated through simulation and experiments. In both cases, the tests were based on a 256-element gridded spiral array probe built by Vermon (Vermon S.A.,Tours, France) according to our design [34,35], at 3.7 MHz central frequency. The position of the array elements matched that of a spiral pattern of 256 seeds, the density of which was modulated according to a 50% Tukey (tapered cosine) window. Moreover, to test the method at a different frequency (7 MHz), additional simulations were carried out considering the layout of a prototype Capacitive Micromachined Ultrasonic Transducer (CMUT) probe, based on a 256-element spiral array with spatial density tapered by a Blackman window. For both probes, the TX signal was always a 5-cycle sine burst at the central frequency of the probe and weighted by a Hamming window. The PRF was set to 500 Hz. In RX, a dynamic Hamming apodization window with constant f-number = 2 was implemented.

Simulations
Simulations were carried out in Matlab with Field II [36,37]. As sketched in Figure 1, a numeric phantom was developed to simulate a parabolic flow inside a straight, cylindrical vessel with 4 mm radius (R). Its axis was maintained at z0 = 16 mm, while the beam-to-flow angle resulted from different combinations of and angles, as summarized in Table 1. The density of the randomly distributed scatterers was set at 27 scatterers per resolution cell. For each setup configuration, simulations were run to simulate a 1 s long acquisition (i.e., 500 consecutive RF data volumes were produced).

Transmission and Reception Settings
The proposed method was validated through simulation and experiments. In both cases, the tests were based on a 256-element gridded spiral array probe built by Vermon (Vermon S.A.,Tours, France) according to our design [34,35], at 3.7 MHz central frequency. The position of the array elements matched that of a spiral pattern of 256 seeds, the density of which was modulated according to a 50% Tukey (tapered cosine) window. Moreover, to test the method at a different frequency (7 MHz), additional simulations were carried out considering the layout of a prototype Capacitive Micromachined Ultrasonic Transducer (CMUT) probe, based on a 256-element spiral array with spatial density tapered by a Blackman window. For both probes, the TX signal was always a 5-cycle sine burst at the central frequency of the probe and weighted by a Hamming window. The PRF was set to 500 Hz. In RX, a dynamic Hamming apodization window with constant f-number = 2 was implemented.

Simulations
Simulations were carried out in Matlab with Field II [36,37]. As sketched in Figure 1, a numeric phantom was developed to simulate a parabolic flow inside a straight, cylindrical vessel with 4 mm radius (R). Its axis was maintained at z 0 = 16 mm, while the beam-to-flow angle resulted from different combinations of ϑ D and ϑ A angles, as summarized in Table 1. The density of the randomly distributed scatterers was set at 27 scatterers per resolution cell. For each setup configuration, simulations were run to simulate a 1 s long acquisition (i.e., 500 consecutive RF data volumes were produced). Table 1. Imaging, flow, and processing parameters.

Experiments
The experimental tests were based on the ULA-OP 256 [38] (Figure 4), which is a research scanner with 256 independent channels, characterized by full programmability, high flexibility, and access to data at any point of the RX chain. In addition, it embeds high-speed programmable devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), for real-time processing and the implementation of custom, computationally intensive algorithms. For this work, the ULA-OP 256 was used to transmit plane waves and acquire raw RF channel data, which were then postprocessed in Matlab.
In the experiments, a homemade phantom was used, in which a blood-mimicking fluid flew in a cylindrical Rilsan ® pipe immersed in a water tank, as shown in Figure 4. The fluid was obtained by stirring 3 g of 10 µm polyamide spherical particles (Orgasol ® , Arkema Inc., Philadelphia, PA, USA) in 2 L of demineralized water until a homogeneous suspension was achieved (density of 2.6 × 10 10 scatterers/mm 3 ). The flow was forced by a peristaltic pump connected to a pulsation dampener, capable of further reducing possible turbulences. Considering the entrance length of the tube (70 mm), parabolic flow was guaranteed by setting the peak velocity approximately at 7.5 cm/s. The distance and the angle of the Vermon probe with respect to the pipe were set through a custom holder connected to a positioning system. The experimental tests were based on the ULA-OP 256 [38] (Figure 4), which is a research scanner with 256 independent channels, characterized by full programmability, high flexibility, and access to data at any point of the RX chain. In addition, it embeds high-speed programmable devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), for real-time processing and the implementation of custom, computationally intensive algorithms. For this work, the ULA-OP 256 was used to transmit plane waves and acquire raw RF channel data, which were then postprocessed in Matlab. In the experiments, a homemade phantom was used, in which a blood-mimicking fluid flew in a cylindrical Rilsan ® pipe immersed in a water tank, as shown in Figure 4. The fluid was obtained by stirring 3 g of 10 µm polyamide spherical particles (Orgasol ® , Arkema Inc., Philadelphia, PA) in 2 L of demineralized water until a homogeneous suspension was achieved (density of 2.6 × 10 10 scatterers/mm 3 ). The flow was forced by a peristaltic pump connected to a pulsation dampener, capable of further reducing possible turbulences. Considering the entrance length of the tube (70 mm), parabolic flow was guaranteed by setting the peak velocity approximately at 7.5 cm/s. The distance and the angle of the Vermon probe with respect to the pipe were set through a custom holder connected to a positioning system.

Performance Metrics
The accuracy and precision of the 3-D VFI method were assessed in terms of mean relative bias and standard deviation of velocity estimates, as proposed in [29]. For each point of interest P(xp, yp,

Performance Metrics
The accuracy and precision of the 3-D VFI method were assessed in terms of mean relative bias and standard deviation of velocity estimates, as proposed in [29]. For each point of interest P(x p , y p , z p ), located at distances ≤ (0.9 × R) from the flow axis, the mean relative bias of the axial velocity component, v z , is defined as where v z x p , y p , z p andv z x p , y p , z p are the temporal means of the estimated and reference velocity components, respectively, while N p is the number of points of interest.v max is the reference peak velocity (7.5 cm/s). On the other hand, the mean relative standard deviation was calculated as is the standard deviation evaluated for each velocity component averaged over the total number of frames N F . Similarly, mean relative biases and standard deviations for the lateral velocity components, v x and v y , and for the velocity module, v Module , were obtained through equivalent equations. Figure 5 shows the three components and the module of the velocity profiles estimated along the z-axis by using the CMUT and Vermon probes. Six probe-to-flow orientations, distributed by columns, were chosen to test the method when (a) only one velocity component was nonzero (left column), (b) two velocity components were different from zero (columns 2-4), or (c) all three velocity components contributed to the velocity module (columns 5 and 6). In particular, the azimuth angle was set to 45 • in the cases shown in the three rightmost columns, which implies that the lateral and elevation velocity components should be identical.

Simulations
The differences between the profiles produced by the two probes can be appreciated in Figure 5. The velocity components (v x , v y ) were better estimated with the CMUT probe, the profiles of which more closely approached the reference. Similarly, the more pronounced underestimation of the lateral velocity components in the Vermon probe yielded a corresponding underestimation of v Module (bottom panels). It is also worth noting that the estimated lateral velocity components (v x and v y , second and third rows) obtained for ϑ A = 45 • gave similar profiles, confirming their expected symmetry.
To visualize that the method behaved correctly even outside the axis of the probe, the 3-D representations of the flow distributions are shown in Figure 6 for the CMUT probe and for ϑ D = 90 • , ϑ A = 0 • . Here, the surface view of the x-z plane for y = 0 is reported on the left and the y-z plane for x = 0 on the right.     The relative bias, B v , and standard deviation, σ v , evaluated according to Equations (1)-(3) for all simulated conditions are reported in Table 2. For the CMUT probe, B v and σ v were always lower than 9.6% and 11.4%, respectively. On average, B v Module and σ v Module were 6.5% ± 8.0%, respectively. The results obtained with the Vermon probe, in general characterized by larger errors (B v Module and σ v Module resulted, on average, 14.4% ± 8.7%), were, however, less dependent on the beam-to-flow orientation, with a relative bias ranging (for the velocity module) between 12.6% and 16.3%. Table 2. Relative bias and mean standard deviation for simulations.

Experiments
Particular care was taken to manually align the probe to the tube and obtain the same conditions used for simulation. However, from the analysis of the results, we noticed a slight misalignment of about 1 • in a couple of cases, which is within the accuracy of the position system. Hence  Figure 7 shows the velocity profiles experimentally obtained by using the Vermon probe. Qualitatively, the overall trend of the estimated profiles was the same observed from the simulations: v z was estimated with good accuracy, while v z and v y suffered from higher underestimation. The errors reported in Table 3 indicate that, on average, the relative error B v Module = 11.5% was lower (roughly 3%) compared with simulations, even if the standard deviation was higher (10.9%). As highlighted by the plots and confirmed in Table 3, when ϑ A = 45 • , there was no mismatch between the estimates of the v z and v y components. The difference between the bias of the two velocity components was lower than 1%. Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 14

Discussion and Conclusions
A novel 3-D high-frame-rate vector flow imaging method has been presented. The method is an extension of the 2-D approach proposed in [12], which estimates the velocity through the 2-D frequency domain phase shifts corresponding to the displacement of 2-D kernels extracted from consecutive RF frames. The proposed approach involves the transmission of unsteered plane waves from a bidimensional probe and parallel beamforming in reception. Then, the volume of beamformed data is divided into 3-D kernels and, for each of them, the 3-D vector flow is estimated.
Simulations were performed for two 256-element spiral probes having comparable dimensions (~1 cm): the first one was based on a 1024-element piezoelectric probe manufactured by Vermon; the second one was based on a CMUT prototype that is currently under development [39]. These probes were chosen since they are illustrative of the performance achievable by the proposed method at different frequencies (3.7 and 7.0 MHz). Furthermore, it was possible to connect the Vermon probe to the ULA-OP 256 research scanner to perform experimental tests. Simulations and experiments were conducted at depths around 16 mm, which are compatible with both probes and with possible peripheral vascular applications. The performance was assessed, for different beam-to-flow angles,   11.5 ± 10.9

Discussion and Conclusions
A novel 3-D high-frame-rate vector flow imaging method has been presented. The method is an extension of the 2-D approach proposed in [12], which estimates the velocity through the 2-D frequency domain phase shifts corresponding to the displacement of 2-D kernels extracted from consecutive RF frames. The proposed approach involves the transmission of unsteered plane waves from a bidimensional probe and parallel beamforming in reception. Then, the volume of beamformed data is divided into 3-D kernels and, for each of them, the 3-D vector flow is estimated.
Simulations were performed for two 256-element spiral probes having comparable dimensions (~1 cm): the first one was based on a 1024-element piezoelectric probe manufactured by Vermon; the second one was based on a CMUT prototype that is currently under development [39]. These probes were chosen since they are illustrative of the performance achievable by the proposed method at different frequencies (3.7 and 7.0 MHz). Furthermore, it was possible to connect the Vermon probe to the ULA-OP 256 research scanner to perform experimental tests. Simulations and experiments were conducted at depths around 16 mm, which are compatible with both probes and with possible peripheral vascular applications. The performance was assessed, for different beam-to-flow angles, in terms of bias and standard deviation. Even if both experiments and simulations were conducted for a maximum flow speed of 7.5 cm/s-suggested by the limited entrance length of the experimental flow phantom-the PRF was accordingly set to a low value of 500 Hz so that, in the worst case (ϑ D = 80 • for the high-frequency CMUT probe), the produced Doppler shift along the z direction was about 50% the aliasing limit. Hence, the obtained results indicate that the possible proximity to the aliasing limit is not a crucial issue in the proposed method.
As reported in Table 2 simulations have shown that the CMUT probe, in the presented cases, performed better than the Vermon one (mean bias ± standard deviation: 6.5% ± 8.0% vs 14.4% ± 8.7%). This difference can be attributed to the characteristics of the two probes, namely, the similar equivalent apertures (~6 mm) and different central frequencies, which are known to give better spatial resolution for higher-frequency (CMUT) probes. In addition, the CMUT probe results are consistent with those (10% ± 5%) obtained for the 2-D method with a 6 MHz linear array probe [12]. This is particularly remarkable considering the noticeable differences between the arrays used in the two cases, which involve elements of dimensions passing from 0.2 × 6 mm for the linear array to about 0.3 × 0.3 mm for the 2-D arrays.
The Vermon experimental results confirmed the simulation results, with a slightly better bias but a slightly higher standard deviation, which was probably due to the nonideality of the measurement setup. In particular, the peristaltic pump introduced flow fluctuations that could not be sufficiently attenuated with the flow stabilization system. Even if a bigger pulsation dampener or a higher hydraulic resistance might have been used to limit the flow fluctuations, we believe that the nonideality of the experimental setting had a limited impact on the final results and did not affect their value.
The computational cost of the proposed method can be assessed by referring to the analysis carried out for the 2-D method [12]. Indeed, as the 3-D VFI is a two-step extension of the 2-D method, the overall computational cost remains of the order O(N). This is much lower than the 3-D cross-correlation-based algorithms, natively O(N 3 ), which may be decreased through appropriate optimizations to reach an order of complexity of at least O(N 2 ). Even if the proposed method was tested offline (i.e., by postprocessing the acquired raw data), it is part of the "high-frame-rate" family, since one full (Doppler) frame can be obtained after each TX event. Furthermore, the algorithm for 3-D VFI is highly parallelizable and is suited for implementation on GPUs, as already shown for 2-D VFI in [40]. The main limitations of parallel computing are linked to data transfer rates and the number of global memory accesses, which both need the proper and careful design of memory coalescing. Considering also the increasing computational power of GPU-based devices, in the near future, the implementation of 3-D VFI should be feasible in real time, targeting input frame rates of at least 7 kHz (i.e., sufficient to estimate flow speeds up to 100 cm/s).
One of the main limitations of this work was using unsteered plane waves, which limited the extension of the investigated region due to the small size of the probe aperture. For example, as seen in the right panel of Figure 6, the paraboloid representative of the velocity module was partially cut off, since only 3.6 mm (about one-half of the pipe diameter) could be investigated along both the x and y axes. A possible alternative to extend the region of interest could be the transmission of diverging waves, although it should be taken into account that they involve a worse signal-to-noise ratio (SNR) [41,42].
Finally, it has to be highlighted that, in this study, the novel method was implemented by using sparse spiral arrays rather than full 2-D arrays [24], which allowed us to perform the first experimental tests. However, it unavoidably involved a reduced SNR and a lower sensitivity to blood flow. This could be partially compensated by using coded imaging techniques [43] or in-probe preamplifiers. Likewise, doubling the number of active elements by exploiting two synchronized ULA-OP 256 systems [34,35] could improve the system sensitivity and, at the same time, counteract the unavoidable grating lobes in the transmitted ultrasound beams associated with spiral arrays [42]. By taking these aspects into considerations, the results of this paper appear extremely encouraging and confirm that sparse arrays may be a relatively cheap solution for the implementation of 3-D imaging on systems based on a limited number of channels.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
In the 2-D VFI HFR method proposed in [12], each RF frame is divided into (possibly partially overlapped) blocks of M B × N B pixels, which are weighted with a Hann's window to limit deleterious edge effects. The same-located blocks of two consecutive frames, s mn ands mn , are obtained at time t and t = t + τ, respectively. If the scatterers inside a block cover a relatively short distance during the interval τ,s mn can be considered as a 2-D shifted version of s mn . Hence, the ∆z and ∆x displacements can be computed from a 1-D discrete Fourier transform (DFT) for the z direction and a 2-D DFT for the x direction, as follows: where f m and f n are the spatial frequencies in the z and x directions, respectively, while ∆φ f m and ∆φ f m f n are the phase shifts in the frequency domain. In this work, displacements were estimated at multiple frequencies to improve the robustness of estimates. Hence, k f m = 3 frequencies were uniformly distributed on a 20% bandwidth around f 0 , while k f n = 10 frequencies were distributed on negative and positive spatial frequencies in a bandwidth between 10% and 40% of the Shannon's spatial frequency. Then, the displacements were averaged as follows: Finally, the velocity components were given by v z = ∆z δz τ v x = ∆x δx τ (A5)