1. Introduction
Abdominal ultrasound (US) requires a large field-of-view with high image quality at all depths because abdominal organs examined by US imaging are of various sizes and located at various depths [
1]. For example, the gallbladder and common bile duct, which are located at shallow depths (2–7 cm), need to be reconstructed with a sufficiently high spatial resolution to estimate the wall thickness, while the liver and kidney, which are located at both shallow and deep depths (5–20 cm), also require high spatial and contrast resolutions. When conventional focusing (CF) is used, however, the transmit beam has a fixed focal depth, which exhibits a higher spatial resolution and contrast of the image in the vicinity of the focal depth but lower image quality at other depths. Consequently, when scanning the entire abdomen, clinicians must constantly adjust the focal depth up and down to the region of interest with one hand while holding a transducer with the other hand. This constant manual adjustment of the focus prolongs the examination time.
A simple method of enhancing the image quality over various depths is to increase the number of transmit foci per scanline [
2]. In the multifocus technique, multiple beams focused at different depths are successively transmitted to reconstruct a single scanline of an image. Consequently, the frame rate decreases inversely proportional to the number of foci. Although this technique is commonly used for linear array imaging with a short depth of view (<8 cm), it is difficult to apply for abdominal US imaging when the view depth is greater than 15 cm because of the long acquisition time. If three foci per scanline are used for a B-mode image with 256 scanlines and a depth of 20 cm, then the frame rate will be lowered to 5 Hz, which is extremely slow for real-time US scanning.
Another method is synthetic transmit focusing (STF), in which multiple low-resolution images obtained by transmitting an unfocused beam or a widely diverging beam before and after a tight focus are compounded coherently to achieve dynamic transmit focusing at every imaging point. Synthetic aperture (SA) imaging is the most investigated and widespread STF technique. In SA imaging, a virtual source (VS) is usually generated in front of the array transducer, and spherical wavefronts before and after the VSs are used for STF [
3,
4,
5]. Because it was adapted for medical imaging in the 1980s–90s [
3,
6,
7], numerous studies have been published and verified that SA imaging provides a high-resolution image over all depths compared to conventional focusing [
8,
9,
10,
11]. Consequently, it has been implemented as an innovative advanced beamformer on commercial high-end US systems, such as
nSIGHT by Philips or cSound by GE [
12,
13]. However, this technique requires reconstruction of dozens of hundreds of more scanlines per frame compared to the CF method, which substantially increases the computational costs. In addition, it suffers from motion artifacts because high-quality SA imaging often requires approximately 100 emissions due to grating lobe problems [
8,
14], and tissue or hand motion could cause incoherence among the low-resolution images that are subsequently compounded and lower the performance of dynamic transmit focusing. Moreover, motion artifacts can be more problematic in abdominal ultrasonography due to the large field-of-view and long acquisition time, which hinders the fast scanning and capturing of various abdominal organs.
When VSs are set behind the transducer array, diverging waves (DWs) are transmitted through multiple elements covering a broad imaging region. In DW imaging (DWI), a small number of DWs, usually 3–20, with different VS positions in the lateral direction are transmitted and coherently compounded for a single frame [
15,
16,
17], thus leading to fast acquisition and less motion artifact. Using the broad coverage of DW, DWI is generally used for fast cardiac imaging with a phased array [
16,
18,
19,
20,
21]. Because DWI offers dynamic transmit focusing with a small number of emissions, it can also be used to enhance the image quality of abdominal US, avoiding significant motion artifacts. Consequently, a recent paper suggested the use of DW for ultrafast abdominal imaging [
22].
When the VSs are placed at infinity behind the array, plane waves (PWs) are transmitted. PW imaging (PWI) employs STF using the PWs steered at different angles [
23]. It has been widely used as a means of ultrafast imaging for shear wave elastography [
24,
25,
26] and blood flow imaging [
27,
28] or fast 3D scanning [
29,
30,
31]. Although most PWI studies focus only on its ultrafast imaging capability, a few studies have reported its potential for high-resolution B-mode imaging [
23,
26,
32]. In addition, PWI is more advantageous than DWI in terms of the spatial resolution [
19,
22] because PW STF can theoretically provide a uniform beam width over all depths [
26,
33]. However, PWI has not been widely used for large sectorial imaging because of the small coverage of PW compared to that of DW, which might be one of the reasons why PWI has not been implemented for abdominal imaging. To test the feasibility of PWI for this unexplored application, in our earlier study [
34,
35], we optimized PWI for the sectorial field-of-view and conducted simulation and phantom experiments. From that study, PWI was proven to enhance both the image quality and frame rate in convex array imaging, which results similar to that of linear array imaging when the PW angles, transmit aperture size, and synthesized PWs for each imaging point were properly selected.
The objective of this paper is to test the clinical applicability of the optimized PWI for abdominal ultrasonography by evaluating its in vivo image quality and comparing its quality with that of DWI and CF. To the best of our knowledge, this is the first in vivo study of PWI for abdominal ultrasonic imaging. Phantom images, in vivo images, and videos of the liver, kidney, and gallbladder of 30 healthy volunteers were obtained using PWI, DWI, and CF. First, the phantom images were used to measure the spatial resolution and image contrast for the quantitative evaluation. Then, in vivo abdominal US images were acquired from 30 healthy volunteers and assessed by three radiologists in terms of spatial resolution, image contrast, noise, and artifacts. In addition, in vivo video clips were also evaluated by radiologists to assess image quality under hand motion.
2. Materials and Methods
2.1. Imaging Techniques
Three imaging techniques were compared: CF, DWI, and PWI. The transmit configuration of each imaging technique is listed in
Table 1. In CF imaging, traditional line-by-line scanning was performed using a focused beam with a focal depth of 10 cm and an F-number of 5.0. The focal depth and the F-number were optimized so that the spatial resolution was as constant with increasing depth as possible to obtain uniform spatial resolution in the entire field-of-view for the fair comparison with two-way dynamic focusing techniques (DWI and PWI). In addition, 128 focused beams were used for CF, and 32 VSs and 32 PWs were employed for DWI and PWI, respectively.
Figure 1 illustrates the VSs used for DWI and PWs used for PWI. The red dots in
Figure 1a show the 32 VSs, and the orange arcs represent the propagation of a DW originated from the first VS over time. The red lines in
Figure 1b show the 32 steered PWs, and the orange lines illustrate the propagating PW. The blue arrows in
Figure 1 show the directions of the DW and PW. The VS positions (
Table 1) were chosen to employ the full aperture (yellow shaded area in
Figure 1a) for each DW transmission. The outermost VS was positioned at
x = ± 20 mm (
Figure S1) considering the 6 dB acceptance angle of the element of the convex array transducer used in this experiment. In PWI, the transmit aperture size was limited, indicated by a yellow shaded area in
Figure 1b, such that it did not exceed the acceptance angle of the transducer element for the optimal PWI as proposed in our previous study [
35].
In sector imaging, PW has a smaller beam propagation region (smaller coverage) compared with DW (orange shaded areas in
Figure 1) because of the limited transmit aperture and a constant direction of propagation. Although this property of PW reduces the number of synthesized waves per imaging pixel, it does not diverge and maintains the wave intensity throughout the propagation, leading to a deeper penetration depth compared with that of DW. More importantly, the small propagation region reduces the amount of computations required in beamforming.
Table 1 shows the normalized amount of computations for each imaging technique when a B-mode image with 1039 × 256 pixels was reconstructed. This value was obtained by normalizing the number of beamforming (channel-summation) operations required for a single compounded image using DWI or PWI by the value required for a single image using CF. As is well known, synthetic imaging (DWI and PWI) requires much more computations than traditional focusing (CF). Note that PWI with 32 PWs requires 2.9-times fewer computations compared with DWI with 32 VSs as shown in
Table 1.
Given the imaging depth
d, the sound speed
c, and the time margin before the next emission
τ, the acquisition rate for a single frame can be calculated by
The acquisition frame rates of CF, DWI, and PWI are 31.9 fps, 127.7 fps, and 127.7 fps, respectively, when
d = 15 cm,
c = 1540 m/s, and
τ = 50
μs (
Table 1). However, during the in vivo data acquisition with the US system, the display frame rates (i.e., the frame rate at which the B-mode image was updated on the screen) were 26.6 fps, 10 fps, and 15 fps for CF, DWI, and PWI, respectively, which were lower than the acquisition rates due to the limited number of receive channels and computing power of the system. Although 2.9 times fewer computations were required for PWI than DWI (
Table 1), the display frame rate of PWI supported by the system was only 1.5 higher than that of DWI due to the limited channel count of the system. The reasons for the low display frame rate will be further explained in discussion section. Note that the display frame rate could, however, be enhanced up to the acquisition frame rate by improving the computational algorithms in the beamforming process and upgrading the computational resources of the US system.
For all the imaging techniques, the transmit voltage was 80 V, the receive F-number was 1.0, and the 50% Tukey window was used for the receive apodization in the beamforming process. In DWI and PWI, only the imaging points reached by the DW or PW were calculated as the low-resolution image and compounded for the final image as described in [
35].
2.2. System and Method for Data Acquisition
A research US system (E-cube 12R, Alpinion Medical Systems, Republic of Korea) with a convex array transducer (SC1-6, Alpinion Medical Systems, Republic of Korea) was used for data acquisition. The transducer has 128 elements and a center frequency of 3.6 MHz, and the system has a 128-channel transmit board and a 64-channel receive board. For DWI and PWI, which require full-channel reception, the same emission was repeated twice for the echo reception of the first and second sets of 64 channels. Beamforming and image processing were conducted on a graphics processing unit (GPU) (GeForce GTX 1080, NVIDIA, CA, USA) equipped in the system by using the CUDA computing platform. Thus, unfortunately, the acquisition frame rate of DWI and PWI in this study was twice the maximum acquisition frame rate (127.7/2 = 63.85 fps).
A commercial phantom (Model 539, ATS laboratories Inc., Bridgeport, CA, USA) was used for the phantom study. For the phantom images of CF, DWI, and PWI, radio-frequency (RF) data were acquired by fixing the transducer on the phantom. A cross-section of the phantom including point and cyst targets was selected, and three images of the same cross-section were reconstructed using the three imaging techniques (CF, DWI, and PWI).
In vivo abdominal ultrasonic images of the gallbladder, liver, and kidney were collected from 30 healthy male volunteers by a radiologist under institutional review board approval at Seoul Saint Mary’s Hospital. Written informed consent was obtained from all volunteers. The radiologist obtained abdominal ultrasonic images and videos of each volunteer using CF, DWI, and PWI, sequentially, trying to obtain three images or videos (CF, DWI, and PWI) for the same cross-section as much as possible. Misaligned sets were excluded, and 52 image sets (a total of 156 images) were evaluated: 14 sets for the gallbladder, 18 sets for the liver, and 20 sets for the kidney. For the video evaluation, 22 video sets (a total of 66 clips) were chosen, and each video contains the real-time image of right hepatic lobe, gallbladder, and right kidney. Three video clips (CF, DWI, and PWI) of each set were synchronized to show the same cross-section at the same time point as much as possible. The time length of the synchronized videos was between 3 and 9 s.
2.3. Beamforming and Postprocessing of Image for the Evaluation
For the still images, the RF channel data were stored and beamforming and postprocessing were conducted offline. In the beamforming process, the RF data were demodulated to the base band, downsampled by a factor of 4, and then beamformed using the parameters shown in
Section 2.1. To flatten the uneven brightness of the image across depths within an image and across different imaging techniques, automatic time-gain-compensation (TGC) was applied to all the images as in [
8]. The imaging region was axially divided into 5 zones, and the 5 representative gain values were determined by the reciprocal of the median brightness of each zone. TGC was applied after the spline interpolation of the 5 gain values.
In the log compression, which highly affects the contrast of an image, the max value was automatically chosen to be 50 dB and 40 dB above the median brightness of the entire image for the phantom and in vivo images, respectively. The dynamic range was 80 dB and 57 dB for the phantom and in vivo images, respectively.
The RF channel data for in vivo videos could not be stored due to the limited storage capacity. The videos were obtained by recording displayed B-mode images on the screen. Because the automatic TGC was not implemented on the online reconstruction software in the system and the radiologist arbitrarily adjusted the gain during the acquisition, the brightness of the on-screen images among DWI, PWI, and CF was quite different. Thus, the automatic TGC was applied on a log scale to the recorded video clips. For this reason, unfortunately, the image contrast of video could not be evaluated because the brightness of the screen-captured video was already clipped with different ranges before the post TGC control.
2.4. Image and Video Evaluation
Three radiologists with 10 years, 8 years, and 5 years of abdominal ultrasonography experience assessed the phantom images and the in vivo images and videos of the human abdomen. The radiologists were asked to score each image or video on a 5-point Likert scale (1: very poor, 2: poor, 3: average, 4: good, and 5: very good) in terms of 4 evaluation items (‘spatial resolution’, ‘contrast’, ‘noise’, and ‘artifacts’). The videos were not assessed in terms of ‘contrast’ because some grayscale values were saturated due to the unavailability of raw data as described in
Section 2.3.
In the phantom study, 3 images (1 set) of a cross-section of the phantom were reconstructed using CF, DWI, and PWI. The 3 images were randomly ordered without labels and presented to evaluators. For the in vivo study, 156 images (52 sets) were randomly ordered and evaluated individually without any information about the patients and imaging techniques. For the assessment of in vivo videos (22 sets), the three synchronized videos of each set were played together side by side with random order. The radiologist could rewind and play back the videos freely during the assessment.
For the phantom study, the spatial resolution and contrast were also quantitatively measured. The spatial resolution was measured by the lateral length of the –6 dB contour of a point target [
35]. The contrast ratio was calculated by
CR =
, where
and
are the mean intensities of the background speckle and cyst regions, respectively [
36].
2.5. Statistical Analysis
The Wilcoxon rank-sum test was used because it is known to be suitable for a Likert scale evaluation [
37,
38]. Because the absolute Likert scale values highly depend on the person’s interpretation of the scale, the test was applied to each evaluator’s scores. Three pairs of data (CF versus (vs.) DWI, CF vs. PWI, and DWI vs. PWI) were tested to statistically demonstrate that PWI offers a better image quality than does CF imaging and provides comparable performance to DWI. The mean score difference between two among three imaging techniques were obtained. For example, the mean score difference between PWI and DWI (P vs. D) was calculated as
where
and
are scores of
n-th image or video clip reconstructed by PWI and DWI, respectively.
4. Discussion
In this paper, we demonstrated that PWI 1) provides significantly enhanced image quality with a 4-fold higher acquisition rate compared to line-by-line CF and 2) provides a comparable performance with a 2.9 times lower number of computations compared to DWI, based on quantitative and qualitative evaluations of phantom and in vivo images. In the phantom study, the spatial resolution at depths ≥ 100 mm was enhanced (~0.5 mm) and the contrast of cyst targets was improved (~2 dB higher on average) when using DWI and PWI compared with CF (
Figure 3 and
Figure 4,
Table 2). In the in vivo study, the radiologists assessed the still images of 52 sets and the video clips of 22 sets, including liver, gallbladder, and kidney.
Comparing PWI and CF, in the image evaluation (
Figure 5 and
Figure 6), radiologist 1 rated PWI significantly higher than CF for all evaluation items and radiologists 2 and 3 recognized the significantly improved image quality of PWI in terms of ‘resolution’, ‘contrast’, and ‘noise’ items (
p < 0.05). In the video evaluation (
Video S1 and
Figure 7), radiologist 1 found a significant enhancement in PWI in terms of ‘resolution’, ‘contrast’, and ‘noise’, while radiologist 2 found significant enhancements in terms of ‘noise’ compared to CF.
In addition to enhanced image quality, the fast acquisition rate is another advantage of PWI compared to CF. As the numbers of transmissions of PWI are 4-times lower than that of CF (
Table 1), the acquisition rates under the physical speed of US in tissues are 4-fold higher than that of CF. This advantage of using a small number of emissions reduces the likelihood of motion artifacts, such as blurring and distortion, which are major issues in synthetic imaging.
A comparison between PWI and DWI showed that PWI had slightly better spatial resolution and DWI had slightly better contrast and reduced noise and artifacts (
Figure 6 and
Figure 7). Similar results were reported by Tong et al. [
19] and Kang et al. [
22]. However, the score differences between PWI and DWI were quite small and none were significant (
p > 0.1,
Table 3 and
Table 4). Therefore, these findings imply that PWI is able to provide a comparable image quality to DWI in sector imaging.
More importantly, PWI required an approximately 3-times lower amount of computations (
Table 1) relative to DWI. For sector imaging, DW is usually chosen to achieve dynamic transmit focusing, which might be related to the larger field-of-view of sector imaging compared with linear-scan imaging and the broader coverage region (beam propagation region in
Figure 1) of DW compared with PW. In this paper, however, we found that PWI can provide comparable image quality with a much lower amount of computations compared to that of DWI when the PW angles and transmit aperture size are carefully selected as in [
35].
4.1. Dependence on Evaluators
From the statistical analysis of the in vivo images and videos (
Table 3 and
Table 4), significant differences were found most often in the assessment of radiologist 1, while the least significant differences among the three radiologists were found for the evaluation results of radiologist 3. This outcome might be associated with the evaluators’ clinical experience. Radiologists 1, 2, and 3 had 10 years, 8 years, and 5 years of experience, respectively, and the most experienced radiologist gave the scores with the largest variance (variance in the image evaluation scores was 1.16, 0.79, and 0.79 for radiologists 1, 2, and 3, respectively). The more experienced radiologists might have assessed the images with greater confidence, resulting in more significant differences in many items.
4.2. Real-time Realization
Similar to other STF imaging techniques, DWI and PWI require massive computations because dozens of scanlines should be reconstructed per single transmission and reception event, while CF requires a one- or two-scanline reconstruction per event (
Table 1). Thus, this computational load makes the real-time implementation of STF imaging challenging, although both DWI and PWI have a high acquisition frame rate. In this case, the lower number of computations of PWI compared with DWI can be beneficial.
Parallel processors can be successfully utilized for STF imaging to accelerate the reconstruction process because beamforming intrinsically performs the same operation on multiple data points. Software-based beamformers based on GPUs have been widely employed for STF imaging [
13,
39,
40] as well as for conventional B-mode imaging, functional imaging, or three-dimensional imaging [
29,
30,
41,
42]. We also utilized a GPU for fast reconstruction of DWI and PWI. Although the display frame rate (real-time frame rate) of the system used in this study fell short of the acquisition frame rate, the process could be accelerated if the system supports a full channel reception and the online B-mode reconstruction software is further optimized, such as by using concurrent data copy and kernel execution. Indeed, using GeForce GTX 1080, it took 41.3 ms and 14.6 ms to compute a single synthesized (i.e., compounded) frame from channel data for DWI with 32 VSs and PWI with 32 PWs, respectively. Considering that parallel computing and data transfer technology is rapidly advancing, PWI with at least a 60-fps frame rate will soon be achievable.
4.3. Limitation of this Study
Despite the fast acquisition rates of PWI and DWI (
Table 1), the display frame rates of PWI and DWI were lower than that of CF (26.6 fps, 10 fps, and 15 fps for CF, DWI, and PWI, respectively) in this study due to the lack of channel count and computing power of the system. The low display frame rates of PWI and DWI were mainly because they (1) need the full-aperture reception (128 channels) and (2) require 10–30 times more computations (
Table 1) than CF. In CF, 64 channels were sufficient to receive echoes of a focused US beam from a straight scan line. However, DWI and PWI required a full 128-channel reception to collect echoes of a wide US beam reflected from a broad region. Unfortunately, the system supports only 64 reception channels and thus two times more transmit-receive sequences were performed to obtain 128-channel data with the 64-channel system. In addition, despite the use of a GPU for beamforming, the data transfer time and image reconstruction time for PWI and DWI was longer than the US echo acquisition time, which further decreases the display frame rates of PWI and DWI.
For the still image evaluation, the B-mode image was reconstructed offline from RF channel data stored and thus the frame rate of image was only affected by the limited number of receive channels. Hence, each still image of CF, PWI, and DWI was acquired at the rate of 31.9 fps, 63.85 fps, and 63.85 fps, respectively. For the video evaluation, the screen-captured videos were used, and thus the frame rate of video was the same as the display frame rate (26.6 fps, 10 fps, and 15 fps for CF, DWI, and PWI, respectively). Those limited frame rates of images and videos might have affected the evaluation results. Note that despite this unfavorable condition (lower frame rate than possible), DWI and PWI received better scores than CF. If the 128-channel acquisition is available, the motion artifacts in DWI and PWI would be further reduced. In addition, if the real-time reconstruction is realized and the reconstruction frame rate is close to the acquisition frame rate, the system noise presented in the B-mode image would also be reduced by frame averaging because more frames could be averaged within a fixed averaging time period the for image persistence.
Although we optimized parameters for each imaging (the focal depth and F-number of CF for a uniform resolution over depths, the VS positions of DWI for full-aperture transmission, and the PW angles and aperture size of PWI according to our previous study [
35]), only a single set of parameters for each imaging technique was used to evaluate the image quality in this study. More exhaustive comparisons with changes in various parameters might be needed because the number and directions (or angles) of synthesized waves are major determinants of image quality in PWI and DWI.