High Resolution, High Contrast Beamformer Using Minimum Variance and Plane Wave Nonlinear Compounding with Low Complexity

The plane wave compounding (PWC) is a promising modality to improve the imaging quality and maintain the high frame rate for ultrafast ultrasound imaging. In this paper, a novel beamforming method is proposed to achieve higher resolution and contrast with low complexity. A minimum variance (MV) weight calculated by the partial generalized sidelobe canceler is adopted to beamform the receiving array signals. The dimension reduction technique is introduced to project the data into lower dimensional space, which also contributes to a large subarray length. Estimation of multi-wave receiving covariance matrix is performed and then utilized to determine only one weight. Afterwards, a fast second-order reformulation of the delay multiply and sum (DMAS) is developed as nonlinear compounding to composite the beamforming output of multiple transmissions. Simulations, phantom, in vivo, and robustness experiments were carried out to evaluate the performance of the proposed method. Compared with the delay and sum (DAS) beamformer, the proposed method achieved 86.3% narrower main lobe width and 112% higher contrast ratio in simulations. The robustness to the channel noise of the proposed method is effectively enhanced at the same time. Furthermore, it maintains a linear computational complexity, which means that it has the potential to be implemented for real-time response.


Introduction
Ultrasound imaging has been widely used in medical clinical diagnosis due to the advantages of non-invasive, high efficiency and low cost. The traditional focus scanning mode makes the imaging result of rapid-moving tissues and organs blurred because of the low frame rate. This affects the diagnostic accuracy of clinicians. Ultrafast imaging mode is able to overcome such problems since it can provide a high frame rate (thousands of frames per second). Thus, it is more advantageous in the applications of 3D and 4D ultrasound imaging, elastography, doppler blood flow imaging, and so on [1][2][3]. The plane wave compounding (PWC) is an effective way to realize ultrafast ultrasound imaging, which was originally proposed by Lu and Cheng [4,5], and then further improved by Montaldo et al. [6]. By combining low-resolution imaging results from different transmitting steering angles, a high-resolution imaging result can be generated. More transmitting beams can generate higher imaging quality, while the frame rate will be decreased accordingly, and vice versa. Therefore, how to achieve the better imaging performance while maintaining the high frame rate of the PWC is of great significance for the further applications of the ultrafast ultrasound imaging in the medical ultrasound imaging field.
Better imaging performance means better imaging resolution and better imaging contrast. As is known, the minimum variance (MV) method proposed by Capon in 1969, achieved a certain effect on improving the imaging resolution, while the imaging contrast was still unsatisfactory [7]. Several studies have focused on modifying the MV method and applying it to PWC imaging. Zhao et al. proposed the joint transmitting-receiving (JTR) MV beamformer, in which two MV weights are calculated to obtain the improvement of the resolution [8]. A mixed transmitting-receiving (MTR) proposed by Wang et al. further enhanced the imaging quality by redefining the MV optimization problem [9]. Nguyen et al. proposed a spatial coherence approach to implement the MV beamformer using datacompounded methods [10]. Other methods combining the MV with the coherence factor (CF) were also widely studied to further improve the beamforming performance [11][12][13]. However, the calculation complexity of above MV-based methods is high mainly due to the estimation and inversion of the covariance matrix. This limits the application of these beamformers in real-time imaging systems.
To alleviate the computational load of the MV, several studies have been proposed. Asl et al. introduced the Toeplitz structured covariance matrix to reduce the calculation amount [14]. Beam-space methods have also been proposed to project received signals to a lower dimensional space and the MV is then implemented on the new space [15,16]. Another low complexity MV beamformer implemented with a partial generalized sidelobe canceler (GSC) for phased array imaging was proposed by Deylami et al. in 2019. The partial GSC divided the weight vector into one constant and one adaptive weight, so that the optimization process could be implemented with lower complexity on the adaptive part [17]. Although these methods reduce the calculation amount, their imaging quality still needs to be improved for the PWC.
The delay multiply and sum (DMAS) is an effective method to improve the imaging contrast, which was first introduced to the ultrasound field by Matrone et al. in 2015 [18]. The spatial coherence of signals was introduced to the beamforming process and the sidelobe clutter was suppressed owing to the low coherence relative to the desired signal. However, the imaging resolution and computational complexity still could not meet requirements for the PWC. Several improvement strategies for the DMAS are proposed in recent years [19][20][21]. Mozaffarzadeh et al. proposed a double-stage DMAS to further increase the contrast by using the DMAS to replace the delay and sum (DAS) algebra in the expansion formula. However, the burden of multiplications is even heavier with this approach [22], which limits its further applications. In addition, a real-time implementation of the DMAS has been used for calculating the amount of reduction with no negative effect on the imaging result [23,24]. The DMAS used as nonlinear compounding, named as the frame multiply and sum (FMAS), was also introduced to improve the contrast for PWC imaging [25]. Nevertheless, there is still room for the improvement in the imaging resolution and contrast while retaining the real-time ability.
In this paper, we tend to propose a beamforming scheme to improve the imaging quality for the PWC. A cascade structure including adaptive weighting and nonlinear compounding stages is proposed. In stage 1, a MV weighting vector of the receiving array dimension is calculated to obtain high resolution images through the partial GSC structure [17]. We name it the pGSC for short in the following sections. The specified submatrix spatial smoothing process is implemented and the multi-wave covariance matrix is calculated by averaging each covariance result of different transmissions [8]. Afterwards, the blocking and dimension reduction matrices are selected to determine the weighting vector, and the equation expansion step is also adopted to further alleviate the computational burden. The following stage is a nonlinear compounding procedure, which is mainly designed for the higher imaging contrast. An efficient second-order signed delay multiply and sum (2-sDMAS) method is proposed to compound the beamforming result of each transmitting event. Then, it is referred to as second-order signed frame multiply and sum . A multiply and sum (MAS) process is used to replace the simple DAS algebra in the real-time implementation equation of the DMAS. Afterwards, the sign of the traditional DAS result is adopted to make a correction to the previous compounding result. Through two stages, a higher imaging quality can be obtained with a fast imaging speed. The rest of this paper is arranged as follows. Background of related beamforming methods is presented in Section 2. The specific implementation of the proposed method is illustrated in Section 3. Section 4 shows the setup, data acquisition, evaluation metrics and results of experiments. In Section 5, further discussions and explanations are given. At last, a conclusion is made in Section 6.

Data Model of Plane Wave Compounding
The PWC is a process of recombining plane wave imaging results from different transmissions. Here, we assume that a transducer with M elements is used to transmit N plane waves. Then, the echo signal is recorded into a 2-D data matrix X(n): . .
where x i , j (n) represents the echo data from the ith transmitting event received by the jth element and x i [n] is the corresponding data vector of the ith transmission. The term n is the time index. By using the traditional DAS method, we can get the output of coherent plane wave compounding. The mathematical expression is:

Delay Multiply and Sum Beamformer
The DMAS beamformer suppresses uncorrelated signals by calculating the spatial cross-correlation of received array signals [18]. For a single transmission, received array signals can be expressed as: where (·) H represents the matrix conjugate transpose. Then the beamforming output of the DMAS is calculated as: where x j (n) and x l (n) are delayed signals received by the element j and element l, respectively. M is the number of transducer elements. f denotes a bandpass (BP) filter, which aims to attenuate the direct current (DC) and higher frequency components while preserving the second harmonic component generated by the multiplication process [18]. In addition, the implementation of the DMAS was further optimized [23,24]. Each echo data was rescaled by applying a signed square root operation. It is expressed as: Then, the beamforming output expression of the DMAS is changed to the following equation: Based on the Equation (6), a simplified implementation form is expressed as follows:

Minimum Variance Beamformer
The MV beamformer calculates adaptive weights through a constraint process that minimizes the power of the beamforming output [7]. We assume that the echo signal of a transmitting event is x(n); as in Equation (3), the output can be expressed as: where w MV is the MV weighting vector. The constraint equation is as follows: where R[n] is the receiving array covariance matrix and a is the steering vector. The solution of Equation (9) can be calculated by the Lagrange multiplier approach: The covariance matrix, R[n], is usually estimated as: Spatial smoothing and diagonal loading techniques are adopted to provide an invertible covariance matrix and a satisfactory robustness [26]. Then, the final covariance matrix can be expressed as:R where x l is the lth subarray, L is the subarray length, ε represents the diagonal loading factor, and I is the identity matrix. ε is usually described as ∆ × trace( 1 x l (n)x H l (n))/L , and the typical range of ∆ is 0.01-0.2. Afterwards, R[n] would be replaced byR[n] in Equations (9) and (10).

Partial Generalized Sidelobe Canceler
The generalized sidelobe canceller (GSC) can be regarded as an equivalent form to implement the MV beamformer. It divides the weight vector into two parts: the upper constant and the lower adaptive vectors. In the whole structure, the beamforming output can be obtained by subtracting the noise and interference of the lower branch from desired signals of the upper branch [27,28].
In a single transmitting event, x[n] in Equation (3) represents the signal vector, and the output of the GSC can be expressed as: Then, the constraint process of the MV in Equation (9) can be rewritten as: Sensors 2021, 21, 394

of 19
The solution of w a [n] is: where w q is the constant vector. B is a blocking matrix andR[n] represents the covariance matrix after spatial smoothing and diagonal loading. The pGSC is implemented by combining the dimension reduction and the GSC [29]. A transformation matrix T can be put after the blocking matrix B, then the new partial blocking matrix would beB M×NN = B M×M−1 T M−1×NN . NN defines the degrees of freedom and B in Equations (13)-(15) would be replaced byB [17].

Methods
The overview of the proposed method is shown in Figure 1. A high resolution and contrast ultrasound image can be obtained through a two-stage cascade structure. The solution of [ ] w a n is: where w q is the constant vector. B is a blocking matrix and ˆ[ ] R n represents the covariance matrix after spatial smoothing and diagonal loading. The pGSC is implemented by combining the dimension reduction and the GSC [29]. A transformation matrix T can be put after the blocking matrix B , then the new partial blocking matrix would be . NN defines the degrees of freedom and B in Equations (13)-(15) would be replaced by B [17].

Methods
The overview of the proposed method is shown in Figure 1. A high resolution and contrast ultrasound image can be obtained through a two-stage cascade structure.

Stage 1: Receiving Array Weighting
The data model of PWC imaging is as shown in Equation (1). Two dimensions of the matrix represent the receiving array and transmitting events, respectively. The submatrix smoothing process is implemented to smooth and de-correlate receiving array signals firstly. It is expressed as follows: where L is the subarray length and M is the element number. The term X l represents the lth overlapping submatrix: where , ( ) i l x n is used to represent the signal recorded in the original matrix in Equation (1).
Then, the multi-wave receiving covariance matrix [ ] R n is calculated by averaging over N transmissions [8]. The expression of [ ] R n is as follows:

Stage 1: Receiving Array Weighting
The data model of PWC imaging is as shown in Equation (1). Two dimensions of the matrix represent the receiving array and transmitting events, respectively. The submatrix smoothing process is implemented to smooth and de-correlate receiving array signals firstly. It is expressed as follows: where L is the subarray length and M is the element number. The term X l represents the lth overlapping submatrix: where x i,l (n) is used to represent the signal recorded in the original matrix in Equation (1). Then, the multi-wave receiving covariance matrix R[n] is calculated by averaging over N transmissions [8]. The expression of R[n] is as follows: where T stands for the lth receiving subarray vector of the ith transmitting event. Then the final receiving covariance matrixR[n] is obtained by adding a diagonal loading factor: The term ∆ is between 0.01 and 0.2 and I is the identity matrix.
The blocking matrix B with size L × (L − 1) aims to block off desired signals in the adaptive road. Thus, it must satisfy: where a is the steering vector of ones. In this paper, B is set as follows [30]: The discrete cosine transform matrix has been researched to represent original signals with lower dimensions by maintaining more energy [16]. It is adopted as the dimension reduction matrix T: By placing T after B, the partial blocking matrixB is generated: Afterwards, Equation (10) is rewritten as: The mathematical expansion [17] for the further calculation reduction is also implemented as follows: After w a is calculated, the weighting vector of the receiving array can be obtained: where w q is set to a hamming window vector.
Weighting the receiving aperture of X R in Equation (18), receiving dimension beamforming results can be calculated as follows:

Stage 2: Nonlinear Compounding
After obtaining the beamforming result of each transmission as in Equation (26), we rescale each term by applying a signed square root operation: Then a second signed square root operation is implemented as follows: Afterwards, the MAS is used to replace the DAS algebra in Equation (7) and the second-order FMAS (2-FMAS) is defined as: Finally, the sign of the traditional DAS result is used to make a correction to the output of 2-FMAS [31]. Combined with the BP filter, the beamforming result can be obtained as:

Experimental Setup
The proposed beamforming method is evaluated using the PWC through simulations and experiments. All simulated data were acquired by Field II [32], based on the MATLAB (R2019b) platform. All phantom and in vivo data were obtained through the Verasonics (Vantage 128, Verasonics, Redmond, WA, USA), which is a standard commercial ultrasound machine that cannot be modified. It is intended to be used as a research laboratory tool to acquire, store, display and analyze data. It follows the security verifications: "IEC 61010-1 3rd Edition (2010)  For simulations, both point targets and anechoic cyst were adopted. The central frequency of the transducer was 5 MHz, and the array had 128 elements and 0.3 mm pitch. During the transmitting process, 49 beams steering from −12 to 12 • with an interval of 0.5 • were generated to scan the imaging area. The excitation pulse was a 2-cycle sinusoid wave at the central frequency. The sampling frequency was set to 40 MHz. Five point targets at the depth of 20, 30, 40, 50 and 60 mm were simulated to evaluate the imaging resolution. A circular anechoic cyst with a radius of 2.5 mm centered at (x, z) = (0 mm, 50 mm) was also simulated to evaluate the imaging contrast. There were 200,000 scatters distributing in the cyst phantom randomly.
In experiments, a L11-4v transducer (Verasonics, Redmond, WA, USA) was used to acquire the phantom and in vivo data on the Verasonics platform. The parameter settings of the probe and transmitting process were the same as with the simulations. It is worth mentioning that the sampling frequency was firstly set to 20 MHz, and then the data was resampled at 40 MHz. The phantom data contained both point and cyst data, which were acquired using a CIRS calibration phantom (Model 040GSE, Computerized Imaging Reference System Inc., Norfolk, VA, USA). In vivo data from a 28-year-old male contained carotid arteries, parotid gland and thyroid data. Throughout the experiment, the mechanical and thermal index were kept to a minimum according to the ALARA principle since we have not made any changes to the instrument. The whole process is safe, noninvasive and non-radiation to the human body and the participant has acknowledged this.
Results of different methods: DAS, FMAS, 2-FMAS, 2-sFMAS, pGSC and 2-sFMAS pGSC will be shown in Section 4.2. First four methods are compared to test the effectiveness of the proposed nonlinear compounding stage in further reducing the sidelobe level and increasing the contrast. The pGSC was performed to evaluate the performance on narrowing the main lobe width and improving the resolution of the receiving array weighting stage. Finally, the 2-sFMAS pGSC was compared with previous methods, aiming to verify that our proposed scheme can achieve better imaging results in terms of both the resolution and contrast.
Another experiment was conducted to evaluate the robustness to noise of different beamformers. The anechoic cyst data in simulations were used and white Gaussian noise with the signal-to-noise ratio (SNR) of 10 dB, 0 dB and −10 dB were added.
Some parameters used in the experiment are illustrated here. The freedom of dimension reduction NN was set to 2, and the subarray length L is then selected as M -NN + 1 [17]. The selection of NN will be further discussed in the following part of this paper. The diagonal loading factor parameter ∆ = 0.01 was also used. The F-number was set to 1.5 in all beamformers and all figures are presented with −70 dB dynamic range. The BP filter was implemented by a 0.5 tapered Tukey window with the bandwidth between 3 and 12 MHz.
Evaluation metrics are explained here. The full width of the main lobe (FWHM), which is equally defined as the main lobe width at −6 dB, is used to evaluate the imaging resolution. For the quantitative measurement of the imaging contrast, the contrast ratio (CR) and contrast-to-noise ratio (CNR) are adopted in anechoic cyst results [33]. These two metrics are defined as:  Figure 2e shows that the pGSC has apparent advantages of narrower main lobe width, but sidelobe amplitudes are maintained high. Figure 2f is the result of the FMAS pGSC, narrower main lobe width and  Figure 2g shows the best performance both on main lobe width and sidelobe levels, which indicates that the proposed method can bring high resolution and high contrast at the same time.
Sensors 2021, 21, x FOR PEER REVIEW 9 of 20 main lobe width of FMAS-based methods is similar to that of the DAS. Figure 2e shows that the pGSC has apparent advantages of narrower main lobe width, but sidelobe amplitudes are maintained high. Figure 2f is the result of the FMAS pGSC, narrower main lobe width and lower sidelobe levels are obtained compared with the pGSC in Figure 2e. The result of the 2-sFMAS pGSC in Figure 2g shows the best performance both on main lobe width and sidelobe levels, which indicates that the proposed method can bring high resolution and high contrast at the same time. In order to analyze results quantitatively, the lateral response of different beamformers at the depth of z = 30 mm and z = 50 mm is presented in Figure 3, and corresponding FWHM values are listed in Table 1. The proposed method achieves FWHM values of 0.08 mm/0.07 mm, which are much smaller than that of the conventional DAS method. From Figure 3, the best performance in narrowing the main lobe width and reducing sidelobe amplitudes of the proposed method is shown, especially for grating lobes. In addition, the 2-sFMAS shows a well-shaped curve without flatting at the top compared with the 2-FMAS. In order to analyze results quantitatively, the lateral response of different beamformers at the depth of z = 30 mm and z = 50 mm is presented in Figure 3, and corresponding FWHM values are listed in Table 1. The proposed method achieves FWHM values of 0.08 mm/0.07 mm, which are much smaller than that of the conventional DAS method. From Figure 3, the best performance in narrowing the main lobe width and reducing sidelobe amplitudes of the proposed method is shown, especially for grating lobes. In addition, the 2-sFMAS shows a well-shaped curve without flatting at the top compared with the 2-FMAS.

Anechoic Cyst
Anechoic cyst results of different beamformers are shown in Figure 4. Obvious noise exists inside cysts in DAS and pGSC results, as shown in Figure 4a,e. The better performance in reducing noise are achieved by the FMAS and FMAS pGSC in Figure 4b,f, but room for improvement still exists. In comparison with the FMAS, the 2-FMAS shows effectiveness in further suppressing the internal noise, which indicates that the second-order operation is powerful in reducing off-axis scatter and sidelobes. Figure 4d is the sign correction result, a visible better speckle background and darker cyst can be seen in comparison with the 2-FMAS. Figure 4g shows a good performance similar to that of the 2-sFMAS. It means that the proposed method has superiority in the noise reduction and contrast improvement.  CRs and CNRs are calculated in Table 2. The values show that the FMAS achieves a higher CR at the cost of the CNR compared with the DAS and pGSC. The 2-FMAS further increases the CR, while the CNR decreases slightly in comparison with the FMAS method. The 2-sFMAS achieves the highest CR, and the CNR is better than those of the FMAS and 2-FMAS. The proposed 2-sFMAS pGSC achieves 112% and 121% higher CR than those of the DAS and pGSC, respectively, while the CNR is relatively reduced. It can also be noted that the proposed method performs slightly poor on the CR compared with only 2-sFMAS, but much better than only pGSC. Table 2. Contrast ratio (CR) and contrast-to-noise ratio (CNR) for simulated cysts. CRs and CNRs are calculated in Table 2. The values show that the FMAS achieves a higher CR at the cost of the CNR compared with the DAS and pGSC. The 2-FMAS further increases the CR, while the CNR decreases slightly in comparison with the FMAS method. The 2-sFMAS achieves the highest CR, and the CNR is better than those of the FMAS and 2-FMAS. The proposed 2-sFMAS pGSC achieves 112% and 121% higher CR than those of the DAS and pGSC, respectively, while the CNR is relatively reduced. It can also be noted that the proposed method performs slightly poor on the CR compared with only 2-sFMAS, but much better than only pGSC. Results of different beamformers are shown in Figure 5. Similar to simulation results, pGSC-based methods show effectiveness in narrowing the main lobe as shown in Figure 5e-g, which indicate the higher spatial resolution compared with DAS and FMAS-based methods. Lateral response across the depth of z = 10 mm and z = 29 mm is shown in Figure 6, and the corresponding FWHM values are listed in Table 3. The proposed 2-sFMAS pGSC achieves 81.6% smaller FWHM than that of the DAS, which is consistent to simulated results. The pGSC-based methods achieve much smaller main lobe width compared with those FMAS-based methods. Lateral response across the depth of z = 10 mm and z = 29 mm is shown in Figure 6, and the corresponding FWHM values are listed in Table 3. The proposed 2-sFMAS pGSC achieves 81.6% smaller FWHM than that of the DAS, which is consistent to simulated results. The pGSC-based methods achieve much smaller main lobe width compared with those FMAS-based methods.

Beamformer CR CNR
Lateral response across the depth of z = 10 mm and z = 29 mm is shown in Figure 6, and the corresponding FWHM values are listed in Table 3. The proposed 2-sFMAS pGSC achieves 81.6% smaller FWHM than that of the DAS, which is consistent to simulated results. The pGSC-based methods achieve much smaller main lobe width compared with those FMAS-based methods.

Cyst Phantom
In Figure 7, cysts phantom imaging results are presented. Figure 7c is the result of the 2-FMAS method, it shows less noise inside cysts and clearer anechoic cysts edges compared with the FMAS in Figure 7b. From Figure 7d,g, it can be seen that the 2-sFMAS and 2-sFMAS pGSC further reduce the internal noise and provide clearer edges of cysts, which are consistent with the simulated cyst results.  The statistical results of the CR and CNR are shown in Table 4. The 2-sFMAS gets the highest CR, which is 189% higher than the DAS, 126% higher than the FMAS. The proposed 2-sFMAS pGSC achieves a little smaller CR than that of 2-sFMAS, but it is still 183% higher than that of the DAS. It also can be seen that the FMAS-based methods result in a higher CR at the expense of the CNR, which is more obvious in the deeper anechoic cyst.  The statistical results of the CR and CNR are shown in Table 4. The 2-sFMAS gets the highest CR, which is 189% higher than the DAS, 126% higher than the FMAS. The proposed 2-sFMAS pGSC achieves a little smaller CR than that of 2-sFMAS, but it is still 183% higher than that of the DAS. It also can be seen that the FMAS-based methods result in a higher CR at the expense of the CNR, which is more obvious in the deeper anechoic cyst.

In Vivo Study
Four sets of in vivo data are used to confirm the effectiveness of the proposed method. In Figure 8, the top row shows the right carotid artery imaging results and the bottom row presents the left carotid artery results. The anechoic circular structure in the figure is the cross-section of the blood vessel. The lower noise level inside the blood vessel and the more complete hyperechoic tissue preserved outside the blood vessel mean the better performance of the beamformer. Figure 8a,e present the DAS and pGSC results, respectively, which both show visible noise inside blood vessels. The imaging results of the 2-sFMAS and 2-sFMAS pGSC are shown in Figure 8d,g. They show the best performance in reducing noise inside vessels among all beamformers.  The quantitative assessment is also carried out, CRs and CNRs calculated between the rectangle region in white and black are listed in Table 5. The CR values of the DAS and pGSC are relatively small, while the CR values of the FMAS-based methods are significantly increased. The 2-sFMAS and the 2-sFMAS pGSC methods achieve higher CR The quantitative assessment is also carried out, CRs and CNRs calculated between the rectangle region in white and black are listed in Table 5. The CR values of the DAS and pGSC are relatively small, while the CR values of the FMAS-based methods are significantly increased. The 2-sFMAS and the 2-sFMAS pGSC methods achieve higher CR values, which are 32.46/55.98 dB and 25.33/48.66 dB, respectively. However, it is worth noting that the value of the CNR has been sacrificed. Another two datasets including human thyroid and parotid gland are also used to further evaluate the performance of different beamformers. The results are presented in Figure 9. The proposed method performs well on distinguishing the boundary between the parotid hyperechoic structure and the speckle background. Another two datasets including human thyroid and parotid gland are also used to further evaluate the performance of different beamformers. The results are presented in Figure 9. The proposed method performs well on distinguishing the boundary between the parotid hyperechoic structure and the speckle background.

Robustness to Noise Evaluation
In ultrasound imaging system, the thermal noise usually exists in channels, which will affect the performance of the beamforming result. Therefore, the robustness against noise and interference experiment is necessary for evaluation. The data of the simulated anechoic cyst was used, and Gaussian white noise with SNR = 10, 0, and −10 dB were added to test imaging results of different beamformers. From Figure 10c,f, it can be seen

Robustness to Noise Evaluation
In ultrasound imaging system, the thermal noise usually exists in channels, which will affect the performance of the beamforming result. Therefore, the robustness against noise and interference experiment is necessary for evaluation. The data of the simulated anechoic cyst was used, and Gaussian white noise with SNR = 10, 0, and −10 dB were added to test imaging results of different beamformers. From Figure 10c,f, it can be seen that the proposed 2-sFMAS and 2-sFMAS pGSC methods perform best on the noise reduction inside the cyst under different noise levels. The edge of the cyst is clearer compared with other methods. Figure 10b,e show that the imaging results of the FMAS and FMAS pGSC are significantly degraded when the noise level is increased, although they are better than the DAS and pGSC. Statistical results are shown in Table 6. The proposed method obtained a 55.52 dB CR under the −10 dB noise level, which is still acceptable for tissue imaging. Overall, the proposed method is least affected by noise and has the best performance on the noise reduction.

Discussion
The results of simulations, experiments, and in vivo studies have been carried out. From results, the proposed method shows its effectiveness in narrowing the main lobe width and reducing sidelobe levels. Compared with the traditional DAS, the proposed method achieves a higher resolution and contrast, as shown in Figure 2a,g and Figure  4a,g, respectively. The higher resolution is owing to the receiving array weighting stage, as comparison of Figure 2a,e. In this stage, the adaptive weighting vector calculated through the pGSC structure plays an important role in reducing the main lobe width. Compared with the original MV method, the pGSC breaks the upper limitation of the subarray length during the smoothing process [17]. A larger subarray means that more effective aperture information is utilized, this also contributes to the high resolution.

Discussion
The results of simulations, experiments, and in vivo studies have been carried out. From results, the proposed method shows its effectiveness in narrowing the main lobe width and reducing sidelobe levels. Compared with the traditional DAS, the proposed method achieves a higher resolution and contrast, as shown in Figures 2a,g and 4a,g, respectively. The higher resolution is owing to the receiving array weighting stage, as comparison of Figure 2a,e. In this stage, the adaptive weighting vector calculated through the pGSC structure plays an important role in reducing the main lobe width. Compared with the original MV method, the pGSC breaks the upper limitation of the subarray length during the smoothing process [17]. A larger subarray means that more effective aperture information is utilized, this also contributes to the high resolution. Meanwhile, the multiwave covariance matrix is performed by integrating the information of all transmitting angles. This makes the covariance matrix more accurate and only one weighting vector is calculated. Combining the dimension reduction and the equation expansion step, the calculation amount remains relatively low.
The stage 2 shows its good performance in the noise rejection and contrast enhancement as shown in Figures 4 and 7. In comparison with the FMAS, the 2-sFMAS shows better contrast as shown in Figure 4b,d. This is due to the replacement of the DAS by the MAS in the real-time reformulation of the FMAS, which achieves the secondary suppression of off-axis signals, thereby enhancing the contrast. In addition, the signal distortion and the flatten main lobe have been avoided through the sign correction process, which further improved the contrast performance. Furthermore, our method achieved better anti-noise performance when different levels of Gaussian white noise were added compared with the DAS and FMAS, as shown in Figure 10. The stage 2 contains additional square root and absolute value operations, but no more multiplications are added, so that it still maintains the ability of real-time implementation. Through the two stages, a high resolution and contrast image for the PWC is achieved. Furthermore, thanks to the dimension reduction, the formula expansion in the stage 1, and the real-time implementation in the stage 2, our proposed method maintains a low computational complexity.
For the selection of the dimension reduction freedom NN, it was set to 2 in this paper [16,17], and the results show that it performs well in all experiments. If NN is increased, more computations are required since the weight vector dimension and the number of subarrays are increased. With additional experimental explorations, we have found that increasing NN does not straightforwardly improve the imaging result, especially for the contrast. Meanwhile, NN = 1 also is not a good choice. It means that the degree of freedoms is reduced to one, which will limit the anti-noise performance of the proposed method [17]. The transducer frequency also affects imaging results. We have explored that the proposed method would be more advantageous on the imaging resolution at a lower frequency compared with other DAS and FMAS-based methods. It indicates that our method can make up for the poor imaging resolution of low frequency signals better.
The proposed method achieves a linear computational complexity. For one imaging point, an N*M data matrix and NN = 2 are used to implement the beamforming process. In the stage 1, the calculation amount mainly comes from the calculation of w a . In Equation (17) In the stage 2, the square root, the absolute value and the sign of the DAS output are required and the process of multiplication in pairs is avoided. As a result, the complexity is expressed as N. In conclusion, the proposed method can be implemented by 6NL + M + N multiplications. It avoids the large calculation amount of the estimation and inversion of the covariance matrix through the dimension reduction and equation expansion steps. Thus, it could obtain faster imaging speed compared with JTR-based methods with the cubic computational complexity [8,12].
The advantages of the proposed method come from three aspects. First, a high resolution and contrast image can be achieved. Second, the robustness to noise can be further enhanced because of the good performance in suppressing noises. Third, the method is promising to be real-time implemented. However, there is still room for improvements. From the anechoic cyst simulation and experiment, it can be seen that the CNR value of the proposed method has been reduced. FMAS-based methods bring the measurement of the backscattered signal coherence into the beamforming chain, and thus achieve a better rejection of the uncorrelated noise. Although the proposed method further improves the performance on reducing the low-coherence noise, the over-suppression phenomenon is relatively more serious, resulting in the decrease of the CNR. Therefore, improving the CNR will be an aspect of our future work.
As for practical applications of the method, it is beneficial for the identification of tissues in medical clinical diagnosis. It could assist doctors to improve the accuracy of diagnosis of tumors, cysts and other diseased tissues. It also appears to be a promising tool for the cardiac ultrasound imaging [34]. The low computational complexity means the fast imaging speed. Combined with the GPU acceleration, it is more advantageous to be implemented in real time [35]. The high image quality also indicates that the number of frames can be saved, which is quite important to the imaging of rapid-moving tissues and organs. Furthermore, it may be beneficial for the ultrafast power doppler imaging, which is well-established to evaluate blood flows quantitatively with high frame rates [36,37]. The amplitude of the blood flow signal is much smaller than that of the tissue signal, and it is more susceptible to noise interference. Our method could provide good noise suppression and improve the effect of blood flow visualization. Other high frame rate strategies like 3D and 4D imaging, and transient elastography can also be further investigated with our beamforming method [1,6].

Conclusions
In this paper, a novel beamforming scheme is proposed to provide the high resolution and contrast image for the PWC. In our proposed method, a receiving weighting stage is implemented to beamform receiving array signals, which contributes to the high resolution. Then, an improved nonlinear compounding technique is proposed to compound the beamforming result of each transmitting event. This stage is beneficial for the noise suppression and contrast improvement. Both stages are implemented in low complexity forms. Thus, the proposed method achieves a linear computational complexity, which is more promising to meet the need of real-time imaging. Simulations, experiments and in vivo studies have been carried out to evaluate the effectiveness of our proposed method. The results confirm that our method is useful in improving the imaging quality. The robustness to the channel noise is also enhanced.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to confidentiality reasons.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: 2-DMAS Second-order delay multiply and sum 2-FMAS Second-order frame multiply and sum 2-sDMAS Second-order signed delay multiply and sum 2-sFMAS Second-order signed frame multiply and sum