1. Introduction
With the advantages of compact size, cost-effectiveness, all-weather adaptation, velocity-measuring capability, and long detection range, etc., millimeter wave (mmWave) radar has always been extensively employed in the applications of autonomous driving [
1,
2,
3,
4]. The emerging high-resolution, four-dimensional imaging radar (4D radar), as the main competitive product of next-generation 77 GHz automotive millimeter wave radar, can provide high-quality 4D point clouds in the range, azimuth, elevation, and time dimensions [
5,
6], i.e., three dimensions in the space domain and one dimension of the velocity information. Taking advantage of multiple-input, multiple-output (MIMO) array design technologies, 4D mmWave radar has the ability to produce an overall array with a radiation pattern. The usage of multiple antennas is optimized in terms of accuracy, angular resolution, and the signal-to-noise ratio (SNR), which no doubt enhances the performance of object detection. The condition of the MIMO approach is that the chosen waveform must be uncorrelated, which means the received signals can be separated. The most commonly proposed research on MIMO waveform design can be simply classified into the following three technologies: time division multiplexing (TDM) [
7,
8], frequency division multiplexing (FDM) [
9,
10], and code division multiplexing (CDM) [
11,
12].
Enterprises involved in the 4D mmWave radar industry range from conventional suppliers, like Bosch, NXP, and ZF, to a host of burgeoning tech companies, such as Arbe, Huawei, and Oculii, and even original equipment manufacturers (OEMs) like Tesla. For example, three types of 4D mmWave radar products are shown in
Figure 1, including the products.
NXP is depicted in
Figure 1a, Tesla is depicted in
Figure 1b, and Arbe is depicted in
Figure 1c, respectively. For fairness and comprehensiveness, these products are selected from the most typical 4D mmWave radars from the supplier, the OEM, and the advanced R&D provider, respectively. It is important to note that the uniform design of these products has practical significance for their processing, manufacturing, and usage, as scattered and irregular array elements will not be able to fit into the feeder design and the placement and layout of RF chips.
The methodologies used to achieve further improvement in the imagery performance can be categorized into hardware-based and software-based approaches, including (1) increasing the number of TR channels by cascading mmWave radar chips or integrating more antennas onto a single chip and (2) increasing the angular resolution by virtually expanding the aperture of antennas through software design [
13] or improving processing algorithms, such as traditional algorithms like FFT and CFAR [
14,
15] and learning-based algorithms [
16,
17].
From the hardware aspect, more antennas introduced mean higher SNR and more freedom for the direction of arrival (DOA) angular estimator. Many DOA estimators have been studied in the last tens of years, such as the FFT-based estimator [
18], MUSIC-based estimator [
19], and ESPRIT-based estimator [
20]. However, MUSIC, ESPRIT, and their improvements are the leading parametric methods that require a priori condition of the signal number, resulting in extensive computational costs [
21]. Especially when the assumed number of signals is incorrect, these methods will fail to achieve the expected efficacy. Since the FFT-based method involves less computation and can be easily deployed in uniform linear array (ULA) applications without requiring priori, it is considered a major estimator in real products.
The FFT-based DOA strategy is adopted in our work, which determines the direction for improving the MIMO array. This means that an improved MIMO array should consider antenna deployment to satisfy the FFT operation, which requires a uniform or quasi-uniform linear array.
Regarding the analysis mentioned above, we make our best efforts to improve 4D mmWave radar imagery. In this paper, our work focuses on the aspects of radio frequency (RF) hardware design while considering the aperture utilization, beam quality, SNR, and detection software, including the dynamic CFAR and the adaptive detection threshold varying with environmental change. In summary, the key considerations used for improving imagery are increasing antenna performance and enhancing the flexibility of the MIMO array. Manufacturability is also an important factor considered in our work, which is very different from other research that has only concerned the novelty of the proposed method.
Due to its high resolution and better environmental perception capabilities, 4D mmWave radar can compensate for the shortcomings of cameras in automotive applications. Therefore, point cloud improvement for 4D mmWave radar is an important point for practice. At the same time, in order to realize target perception, the MIMO array technique is introduced to achieve better imaging resolution and detection performance. The Tesla 4D mmWave radar utilizes 48 virtual channels to form a virtual array of small size, particularly in the vertical direction. The radar design scheme proposed in this paper has the ability to integrate more antennas to form a larger virtual array, achieving better horizontal resolution while maintaining the same horizontal resolution as the Arbe system. However, Arbe 4D mmWave radar uses six cascaded chips, which is more than our design in this paper, resulting in a higher cost. Furthermore, the antenna design scheme proposed in this paper is optimized, designed, and easy to deploy with manufacturability.
Thus, a point cloud improvement method for 4D mmWave radar imagery is proposed with four aspects in this paper. In brief, the following are four major contributions that we have made:
- (1)
To achieve higher horizontal resolution, improved beam-forming performance, and better manufacturability, the genetic algorithm (GA) is employed to deploy the MIMO antenna array, obtaining an optimized layout in constrained space.
- (2)
To achieve higher SNR and cover the performance of the whole array design in beam width, main lobe width, and sidelobe level (SLL), the antenna element of the MIMO array is carefully designed.
- (3)
Assuming a non-uniform distribution of the environment clutter, an optimized peak spectrum search criterion is proposed along with the dynamic CFAR algorithm.
- (4)
Focused on system manufacturing, the methods for analyzing the ambiguity function (AF) and the shooting and bouncing rays (SBR) tracing are introduced, and an mmWave radar system is realized based on the proposed method; its performance has been validated through practical experiments.
The rest of this paper is organized as follows:
Section 2 describes the basic principles of the 4D mmWave radar imagery, focusing on the signal processing. In
Section 3, a point cloud improvement method for 4D mmWave radar imagery is studied, including the antenna design, array design, and peak spectrum optimization.
Section 4 presents the experiment results, including the simulation and the self-developed radar system, which verify the effectiveness of the proposed method.
Section 5 gives the conclusion of this paper.
2. Basic Theory of the 4D MmWave Radar Imagery
The mmWave radar point cloud imagery diagram is shown in
Figure 2.
First, a 2D FFT is performed on the recorded echo data to obtain the Range-Doppler Map (RDM) of the detected target. Secondly, the generated RDMs from different transceiver channels are summed to improve the SNR, and the CFAR detection is performed to obtain the target position in the RDM. Third, the azimuth and elevation angles are estimated using a DOA estimator, typically the FFT-based method. Finally, the 4D point cloud of the target is generated by converting the detected range, velocity, azimuth angle, and elevation angle from spherical coordinates into Cartesian coordinates.
2.1. RDM Generation
Generally, the 4D mmWave radar emits a modulated waveform with a wideband signal to acquire the echo power or energy from the target reflection, since the wideband signal carries more information according to the information basic theory. The commonly used waveforms are continuous wave (CW), pulse continuous wave (Pulse CW), frequency modulated continuous wave (FMCW), and stepped frequency continuous wave (SFCW) in TDM mode, as well as orthogonal frequency–division multiplexing (OFDM) in FDM or CDM mode. Considering the emitted signal is modulated using one of these waveforms, the echo signal can be written as follows [
22]:
where
is a complex scalar, whose magnitude represents attenuation due to antenna gain, path loss, and the radar cross section (RCS) of the target, and
is the additive noise.
is the emitted signal waveform and
is the time delay of the signal being emitted, reflected from the target, and received by the radar system. Specifically, a single FMCW signal has the representation form at carrier frequency, which can be written as follows:
where
is the carrier frequency, and
is the frequency modulation slope.
The information of the first dimension, i.e., range, of 4D mmWave radar can be extracted using the matched filter
, which maximizes the SNR at the output. Thus, the filtered signal is the correlation of
, and
and the time delay, which implies the range information, is the peaks of the filtered signal.
Then, the estimated range can be obtained using , where is the speed of light.
Suppose the emitted waveform is designed as FMCW, the receiver output with
samples per pulse and
consecutive pulses can be written as follows:
where
represents the range from target to radar;
is the Doppler shift caused by target movement;
is the sampling frequency;
is the chirp duration;
is the fast time index; and
is the slow time index.
The beat frequency can be obtained using FFT along the fast time dimension, which is coupled with the Doppler frequency . Then, the range gate can be determined as = . The second FFT along the slow time dimension can be applied to estimate the Doppler shift, which indicates the target velocity with , where is the wavelength.
After the previous two FFTs on the echo signal, an RDM can be obtained, representing the range and velocity information. Then, target detection can be carried out with the RDM using a detector.
2.2. FFT-Based DOA Estimation
The location of a target is often described in terms of a spherical coordinate system in mmWave radar sensing, denoted as , where denotes the azimuthal angle and denotes the elevation angle. The values of are always estimated through a DOA estimator, such as an FFT-based estimator, a MUSIC-based estimator, or an ESPRIT-based estimator. As mentioned above, since the FFT-based method has less computation and can be easily applied in ULA without a priori, it is the preferred estimator in real applications.
For simplicity, take the 2D problem as an example, and suppose
is the range and angle of the
th target in spherical coordinates with velocity
. With the far-field approximation, the delay time from a transmitter located at the origin to the
th target and back to the receiver located at
can be written as follows:
where
is the spacing between adjacent antenna elements in the linear array deployment.
After rearranging, the receiver output described in three dimensions can be written as follows:
where
is the total number of detected targets.
Since a new dimension is introduced, the target DOA can be estimated, which is implied by the term , by applying another FFT along the antenna elements dimension, which are linearly located in space. Here, the FFT-based estimator is used to obtain .
Similarly, can also be estimated through a fourth FFT when solving the 3D problem. In fact, 2D angle estimation of and can be performed separately or jointly.
2.3. Radar Point Cloud Generation
Since the parameters of 4D mmWave radar detection, including
,
,
, and
, are correctly extracted, the target can be easily represented in terms of the spherical coordinates, which are
. However, these parameters are always converted into Cartesian coordinates for a more intuitive visual effect. Consequently, the final 4D information for each target at the same detection point can be expressed as follows [
23]:
where
,
, and
denote the forward, horizontal, and elevation range of the
th target from the radar.
represents the velocity of the
th target.
The visual representations of these generated points in Cartesian coordinates are rendered in a 3D user interface (UI) supported by one of the graphics engines, such as the point cloud library (PCL) [
24]. The final visualized points are drawn in 3D and colored with indices of the point’s velocity. The point sets from one frame or multiple frames are called a point cloud, which is similar to the Lidar output but with velocity information.
3. The Point Cloud Improvement Method
The basic theory of 4D mmWave radar imagery is adopted in real products, such as the single-chip radar. In this paper, the AWR2243 chip with three TXs and four RXs is considered as the major RF component [
25]. The four-chip cascade design is adopted, and the final array has 12 TXs and 16 RXs, forming a total of 192 TR channels. The antenna design, array design, and CFAR modification are proposed, which can be used in other designs of the 4D mmWave radar since the improvement principles are built upon the basic theory.
The overall consideration is to achieve higher SNR, better resolution, and more robust target points through the antenna design, array design, and CFAR modification. Then, the details are given in the following contents.
3.1. Array Design
The goal of the MIMO array design is to achieve higher resolution in the horizontal direction, better beam-forming (BF) performance, and better manufacturability. To ensure phase-coherent accumulation in BF mode, the antenna locations are constrained to multiples of
λ/2 and then optimized using a genetic algorithm (GA) [
26].
More antennas are deployed in the horizontal direction to achieve higher azimuthal resolution. In contrast, fewer antennas are deployed in a vertical direction. Then, a uniform distribution is adopted since it can achieve approximate performance of the minimum-redundancy array (MRA) to achieve maximum resolution using a small number of antennas [
27].
In [
28], the maximum sidelobe level is proven to be a critical constraint, as exceeding it will cause the array configuration to generate ambiguities in DOA estimation. Thus, the azimuthal resolution is optimized with a GA to determine the antenna locations in the horizontal direction and with the designed constraints written as follows:
where
represents the beam pattern of the MIMO array;
represents the sidelobe level of the beam pattern; and
and
represent the horizontal and vertical positions of the
th antenna, respectively.
is the angular resolution in azimuthal direction calculated using the Rayleigh criterion that is defined as
, where
is the aperture length of the virtual array in
dimension [
29].
The beam pattern of the MIMO array
can be written as the function of
and
, and it yields the following:
where
is the wave number.
The goal of Equation (8) is to minimize the SLL of the array beam pattern, which can be synthesized by the responses of the virtual antennas calculated using phase center approximation theory, i.e.,
where
and
are the numbers of TX and RX antennas, respectively.
Part of the GA optimization results are given in
Figure 3, showing the antenna positions of the first and final generations. The locations of the RX antennas are established and locked using a ladder array design. Then, the positions of TX antennas, located in the bottom row of the array, are optimized iteratively to form different generations.
Finally, comparisons of the zoomed array patterns and AF maps are given in
Figure 4. The zoomed array patterns are shown in
Figure 4a–c, and the AF maps are shown in
Figure 4d–f, corresponding to the first, the 15th, and the final generations, respectively. The final angular resolution reaches approximately one degree. At this time, the array configuration reaches a stable state while meeting the requirements of narrow main beamwidth and low sidelobes, demonstrating excellent radiation pattern and energy focusing ability.
The physical positions of the TX and RX antennas are also depicted in
Figure 5. The steps shown in
Figure 5 represent a scheme for forming an approximately uniform quasi-planar array. Due to the high integration of the mmWave chip, the antenna array usually needs to be deployed on a multi-layer printed circuit board (PCB) along with several other electronic components. This means the antenna layout inevitably needs to compromise with other critical circuit structures.
Thus, achieving an ideal 2D-MIMO planar array is always difficult. However, the stepped layout provides greater flexibility in array design and serves as an adaptive option for MIMO array design, which is an adaptive option to the MIMO array design. Therefore, some studies on MIMO array deployment provide alternative methods for locating the antennas linearly [
30]. In contrast, more antennas are used to improve the system resolution with simple stacking and cascading [
31,
32]. The stepped array is a convenient choice for deploying the feeders, as it avoids crossing or overlapping with parallel arrangements. Another consideration is that the feeders need to be equal in length, which is minimized by designing with the 0.15 dB/cm propagation loss. This is also the reason we chose the stepped layout of the RX antennas in the vertical direction.
Sequentially, the gain in the E plane and H plane, AF map, and beam pattern of the designed array are shown in
Figure 6a,
Figure 6b,
Figure 6c and
Figure 6d, respectively. The designed field-of-view (FoV) of the antenna is 100 degrees in the E plane (horizontal) and 30 degrees in the H plane (vertical).
According to the results shown in
Figure 6, the designed parameters of the antenna array are given in
Table 1.
3.2. Antenna Design
To further improve the performance of the MIMO array, this paper adopts the microstrip antenna, which is commonly used in autonomous driving. The beam pattern is meticulously designed to achieve higher SNR and cover the performance of the entire array design in beam width, main lobe width, and SLL. The microstrip antenna is designed with substrate material Rogers 3003 with the thickness 0.127 mm. The thickness of the copper layer on the surface is 0.035 mm. The technique of electroless nickel immersion gold (ENIG) is adopted to process the copper layer. The antenna has the center frequency response at 77.8 GHz and bandwidth from 76.4 GHz to 79.1 GHz. Lower phase noise can be obtained when 76~78 GHz bands are used.
To improve the main lobe level and achieve beamwidth of the FoV, a double U-shaped series-fed microstrip antenna is designed and adjusted according to Dolph–Chebyshev synthesis as shown in
Figure 7, and the detailed parameters are listed in
Table 2. The characteristics of the designed antennas are shown in
Figure 8 with the
-parameter and gain.
Moreover, the electromagnetic band gap (EBG) structure is introduced in the design of the antennas at closer distances to avoid the coupling effect; that is, several rows of rectangular EBG structures are placed between the antenna units with small spaces, as shown in
Figure 9.
The coupling level is effectively suppressed using the EBG structures, which can be seen from
Figure 10. The parameters
and
are given with EBG in
Figure 10a and without EBG in
Figure 10b, where
represents the coupling level between antenna 1 and
. It can be seen that the EBG structure brings more than 15 dB isolation improvement.
The antenna gains with EBG and without EBG are also analyzed in
Figure 11. It can be seen that the EBG structure has little influence on the antenna pattern. Moreover, when the TDM waveform is adopted, the mutual coupling effect will be further reduced. The EBG structure is considered for the antenna design area that does not meet the isolation requirement.
3.3. Dynamic CFAR
To further improve the quality of the point cloud, a better CFAR detector is always needed. The basic mechanism of target detection in RDMs is the judgment of target existence using the CFAR method, which is shown in
Figure 12. The training cells and guard cells are marked with green and blue, which represent the background clutter and target signals, respectively. When the number of training cells is small, the threshold fluctuates rapidly. On the contrary, when the number of training cells is large, the threshold cannot represent the background clutter level properly.
The decision thresholds
and
are calculated in range and Doppler dimensions, with the judgment written as follows:
In general, the number of guard cells should be larger than the target signal to provide the exact clutter sampling. However, the specific parameters should be adjusted in practical applications according to this strategy.
However, the target with weak reflection usually cannot be detected when the training cells appear high power, bringing the accuracy decrease in the spatial target detection. Therefore, the key point is to determine an adequate edge and threshold for the detection.
In the case of 2D target detection, CFAR with fixed parameters is always difficult to achieve more accurate performance when the environment is complicated with non-uniformly distributed clutter. Thus, it is necessary to distinguish objects in different environmental clutters to perform sensitive detection and capture the detailed characteristics of the target accurately.
The dynamic CFAR can be used in uniformly distributed clutter environments or complicated environments with non-uniformly distributed clutter. In the case of a complicated environment, the edge areas are identified with reference to the reference window. Then, the location region of the CUT is judged as either a strong or weak clutter region. Finally, two detectors are introduced to perform target detection, and the results of both areas are merged. The detailed diagram is shown in
Figure 13. The CUT is located in region 1, and the parameters of region 1 are used to calculate the detection threshold, which are the variance of the clutter power
and the length
, when the variance of the clutter power
and the length
present the parameters of region 2.
The maximum likelihood estimation method is used to determine the clutter edge position, which can be described as follows:
where
are the sampling points of the RD map, and
is the hypothesis of the clutter distribution.
With taking logarithms on both sides of (12), the maximum likelihood estimation of the threshold can be written as follows:
Abbreviate
for
and find the gradient on both sides of (13), using the following expression:
The estimates of
and
can be obtained as follows:
Then, the expression of
can be arranged and rewritten as follows:
Ignoring the constants and the split position of areas 1 and 2 can be optimized and written as follows:
After determining the split location of regions 1 and 2, the SOCA-CFAR algorithm is introduced to detect the target existence. Based on the point cloud imagery process described above, the proposed method to estimate the boundaries of non-uniform regions using dynamic thresholds for different regions is considered to improve the point cloud accuracy.