During signal preprocessing, the phase signal corresponding to the target range bin is extracted through Range-FFT and static clutter removal. The proposed MDCA method is then applied to adaptively select the optimal receive channel and range bin. Subsequent steps include phase unwrapping, phase difference computation, and impulse noise suppression to extract the vital sign signal.
Next, the VMD algorithm is applied to the preprocessed signal to separate the heartbeat component, completing the signal decomposition stage. Finally, the FIIB algorithm is employed in this context to estimate the heart rate. Notably, FIIB performs both coarse and fine frequency estimations, significantly improving measurement precision and frequency resolution.
This processing framework effectively enhances the SNR and mitigates various interferences in complex indoor environments through a multi-stage procedure, thereby achieving high-accuracy HR estimation. The following subsections detail each processing step.
3.1. Multi-Dimensional Coherent Accumulation Algorithm
Radar-detected heartbeat signals are extremely weak and easily affected by respiratory harmonics, cross-modulation, and noise. Signal accumulation is thus essential for enhancing signal quality and enabling accurate HR estimation. Although many modern radars (e.g., TI, NXP, and other manufacturers) support multi-channel reception, most methods of HR detection still use single-channel processing, underutilizing available information. In the following, we present the signal accumulation scheme adopting multi-channel information. To provide a more specific, clear, and understandable explanation, this study adopts the widely-used Texas Instruments (TI) AWR1642 millimeter-wave radar system as a representative example, and our algorithm is universally applicable to similar multi-antenna systems.
Most radar-based spatial domain systems still rely on single-channel processing. Although some studies (e.g., [
19]) use direct summation of multi-channel signals (referred to here as equal gain combination (EGC)), this approach often underperforms due to channel variability. For instance, as shown in
Figure 3a, signals obtained from four antennas exhibit noticeable differences: the waveforms from antennas 3 and 4 deviate substantially in shape from those of antennas 1 and 2, thereby rendering simple signal summation ineffective.
Spatial domain processing faces several key challenges: vital sign signals vary across channels, resulting in low correlation [
17], and phase unwrapping is nonlinear and complex [
37]. As a result, direct multi-channel summation often fails to deliver optimal performance. To address the limitations of the EGC scheme, we propose a spatial domain coherent accumulation (SDCA) method that follows the “fewer but better” principle. Instead of using all channels, SDCA adaptively selects a small number of high-quality channels, thereby influencing accumulation gain and signal quality.
During heartbeat signal detection, the entire radar data frame is partitioned into n distinct groups, with each group containing J frames. Based on empirical analysis, we set . Within each group, a reference range bin is selected from the J frames. Additionally, for each frame, the range bin with the maximum energy across all channels is dynamically selected as the reference. The specific processing steps for each group of data are outlined as follows.
The selection of the range bin is performed first, where the energy of the phase signals from the four channels is calculated as:
where
denotes the signal energy of the
ith range bin in the
kth channel,
represents the phase signal, and
J is the total number of frames in each group. The reference channel and reference range bin are then selected according to the maximum energy criterion:
where
is the chosen reference range bin, and
is the corresponding reference channel. Since the AWR1642 radar is equipped with four antennas, the antenna index
ranges from 1 to 4. A 256-point FFT is performed along the fast-time dimension, so the range bin index
spans from 1 to 256. Both the reference channel and range bin are selected once per group, i.e., once every
J frames.
After determining the reference channel, an optimal channel for signal accumulation is selected from the remaining three receiving channels of the AWR1642 radar. Following the approach in [
10], the optimal channel is chosen based on the correlation coefficient between the reference channel and each candidate channel. Specifically, let
and
denote the phase signal vectors obtained after phase unwrapping from the reference channel and another channel, respectively. The Pearson correlation coefficient between these two vectors is then computed as:
where
represents the covariance of these vectors,
and
are the corresponding standard deviations, respectively. Secondly, we select the channel with the highest correlation coefficient among those exceeding the predetermined threshold, which is calculated by:
where
denotes the correlation coefficient between the
kth channel and the reference channel
,
l represents the index of the channel corresponding to the maximum correlation coefficient. Finally, we evaluate whether the correlation coefficient
exceeds the predefined threshold
. If
, we accumulate the phase signal by summing phase signals of these two channels directly (channel
k and
). Otherwise, we double the phase signal of
. This selection process can be expressed as:
where
and
represent the phase signals of the reference channel and the optimal channel, respectively. Based on extensive simulations, we set the threshold
in our experiments. Channels with correlation coefficients above 0.8 are selected for accumulation, as this threshold consistently provides stable and effective performance. A lower threshold such as 0.6 may introduce less correlated channels, leading to interference, while a higher threshold like 0.9 may exclude too many channels, weakening the benefit of multi-channel accumulation. Therefore,
offers a reasonable trade-off between selectivity and accumulation effectiveness.
Figure 3b compares the original signal and the enhanced signal after applying SDCA, demonstrating improved displacement amplitude via spatial accumulation.
SDCA leverages multi-antenna data to enhance signal quality through optimal channel selection. However, since human reflections span multiple adjacent range bins [
20], by extending the SDCA approach to incorporate signals from these distributed range bins [
25], further improvements in HR detection performance can be expected.
Although [
20] proposed selecting adjacent bins with high correlation (e.g., Pearson coefficient
), we observe that, after spatial accumulation, the correlation between bins frequently decreases, limiting the applicability of this method. Similarly, refs. [
21,
22] explored multi-bin or region accumulation, but little gain can be achieved or computational burden is added when combined with SDCA.
To address this, we propose a simple yet effective method: After selecting the reference bin, only the adjacent bin (either preceding or following) with the highest energy is used for accumulation, ensuring improved performance with minimal complexity.
The following sections introduce the MDCA proposed in this study:
Spatial domain processing: First, select the channel with the highest energy as the reference channel. Then, identify the channel that has the highest correlation with as the optimal channel. If the correlation coefficient exceeds a predefined threshold, the phase signals from these two channels are summed to perform SDCA.
Temporal domain processing: Determine the range bin with the highest energy as the reference bin. Then, select the adjacent bin that has the next highest energy relative to . For each of these two bins, apply the spatial domain accumulation described above.
Joint temporal–spatial accumulation: Finally, sum the signals from the two selected range bins (after spatial domain accumulation) to complete the joint accumulation across both the temporal and spatial domains.
By processing each data group using the above method, the proposed MDCA approach can be effectively implemented. This method fully leverages both spatial and temporal information for signal accumulation, thereby significantly improving the accuracy of heart rate estimation.
Figure 4 illustrates the frequency spectra of the single-channel signal and the signal after MDCA enhancement. The red dashed line represents the single-channel method, the blue solid line corresponds to the MDCA method proposed in this study, the green dashed line marks the reference respiration rate, and the black dashed line indicates the reference heart rate. It can be observed that the main spectral peaks remain located at the reference respiratory and heart rates after enhancement. Meanwhile, the blue line exhibits significantly higher amplitude than the red dashed line, indicating that the vital sign components are effectively enhanced by MDCA.
To quantify the enhancement effect, we also calculated the frequency-domain SNR, defined as the ratio between the spectral peak amplitude at the target frequency (heartbeat) and the average noise amplitude in neighboring bins, excluding known harmonics. The specific formula is provided in (
11), and follows conventions in radar-based vital sign studies:
where
is a Discrete Fourier Transform (DFT) of signal
,
is the signal power,
,
,
is the maximum amplitude position within the frequency range of 0.8–2 Hz,
is the sampling frequency, and
N is the number of sampling points. In the example of
Figure 4, the unenhanced signal has an SNR of −7.27 dB, whereas the MDCA-enhanced signal achieves −2.60 dB, corresponding to a notable improvement of 4.67 dB.
To further assess the effectiveness of MDCA, a more detailed spectral analysis was performed. Although the true frequency components remain unchanged, we observed that in some cases, spurious peaks in the unprocessed signal can be stronger than the actual physiological peaks, potentially leading to incorrect frequency estimation. As illustrated in
Figure 5, with a reference heart rate of 78 bpm (approximately 1.3 Hz), the single-channel signal exhibits a stronger peak near 1.37 Hz, resulting in a deviation of 0.07 Hz (approximately 4.2 bpm). After applying MDCA, the heart rate component is significantly enhanced through accumulation and becomes the dominant peak, thereby improving the estimation accuracy.
Consequently, the SNR is substantially improved by MDCA, which provides a stronger foundation for subsequent precise frequency estimation.
3.3. Spectrum Analysis Spectrum
The fundamental component of the HR signal is often closely spaced in frequency with respiratory harmonics, cross-modulation components, and other physiological or environmental interferences. In many cases, the frequency separation between the HR fundamental and nearby interference is very narrow, which poses a significant challenge for conventional spectral analysis methods such as FFT. These methods typically suffer from limited resolution, making it difficult to distinguish and accurately extract the true HR component. Therefore, a frequency estimation method with high resolution, high accuracy, and low computational complexity is essential for robust and reliable heart rate detection. Although the MUSIC and estimation of signal parameters via rotational invariance techniques (ESPRIT) algorithms have super-resolution capability, it is difficult to adopt them for HR detection due to their massive computational requirements. Consequently, an algorithm featuring super-resolution capability, high spectral analysis precision, and low computational complexity is imperative for achieving accurate and efficient heart rate estimation.
To overcome the aforementioned challenges, we adopt the FIIB algorithm for HR estimation. FIIB, originally developed for direction-of-arrival (DOA) estimation in the spatial domain, effectively combines super-resolution capability with low computational complexity, making it particularly suitable for radar-based physiological monitoring.
Following the approach in [
39], we employ an iterative “estimate–subtract” strategy, where a coarse estimation stage first identifies dominant frequency components using the FFT, and a fine estimation stage then iteratively refines these results to minimize estimation bias and improve resolution. The complete workflow of the FIIB algorithm is summarized in Algorithm 1.
Algorithm 1 Frequency Estimation Based on FIIB |
- 1:
Initialization: - 2:
for to L do - 3:
- 4:
- 5:
end for - 6:
- 7:
- 8:
while and not converged do - 9:
for to L do - 10:
if then ▹ Coarse Estimate - 11:
for to do - 12:
- 13:
end for - 14:
- 15:
- 16:
else ▹ Accurate Estimate - 17:
for do - 18:
- 19:
- 20:
end for - 21:
- 22:
- 23:
end if - 24:
- 25:
end for - 26:
- 27:
end while - 28:
Output: for
|
Specifically, for the
lth source, we first subtract the previously estimated source in line 12, then locate the highest peak of the spectrum in line 14. This can reduce the impact of other signals on the current signal (
lth signal). Note that in line 12,
represents the DFT coefficient of the
ith frequency component at the normalized frequency
. This relationship is formally expressed as:
where
denotes Fourier transform of
ith frequency component which can be expressed as:
The normalized coarse estimate obtained in line 15 serves as input for spectral refinement in line 19 to line 22, where Fourier coefficients interpolation techniques are applied to enhance estimation accuracy, achieving the fine estimation [
40].
To refine the measurement range, line 19 calculates the DFT coefficients of complex exponential signal at frequency .
These coefficients are subsequently subtracted from the spectral leakage components associated with other estimated frequency points, which are expressed as:
where the leakage DFT coefficients can be expressed as:
Subsequently, spectral interpolation is performed using the leakage-compensated DFT coefficients
to estimate the frequency offset. Let
denote the true frequency deviation between the
lth complex exponential signal and the maximum spectral line obtained from coarse estimation, where
. The frequency offset of the complex exponential signal is then given by
, and the spectral estimation result for the
th iteration is expressed as:
Finally, the frequency estimate of the lth complex exponential component is refined by incorporating the offset . The associated complex amplitude is then calculated by compensating for spectral leakage from other components at this frequency. The estimation proceeds sequentially from the strongest to the weakest signal, and this iterative process is repeated for all components over Q iterations.
The FIIB algorithm requires only a single FFT computation and combines a small number of Fourier coefficients with interpolation techniques, thereby effectively reducing the computational complexity.
3.3.1. Frequency Estimation Performance Evaluation
To analyze the frequency estimation performance of FIIB, we considered a simulation scenario involving dual-frequency signals defined as:
where
is the sampling period,
and
are the amplitudes of the two signals.
and
denote the frequencies of the two signals. The sampling frequency is set to
Hz, and the number of samples
. The noise is additive white Gaussian noise (AWGN), and the SNR varies from −15 dB to 10 dB in steps of 1 dB. In all examples, Monte Carlo experiments were performed independently for 5000 times, and the performance of the FIIB method was compared with that of the FFT and MUSIC algorithms. The mean squared error (MSE) was used in the experiments to evaluate the accuracy of the frequency estimation, defined as:
where
is the estimated frequency,
M denotes the number of Monte Carlo trials, which is set to 5000 in this study and
is the real frequency. The simulation parameters are set according to the actual conditions of radar monitoring of HR,
,
,
,
, and
is the frequency resolution of FFT, which can be written as:
Figure 6 presents the MSE curves plotted against the SNR for the three algorithms. The results demonstrate that the FIIB algorithm approaches the Cramér-Rao Bound (CRB) more closely when the SNR exceeds −6 dB. While the MUSIC algorithm achieves comparable measurement accuracy to FIIB, it does so at the expense of significantly higher computational complexity. As a result, the FIIB algorithm offers superior spectral analysis performance combined with low computational cost, rendering it particularly well-suited for heart rate detection applications.
3.3.2. Resolution and Computational Complexity Comparison
Figure 7a,b compares the performance of MUSIC and FFT algorithms at an SNR of 20 dB, using frequency intervals set at 0.75 and 0.5 times the FFT resolution, respectively. FFT fails to resolve the closely spaced frequencies in both scenarios due to its limited resolution. In contrast, MUSIC successfully resolves the 0.75 times interval but fails at the 0.5 times interval. Notably, FIIB employs an efficient ‘estimate–subtract’ approach, which does not directly produce a conventional spectral plot. As indicated in
Table 1, FIIB accurately resolves targets at both 0.75 and 0.5 times the FFT resolution, with a RMSE—defined as the square root of the MSE—below 0.0026 Hz, demonstrating its superior super-resolution capability and estimation accuracy.
Further analysis reveals that even at a frequency spacing of 0.8 times the FFT resolution, the FFT method fails to resolve two closely spaced frequency components. The MUSIC algorithm successfully resolves signals starting at 0.7 times the FFT resolution, whereas the proposed FIIB algorithm consistently resolves the two targets across all tested intervals ranging from 0.4 to 0.8 times the FFT resolution.
Additionally, we compared the computational complexity of FFT, MUSIC, and FIIB algorithms on a laptop equipped with an Intel i5-8265U CPU and 8 GB RAM. Their execution times are approximately 0.001 s, 26 s, and 0.06 s, respectively. MUSIC is significantly more computationally demanding than the other two algorithms, primarily due to the eigenvalue decomposition involved. While FFT is the fastest, it lacks super-resolution capabilities and suffers from lower accuracy. FIIB provides a favorable trade-off, combining high accuracy and super-resolution with low computational complexity, rendering it well-suited for real-time heart rate measurement applications.