1. Introduction
As AIOT [
1] and pervasive sensing technology have been rapidly advancing, the development of smart homes, the smart industry, and smart security fields has been boosted over the past few years. Under this context, several problems are triggered (e.g., abnormal crowd flow leading to stampede events, inaccurate business analysis in critical areas, as well as low level of indoor security). Accordingly, numerous scholars worldwide have used sensors to analyze the people counting [
2,
3] and population density [
4] in a fixed area ROI over the past few years. In the literature [
5], the video image extraction of features was adopted to address the problem of people counting. However, the critical disadvantage of video-image-based algorithms is that they can be easily affected by lighting conditions, viewpoint diversity, and spatial complexity, and their performance fluctuates wildly. Under extreme conditions (e.g., fog and smoke), video-image-based algorithms exhibit poor performance or even fail to work; they are prone to problems regarding the disclosure of personal privacy. Moreover, most video-image-based algorithms collect high-resolution HD images or videos as the training samples, and the algorithms consume huge memory and computational space. Thus, researchers have placed their focus on Radio Frequency (RF)−based algorithms [
6,
7]. RF-based people counting methods [
8] are mainly: WiFi-based people counting methods and Radio Frequency Identification (RFID)−based people counting methods. WiFi-based people counting methods include the following two: on the one hand, there is the use of RSSI [
9] in WiFi for people counting. On the other hand, it is categorized using a Channel State Information (CSI) [
10] subcarrier variation rule of the WiFi physical layer. However, since WiFi signals can be easily interfered with via the multipath effect, it cannot indicate the subject information timely and accurately, leading to the dilemma of low recognition accuracy. An RFID-based [
7,
8] people counting method is not convenient for daily use and maintenance due to the considerable number of RF sensor devices required. Besides RF-based people counting methods, there are also detection methods using (Passive Infrared Detectors) PIR [
11] and thermal imaging-based techniques [
12]. PIR-based detection methods are simple in principle and in terms of equipment and exhibit low algorithmic complexity; however, they are susceptible to detection failures under the effect of the surrounding temperature. The advantage of the thermal imaging-based detection method is that it can be applied to low-light environments. However, the disadvantage is that it is still as susceptible to the effect of ambient temperature as the PIR-based detection method. Furthermore, thermal imaging-based detection methods consume more computational resources since the algorithms exhibit higher complexity.
Ultra-Wide Band (UWB) technology complies with the communication band of 3.1–10.6 GHz, which was mandated by the U.S. FCC in 2002 [
13]. UWB uses a higher and wider frequency band compared to other communication technologies. The popular Bluetooth and WiFi communications use a narrow bandwidth of 2.4 GHz and 5 GHz, respectively, and are very high power, as shown in
Figure 1. IR-UWB radar transmits data using non-positive-wave pulses in the nanosecond (ns) to picosecond (ps) range, which occupies bandwidths up to several gigahertz so that the maximum data rate can be up to a few hundred meters per second. It can be seen that the use of pulse radio technology in the physical layer of the air interface of UWB technology is used to increase the data transmission rate of the physical layer. A comparison of conventional narrowband communication systems with the transceivers of IR-UWB shows a clear difference in the way the two technologies are implemented. Narrowband systems generally use sinusoidal carrier modulation to realize spectrum migration, the channel transmits the RF-tuned signal, and the receiver needs to demodulate after down-conversion step by step to recover the original information. In contrast, IR-UWB directly sends broadband narrow pulses after spectrum shaping. The channel transmits baseband signals, and the receiver is mainly a correlation detector, which is much simpler in structure than the traditional narrowband communication system. IR-UWB technology solves the significant problems related to propagation that plague conventional wireless technology and has the advantages of insensitivity to channel fading, low power spectral density of the transmitted signal, low system complexity, and centimeter-level positioning accuracy. Therefore, UWB technology offers low strength, high discrimination, excellent non-contact performance, and anti-interference performance. In recent years, applications based on UWB technology have sprung up, such as their use in the detection of physical signs such as breathing and heartbeat [
14], non-contact subject target trajectory recording [
15], and target location determination in large indoor scenes [
16].
In recent years, many scholars have focused on people counting methods based on pulsed radar [
17] as well as Frequency Modulated Continuous Wave (FMCW) radar [
18] because of their low or even negligible influence in terms of light or ambient temperatures, their high resolution in terms of people or objects, and their lack of related issues such as those relating to personal privacy. IR-UWB radar-based people counting methods generally have the following two branches, one is Line of Interest (LOI), which is the detection of people standing on the same line at a certain fixed angle, for example, in [
19,
20]. The authors used IR-UWB radar sensors to explore the LOI of person counting on a certain line people counting. The second is ROI, i.e., people counting on a fixed size for a fixed region, which is also the concern of the research in this paper. In [
21], the authors use an iterative algorithm to detect the local maxima of IR-UWB radar signals to count people. In the paper [
22], the authors simulate the theoretical model of UWB signals. In [
23], the authors proposed an algorithm based on principal clustering to analyze the distribution of selected magnitudes with distance and number of people. In papers [
24,
25], the authors detect each signal caused by a person from the received radar echoes, thus determining the number of people. However, due to the influence of the surrounding environment, the received radar echoes are often interspersed with and contain clutter caused by multipath effects. This detection method is not accurate in counting the number of individuals.
Many scholars have been working on problems related to electromagnetic interference, scattering, and multipath effects in recent years. The article [
26] investigates the problem of electromagnetic solid interference suppression without a priori information in non-cooperative scenarios, the issue of near-field EMI suppression is investigated, and the proposed algorithm effectively suppresses strong EMI without a priori knowledge while effectively capturing the target signal. In the article [
27], the authors proposed a method to effectively solve the problem of near-area electromagnetic scattering of scatterers under external field irradiation. The method is based on the Helmholtz equation discretized using the finite element method. In dealing with the reverberation and interference fields of the signal, the authors [
28] have combined the minimum variance distortionless response (MVDR) beamformer with Multi-Channel Linear Prediction (MCLP) to achieve active noise reduction. In order to effectively deal with multipath propagation of signals, the article [
29] addresses the reverberation of radar echo data. It proposes an alternative to conventional reverberation estimation for use in time-frequency sequence to deal with reverberation. At the same time, in order to solve the multipath effect of IR-UWB radar echoes, the authors in [
23] used the Probabilistic Model (PM) to analyze the variation rule of amplitude with the distance of the selected radar echoes to determine the number of individuals in the target area. Still, this algorithm is not applicable to cases where there are too many people in the target area. The authors in [
30] proposed an algorithm to count the number of individuals in the target area under the condition of there being too many persons in an area using image feature extraction of 2D signals. However, this algorithm is unable to do anything for IR-UWB radar signals with low SNR.
To address the complexity of the current domestic and international research on the analysis of IR-UWB radar signals, it is difficult to accurately detect the number of people counted in the ROI from low SNR signals using the existing solutions. For this reason, in this study, an effective 1DCNN-LSTM algorithm is proposed to accurately detect the number of targets, even under the conditions of low SNR environments with considerable people. In the literature [
31], Convolutional Neural Network (CNNs) [
32] techniques have been adopted for the spontaneous filtering of noisy signals from the received signals and images while extracting effective features by constructing a model. Existing research suggested that CNNs apply to the extraction of features with non-temporal dependencies [
33] or features that show more significant local differences. However, the received signals were time series with high time dependencies, such that the CNN technique should be adopted alone.
Long Short-Term Memory (LSTM) networks refer to a particular version of RNN that contain cyclic feedback designed to process time series [
34]. Thus, the LSTM layer is capable of encoding information regarding class-specific features across time [
35]. Given this finding, a model architecture combining CNN and LSTM networks was proposed in this study, considering the temporal characteristics of IR-UWB radar signals.
The remaining parts of this study are given below. The IR-UWB radar signals are modeled in
Section 2.
Section 3 describes the proposed people counting classification model in detail. The performance assessment of the experiments and the analysis of the results are placed in
Section 4 of the paper. The last section summarizes the work of this study.
5. Discussion
This study aimed at designing algorithms that can effectively differentiate people counting in the ROI region with high accuracy and ease of implementation. For people counting using IR-UWB conventional methods, it is difficult to set the optimal threshold due to using a single threshold, and there exists a problem that people counting cannot be determined at low SNR under the effects of multipath and noise signals. In this study, a 1DCNN-LSTM-based excess kurtosis people counting system was proposed using IR-UWB radar signals. Using actual IR-UWB radar sensors, we validate the algorithm’s performance in two open environments. We found that the greater the subject’s weight, the more reflected signals the radar picks up, which increases the number of pulse frames, which may cause the algorithm to experience an increased counting error rate. In order to control the variable to reduce the variability caused by weight, we selected the same volunteers (as shown in
Table 3) as experimental subjects. In this paper, the ROI region, we simulate the actual application scenarios such as shopping malls, halls, outdoor, and other open environments. We set the size of the experimental environment as a 6 m × 6 m open environment. However, even so, when we apply the proposed 1DCNN-LSTM algorithm to the complex environment in the laboratory, we find that it is also effective because we use the running average method to eliminate the static clutter in the background. At the same time, we use the Butterworth filter to improve the SNR of IR-UWB radar signals. Non-static noise signals mainly influence IR-UWB radar signals in complex environments, so it is wise to use smoothing filters to suppress non-static noise signals.
Thus, this study was subjected to the following limitations. In this study, the data were collected only in a stationary scene, the effects exerted by scene migration were ignored, and only an empty environment was considered not a complex experimental scene with more clutter (e.g., computers and obstacles). Accordingly, in subsequent research, the relevant research will be conducted on the following two points. (1) Training via data-enhanced training networks or other datasets is conducted to enhance the scene migration generalization ability of the proposed algorithm. (2) Moreover, relevant experimental studies are conducted for complex experimental scenarios to refine and optimize the algorithm.