Deep Learning-Based Indoor Two-Dimensional Localization Scheme Using a Frequency-Modulated Continuous Wave Radar

: In this paper, we propose a deep learning-based indoor two-dimensional (2D) localization scheme using a 24 GHz frequency-modulated continuous wave (FMCW) radar. In the proposed scheme, deep neural network and convolutional neural network (CNN) models that use different numbers of FMCW radars were employed to overcome the limitations of the conventional 2D localization scheme that is based on multilateration methods. The performance of the proposed scheme was evaluated experimentally and compared with the conventional scheme under the same conditions. According to the results, the 2D location of the target could be estimated with a proposed single radar scheme, whereas two FMCW radars were required by the conventional scheme. Furthermore, the proposed CNN scheme with two FMCW radars produced an average localization error of 0.23 m, while the error of the conventional scheme with two FMCW radars was 0.53 m.


Introduction
Technology for estimating the location of workers in indoor environments has been studied for accident prevention and convenience at construction and industrial sites. Furthermore, indoor localization systems are being developed for the realization of the fourth industrial revolution [1][2][3]. For example, if a worker attempts to enter a dangerous area, an indoor localization system can prevent accidents by estimating the location of the worker and warning them. Furthermore, human-robot collaboration, in which human workers and robots pool their skills for flexible manufacturing, has recently been noted as a future manufacturing trend [4]. However, an indoor localization system is considered necessary for the realization of safe human-robot collaboration.
Because a global navigation satellite system is not available indoors, various indoor localization systems using radio signals, such as Wi-Fi, Zigbee, RFID, Bluetooth, ultra-wide bandwidth (UWB) radar, and frequency-modulated continuous wave (FMCW) radar, have been introduced [5][6][7][8][9][10][11]. The multilateration method and the fingerprint method are well known as localization techniques that use radio signals such as Wi-Fi, Zigbee, RFID, and Bluetooth. Because the multilateration method is based on range estimation, its localization performance depends on the accuracy of the estimated distance to the target [12]. In the fingerprint method, meanwhile, the received signal strength of radio signals is collected at all points of interest, and the real-time data at a specific position are correlated with the precollected data to estimate its location [13]. However, it is known that the range and the localization accuracy of both the multilateration and the fingerprint methods with Wi-Fi, ZigBee, and Bluetooth are inferior to those localization schemes that use radars, which provide high time resolution for estimating distances and locations [11,[14][15][16]. It is also known that the UWB radar has the disadvantage of limited coverage relative to the FMCW radar, which computes the difference between the transmission and reception frequencies generated by the time delay and then calculates the distance to the target [14][15][16]. Indoor localization using FMCW radars is introduced with various bandwidths (e.g., 24 GHz, 77 GHz, etc.), and the performance varies depending on the frequency band. In most cases, with high bandwidth, the hardware is calibrated to increase target detection accuracy.
In conventional two-dimensional (2D) localization schemes that use FMCW radar, multilateration methods that use time of flight (TOF) and a joint TOF and direction of arrival (DOA) scheme have been introduced [11,17]. State-of-the-art conventional schemes can provide relatively low errors in location estimation if the distance to the target is accurately estimated. However, the distance estimation tends to be somewhat inaccurate due to random occurrences in indoor environments [15,16]. To overcome this limitation, a distance estimation scheme that exploits the deep learning technology of artificial neural networks is introduced to improve the accuracy of distance estimation in [18]. By applying deep learning technology to the data received by FMCW radar, the data can be classified in terms of different distances to the target even with noise and clutter, and thus an accurate distance can be estimated.
We propose a deep learning-based indoor 2D localization scheme using 24 GHz FMCW radar. In the proposed scheme, the deep learning technology of artificial neural networks is employed to overcome the limitations of the conventional 2D localization scheme based on multilateration methods. We also consider two different models, which are the deep neural network (DNN) and the convolutional neural network (CNN), and two different numbers of FMCW radars to analyze the performance of the proposed scheme. The performance of the proposed scheme is evaluated experimentally and compared with the conventional 2D localization scheme that is based on multilateration under the same conditions.
The remainder of this paper is organized as follows: In Section 2, we briefly describe the 2D localization system using FMCW radar and review the conventional scheme. In Section 3, the proposed scheme is presented in detail. In Section 4, the performance of the proposed scheme is evaluated experimentally then compared with the conventional scheme under the same conditions, and the results are discussed. Finally, the conclusions of the study are summarized in Section 5.

System Overview
The FMCW radar emits a continuous wave that changes frequency linearly over time; the transmitted signal is reflected by the target and returns to the radar. Compared with the currently generated signal, the received signal has a different frequency, which depends on the signal's travel time. The 2D localization system that uses FMCW radar is shown in Figure 1. As shown in the figure, one of the two FMCW radars are placed at each end, and the received data are collected to estimate the location of the human target, which is positioned at one of 25 different points over a monitoring area of 5 m × 6 m. The experimental data are used for both the conventional scheme in Section 2.2 and the proposed scheme in Section 3. To evaluate the 2D localization performance, the monitoring area is divided into 25 different points that are positioned 1 m apart in the form of a 2D lattice, as shown in Figure 1. The experiments were conducted in the corridor on the sixth floor of the general office building at Kwangwoon University to collect the data and compare the performance of the proposed scheme with that of the conventional scheme. The FMCW radar used was the EVALKIT SMR-334, 24 GHz. The target was a human, and one or two FMCW radars were placed at each end of the corridor, as shown in Figures 2 and 3. Data were collected at all 25 points for two FMCW radars by changing the direction every 30 s to capture four directions (front and back and sides). The FFT data from a single radar were collected in an array of size 1 × 64. For each point, 200 pieces of FFT data were collected, and a total data set of 5000 pieces of FFT data was collected for all the points. Figure 3 shows a schematic of the experimental configuration using two FMCW radars. Generally, two radars are required for 2D localization, and if one of the two radars is broken, the 2D location cannot be estimated. We also conducted an experiment using only one radar to show that 2D localization is still possible with just one radar for the proposed scheme, whereas two radars are necessary for the conventional scheme.  The experiments were conducted in the corridor on the sixth floor of the general office building at Kwangwoon University to collect the data and compare the performance of the proposed scheme with that of the conventional scheme. The FMCW radar used was the EVALKIT SMR-334, 24 GHz. The target was a human, and one or two FMCW radars were placed at each end of the corridor, as shown in Figures 2 and 3. Data were collected at all 25 points for two FMCW radars by changing the direction every 30 s to capture four directions (front and back and sides). The FFT data from a single radar were collected in an array of size 1 × 64. For each point, 200 pieces of FFT data were collected, and a total data set of 5000 pieces of FFT data was collected for all the points. Figure 3 shows a schematic of the experimental configuration using two FMCW radars. Generally, two radars are required for 2D localization, and if one of the two radars is broken, the 2D location cannot be estimated. We also conducted an experiment using only one radar to show that 2D localization is still possible with just one radar for the proposed scheme, whereas two radars are necessary for the conventional scheme. The experiments were conducted in the corridor on the sixth floor of the general office building at Kwangwoon University to collect the data and compare the performance of the proposed scheme with that of the conventional scheme. The FMCW radar used was the EVALKIT SMR-334, 24 GHz. The target was a human, and one or two FMCW radars were placed at each end of the corridor, as shown in Figures 2 and 3. Data were collected at all 25 points for two FMCW radars by changing the direction every 30 s to capture four directions (front and back and sides). The FFT data from a single radar were collected in an array of size 1 × 64. For each point, 200 pieces of FFT data were collected, and a total data set of 5000 pieces of FFT data was collected for all the points. Figure 3 shows a schematic of the experimental configuration using two FMCW radars. Generally, two radars are required for 2D localization, and if one of the two radars is broken, the 2D location cannot be estimated. We also conducted an experiment using only one radar to show that 2D localization is still possible with just one radar for the proposed scheme, whereas two radars are necessary for the conventional scheme.   The experiments were conducted through serial communication between radar and laptop, and wireless communication between laptop and server. Firstly, the two radars were connected to the laptop, respectively, to collect data received by the radar via serial communication. The collected data were sent to the precreated server in real time, and then the data were sent from the server to the database. Figure 4 shows a brief representation of the experimental system configuration.

Conventional Scheme
The conventional 2D localization scheme, based on the multilateration method, uses estimated distance information to find the location of the target. Two circles can be drawn using an estimated distance of the location from two radars, and the intersection of the two circles is determined to be the location of the target, as shown in Figure 5 [19]. To compare this performance with that of the proposed schemes, which will be presented in Section 3, we used the same data obtained in Section 2.1 to evaluate the conventional scheme. The experiments were conducted through serial communication between radar and laptop, and wireless communication between laptop and server. Firstly, the two radars were connected to the laptop, respectively, to collect data received by the radar via serial communication. The collected data were sent to the precreated server in real time, and then the data were sent from the server to the database. Figure 4 shows a brief representation of the experimental system configuration. The experiments were conducted through serial communication between radar and laptop, and wireless communication between laptop and server. Firstly, the two radars were connected to the laptop, respectively, to collect data received by the radar via serial communication. The collected data were sent to the precreated server in real time, and then the data were sent from the server to the database. Figure 4 shows a brief representation of the experimental system configuration.

Conventional Scheme
The conventional 2D localization scheme, based on the multilateration method, uses estimated distance information to find the location of the target. Two circles can be drawn using an estimated distance of the location from two radars, and the intersection of the two circles is determined to be the location of the target, as shown in Figure 5 [19]. To compare this performance with that of the proposed schemes, which will be presented in Section 3, we used the same data obtained in Section 2.1 to evaluate the conventional scheme.

Conventional Scheme
The conventional 2D localization scheme, based on the multilateration method, uses estimated distance information to find the location of the target. Two circles can be drawn using an estimated distance of the location from two radars, and the intersection of the two circles is determined to be the location of the target, as shown in Figure 5 [19]. To compare this performance with that of the proposed schemes, which will be presented in Section 3, we used the same data obtained in Section 2.1 to evaluate the conventional scheme. As shown in Figure 5, the intersection of the arc passing through the coordinates where the target is located in radar 1 and the arc passing the coordinates where the target is located in radar 2 can be obtained by solving quadratic Equations (1) and (2) [19,20]: where 1 and 2 are the estimated distance from each radar, as shown in Figure 4. The coordinates of the estimated location of the target can be expressed as follows: The estimated distances between the target and radars can be obtained using the conventional fast Fourier transform max value index-based distance estimation scheme or a deep learning model [18]. In our study, the deep learning model is used to estimate the distances, 1 and 2 , to the target. The location of the target is estimated using equations (3) and (4) using the estimated distances. Note that we use the Pythagorean theorem to calculate the distance between the coordinates of each point and each radar for the ground truth of the distance. For example, the 1 of the (1,1) point is √2 and the 2 is √26. The 1 and 2 for other points can be calculated the same way.
The localization performance of the conventional scheme is shown in Figure 6. In this figure, class 1 indicates the point (1,1), class 2 indicates the point (1,2), while class 25 represents the point (5,5). As shown in the figure, the average localization error increases as the distance between the target and the radars increases, whereas the total average estimation error is approximately 0.53 m. As shown in Figure 5, the intersection of the arc passing through the coordinates where the target is located in radar 1 and the arc passing the coordinates where the target is located in radar 2 can be obtained by solving quadratic Equations (1) and (2) [19,20]: where r 1 and r 2 are the estimated distance from each radar, as shown in Figure 4. The coordinates of the estimated location of the target can be expressed as follows: The estimated distances between the target and radars can be obtained using the conventional fast Fourier transform max value index-based distance estimation scheme or a deep learning model [18]. In our study, the deep learning model is used to estimate the distances, r 1 and r 2 , to the target. The location of the target is estimated using Equations (3) and (4) using the estimated distances. Note that we use the Pythagorean theorem to calculate the distance between the coordinates of each point and each radar for the ground truth of the distance. For example, the r 1 of the (1,1) point is √ 2 and the r 2 is √ 26. The r 1 and r 2 for other points can be calculated the same way. The localization performance of the conventional scheme is shown in Figure 6. In this figure, class 1 indicates the point (1,1), class 2 indicates the point (1,2), while class 25 represents the point (5,5). As shown in the figure, the average localization error increases as the distance between the target and the radars increases, whereas the total average estimation error is approximately 0.53 m.

Proposed Scheme
In this section, we propose a deep learning-based indoor 2D localization scheme using a 24 GHz FMCW radar. In the proposed scheme, the deep learning technology of artificial neural networks is employed to overcome the limitations of the conventional 2D localization scheme based on multilateration methods. To achieve enhanced localization performance, two different models-the DNN and the CNN model-are proposed using different numbers of FMCW radars.
Firstly, we propose the DNN model, which includes two cases. In the first DNN model case, the collected data are used with two FMCW radars, and the input layer is set to 128 units, because it combines two pieces of collected data for each point. Note that the FFT data from a single radar are collected in an array of size 1 × 64. For the second DNN model case, the collected data are used with only one radar, and the input layer is set to 64 units. As for the second case with a single radar, we tried to estimate the 2D location of the target using only one radar pattern. These two DNN model cases consist of the same layers, except for the number of units in the input layer. Figure 7 shows a network configuration diagram of the proposed DNN model where (1) represents the case with only one piece of radar data in the input layer, and (2) represents the case with two pieces of radar data. Note that the hyperparameter configurations of both cases are the same, as summarized in Table 1. Based on the experience accumulated from previous studies, we selected the best performing hyperparameters [18].

Proposed Scheme
In this section, we propose a deep learning-based indoor 2D localization scheme using a 24 GHz FMCW radar. In the proposed scheme, the deep learning technology of artificial neural networks is employed to overcome the limitations of the conventional 2D localization scheme based on multilateration methods. To achieve enhanced localization performance, two different models-the DNN and the CNN model-are proposed using different numbers of FMCW radars.
Firstly, we propose the DNN model, which includes two cases. In the first DNN model case, the collected data are used with two FMCW radars, and the input layer is set to 128 units, because it combines two pieces of collected data for each point. Note that the FFT data from a single radar are collected in an array of size 1 × 64. For the second DNN model case, the collected data are used with only one radar, and the input layer is set to 64 units. As for the second case with a single radar, we tried to estimate the 2D location of the target using only one radar pattern. These two DNN model cases consist of the same layers, except for the number of units in the input layer. Figure 7 shows a network configuration diagram of the proposed DNN model where (1) represents the case with only one piece of radar data in the input layer, and (2) represents the case with two pieces of radar data. Note that the hyperparameter configurations of both cases are the same, as summarized in Table 1. Based on the experience accumulated from previous studies, we selected the best performing hyperparameters [18].

Proposed Scheme
In this section, we propose a deep learning-based indoor 2D localization scheme using a 24 GHz FMCW radar. In the proposed scheme, the deep learning technology of artificial neural networks is employed to overcome the limitations of the conventional 2D localization scheme based on multilateration methods. To achieve enhanced localization performance, two different models-the DNN and the CNN model-are proposed using different numbers of FMCW radars.
Firstly, we propose the DNN model, which includes two cases. In the first DNN model case, the collected data are used with two FMCW radars, and the input layer is set to 128 units, because it combines two pieces of collected data for each point. Note that the FFT data from a single radar are collected in an array of size 1 × 64. For the second DNN model case, the collected data are used with only one radar, and the input layer is set to 64 units. As for the second case with a single radar, we tried to estimate the 2D location of the target using only one radar pattern. These two DNN model cases consist of the same layers, except for the number of units in the input layer. Figure 7 shows a network configuration diagram of the proposed DNN model where (1) represents the case with only one piece of radar data in the input layer, and (2) represents the case with two pieces of radar data. Note that the hyperparameter configurations of both cases are the same, as summarized in Table 1. Based on the experience accumulated from previous studies, we selected the best performing hyperparameters [18].   Secondly, we propose a 1D CNN model, which consists of two cases. The CNN model is known to be a trainable model with spatial information of the image retained [21,22]. Because CNNs exhibit excellent performance by extracting features from raw data during image classification, a 1D CNN was recently developed to reduce the computational complexity of 1D signals [23]. Similar to the proposed DNN model with two cases, we propose a 1D CNN model with two cases, and set the number of CNN channels to one and two, respectively, and classify the data as 1D or 2D input for each case. The input shape for the CNN model using one channel is set to (64,1) in 1D form, while the input shape for the CNN model using two channels is set to the 2D form of (64,2). These two cases for the CNN model consist of the same layers except for the input shape, as shown in Figure 8. The hyperparameters of the CNN model are the same as those of the proposed DNN model, as summarized in Table 1.  Secondly, we propose a 1D CNN model, which consists of two cases. The CNN model is known to be a trainable model with spatial information of the image retained [21,22]. Because CNNs exhibit excellent performance by extracting features from raw data during image classification, a 1D CNN was recently developed to reduce the computational complexity of 1D signals [23]. Similar to the proposed DNN model with two cases, we propose a 1D CNN model with two cases, and set the number of CNN channels to one and two, respectively, and classify the data as 1D or 2D input for each case. The input shape for the CNN model using one channel is set to (64,1) in 1D form, while the input shape for the CNN model using two channels is set to the 2D form of (64,2). These two cases for the CNN model consist of the same layers except for the input shape, as shown in Figure 8. The hyperparameters of the CNN model are the same as those of the proposed DNN model, as summarized in Table 1. The same dataset is used for both the DNN and the CNN models. As mentioned in Section 2.1, we collected a dataset of 5000 pieces of FFT data for the 25 points, and split them into training data, validation data, and test data. The training data and the validation data were used for algorithmic learning, while the test data were used to evaluate the performance of the proposed model and were not involved in learning. The 5000 pieces of data collected through experiments were first divided into learning and test data at a ratio of 8:2. Subsequently, the divided learning data were again divided into training and validation data at a ratio of 8:2. In other words, the model used 3200 pieces of learning data, and 800 pieces of validation data, while 1000 pieces of test data were used for the performance evaluation of the proposed model. Figure 9 shows a diagram of the dataset split.
Furthermore, 5000 pieces of data were randomly divided into 25 classes. This is because if it is not split by class, out of the 25 classes, an empty class can occur. The data were split into 25 classes and then randomized using a random function in TensorFlow. The same dataset is used for both the DNN and the CNN models. As mentioned in Section 2.1, we collected a dataset of 5000 pieces of FFT data for the 25 points, and split them into training data, validation data, and test data. The training data and the validation data were used for algorithmic learning, while the test data were used to evaluate the performance of the proposed model and were not involved in learning. The 5000 pieces of data collected through experiments were first divided into learning and test data at a ratio of 8:2. Subsequently, the divided learning data were again divided into training and validation data at a ratio of 8:2. In other words, the model used 3200 pieces of learning data, and 800 pieces of validation data, while 1000 pieces of test data were used for the performance evaluation of the proposed model. Figure 9 shows a diagram of the dataset split.

Performance Evaluation
After learning the DNN and the CNN models, using the experimental data, validation and testing were carried out. For the DNN model, there were two cases that used data from only one radar (DNN_radar_1) or used data from both radars (DNN_radar_2). In the CNN models, the two cases had either one channel (CNN_channel_1) or two channels (CNN_channel_2). A graphical representation of the validation accuracy of each model is shown in Figure 10. As shown in the figure, the DNN_radar_1 model achieved a validation accuracy of approximately 56%, whereas the DNN_radar_2 model achieved a validation accuracy of approximately 80%. Similarly, the validation accuracy of the CNN_chan-nenl_1 model was approximately 65%, whereas the validation accuracy of CNN_chan-nel_2 was approximately 90%.  Table 2 shows a comparison of the validation accuracy and the average localization error using test data for both the conventional scheme and the proposed schemes. As Furthermore, 5000 pieces of data were randomly divided into 25 classes. This is because if it is not split by class, out of the 25 classes, an empty class can occur. The data were split into 25 classes and then randomized using a random function in TensorFlow.

Performance Evaluation
After learning the DNN and the CNN models, using the experimental data, validation and testing were carried out. For the DNN model, there were two cases that used data from only one radar (DNN_radar_1) or used data from both radars (DNN_radar_2). In the CNN models, the two cases had either one channel (CNN_channel_1) or two channels (CNN_channel_2). A graphical representation of the validation accuracy of each model is shown in Figure 10. As shown in the figure, the DNN_radar_1 model achieved a validation accuracy of approximately 56%, whereas the DNN_radar_2 model achieved a validation accuracy of approximately 80%. Similarly, the validation accuracy of the CNN_channenl_1 model was approximately 65%, whereas the validation accuracy of CNN_channel_2 was approximately 90%.

Performance Evaluation
After learning the DNN and the CNN models, using the experimental data, validation and testing were carried out. For the DNN model, there were two cases that used data from only one radar (DNN_radar_1) or used data from both radars (DNN_radar_2). In the CNN models, the two cases had either one channel (CNN_channel_1) or two channels (CNN_channel_2). A graphical representation of the validation accuracy of each model is shown in Figure 10. As shown in the figure, the DNN_radar_1 model achieved a validation accuracy of approximately 56%, whereas the DNN_radar_2 model achieved a validation accuracy of approximately 80%. Similarly, the validation accuracy of the CNN_chan-nenl_1 model was approximately 65%, whereas the validation accuracy of CNN_chan-nel_2 was approximately 90%.  Table 2 shows a comparison of the validation accuracy and the average localization error using test data for both the conventional scheme and the proposed schemes. As  Table 2 shows a comparison of the validation accuracy and the average localization error using test data for both the conventional scheme and the proposed schemes. As shown in the table, the average localization error of the DNN_radar_1 model is approximately 1.30 m, while that of the DNN_radar_2 model is approximately 0.89 m. It is evident that the performance of the proposed DNN scheme was not improved compared with that of the conventional scheme. However, it is worth noting that the 2D location of the target can be estimated using a single radar in the proposed scheme, while two radars are required in the conventional scheme. As for the proposed CNN schemes, the average localization error of the CNN_channel _1 model was approximately 0.77 m, while that of the CNN_channel_2 model was approximately 0.23 m. According to the results, the proposed CNN scheme with two FMCW radars can provide enhanced localization performance compared with the conventional scheme as well as the other proposed schemes. Therefore, even when using the same data set, it is shown that we can enhance the average localization error from 0.53 m to 0.23 m by using the proposed CNN scheme with two FMCW radars. Although enhanced 2D localization performance with a single radar can be obtained with the proposed CNN scheme, it is worth noting that compared with the DNN scheme, the proposed schemes with two radars can enhance the performance of the localization more than the schemes with one radar.
Meanwhile, the validation accuracy graphs in Figure 10 show that the CNN model has less variance as the epochs increase, while the DNN model has greater variance. This is because the CNN model maintains the geometry of the input/output data in each layer, unlike the DNN model, and thus they can effectively recognize features with neighboring values while retaining spatial information in the data [24]. Therefore, we conclude that the proposed CNN models provide more efficient learning than the DNN models, and this results in higher accuracy due to effective feature extraction. By comparing the performance of the four models, we also conclude that the CNN model with two channels was the most accurate and had the lowest average error. Figure 11 shows the estimated location of the target by the CNN model with two channels for the test data. As shown in the figure, the estimated location was very close to the ground truth point for all 25 points.
Electronics 2021, 10, x FOR PEER REVIEW 9 of 12 shown in the table, the average localization error of the DNN_radar_1 model is approximately 1.30 m, while that of the DNN_radar_2 model is approximately 0.89 m. It is evident that the performance of the proposed DNN scheme was not improved compared with that of the conventional scheme. However, it is worth noting that the 2D location of the target can be estimated using a single radar in the proposed scheme, while two radars are required in the conventional scheme.
As for the proposed CNN schemes, the average localization error of the CNN_chan-nel_1 model was approximately 0.77 m, while that of the CNN_channel_2 model was approximately 0.23 m. According to the results, the proposed CNN scheme with two FMCW radars can provide enhanced localization performance compared with the conventional scheme as well as the other proposed schemes. Therefore, even when using the same data set, it is shown that we can enhance the average localization error from 0.53 m to 0.23 m by using the proposed CNN scheme with two FMCW radars. Although enhanced 2D localization performance with a single radar can be obtained with the proposed CNN scheme, it is worth noting that compared with the DNN scheme, the proposed schemes with two radars can enhance the performance of the localization more than the schemes with one radar. Meanwhile, the validation accuracy graphs in Figure 10 show that the CNN model has less variance as the epochs increase, while the DNN model has greater variance. This is because the CNN model maintains the geometry of the input/output data in each layer, unlike the DNN model, and thus they can effectively recognize features with neighboring values while retaining spatial information in the data [24]. Therefore, we conclude that the proposed CNN models provide more efficient learning than the DNN models, and this results in higher accuracy due to effective feature extraction. By comparing the performance of the four models, we also conclude that the CNN model with two channels was the most accurate and had the lowest average error. Figure 11 shows the estimated location of the target by the CNN model with two channels for the test data. As shown in the figure, the estimated location was very close to the ground truth point for all 25 points. Figure 11. Results of location estimation with the proposed CNN model with two channels. Figure 11. Results of location estimation with the proposed CNN model with two channels. Figure 12 shows the average localization error for the proposed CNN model with two channels. In the figure, class 1 indicates the point (1,1), and class 2 indicates the point (1,2), while class 25 represents the point (5,5). Contrary to the results of the conventional scheme in Figure 6, the average localization error does not increase as the distance between the target and the radars increases, and the total average estimation error was approximately 0.23 m. Figure 12 shows the average localization error for the proposed CNN model with two channels. In the figure, class 1 indicates the point (1,1), and class 2 indicates the point (1,2), while class 25 represents the point (5,5). Contrary to the results of the conventional scheme in Figure 6, the average localization error does not increase as the distance between the target and the radars increases, and the total average estimation error was approximately 0.23 m.  Figure 13 shows a comparison of the average localization error for the conventional scheme and the proposed CNN scheme using two channels. In the figure, the localization error is compared point by point. As shown in the figure, the average localization error of the proposed scheme is generally less than that of the conventional scheme, and the difference is remarkable in higher classes. Therefore, we can expect reliable localization performance from the proposed scheme, regardless of the real location of the target. Moreover, enhanced localization performance for remote points can be achieved using the proposed scheme compared with the conventional scheme.   Figure 13 shows a comparison of the average localization error for the conventional scheme and the proposed CNN scheme using two channels. In the figure, the localization error is compared point by point. As shown in the figure, the average localization error of the proposed scheme is generally less than that of the conventional scheme, and the difference is remarkable in higher classes. Therefore, we can expect reliable localization performance from the proposed scheme, regardless of the real location of the target. Moreover, enhanced localization performance for remote points can be achieved using the proposed scheme compared with the conventional scheme. Figure 12 shows the average localization error for the proposed CNN model with two channels. In the figure, class 1 indicates the point (1,1), and class 2 indicates the point (1,2), while class 25 represents the point (5,5). Contrary to the results of the conventional scheme in Figure 6, the average localization error does not increase as the distance between the target and the radars increases, and the total average estimation error was approximately 0.23 m.  Figure 13 shows a comparison of the average localization error for the conventional scheme and the proposed CNN scheme using two channels. In the figure, the localization error is compared point by point. As shown in the figure, the average localization error of the proposed scheme is generally less than that of the conventional scheme, and the difference is remarkable in higher classes. Therefore, we can expect reliable localization performance from the proposed scheme, regardless of the real location of the target. Moreover, enhanced localization performance for remote points can be achieved using the proposed scheme compared with the conventional scheme.

Conclusions
In this paper, we proposed a deep learning-based indoor 2D localization scheme using a 24 GHz FMCW radar to achieve better localization accuracy than the conventional 2D localization scheme based on multilateration. In the proposed scheme, DNN and the CNN models with either one or two FMCW radars were employed to overcome the limitations of the conventional 2D localization scheme.
Experiments were conducted in the corridor of the general office building ta Kwangwoon University, and the received data were collected to estimate the location of a human target, which was positioned at one of 25 different points within a monitoring area of 5 m × 6 m. According to the results, the 2D location of the target could be estimated with a single radar using the proposed scheme, while two FMCW radars were required for the conventional scheme. Furthermore, for the proposed CNN scheme, using two FMCW radars produced an average localization error of 0.23 m; while for the conventional scheme, using two FMCW radars produced an average localization error of 0.53 m. Even for the same data set, therefore, it was shown that the average localization error could be improved from 0.53 m to 0.23 m by applying the proposed CNN scheme and using two FMCW radars. Furthermore, the localization error was compared point by point, and it was shown that the average localization error of the proposed scheme was generally lower than that of the conventional scheme, and the difference was more remarkable in higher classes. Therefore, we can expect reliable localization performance from the proposed scheme, regardless of the real location of the target. Moreover, enhanced localization performance was achieved for remote points using the proposed scheme relative to the conventional scheme.
In future research, we will develop a regression model with substantial training data for more accurate localization performance, and we will also conduct research for estimating the location of any targets not included in the training data.
Author Contributions: Conceptualization, software, K.P. and J.L.; formal analysis and writingoriginal draft preparation, K.P.; writing-review and editing and supervision, Y.K. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.