Recognition of Blocking Categories for UWB Positioning in Complex Indoor Environment

The recognition of non-line-of-sight (NLOS) state is a prerequisite for alleviating NLOS errors and is crucial to ensure the accuracy of positioning. Recent studies only identify the line-of-sight (LOS) state and the NLOS state, but ignore the contribution of occlusion categories to spatial information perception. This paper proposes a bidirectional search algorithm based on maximum correlation, minimum redundancy, and minimum computational cost (BS-mRMRMC). The optimal channel impulse response (CIR) feature set, which can identify NLOS and LOS states well, as well as the blocking categories, are determined by setting the constraint thresholds of both the maximum evaluation index, and the computational cost. The identification of blocking categories provides more effective information for the indoor space perception of ultra-wide band (UWB). Based on the vector projection method, the hierarchical structure of decision tree support vector machine (DT-SVM) is designed to verify the recognition accuracy of each category. Experiments show that the proposed algorithm has an average recognition accuracy of 96.7% for each occlusion category, which is better than those of the other three algorithms based on the same number of CIR signal characteristics of UWB.


Research Status
Outdoor positioning technologies such as the Global Positioning System (GPS) can achieve good results in outdoor positioning, but they cannot achieve good positioning results in indoor positioning. Note that indoor position perception plays an important role in many applications such as tracking. Thus it is important and necessary to do more investigation and exploration on indoor positioning technologies. Among the existing indoor-positioning systems, ultra-wide band (UWB) technology has become one of the most promising methods due to its high precision, good time-delay resolution, low power consumption, and high robustness in complex indoor environments. In particular, it has significant advantages over other indoor positioning technologies in terms of accuracy [1]. The UWB positioning system shows the best performance under line-of-sight (LOS) propagation conditions, and the positioning accuracy can reach centimeter level. However, due to the complexity of the indoor environment, there are often barriers between fixed base station (FS) and mobile base station (MS), namely non-line-of-sight propagation (NLOS) conditions, which will lead to the blocking of the signal propagation path between FS and MS and introduce positive deviation [2]. The propagation of UWB under NLOS will cause the measured distance between FS and MS to be greater than the actual distance, then result in a sharp decline in the final positioning accuracy. Therefore, the status recognition of on the selection of high-dimensional features. In addition, some scholars also studied other methods for feature selection [26]. For example, a genetic algorithm was used for the CIR features of UWB selection [27]. A classical sorting algorithm ReliefF based on feature distance was introduced [28]. A backward recursive feature selection algorithm SVM recursive feature elimination (SVM-RFE) was also proposed [29]. However, the computational time complexity of these algorithms is not as good as that of feature selection methods based on mutual information. Therefore, the performance of feature selection methods based on mutual information is better for higher dimensions.
All the above methods only study the recognition of the propagation of LOS and NLOS in the indoor environment, and the influence of NLOS state on the positioning accuracy of UWB. However, the contribution of blocking category information to the perception of indoor space information is ignored, and the recognition of different blocking categories is not studied in depth. In addition, in the aforementioned CIR signal feature selection method applied to UWB, the calculation cost of different features and the number of CIR signal features are often not comprehensively considered. Some algorithms cannot achieve good feature selection results when the feature dimension is low. Meanwhile, the first feature selected by these algorithms is not the feature that contributes the most to the whole, and the computational cost of some features is often higher than other features.

Contributions
In order to address these problems of the above methods, this paper not only studies the recognition of LOS and NLOS states, but also considers the contribution of the effective recognition of each blocking category to indoor spatial information perception, and realizes the recognition of different blocking categories, which can be used for the indoor positioning and indoor information perception by providing more useful information. In the experiment, the blocking categories of NLOS state are expanded, including water, human body, metal, wall and wooden board, to reflect the complex indoor environment. In addition, 30 effective signal points are extracted from the CIR to improve the calculation efficiency of eight features, and the influence of different features on various shielding is analyzed.
In addition, considering the detection speed in practical application, the calculation cost of different features is introduced into the feature evaluation index. A bidirectional search algorithm based on the feature evaluation index of maximum relevance, minimum redundancy and minimum calculation cost is proposed to determine the optimal feature subset. In addition, based on the method of vector projection, the separability between the classes of measurement is realized, the decision tree support vector machine DT-SVM hierarchy is designed to complete the identification accuracy of each blocking category verification. More importantly, compared with the research of other scholars, this paper not only studies the LOS and NLOS state recognition, but also studies the recognition of different blocking categories in the NLOS state. Experiments have verified that the method proposed in this paper is superior to other algorithms in the recognition of LOS and NLOS states, but has achieved good recognition results for all blocking categories, with an average accuracy of 96.7%. This paper provides a new research method for UWB subsequent indoor positioning, indoor information perception construction and target tracking.

Hardware Equipment and Experimental Environment
This test uses two UWB devices, one is a fixed base station BS as a receiving device and the other is a mobile base station MS as a transmitting device. DW1000 chip is used for each UWB base station, and the time difference of arrival (TDOA) method is used to calculate the distance. The operating frequency range of DW1000 used by BS and MS is 3244 MHz to 6999 MHz. The maximum transmission power density is −35 dBm/MHz. The communication rate is 6.8 Mbps. This experiment was conducted in a large experimental center with a relatively complicated indoor environment (listed in Figure 1). The whole room is divided into data acquisition areas for training sets and data acquisition areas for testing. In the complex indoor environment, the blocking categories usually include metal blocking (such as iron cabinets), wooden blocking (such as tables and chairs), human blocking, and water blocking (such as fish tanks and buckets of water). Therefore, in the experimental environment, the aforementioned several blocking materials were simulated to conduct experiments to verify the blocking effects of different materials on UWB. Data collection can be divided into six categories: LOS condition (no blocking) and water blocking, people blocking, metal blocking, wall blocking and wood board blocking. The physical conditions for each blocking category are as follows: In the experiment, a metal iron cabinet with 70 cm × 50 cm × 200 cm is placed between the MS and the BS, and the blocking angle of the iron cabinet is adjusted to complete the frontal blocking and the side blocking to achieve metal blocking; 1 to 2 people stand between the MS and BS at different angles to achieve human blocking. A plastic transparent water tank 80 cm × 40 cm × 80 cm, is filled with water and placed between the MS and BS to achieve water blocking. The BS is placed in the laboratory room, and the MS is placed in the corridor, separated by a concrete wall to block the wall. A wooden board 150 cm × 15 cm × 150 cm is placed between the MS and BS to block the board.
frequency range of DW1000 used by BS and MS is 3244 MHz to 6999 MHz. The maximum transmission power density is −35 dBm/MHz. The communication rate is 6.8 Mbps. This experiment was conducted in a large experimental center with a relatively complicated indoor environment (listed in Figure 1). The whole room is divided into data acquisition areas for training sets and data acquisition areas for testing. In the complex indoor environment, the blocking categories usually include metal blocking (such as iron cabinets), wooden blocking (such as tables and chairs), human blocking, and water blocking (such as fish tanks and buckets of water). Therefore, in the experimental environment, the aforementioned several blocking materials were simulated to conduct experiments to verify the blocking effects of different materials on UWB. Data collection can be divided into six categories: LOS condition (no blocking) and water blocking, people blocking, metal blocking, wall blocking and wood board blocking. The physical conditions for each blocking category are as follows: In the experiment, a metal iron cabinet with 70 cm × 50 cm × 200 cm is placed between the MS and the BS, and the blocking angle of the iron cabinet is adjusted to complete the frontal blocking and the side blocking to achieve metal blocking; 1 to 2 people stand between the MS and BS at different angles to achieve human blocking. A plastic transparent water tank 80 cm × 40 cm × 80 cm, is filled with water and placed between the MS and BS to achieve water blocking. The BS is placed in the laboratory room, and the MS is placed in the corridor, separated by a concrete wall to block the wall. A wooden board 150 cm × 15 cm × 150 cm is placed between the MS and BS to block the board.
With the same distance between MS and BS, in the data collection area of the training set, each blocking category was repeatedly tested 10,000 times, and a total of 60,000 CIR samples were collected as the training set. In the data collection area of the test set, each blocking category was repeatedly tested 1000 times and a total of 6000 CIR samples were collected as the test set. The true distance between BS and MS is measured by Leica DISTO D5 laser rangefinder.

CIR Description
Channel state information (CSI) includes the whole process of the signal from the transmitter to the receiver, including signal fading, multipath channel, transmission path and a series of information. It is usually used to describe the transmission process of a wireless channel. CSI can be measured by channel impulse response (CIR), which is expressed as: With the same distance between MS and BS, in the data collection area of the training set, each blocking category was repeatedly tested 10,000 times, and a total of 60,000 CIR samples were collected as the training set. In the data collection area of the test set, each blocking category was repeatedly tested 1000 times and a total of 6000 CIR samples were collected as the test set. The true distance between BS and MS is measured by Leica DISTO D5 laser rangefinder.

CIR Description
Channel state information (CSI) includes the whole process of the signal from the transmitter to the receiver, including signal fading, multipath channel, transmission path and a series of information. It is usually used to describe the transmission process of a wireless channel. CSI can be measured by channel impulse response (CIR), which is expressed as: where N is total of propagation paths, a i and τ i are the amplitudes and time delays of path i, and δ is the impulse function. The CIR signals received by BS are shown in Figure 2. It can be seen that most of the 1015 signal points are noise (blue point area), while only a few are useful information (red point area). To improve the system efficiency, only 100 points of CIR data in the red area are collected.
The 100 CIR signal points of six class data (no blocking, water blocking, people blocking, metal blocking, concrete wall blocking and wood board blocking) are shown in Figure 3. However, it can be clearly seen from Figure 3a that the CIR signal differences of several types of features only concentrate Sensors 2020, 20, 4178 5 of 22 on 30 points, which is confirmed in literature [15], while the remaining points cannot contribute much difference information. Therefore, this paper extracts 30 useful signal points from the 100 points, as shown in Figure 3b. and δ is the impulse function.
The CIR signals received by BS are shown in Figure 2. It can be seen that most of the 1015 signal points are noise (blue point area), while only a few are useful information (red point area). To improve the system efficiency, only 100 points of CIR data in the red area are collected.
The 100 CIR signal points of six class data (no blocking, water blocking, people blocking, metal blocking, concrete wall blocking and wood board blocking) are shown in Figure 3. However, it can be clearly seen from Figure 3a that the CIR signal differences of several types of features only concentrate on 30 points, which is confirmed in literature [15], while the remaining points cannot contribute much difference information. Therefore, this paper extracts 30 useful signal points from the 100 points, as shown in Figure 3b.

Construct CIR Signal Features
By analyzing the CIR signals in the LOS and NLOS states under different blocking materials, the maximum amplitude of the signal in the NLOS state is usually smaller than that in the LOS state due to the blocking and attenuation of the signal during propagation. The energy of the first path is much less than the LOS state because of signal attenuation. In addition, the reflected, diffracted, or dispersed pulses after blocking are merged after the first path, resulting in the path amplitude after the first path in the condition of NLOS usually being larger than that in the condition of LOS, making the data below the state of NLOS closer to the mean and smaller standard deviation. At the same time, because blocking will cause delayed propagation in the NLOS state, the rise time, mean excess delay and RMS delay spread are selected as characteristics. Kurtosis and skewness are selected as characteristics due to the energy dispersion in the LOS and NLOS states. The discrete mathematical models of the above eight features are as follows: points are noise (blue point area), while only a few are useful information (red point area). To improve the system efficiency, only 100 points of CIR data in the red area are collected.
The 100 CIR signal points of six class data (no blocking, water blocking, people blocking, metal blocking, concrete wall blocking and wood board blocking) are shown in Figure 3. However, it can be clearly seen from Figure 3a that the CIR signal differences of several types of features only concentrate on 30 points, which is confirmed in literature [15], while the remaining points cannot contribute much difference information. Therefore, this paper extracts 30 useful signal points from the 100 points, as shown in Figure 3b.

Construct CIR Signal Features
By analyzing the CIR signals in the LOS and NLOS states under different blocking materials, the maximum amplitude of the signal in the NLOS state is usually smaller than that in the LOS state due to the blocking and attenuation of the signal during propagation. The energy of the first path is much less than the LOS state because of signal attenuation. In addition, the reflected, diffracted, or dispersed pulses after blocking are merged after the first path, resulting in the path amplitude after the first path in the condition of NLOS usually being larger than that in the condition of LOS, making the data below the state of NLOS closer to the mean and smaller standard deviation. At the same time, because blocking will cause delayed propagation in the NLOS state, the rise time, mean excess delay and RMS delay spread are selected as characteristics. Kurtosis and skewness are selected as characteristics due to the energy dispersion in the LOS and NLOS states. The discrete mathematical models of the above eight features are as follows:

Construct CIR Signal Features
By analyzing the CIR signals in the LOS and NLOS states under different blocking materials, the maximum amplitude of the signal in the NLOS state is usually smaller than that in the LOS state due to the blocking and attenuation of the signal during propagation. The energy of the first path is much less than the LOS state because of signal attenuation. In addition, the reflected, diffracted, or dispersed pulses after blocking are merged after the first path, resulting in the path amplitude after the first path in the condition of NLOS usually being larger than that in the condition of LOS, making the data below the state of NLOS closer to the mean and smaller standard deviation. At the same time, because blocking will cause delayed propagation in the NLOS state, the rise time, mean excess delay and RMS delay spread are selected as characteristics. Kurtosis and skewness are selected as characteristics due to the energy dispersion in the LOS and NLOS states. The discrete mathematical models of the above eight features are as follows: (1) The energy of the CIR signal: where E and E f represent the energy of all CIR signals and the energy of the CIR signal reaching the first peak respectively, and N and N f represent all the points and the number of points reaching the first peak respectively.
(2) Maximum amplitude: (3) The rise time to the maximum t rise (4) The standard deviation σ (5) Mean excess delay: τ m where u is the mean of h(t).
This paper calculates the cumulative distribution of different occlusion categories under each feature. Figure 4 shows the cumulative distribution of each feature of different occlusion categories under a 30-point CIR signal, indicating that the first path energy and the maximum amplitude under LOS status are bigger than under the NLOS condition. Rise time, standard deviation, average excess delay and delay propagation RMS under LOS status are significantly less than the NLOS condition. However, it can be seen from Figure 4 that there is no significant difference between the features of LOS and wood blocking. Meanwhile, the difference between the measured distance and the actual distance when the 15 cm thick wood is blocked is similar to the difference under LOS, which verifies that wood has little effect on the attenuation of the incident electromagnetic field when the indoor wood moisture content is low. Therefore, the following work mainly focuses on the feature selection and the best feature set of no blocking, water blocking, people blocking, metal blocking and wall blocking.
Sensors 2020, 20, x 7 of 23 wood moisture content is low. Therefore, the following work mainly focuses on the feature selection and the best feature set of no blocking, water blocking, people blocking, metal blocking and wall blocking.

System Model Description
Given a dataset D containing k samples, where n = 1, 2, . . . , 8 represents the eight CIR signal: features of the first path energy, maximum amplitude, rise time, standard deviation, mean excess delay, RMS delay spread, kurtosis and skewness.  Table A1.
. . , n, is the index of the feature that meet the set criteria for evaluating features.

Mutual Information and Relevance Definitions
The relationship between features can be divided into relevance, redundancy, interaction and independence according to mutual information. The definition of a feature's relevance, redundancy, independence, irrelevance and interactivity is well defined in literature [25].
Mutual Information(MI) I(X; Y) is used to describe the degree of relevance between two random variables X = {x 1 , x 2 , . . . , x n } and Y = y 1 , y 2 , . . . , y m [19]. When I(X; Y) is large, it means the relevance between X and Y is strong; otherwise, it means the relevance is small. In particular, when I(X; Y) = 0, it means that X and Y are independent of each other.
where p(x, y) is the probability density of x and y. Conditional mutual information (CMI) can reflect the relevance between different features under the same label. Suppose there are three random variables set X, Y and L = {l 1 , l 2 , . . . , l d }, and their conditional probability density is p(x l), p(y l), p(x, y l) , respectively. In the case of a given L, then the mutual conditional information of X and Y about L is: Based on the relevance between features and categories, a measurable multivariate model F can be obtained, and the specific form of F is as follows: where I represents the mutual information between each feature and the corresponding category, and the relevance between the 8 features and the label is shown in Figure 5.
In general, the matrix F = [I ij ] can be obtained by standardizing and centralizing the data.
Calculate the feature's mutual information matrix Q = F · F T . A n × n dimensional symmetric matrix is obtained, in which each element represents the mutual information between features under the same category, as shown in Figure 6.
where I ij represents the mutual information between each feature and the corresponding category, and the relevance between the 8 features and the label is shown in Figure 6.
L , then the mutual conditional information of X and Y .about L is: Based on the relevance between features and categories, a measurable multivariate model F can be obtained, and the specific form of F is as follows:  (11) where I represents the mutual information between each feature and the corresponding category, and the relevance between the 8 features and the label is shown in Figure 5. In general, the matrix ] can be obtained by standardizing and centralizing the data.
Calculate the feature's mutual information matrix A n n × dimensional symmetric matrix is obtained, in which each element represents the mutual information between features under the same category, as shown in Figure 6.
where ij I represents the mutual information between each feature and the corresponding category, and the relevance between the 8 features and the label is shown in Figure 6.

Construction of Bidirectional Search Algorithm Based on Maximum Correlation, Minimum Redundancy, and Minimum Computational Cost (mRMRMC) Feature Evaluation Criteria
The mRMR (max-relevance and min-redundancy) algorithm [23] is proposed on the basis of the mutual information-based feature selection algorithm (MIFS). The mRMR algorithm not only between features and labels, but also considers the redundancy between features under the same label.

Construction of Bidirectional Search Algorithm Based on Maximum Correlation, Minimum Redundancy, and Minimum Computational Cost (mRMRMC) Feature Evaluation Criteria
The mRMR (max-relevance and min-redundancy) algorithm [23] is proposed on the basis of the mutual information-based feature selection algorithm (MIFS). The mRMR algorithm not only considers the relevance D(x i , L) between features and labels, but also considers the redundancy R(x i , x j ) between features under the same label.
where |L| represents the number of labels in tag set L, and |X| represents the number of features in feature set X.
Sensors 2020, 20, 4178 The mRMR evaluation index is: where I(l; x) is the mutual information between feature x and category label l, I(x i ; x) are the mutual information between feature x and the selected feature x i , and B represents the selected feature subset.
With the increase of the number of features in the selected subsets, the calculation of the system will increase accordingly, the time cost of feature computation is introduced. Big O notation is usually used to express time complexity. Since only the highest term is retained and the coefficients are ignored, the time complexity can only be roughly described. Therefore, the calculation time complexity of each feature is expressed by calculating the number of statement execution times of each feature in this paper. Through the calculation formula of different features, the number of execution statements for each feature is obtained, and the calculation cost of each feature is obtained after ignoring the constant term.
where C(B) represents time cost of the selected feature set B, c(x i ) represents the calculated cost of feature x i , and β i is the relative weight of the calculation cost. Finally, the comprehensive evaluation criteria based on maximum relevance, minimum redundancy and minimum computational cost are determined as mRMRMC: where x i ∈ B s represents the features in the selection area; ω is an adjustable constant, indicating the weight of calculation cost in the evaluation index, and under this evaluation index, the selected set B shall be satisfied.

The Optimal Feature Set Is Determined Based on Multiple Threshold Constraints
In the traditional feature selection algorithm based on mutual information, such as mRMR, MIFS, IG, etc., the feature with the greatest relevance to the category is put into the selected feature subset as the first feature. However, these methods often ignore the combination effect between features, and do not consider whether the selected first feature is the feature that contributes the most to the whole. In this paper, a bidirectional search strategy based on mRMRMC, namely BS-mRMRMC algorithm, is proposed. First, calculate the relevance D all (X, L) of all features and labels, and then inversely delete feature x i to obtain the relevance D miss(x i ) (x i , L) of the feature set and labels after missing x i . The feature that causes the greatest change in relevance is selected as the first feature vector of the selected combination. After that, the features in B s to be selected are selected in a positive sequence, and the features with the largest evaluation index J mRMRMC (x i ) are added to the selected B, and the best feature subset is finally determined. L)), then feature x i provides the most information for the whole, so x i is taken as the first feature of the selected set, B(1) = x i .
Reverse deletion ensures that the first feature selected must be the feature that provides the most information for the whole, avoids that the features which are most relevant to the tag are not the feature that provides the most information for the whole due to the combination effect.
This paper makes the system model more efficient and stable by adding three constraint conditions: minimum relevance constraint, computational cost constraint and maximum evaluation index constraint. The selection of the optimal feature subset based on BS-mRMRMC algorithm under three constraints is shown in Figure 7.
, then feature i x provides the most information for the whole, so i x is taken as the first feature of the selected set, Reverse deletion ensures that the first feature selected must be the feature that provides the most information for the whole, avoids that the features which are most relevant to the tag are not the feature that provides the most information for the whole due to the combination effect.
This paper makes the system model more efficient and stable by adding three constraint conditions: minimum relevance constraint, computational cost constraint and maximum evaluation index constraint. The selection of the optimal feature subset based on BS-mRMRMC algorithm under three constraints is shown in Figure 7.

Input:
Sample data set D , features set X , category tag set L ;

Output:
Best subset of features best B ; First feature of the selected feature subset by the reverse deletion strategy Best feature subset selection by the forward search strategy The relevance between features and labels is calculated according to formula (9) and formula (14)(15) ) , Break;  , the feature i x is directly discarded to reduce the complexity of later calculation and improve the system efficiency. Where d η is selected according to the relevance between the global feature vector and the label: where 1 0 ≤ < d λ represents the parameter of minimum relevance threshold. Constraint 1: minimum relevance constraint. Set the threshold of relevance between feature and label η d . Once the feature is related to the label D(x i , L) < η d , the feature x i is directly discarded to reduce the complexity of later calculation and improve the system efficiency. Where η d is selected according to the relevance between the global feature vector and the label: where 0 < λ d ≤ 1 represents the parameter of minimum relevance threshold. Constraint 2: computational cost constraint. Set the computational cost constraint threshold η c , x j = argmax x i ∈B s (J mRMRMC (x i )), if C(B) + c(x j ) > η c , then the best subset of features is directly determined to be B best = B. The computing cost constraint threshold η c is determined according to the complexity of computing system characteristics and the computing power of the system in practical application.
where 0 < λ c ≤ 1 represents the parameter of computing cost threshold. Constraint condition 3: maximum evaluation index constraint. Set the maximum evaluation index constraint threshold η j , Indicates that the recognition accuracy of the system can reach the actual requirement η j after x j is added, the best subset of features B best = B ∪ x j .

Decision Tree Support Vector Machine (DT-SVM) Classifier Design
SVM, one of the most commonly used classifiers, was proposed by Vapnik et al. based on statistical learning theory and its learning method [30]. Due to its strong learning ability and generalization ability, SVM has been widely used in multi-classification problems with small samples. However, SVM was originally designed to handle two types of classification tasks. For multi-classification support vector machines, N-class tasks can be converted into multiple two-class tasks as in one-against-all SVM (OAA-SVM) [31], one-against-one SVM (OAO-SVM) [32], decision directed acyclic graph SVM (DDAG-SVM) [33], and the multi-classification method decision tree SVM (DT-SVM) [34]. For N-class classification problems, DDAG-SVM and DT-SVM only need to construct the decision surface, which greatly improves the training and detection speed. But the recognition accuracy of DDAG-SVM and DT-SVM is not ideal due to the accumulation of classification errors. In order to solve this problem, both [35] and [36] proposed the use of the particle swarm optimization algorithm DT-SVM. Although this method improves the classification accuracy, it also increases the corresponding computational complexity and has a good effect on more classification. The structure of DT-SVM classifier is designed by different methods [37][38][39][40], but there are always problems such as limited accuracy improvement and limited adaptability.
Therefore, considering the actual number of categories in this paper, the vector projection method to realize the separability measurement between classes is adopted. Accordingly, the hierarchical structure of the DT-SVM classifier is designed to realize the recognition of each blocking category. The method based on vector projection was proposed by Li et al. [41], by which the number of intersecting samples between two types of samples could be better distinguished. The method based on vector projection was proposed by Li et al. [41], by which the number of intersecting samples between two types of samples could be better distinguished.
Set the sample set D l = {d l1 ,d l2 , . . . ,d li } of class l, indicating that there are k l samples of class l. d li = x i1 , x i2 , . . . , x ij , j = 1, 2, . . . , n, indicating that each sample contains n features. Let: where d li is the projection of sample d li to the feature direction of this class, Euclidean distance from d li to m l is m l d li 2 = m l − d li 2 . Let: d = m 1 − m 2 2 represents the distance between the two classes D 1 = {d 11 ,d 12 , . . . ,d 1i } In order to measure the separability between classes, the separability measure value se 12 is calculated. If r 1 + r 2 ≥ d, the number of samples in X 1 that satisfy d − r 2 ≤ m 1 d 1i 2 ≤ r 1 and m 1 d 1i 2 ≤ d is n 1 . In the same way, the number of samples in X 2 that satisfy d − r 1 ≤ m 2 d 2i 2 ≤ r 2 and m 2 d 2i 2 ≤ d is n 2 .Then, the separability measure between classes is defined as se 12 .
Step1: Let d = 1, and the separable measure se ij between classes was obtained from the class samples, i, j = 1, 2, . . . , l, i j, and the separable measure matrix SE was constructed.
se l−1,1 se l−1,2 · · · 0 se l−1,c se l,1 se l,2 · · · se l,l−1 0 In order to measure the separability between classes, the separability measure value 12 se is calculated. If d r r ≥ + 2 1 , the number of samples in 1 X that satisfy , that means the two classes don't intersect, 0 12 = se . Step1: , and the separable measure ij se between classes was obtained from the class samples, , and the separable measure matrix SE was constructed.
, then delete the elements of that row and column. Step3: , and repeat step2 to guide the sorting of inter-class relationships of all categories. Step4: , and take the sample set of class 1 = d as the positive sample of subclassifier, and the remaining sample of classes as the negative sample of the classifier. Train the SVM classifier and record the node information ) (d node . Remove the class ) (d L sample set from the sample set. Step5 , and repeat step4 until all sub-classifiers are trained. The final DT-SVM multiclassifier is shown in Figure 8.  The flow chart of the whole paper is addressed in Figure 9. Firstly, data collection and data preprocessing were carried out, and then the best feature subset was obtained based on the BS-mRMRMC algorithm. Finally, the classification hierarchy of multiple classifiers was determined according to the vector projection method, and the performance of the classifier was evaluated with four indexes.
The flow chart of the whole paper is addressed in Figure 9. Firstly, data collection and data preprocessing were carried out, and then the best feature subset was obtained based on the BS-mRMRMC algorithm. Finally, the classification hierarchy of multiple classifiers was determined according to the vector projection method, and the performance of the classifier was evaluated with four indexes.

Results and Discussion
In this section, the effectiveness of the proposed method is verified by the selection effect of features in the best subset and the evaluation indexes of recognition results. As shown in Figure 10, three feature selection algorithms IG [20], ReliefF [27] and SVM-RFE [29] are selected to compare the results with the BS-mRMRMC algorithm proposed in this paper. In this paper, four system performance evaluation indexes are selected to evaluate the results of each index under different feature subsets selected by four methods. The meanings of several parameters of performance evaluation are shown in Table 1.
Four evaluation indexes are overall accuracy, precision, recall rate and the comprehensive evaluation index. The overall accuracy can evaluate the positive and negative samples detection accuracy, precision can evaluate the misjudgment of positive samples of the system, and the recall rate can evaluate the missed sanction of positive samples of the system, and the comprehensive evaluation index can comprehensively evaluate the identification performance of the system.

Results and Discussion
In this section, the effectiveness of the proposed method is verified by the selection effect of features in the best subset and the evaluation indexes of recognition results. As shown in Figure 10, three feature selection algorithms IG [20], ReliefF [27] and SVM-RFE [29] are selected to compare the results with the BS-mRMRMC algorithm proposed in this paper. The flow chart of the whole paper is addressed in Figure 9. Firstly, data collection and data preprocessing were carried out, and then the best feature subset was obtained based on the BS-mRMRMC algorithm. Finally, the classification hierarchy of multiple classifiers was determined according to the vector projection method, and the performance of the classifier was evaluated with four indexes.

Results and Discussion
In this section, the effectiveness of the proposed method is verified by the selection effect of features in the best subset and the evaluation indexes of recognition results. As shown in Figure 10, three feature selection algorithms IG [20], ReliefF [27] and SVM-RFE [29] are selected to compare the results with the BS-mRMRMC algorithm proposed in this paper. In this paper, four system performance evaluation indexes are selected to evaluate the results of each index under different feature subsets selected by four methods. The meanings of several parameters of performance evaluation are shown in Table 1.
Four evaluation indexes are overall accuracy, precision, recall rate and the comprehensive evaluation index. The overall accuracy can evaluate the positive and negative samples detection accuracy, precision can evaluate the misjudgment of positive samples of the system, and the recall rate can evaluate the missed sanction of positive samples of the system, and the comprehensive evaluation index can comprehensively evaluate the identification performance of the system.  In this paper, four system performance evaluation indexes are selected to evaluate the results of each index under different feature subsets selected by four methods. The meanings of several parameters of performance evaluation are shown in Table 1. Four evaluation indexes are overall accuracy, precision, recall rate and the comprehensive evaluation index. The overall accuracy can evaluate the positive and negative samples detection accuracy, precision can evaluate the misjudgment of positive samples of the system, and the recall rate can evaluate the missed sanction of positive samples of the system, and the comprehensive evaluation index can comprehensively evaluate the identification performance of the system.
(1) Overall accuracy: (2) Precision: (3) Recall: (4) Comprehensive evaluation index F-Score: Usually take α = 1, F1 = 2 * P * R P+R . The feature ordering of the four methods adopted in this paper is shown in Table 2. It can be analyzed from the feature selection in the table that the feature with the greatest relevance with the label is not necessarily the feature that contributes the most to the overall information, which also verifies the effectiveness of the paper's reverse deletion search strategy used to determine the first feature. Table 2. Feature selection ranking of four algorithms.

Algorithms
Feature Selection Sorting

Best Subset Selection Based on Different Constraints
First, the paper verifies the number of features in the optimal feature subset determined by each method under different constraints of computational cost and maximum detection accuracy. Different η c and η j are obtained by setting different threshold parameters of calculation cost and maximum evaluation index λ c = [0.1, 0.2, . . . , 1], λ j = [0.5, 0.55, . . . , 0.95]. From Figure 11, the number of features in the best feature subset selected by the BS-mRMRMC method is less than that of other methods under higher detection accuracy requirements and lower computational cost requirements. That is to say, the method in this paper can achieve the same or higher detection accuracy with fewer features. For example, when the constraints are η c ≤ 0.5C(X) and η j ≥ 0.9, the number of best sub-sets selected by the proposed algorithm is only 2, when the constraints are changed by η c ≤ 0.5C(X) and η j ≥ 0.95, the number is only 3. However, the number of features selected by other algorithms under the same constraint conditions are more than the algorithm of this paper. It can be concluded that the number of features determined by the best subset of the proposed algorithm is better than that of the other three algorithms when the computational cost is lower and the demand for higher precision is higher. sets selected by the proposed algorithm is only 2, when the constraints are changed by c and 95 . 0 ≥ j η , the number is only 3. However, the number of features selected by other algorithms under the same constraint conditions are more than the algorithm of this paper. It can be concluded that the number of features determined by the best subset of the proposed algorithm is better than that of the other three algorithms when the computational cost is lower and the demand for higher precision is higher.

The Results Were Compared and Analyzed with Different Feature Number
In this section, the paper studies the comparison of the recognition results of different categories at first according to the sorting results of features in Table 2. The four pictures of Figure 12 show the comparison surface diagram of four evaluation indexes for identifying each category by four methods, when one to eight different features are selected successively. Figure 12a shows that our algorithm can achieve higher detection accuracy with fewer features. When the number of features selected is less than 4, the identification accuracy of each category is significantly better than IG and SVM-RFE, and slightly better than ReliefF. The Figure 12b, c and d show that our method has better recognition accuracy, recall rate and comprehensive evaluation index for most categories than the other three algorithms when the number of selected features is less than 4. This result is also consistent with that in the previous section, under the constraint of ) ( 5 . 0 X C c ≤ η and 95 . 0 ≥ j η , the algorithm in this paper selects only 3 optimal subsets, which can achieve higher detection accuracy at a lower computational cost with a smaller number of features. Figure 13 shows average accuracy rate, average precision, average recall rate and average comprehensive evaluation index contrast line chart of four methods when selecting a different number of features. If the number of features selected is less than 4, the indicators of the BS-mRMRMC algorithm are better than other algorithms. When there are more features, the selection of feature sets of different algorithms for the final indicators tend to be consistent. Due to the redundancy between features not considered by the IG algorithm, the detection accuracy decreases when the third feature is added in Figure 13, which also verifies the importance of considering the Figure 11. Surface graphs of the number of feature of best feature subset for the four methods under different computational cost constraints η c and maximum evaluation index constraints η j .

The Results Were Compared and Analyzed with Different Feature Number
In this section, the paper studies the comparison of the recognition results of different categories at first according to the sorting results of features in Table 2. The four pictures of Figure 12 show the comparison surface diagram of four evaluation indexes for identifying each category by four methods, when one to eight different features are selected successively. Figure 12a shows that our algorithm can achieve higher detection accuracy with fewer features. When the number of features selected is less than 4, the identification accuracy of each category is significantly better than IG and SVM-RFE, and slightly better than ReliefF. The Figure 12b-d show that our method has better recognition accuracy, recall rate and comprehensive evaluation index for most categories than the other three algorithms when the number of selected features is less than 4. This result is also consistent with that in the previous section, under the constraint of η c ≤ 0.5C(X) and η j ≥ 0.95, the algorithm in this paper selects only 3 optimal subsets, which can achieve higher detection accuracy at a lower computational cost with a smaller number of features. Figure 13 shows average accuracy rate, average precision, average recall rate and average comprehensive evaluation index contrast line chart of four methods when selecting a different number of features. If the number of features selected is less than 4, the indicators of the BS-mRMRMC algorithm are better than other algorithms. When there are more features, the selection of feature sets of different algorithms for the final indicators tend to be consistent. Due to the redundancy between features not considered by the IG algorithm, the detection accuracy decreases when the third feature is added in Figure 13, which also verifies the importance of considering the redundancy between features. In addition, the first feature selected can be regarded as the feature with the greatest contribution to overall performance. The accuracy of the recognition under the first feature is better than other algorithms, which also verifies the effectiveness of the bidirectional search strategy. However, the average recognition accuracy of the first feature selected by the BS-mRMRMC method is high, but the average recall rate is poor, indicating that this feature has a great correlation with other categories, which can be verified from Figure 5. As a result, many positive samples are missed and identified as negative samples, and there is a certain contradiction between the recall rate and accuracy. Therefore, it can be seen from the average comprehensive evaluation index F1 obtained from the comprehensive accuracy and recall rate that the first feature selected by BS-mRMRMC is better. with other categories, which can be verified from Figure 5. As a result, many positive samples are missed and identified as negative samples, and there is a certain contradiction between the recall rate and accuracy. Therefore, it can be seen from the average comprehensive evaluation index F1 obtained from the comprehensive accuracy and recall rate that the first feature selected by BS-mRMRMC is better.  The purpose of feature selection is to obtain the best and smallest feature set, and the system performance meets the constraint requirements. Compared with the ReliefF method, our algorithm can comprehensively consider the correlation between features and categories and the redundancy between features. The limitation of ReliefF is that it cannot effectively remove redundant features. It can also be seen from Figure 13 that our proposed method selects three features, but ReliefF needs to select five features to achieve the same index requirements. Under the same constraints, our The purpose of feature selection is to obtain the best and smallest feature set, and the system performance meets the constraint requirements. Compared with the ReliefF method, our algorithm can comprehensively consider the correlation between features and categories and the redundancy between features. The limitation of ReliefF is that it cannot effectively remove redundant features. It can also be seen from Figure 13 that our proposed method selects three features, but ReliefF needs to select five features to achieve the same index requirements. Under the same constraints, our algorithm can select the feature set with the largest correlation, the smallest redundancy, and a smaller number of features. When the number of features in the selected best feature subset is 3, comparison of identification indicators of each category under the four methods is shown in Table 3. Except for the fact that the recognition index of some categories is slightly lower than other algorithms, the recognition index of BS-mRMRMC algorithm of most categories is better than other algorithms, and the average index of each category is more than 95%.

Conclusions
In this paper, a large amount of CIR data of a UWB device with five different blocking materials at LOS and NLOS are obtained in a large experimental center, and eight features are extracted from CIR to analyze the influence of different features on different occlusion categories. Selecting 30 effective CIR signal points for feature extraction can effectively improve the calculation efficiency of feature extraction. The proposed BS-mRMRMC method introduces the computational cost into the mRMRMC feature evaluation criteria, and uses a bidirectional search strategy for feature selection. Through multiple experiments, it has been proved that under the same constraints, BS-mRMRMC can better select the optimal feature subset with maximum correlation, minimum redundancy, minimum computational cost and fewer features. Meanwhile, when the same number of features are selected, the BS-mRMRMC method has better recognition accuracy for each category than the other three methods. When only three features are selected, the average accuracy of the recognition for each blocking category can reach 96.7%. More importantly, this paper not only identifies the LOS and NLOS categories, but also identifies the different blocking categories under NLOS, which provides a new research method for UWB subsequent indoor positioning, indoor information perception construction and target tracking.