Adaptive Residual Weighted K-Nearest Neighbor Fingerprint Positioning Algorithm Based on Visible Light Communication

The weighted K-nearest neighbor (WKNN) algorithm is a commonly used fingerprint positioning, the difficulty of which lies in how to optimize the value of K to obtain the minimum positioning error. In this paper, we propose an adaptive residual weighted K-nearest neighbor (ARWKNN) fingerprint positioning algorithm based on visible light communication. Firstly, the target matches the fingerprints according to the received signal strength indication (RSSI) vector. Secondly, K is a dynamic value according to the matched RSSI residual. Simulation results show the ARWKNN algorithm presents a reduced average positioning error when compared with random forest (81.82%), extreme learning machine (83.93%), artificial neural network (86.06%), grid-independent least square (60.15%), self-adaptive WKNN (43.84%), WKNN (47.81%), and KNN (73.36%). These results were obtained when the signal-to-noise ratio was set to 20 dB, and Manhattan distance was used in a two-dimensional (2-D) space. The ARWKNN algorithm based on Clark distance and minimum maximum distance metrics produces the minimum average positioning error in 2-D and 3-D, respectively. Compared with self-adaptive WKNN (SAWKNN), WKNN and KNN algorithms, the ARWKNN algorithm achieves a significant reduction in the average positioning error while maintaining similar algorithm complexity.


Introduction
Positioning systems can be divided into outdoor positioning system (OPS) and indoor positioning system (IPS). The OPS usually uses global positioning system (GPS) to obtain the coordinates of the target. Since the GPS signal is not able to penetrate the wall and other obstacles, GPS cannot be applied in the indoor positioning scene [1]. As a supplement to OPS, IPS has attracted increasing attention among researchers. At present, there are two main research areas on IPS. One is based on radio frequency communication technology, such as radio frequency identification (RFID) [2], wireless sensor network (WSN) [3], ultra-wideband (UWB) [4], wireless fidelity (WiFi) [5], Bluetooth [6], etc. The other is based on visible light communication (VLC) [7]. IPS can be divided into range-based IPS and range-free IPS. The methods of range-based include time of arrival (TOA), angle of arrival (AOA), and received signal strength indication (RSSI), etc. [8,9]. The range-free IPS usually uses fingerprint The rest of this paper is organized as follows: the ARWKNN algorithm is proposed in Section 2. Simulation results are shown and discussed in Section 3. Finally, Section 4 concludes this paper.
Notation: Matrices and vectors are in boldface. The field of real numbers is denoted by R. . 2 is the 2 norm of the vector. |·| is the absolute value, and denotes the rounding up operator. The transpose operation is denoted by [.] T .

System Model
The positioning model is shown in Figure 1a. If there is M total LEDs in the room, the target checks and selects M LEDs that has the highest RSSI for positioning. For simplicity, we assume that the target appears in a 3-D space with M LEDs. The coordinates of M LEDs are β i = [x LED-i , y LED-i , z LED-i ] T , for i = 1, 2, . . . , M. It is assumed that M LEDs are evenly distributed on the same horizontal plane, i.e., z LED-i = z LED , z LED is the height from the floor to the LED. α i ∈ R 3×1 represents the angle of the ith LED. θ j ∈ R 3×1 and γ j ∈ R 3×1 represent the coordinate and angle of the jth fingerprint point, respectively, for j = 1, 2, . . . , N, N represents the number of fingerprint points. Suppose the target moves in an interval from h L to h H at the z-axis, h L and h H are the minimum and maximum vertical distance from the floor to the target, respectively.
We use S to denote the spacing of the fingerprints, as shown in Figure 1a. m, n and l are used to represent the collection directions of fingerprints in x-axis, y-axis and z-axis, respectively, the meanings of m, n, and l are shown in Table 1. To make it easier to understand, an example is given, as shown in Figure 1b. In Figure 1b, columns are arranged from left to right (in the positive direction of the x-axis), rows are arranged from bottom to top (in the positive direction of the y-axis), and dimensions are arranged from low to high (in the positive direction of the z-axis). The starting point of fingerprint collection is θ init = [x init , y init , z init ] T , x init , y init , and z init are given by: We use S to denote the spacing of the fingerprints, as shown in Figure 1a. m, n and l are used to represent the collection directions of fingerprints in x-axis, y-axis and z-axis, respectively, the meanings of m, n, and l are shown in Table 1. To make it easier to understand, an example is given, as shown in Figure 1b. In Figure 1b, columns are arranged from left to right (in the positive direction of the x-axis), rows are arranged from bottom to top (in the positive direction of the y-axis), and dimensions are arranged from low to high (in the positive direction of the z-axis). The starting point of fingerprint collection is θinit = [xinit, yinit, zinit] T , xinit, yinit, and zinit are given by:    Then, in the positioning space, the coordinates corresponding to the fingerprint points in the l dimension, i.e., the m column and the n row are x fin−m = x init + S(m − 1), m = 1, 2, . . . , L 1 S +1 y fin−n = y init + S(n − 1), n = 1, 2, . . . , L 2 S +1 z fin−l = z init + S(l − 1), l = 1, 2, . . . , L 3 S +1 (2) where L 1 = max (x LED-i ) − min (x LED-i ), L 2 = max (y LED-i ) − min (y LED-i ) and L 3 = h H − h L . Then the distance d l,m,n−i between each fingerprint point and the ith LED can be obtained as:

Fingerprint Matrix Construction
We use Φ ∈ R M×N to denote the measurement matrix of the fingerprints, which is given by: Sensors 2020, 20, 4432 5 of 23 where N is given by: And φ l,m,n−i represents the RSSI, which is given by: φ l,m,n−i = 10 log 10 P l,m,n−i (6) where P l,m,n−i represents the optical power value from the ith LED received by the fingerprint point in the l dimension, m column and n row within the positioning area.

Measurement Vector
Suppose the coordinates of targets in 3-D are Ψ k = [x target-k , y target-k , z target-k ] T , for k = 1, 2, . . . , C, and C represents the number of targets. Thus, the receiving signal intensity vector Y k of M LEDs collected by the kth target is given by: where Y k,i is given by Y k,i = 10 log 10 P k,i where P k,i represents the optical power value of the ith LED received by the kth target.

Measurement Model
In this paper, the measurement matrix Φ and measurement vector Y k are generated by the Lambertian radiation model. Because the LED is distributed on the ceiling, there is mainly LoS communication between the fingerprint point and the LED. Without loss of generality, this paper only considers the Lambertian radiation model of the LoS, which are widely adopted in papers such as [12,28,[30][31][32], the received light power value of the fingerprint point is: where P Re represents the received light power value; P Tr represents the transmit power of the LED; d is the distance between the transmitter and the receiver; T s and g are the optical filter gain and optical concentrator gain, respectively; b is the Lambertian order; λ 1/2 is the half-power angles of the LED; A PD is the effective area of the PD detection; The field of view (FOV) of PD is defined as ω FOV , and 0 < ω i < ω FOV . λ i and ω i are the radiation and incident angles, i.e., the transmitter's normal and receiver's normal, respectively, as shown in Figure 1a.

Channel Access Method
As LEDs transmit a unique identification (ID) code independently, however, signals sent from different LEDs will interfere with each other at the receiver. In order to receive the power from different LEDs, we also use time division multiplexing to achieve this goal [20,31,32], and in a real scenario, we can also use different modulation frequencies, such as Guo et al. [11] and Alam et al. [12]. M LEDs have synchronous frames [20,31], and different LEDs use different time slots to transmit signals within each frame cycle, when one LED transmits the ID code, other LEDs emit a constant light intensity (CLI) for illumination purposes only. The frame structure is shown in Figure 2. After photoelectric conversion, a high-pass filter can be used to filter out the power from other LEDs [20]. scenario, we can also use different modulation frequencies, such as Guo et al. [11] and Alam et al. [12]. M LEDs have synchronous frames [20,31], and different LEDs use different time slots to transmit signals within each frame cycle, when one LED transmits the ID code, other LEDs emit a constant light intensity (CLI) for illumination purposes only. The frame structure is shown in Figure 2. After photoelectric conversion, a high-pass filter can be used to filter out the power from other LEDs [20].

Setting of K
According to the principle of fingerprint positioning, the purpose of positioning is to find K fingerprint points that are close to the target. When in a different experimental environment, K generally takes different values, such as in Xue et al. [15], the optimal positioning accuracy is obtained when K = 5; in Alam et al. [12] and Zhang et al. [28], the optimal positioning accuracy is obtained when K = 4; in Van et al. [25], the optimal positioning accuracy is obtained when K = 3. One thing they all have in common is that K is a fixed value. In this paper, N fingerprint points are evenly distributed in the 2-D or 3-D space. In a specific time, there are K fingerprint points close to the same target, which is called the KNN fingerprint positioning algorithm. For example, when the target exactly matches the fingerprint point, as shown in Figure 3a, obviously, the optimal positioning accuracy is obtained when K = 1. When the target falls on a straight line formed by two fingerprint points, as shown in Figure 3b, i.e., K = 2. When the target is in a triangular area composed of three fingerprint points, as shown in Figure 3c, i.e., K = 3. If the 3-D fingerprints map is adopted, and the target is obviously located in a minimum cube composed of 8 fingerprint points with a high probability, i.e., K = 8, as shown in Figure 3d.

Setting of K
According to the principle of fingerprint positioning, the purpose of positioning is to find K fingerprint points that are close to the target. When in a different experimental environment, K generally takes different values, such as in Xue et al. [15], the optimal positioning accuracy is obtained when K = 5; in Alam et al. [12] and Zhang et al. [28], the optimal positioning accuracy is obtained when K = 4; in Van et al. [25], the optimal positioning accuracy is obtained when K = 3. One thing they all have in common is that K is a fixed value. In this paper, N fingerprint points are evenly distributed in the 2-D or 3-D space. In a specific time, there are K fingerprint points close to the same target, which is called the KNN fingerprint positioning algorithm. For example, when the target exactly matches the fingerprint point, as shown in Figure 3a, obviously, the optimal positioning accuracy is obtained when K = 1. When the target falls on a straight line formed by two fingerprint points, as shown in Figure 3b, i.e., K = 2. When the target is in a triangular area composed of three fingerprint points, as shown in Figure 3c, i.e., K = 3. If the 3-D fingerprints map is adopted, and the target is obviously located in a minimum cube composed of 8 fingerprint points with a high probability, i.e., K = 8, as shown in Figure 3d.
As LEDs transmit a unique identification (ID) code independently, however, signals sent from different LEDs will interfere with each other at the receiver. In order to receive the power from different LEDs, we also use time division multiplexing to achieve this goal [20,31,32], and in a real scenario, we can also use different modulation frequencies, such as Guo et al. [11] and Alam et al. [12]. M LEDs have synchronous frames [20,31], and different LEDs use different time slots to transmit signals within each frame cycle, when one LED transmits the ID code, other LEDs emit a constant light intensity (CLI) for illumination purposes only. The frame structure is shown in Figure 2. After photoelectric conversion, a high-pass filter can be used to filter out the power from other LEDs [20].

Setting of K
According to the principle of fingerprint positioning, the purpose of positioning is to find K fingerprint points that are close to the target. When in a different experimental environment, K generally takes different values, such as in Xue et al. [15], the optimal positioning accuracy is obtained when K = 5; in Alam et al. [12] and Zhang et al. [28], the optimal positioning accuracy is obtained when K = 4; in Van et al. [25], the optimal positioning accuracy is obtained when K = 3. One thing they all have in common is that K is a fixed value. In this paper, N fingerprint points are evenly distributed in the 2-D or 3-D space. In a specific time, there are K fingerprint points close to the same target, which is called the KNN fingerprint positioning algorithm. For example, when the target exactly matches the fingerprint point, as shown in Figure 3a, obviously, the optimal positioning accuracy is obtained when K = 1. When the target falls on a straight line formed by two fingerprint points, as shown in Figure 3b, i.e., K = 2. When the target is in a triangular area composed of three fingerprint points, as shown in Figure 3c, i.e., K = 3. If the 3-D fingerprints map is adopted, and the target is obviously located in a minimum cube composed of 8 fingerprint points with a high probability, i.e., K = 8, as shown in Figure 3d.    Figure 4 is the positioning error of five targets at different 3-D positions using WKNN algorithm, for K increases from 1 to 8. When K = 4, the positioning error of target 1 is minimal. When K = 3, the positioning error of target 2 is minimal. When K = 8, the positioning error of target 3 is minimal. When K = 1, the positioning error of target 4 is minimal. When K = 6, the positioning error of target 5 is minimal. It can also be seen from Figure 4 that the positioning error varies with the K value fluctuation, and there is no monotonous increasing or decreasing relationship. In a 2-D visible light localization, the average positioning error based on the WKNN algorithm can be minimized when K = 3 or K = 4, e.g., [12,25,28]. In the 3-D visible light localization, the average positioning error based on the WKNN algorithm can be minimized when K = 8, which will be discussed in Section 3 . The minimum mean positioning error does not mean that the positioning error of each target is the smallest, so the dynamic K value can effectively reduce the positioning error of different targets. To address this issue, this paper proposes an adaptive residual weighted K-nearest neighbor fingerprint positioning algorithm, which is called ARWKNN fingerprint positioning algorithm.
Sensors 2020, 20, 4432 7 of 23 = 3 or K = 4, e.g., [12,25,28]. In the 3-D visible light localization, the average positioning error based on the WKNN algorithm can be minimized when K = 8, which will be discussed in section III. The minimum mean positioning error does not mean that the positioning error of each target is the smallest, so the dynamic K value can effectively reduce the positioning error of different targets. To address this issue, this paper proposes an adaptive residual weighted K-nearest neighbor fingerprint positioning algorithm, which is called ARWKNN fingerprint positioning algorithm.

ARWKNN Algorithm
The WKNN fingerprint positioning algorithm is based on the shortest RSSI physical distance between the fingerprint and the target position. The positioning error for the WKNN algorithm is affected by the weight of the fingerprint point and this weight is affected by the K value. If the optimal K value can be obtained, the positioning error can be reduced, so a novel ARWKNN algorithm is proposed in this paper. The pseudo-code of the ARWKNN algorithm is shown in Algorithm 1. In Algorithm 1, if we only consider Steps 1, 2 and 5, then it is the WKNN algorithm, and in Step 5, if the location of the target is estimated by averaging the coordinates of K fingerprints, then it is the KNN algorithm. By contrast with the KNN and WKNN algorithms, the ARWKNN algorithm also performs Step 3 and 4 in Algorithm 1. Because there is no prior information about the location of the target, that is, the value of Ψk is unknown, but we known the fingerprint matrix Φ and the target RSSI measurement vector Yk, we can adaptively select the K value by matching the residual between the measured and calculated RSSI values. Therefore, the purpose of Steps 3 and 4 in algorithm 1 is to obtain the optimal K value, i.e., the K value corresponding to the smallest RSSI matching residual. In Algorithm 1, because the maximum number of neighboring fingerprint points Kmax is much smaller than the total number of fingerprint points N, the ARWKNN algorithm has a large reduction in the average positioning error while maintaining similar algorithm complexity, which will be discussed in Section 3.4.

Algorithm 1. ARWKNN algorithm
Input: the maximum number of nearest neighbor fingerprints Kmax, fingerprint matrix Φ, and the kth target measurement vector Yk.
Output: The coordinates of the kth target, i.e., Ψk.

ARWKNN Algorithm
The WKNN fingerprint positioning algorithm is based on the shortest RSSI physical distance between the fingerprint and the target position. The positioning error for the WKNN algorithm is affected by the weight of the fingerprint point and this weight is affected by the K value. If the optimal K value can be obtained, the positioning error can be reduced, so a novel ARWKNN algorithm is proposed in this paper. The pseudo-code of the ARWKNN algorithm is shown in Algorithm 1. In Algorithm 1, if we only consider Steps 1, 2 and 5, then it is the WKNN algorithm, and in Step 5, if the location of the target is estimated by averaging the coordinates of K fingerprints, then it is the KNN algorithm. By contrast with the KNN and WKNN algorithms, the ARWKNN algorithm also performs Step 3 and 4 in Algorithm 1. Because there is no prior information about the location of the target, that is, the value of Ψ k is unknown, but we known the fingerprint matrix Φ and the target RSSI measurement vector Y k , we can adaptively select the K value by matching the residual between the measured and calculated RSSI values. Therefore, the purpose of Steps 3 and 4 in algorithm 1 is to obtain the optimal K value, i.e., the K value corresponding to the smallest RSSI matching residual. In Algorithm 1, because the maximum number of neighboring fingerprint points K max is much smaller than the total number of fingerprint points N, the ARWKNN algorithm has a large reduction in the average positioning error while maintaining similar algorithm complexity, which will be discussed in Section 3.4.

Algorithm 1. ARWKNN algorithm
Input: the maximum number of nearest neighbor fingerprints K max , fingerprint matrix Φ, and the kth target measurement vector Y k . Output: The coordinates of the kth target, i.e., Ψ k .
Step 1: Calculate the distance from the kth target to N fingerprint points.
where r = 1 represents the Manhattan distance, r = 2 represents the Euclidean distance.
; end for where A ∈ R M×K represents finding the K column values corresponding to the fingerprint matrix Φ according to the index set I. Calculate the kth target RSSI vector via K nearest neighbor fingerprints, , for t = 1, 2, . . . , K, Calculate the matched RSSI residual between the measured and calculated RSSI values, and calculate the sum of the absolute values of the residuals, Step 4: Output the K value, i.e., Step 5: Calculate the coordinates of the kth target, where θ I(t) represents the coordinates of the corresponding fingerprint point found according to the index set I.

Simulation Analysis
In this Section, the ARWKNN algorithm is compared with RF [14], ELM [16], ANN [17], GI-LS [11], SAWKNN [19], WKNN [12] or KNN [15,25] algorithms. The basic principle of the fingerprint positioning algorithm based on RF, ELM, ANN, and GI-LS machine learning is as follows [11,13]: Firstly, the positioning area is divided into several equal grid points according to the sampling interval S, RSSI measurements are obtained by placing the receiver at different grid points, and each grid point represents a category. Secondly, machine-learning algorithms are used to train the category to which each grid point belongs. Thirdly, the RSSI measurements obtained in the online phase are compared with the derived model to predict the location of the target.

Error Definition
Suppose the actual coordinates of targets are Ψ k ∈ R 3×1 , then the positioning error E k is defined as: and the average positioning error E APE is defined as:

Noise Model of Visible Light Communication (VLC)
In indoor VLC, the noise σ noise includes shot noise σ shot and thermal noise σ thermal [33], which are given by: where q is elementary charge, R PD is the responsivity of the PD, B is the equivalent noise bandwidth, P r indicates the received power from M LEDs, k B is the Boltzmann's constant, T K is the absolute temperature, G 0 is the open loop gain, η is the fixed capacitance of PD, I bg is the background light current, Γ is the channel noise factor, g m is the field effect transistor (FET) transconductance, I 2 and I 3 are the noise bandwidth factors.
According to the noise model, the signal-to-noise ratio (SNR) is given by [32] SNR(dB) = 10 log 10 R PD

Simulation Parameters
Without loss of generality, we suppose α i = [0, 0, −1] T and γ j = [0, 0, 1] T , i.e., cos(λ i ) = cos(ω i ) = h l,m,n-i /d l,m,n-i , h l,m,n-i is the z-axis distance from the fingerprint point to the ith LED in the l dimension, the m column and the n row, which are widely adopted in papers such as [12,20,28]. The parameter setting of the Lambertian radiation model is as follows: For simplicity, unless otherwise specified, we only consider the 2-D case, and S = 20 cm. In order to obtain the optimal classification accuracy of ANN, ELM, and RF algorithms, and the optimal positioning accuracy of KNN, WKNN, and SAWKNN algorithms. The optimal parameters obtained through offline training and learning are as follows: In KNN, WKNN, ARWKNN and SAWKNN algorithms, K max = 4. In the Section 3.4, we will also discuss the impact of different K max values on the average positioning error. For the optimal number of hidden nodes and trees, the classification method is the same as that in Guo et al. [11], i.e., each grid point represents a category, and the cross-validation method is adopted based on experience adjustment. For the optimal number of hidden nodes, the cross-validation method has a range of 100 to 700 and a step size of 50. For the optimal number of trees, the cross-validation method has a range of 10 to 50 and a step size of 5. After comprehensive evaluation of the positioning accuracy and classification accuracy, the optimal number of hidden nodes and trees are selected to be 600 and 40, respectively. The impact of γ th on the average positioning error is shown in Figure 5, it can be seen from the Figure 5 that minimum average positioning error is achieved when γ th is within the range of [30%, 50%], so, the value of γ th is selected to be 40%, which denotes the threshold of two RSSI difference values that can be considered similar [19].
For simplicity, unless otherwise specified, we only consider the 2-D case, and S = 20 cm. In order to obtain the optimal classification accuracy of ANN, ELM, and RF algorithms, and the optimal positioning accuracy of KNN, WKNN, and SAWKNN algorithms. The optimal parameters obtained through offline training and learning are as follows: In KNN, WKNN, ARWKNN and SAWKNN algorithms, Kmax = 4. In the Section 3.4, we will also discuss the impact of different Kmax values on the average positioning error. For the optimal number of hidden nodes and trees, the classification method is the same as that in Guo et al. [11], i.e., each grid point represents a category, and the crossvalidation method is adopted based on experience adjustment. For the optimal number of hidden nodes, the cross-validation method has a range of 100 to 700 and a step size of 50. For the optimal number of trees, the cross-validation method has a range of 10 to 50 and a step size of 5. After comprehensive evaluation of the positioning accuracy and classification accuracy, the optimal number of hidden nodes and trees are selected to be 600 and 40, respectively. The impact of γth on the average positioning error is shown in Figure 5, it can be seen from the Figure 5 that minimum average positioning error is achieved when γth is within the range of [30%, 50%], so, the value of γth is selected to be 40%, which denotes the threshold of two RSSI difference values that can be considered similar [19].

Result Analysis
We only consider positioning in this paper, so B = 640 KHz will be able to label 3.4 × 10 38 LEDs [34], which is far exceeds the actual needs. The SNR experimental results are shown in Table 2. If B = 640 KHz, typical SNR for indoor visible light communication ranges from 42.97 to 60.92 dB, and the average value reaches 52.45 dB. In addition to indoor positioning, LEDs can also provide high-speed data rate, If B = 100 MHz, the average SNR can also reach 28.86 dB. When P tr = 6 W, the average positioning errors of eight algorithms are analyzed when B is within 50 MHz to 400 MHz, the results are shown in Figure 6. As the value of modulation bandwidth increases, the average positioning errors of eight algorithms increase. The higher the modulation bandwidth, the lower the SNR and the higher the average positioning errors. As only positioning is considered in this paper, a very high modulation bandwidth is not necessary. With a high-modulation bandwidth, it may be more suitable to modulate the transmission signal of the LED by modified orthogonal frequency division multiplexing (OFDM) to achieve indoor positioning [22,35,36], but this is beyond the scope of this paper. It can also be seen from Figure 6 that when B is within 50 MHz to 400 MHz, the average positioning error based on the ARWKNN algorithm is the smallest.
When Ptr = 6 W, the average positioning errors of eight algorithms are analyzed when B is within 50 MHz to 400 MHz, the results are shown in Figure 6. As the value of modulation bandwidth increases, the average positioning errors of eight algorithms increase. The higher the modulation bandwidth, the lower the SNR and the higher the average positioning errors. As only positioning is considered in this paper, a very high modulation bandwidth is not necessary. With a highmodulation bandwidth, it may be more suitable to modulate the transmission signal of the LED by modified orthogonal frequency division multiplexing (OFDM) to achieve indoor positioning [22,35,36], but this is beyond the scope of this paper. It can also be seen from Figure 6 that when B is within 50 MHz to 400 MHz, the average positioning error based on the ARWKNN algorithm is the smallest. When B = 100 MHz, the average positioning errors of eight algorithms are analyzed when P tr is within 1 W to 6 W, the results are shown in Figure 7. As the P tr increases, the average positioning errors of eight algorithms decrease. When P tr = 3 W, the average positioning errors of eight algorithms are close to convergence. The higher the transmitting power, the higher the SNR and the smaller the average positioning errors. It can also be seen from Figure 7 that when P tr is within 1 W to 6 W, the average positioning error based on the ARWKNN algorithm is the smallest.  The average positioning errors of eight algorithms under different SNR are compared, simulation results are shown in Figure 8. As shown in Figure 8, when SNR = 10 dB, the average positioning errors of eight algorithms are large due to severe noise interference. As the SNR increases, the average positioning errors of eight algorithms decrease. When SNR = 20 dB, the average positioning errors of eight algorithms are close to convergence. Since fingerprint positioning based on RF, ELM and ANN algorithms can only determine the category of the target, compared with WKNN algorithm, the positioning error is larger. When the SNR is higher than 15, the average positioning error based on the ARWKNN algorithm is the smallest. Due to lighting requirements and LoS communication, within the typical SNR range of indoor visible light communication, the average positioning error based on the ARWKNN algorithm is significantly lower than that of RF, ELM, ANN, GI-LS, SAWKNN, WKNN and KNN algorithms. The average positioning error based on the SAWKNN algorithm is lower than that of the WKNN algorithm. The GI-LS algorithm uses the complementary advantages of KNN, RF, and ELM classifiers to weight the estimation results, the average positioning error based on the GI-LS algorithm is lower then that of KNN, RF, ELM and ANN algorithms, but higher then WKNN, ARWKNN and SAWKNN algorithms.
The average positioning errors of eight algorithms under different SNR are compared, simulation results are shown in Figure 8. As shown in Figure 8, when SNR = 10 dB, the average positioning errors of eight algorithms are large due to severe noise interference. As the SNR increases, the average positioning errors of eight algorithms decrease. When SNR = 20 dB, the average positioning errors of eight algorithms are close to convergence. Since fingerprint positioning based on RF, ELM and ANN algorithms can only determine the category of the target, compared with WKNN algorithm, the positioning error is larger. When the SNR is higher than 15, the average positioning error based on the ARWKNN algorithm is the smallest. Due to lighting requirements and LoS communication, within the typical SNR range of indoor visible light communication, the average positioning error based on the ARWKNN algorithm is significantly lower than that of RF, ELM, ANN, GI-LS, SAWKNN, WKNN and KNN algorithms. The average positioning error based on the SAWKNN algorithm is lower than that of the WKNN algorithm. The GI-LS algorithm uses the complementary advantages of KNN, RF, and ELM classifiers to weight the estimation results, the average positioning error based on the GI-LS algorithm is lower then that of KNN, RF, ELM and ANN algorithms, but higher then WKNN, ARWKNN and SAWKNN algorithms.  Table 3. It can be seen from Table 3 that compared with RF, ELM, ANN, GI-LS, SAWKNN, WKNN and KNN algorithms, the average  Table 3. It can be seen from Table 3   When SNR = 20 dB, the simulation results of cumulative distribution function (CDF) are shown in Figure 9. It can be seen from Figure 9 that the CDF of positioning errors based on the ARWKNN algorithm is significantly better than that of the RF, ELM, ANN, GI-LS, SAWKNN, WKNN and KNN algorithms. The KNN algorithm is one of the simplest of all machine learning algorithms. Compared with the RF, ELM, ANN and GI-LS algorithms, fingerprint positioning based on the ARWKNN algorithm, not only has lower complexity, but also has lower positioning error. Fingerprint positioning is based on machine-learning algorithms, which require a large amount of data for training and learning. If there are not enough training data, the positioning error will be large, and a large amount of training data will increase the complexity of the algorithm. Compared with the SAWKNN, WKNN, and KNN algorithms, the ARWKNN algorithm can significantly reduce the average positioning error while maintaining similar algorithm complexity, which will be discussed in the section of algorithm complexity analysis.
algorithms. The KNN algorithm is one of the simplest of all machine learning algorithms. Compared with the RF, ELM, ANN and GI-LS algorithms, fingerprint positioning based on the ARWKNN algorithm, not only has lower complexity, but also has lower positioning error. Fingerprint positioning is based on machine-learning algorithms, which require a large amount of data for training and learning. If there are not enough training data, the positioning error will be large, and a large amount of training data will increase the complexity of the algorithm. Compared with the SAWKNN, WKNN, and KNN algorithms, the ARWKNN algorithm can significantly reduce the average positioning error while maintaining similar algorithm complexity, which will be discussed in the section of algorithm complexity analysis. When SNR = 20 dB, the average positioning errors of ARWKNN, SAWKNN, WKNN, and KNN algorithms are analyzed, in WKNN and KNN algorithms, K is a fixed value, that is, K = K max . The simulation results of 2-D and 3-D are shown in Figures 10 and 11, respectively. As can be seen from Figure 10, when K max is within 1 to 8, similar to the experimental results in most papers, in 2-D, the optimal K based on the WKNN algorithm is 3 or 4, which exactly conforms with the fact that the target will be located in a minimum triangle or square composed of 3 or 4 fingerprint points with a high probability. It can also be analyzed from Figure 10 that when K max is greater than 3, the average positioning error based on the ARWKNN algorithm is significantly lower than that of the KNN, WKNN, and SAWKNN algorithms. From Figure 11, It can be seen that as the K max increases from 1 to 12, the average positioning error based on the ARWKNN algorithm decreases. When K max = 8, the average positioning error is not significantly reduced if the value of K max continues to increase. Therefore, a reasonable value of K max is taken as 8. From Figure 11, we can also see that when K max = 8, the average positioning error based on the KNN and WKNN algorithms is the smallest, which exactly conforms that the target will be located in a minimum cube composed of 8 fingerprint points with a high probability. It can also be analyzed from Figure 11 that when K max is greater than 6, the average positioning error based on the ARWKNN algorithm is significantly lower than that of the KNN, WKNN, and SAWKNN algorithms, and the advantages of the ARWKNN algorithm are more obvious as K max increases.
Sensors 2020, 20, x FOR PEER REVIEW 14 of 25 target will be located in a minimum triangle or square composed of 3 or 4 fingerprint points with a high probability. It can also be analyzed from Figure 10 that when Kmax is greater than 3, the average positioning error based on the ARWKNN algorithm is significantly lower than that of the KNN, WKNN, and SAWKNN algorithms. From Figure 11, It can be seen that as the Kmax increases from 1 to 12, the average positioning error based on the ARWKNN algorithm decreases. When Kmax = 8, the average positioning error is not significantly reduced if the value of Kmax continues to increase. Therefore, a reasonable value of Kmax is taken as 8. From Figure 11, we can also see that when Kmax = 8, the average positioning error based on the KNN and WKNN algorithms is the smallest, which exactly conforms that the target will be located in a minimum cube composed of 8 fingerprint points with a high probability. It can also be analyzed from Figure 11 that when Kmax is greater than 6, the average positioning error based on the ARWKNN algorithm is significantly lower than that of the KNN, WKNN, and SAWKNN algorithms, and the advantages of the ARWKNN algorithm are more obvious as Kmax increases.     Figures 12 and 13, respectively. It can be seen that as S decreases from 40 cm to 20 cm, whether in 2-D or 3-D, the average positioning error based on the ARWKNN algorithm is significantly lower than that of the KNN, WKNN, and SAWKNN algorithms, and the larger the S, the more obvious the advantage. As S decreases to 5 cm, the average positioning errors of four  Figures 12 and 13, respectively. It can be seen that as S decreases from 40 cm to 20 cm, whether in 2-D or 3-D, the average positioning error based on the ARWKNN algorithm is significantly lower than that of the KNN, WKNN, and SAWKNN algorithms, and the larger the S, the more obvious the advantage. As S decreases to 5 cm, the average positioning errors of four algorithms tend to be the same. The lower the value of S, the larger the number of fingerprint points N to be acquired, and the more complicated the algorithm becomes.       When SNR = 20 dB, in order to analyze the robustness of the algorithm, fingerprints adopt nonuniform distribution structure, i.e., the RSSI values in the fingerprint map are chosen randomly at different sampling ratios SR. The average positioning errors of the ARWKNN, SAWKNN, WKNN, and KNN algorithms are analyzed with the variation of the fingerprint sampling ratio SR, the results of 2-D and 3-D are shown in Figures 16 and 17, respectively. It can be seen that as SR increases from 50% to 100%, whether in 2-D or 3-D, the average positioning error based on the ARWKNN algorithm is significantly lower than that of the KNN, WKNN, and SAWKNN algorithms, and the larger the SR, the smaller the average positioning errors of the four algorithms. When SR = 50%, the average positioning errors of the ARWKNN, SAWKNN, WKNN, and KNN algorithms are analyzed with the variation of the time, and the results of 2-D and 3-D are shown in Figures 18 and 19, respectively. It can be seen that as t increases from 1 to 50, whether in 2-D or 3-D, the average positioning error based on the ARWKNN algorithm is significantly lower than that of the KNN, WKNN, and SAWKNN algorithms. As can be seen from Figures 16-19, the ARWKNN algorithm has good robustness. When the fingerprint sampling rate is only 50%, lower positioning errors can still be achieved. When SNR = 20 dB, in order to analyze the robustness of the algorithm, fingerprints adopt non-uniform distribution structure, i.e., the RSSI values in the fingerprint map are chosen randomly at different sampling ratios SR. The average positioning errors of the ARWKNN, SAWKNN, WKNN, and KNN algorithms are analyzed with the variation of the fingerprint sampling ratio SR, the results of 2-D and 3-D are shown in Figures 16 and 17, respectively. It can be seen that as SR increases from 50% to 100%, whether in 2-D or 3-D, the average positioning error based on the ARWKNN algorithm is significantly lower than that of the KNN, WKNN, and SAWKNN algorithms, and the larger the SR, the smaller the average positioning errors of the four algorithms. When SR = 50%, the average positioning errors of the ARWKNN, SAWKNN, WKNN, and KNN algorithms are analyzed with the variation of the time, and the results of 2-D and 3-D are shown in Figures 18 and 19, respectively. It can be seen that as t increases from 1 to 50, whether in 2-D or 3-D, the average positioning error based on the ARWKNN algorithm is significantly lower than that of the KNN, WKNN, and SAWKNN algorithms. As can be seen from Figures 16-19, the ARWKNN algorithm has good robustness. When the fingerprint sampling rate is only 50%, lower positioning errors can still be achieved. positioning errors of the ARWKNN, SAWKNN, WKNN, and KNN algorithms are analyzed with the variation of the time, and the results of 2-D and 3-D are shown in Figures 18 and 19, respectively. It can be seen that as t increases from 1 to 50, whether in 2-D or 3-D, the average positioning error based on the ARWKNN algorithm is significantly lower than that of the KNN, WKNN, and SAWKNN algorithms. As can be seen from Figures 16-19, the ARWKNN algorithm has good robustness. When the fingerprint sampling rate is only 50%, lower positioning errors can still be achieved.        The WKNN fingerprint positioning algorithm is based on the shortest RSSI physical distance between the fingerprint and the target position. It can be seen from Step 5 of the ARWKNN algorithm that the positioning error is affected by the weight of the fingerprint point and this weight is affected by the distance metric; therefore, it is necessary to analyze the impact of different distance metrics on the positioning error. In addition to Euclidean distance (ED) and Manhattan distance (MD), there are other distance metrics [12,37], such as: Minimum maximum distance (MMD), which is defined as: Squared Euclidean distance (SED), which is defined as: Chebyshev distance (CHD), which is defined as: Squared-chord distance (SCD), which is defined as: Wave hedges distance (WHD), which is defined as: Lorentzian distance (LD), which is defined as: Matusita distance (MTD), which is defined as: Squared chi-squared distance (SCSD), which is defined as: Canberra distance (CAD), which is defined as: Clark distance (CLD), which is defined as: For different distance metrics, if the same γ th value is used, the positioning error based on the SAWKNN algorithm will be greatly affected, so this section does not consider the SAWKNN algorithm. When SNR = 20 dB, we investigated 30 distance metrics and selected 12 distance metrics with the best performances, the results of which are shown in Tables 4 and 5. It can be seen from Tables 4 and 5 that when the KNN algorithm is used for positioning, ED and SED metrics produce the minimum average positioning error in 2-D and 3-D. In 2-D, the average positioning error based on the WKNN algorithm is similar to the experimental results in Alam et al. [12], we also get SCD and SCSD metrics produce the minimum average positioning error, but in 3-D, SED metric produces the minimum average positioning error. When the ARWKNN algorithm is used for positioning, the CLD metric produces the minimum average positioning error in 2-D and MMD metric produces the minimum average positioning error in 3-D. As far as the authors know, this is the first work to report the impact of CLD and MMD metrics on the positioning error of the fingerprint positioning algorithm. It can also be seen from Table 4 that the best values of the KNN, WKNN and ARWKNN algorithms are 4.84 cm, 2.03 cm and 1.45 cm, respectively. Compared with the KNN and WKNN algorithms, in 2-D, the minimum average positioning error of the ARWKNN algorithm can be reduced by 70.04%, and 28.57%, respectively. It can also be seen from Table 5 that the best values of the KNN, WKNN and ARWKNN algorithms are 4.46 cm, 3.05 cm and 2.18 cm, respectively. Compared with the KNN and WKNN algorithms, in 3-D, the minimum average positioning error of the ARWKNN algorithm can be reduced by 51.12%, and 28.52%, respectively. In 2-D or 3-D, the average positioning errors of the ARWKNN algorithm proposed in this paper are all smaller than that of the KNN and WKNN algorithms under 12 distance metrics. Figure 20 shows the cumulative distributions of positioning errors for the ED and CLD metrics with various S values. As can be seen from Figure 20, in 2-D, compared with the ED metric, the CLD metric produces smaller positioning error. In addition, compared with the CLD metric, the positioning error of the ED metric increases faster when S becomes larger. Figure 21 shows the cumulative distributions of positioning errors for the ED and MMD metrics with various S values. As can be seen from Figure 21, in 3-D, compared with the ED metric, the MMD metric produces smaller positioning error. In addition, compared with the MMD metric, the positioning error of ED metric increases faster when S becomes larger. ED is a commonly used distance metric, however, as can be seen from Tables 4 and 5, in fact, the ED is not the most accurate metric for calculating weights when the WKNN and ARWKNN algorithms are used for positioning.  Figure 20 shows the cumulative distributions of positioning errors for the ED and CLD metrics with various S values. As can be seen from Figure 20, in 2-D, compared with the ED metric, the CLD metric produces smaller positioning error. In addition, compared with the CLD metric, the positioning error of the ED metric increases faster when S becomes larger. Figure 21 shows the cumulative distributions of positioning errors for the ED and MMD metrics with various S values. As can be seen from Figure 21, in 3-D, compared with the ED metric, the MMD metric produces smaller positioning error. In addition, compared with the MMD metric, the positioning error of ED metric increases faster when S becomes larger. ED is a commonly used distance metric, however, as can be seen from Tables 4 and 5, in fact, the ED is not the most accurate metric for calculating weights when the WKNN and ARWKNN algorithms are used for positioning.    Figures 22 and 23, respectively. As can be seen from Figures 22 and 23, there are differences in the optimal K values for 200 targets, and there are also differences in cumulative distributions of the optimal K for five distance metrics. The optimal K cumulative distributions for ED, MMD and CLD are very close, and the optimal K cumulative distributions for SED and SCD are also very close.    Figures 22 and 23, respectively. As can be seen from Figures 22 and 23, there are differences in the optimal K values for 200 targets, and there are also differences in cumulative distributions of the optimal K for five distance metrics. The optimal K cumulative distributions for ED, MMD and CLD are very close, and the optimal K cumulative distributions for SED and SCD are also very close.  Figures 22 and 23, respectively. As can be seen from Figures 22 and 23, there are differences in the optimal K values for 200 targets, and there are also differences in cumulative distributions of the optimal K for five distance metrics. The optimal K cumulative distributions for ED, MMD and CLD are very close, and the optimal K cumulative distributions for SED and SCD are also very close.    Figures 22 and 23, respectively. As can be seen from Figures 22 and 23, there are differences in the optimal K values for 200 targets, and there are also differences in cumulative distributions of the optimal K for five distance metrics. The optimal K cumulative distributions for ED, MMD and CLD are very close, and the optimal K cumulative distributions for SED and SCD are also very close.   The complexity of the KNN and WKNN algorithms mainly depends on the size of N and the sorting operation of Step 2 in Algorithm 1. Compared with the KNN and WKNN algorithms, the ARWKNN algorithm also performs Step 3 loop function and Step 4 min function in Algorithm 1. The time complexity of Step 3 plus Step 4 depends on the size of K max . Since K max is much smaller than N, that is, the number of neighboring fingerprint points are much smaller than the total number of fingerprint points, the complexity of the ARWKNN algorithm is similar to the KNN and WKNN algorithms. In 3-D, when K max = 8, the average computing time of 200 targets varying with S is analyzed, the result of which is shown in Table 6. It can be seen that when S is the same, the average calculation time of the KNN, WKNN, SAWKNN, and ARWKNN algorithms is almost the same. It can also be seen from Figure 13 that when S decreases, the average positioning errors of four algorithms decrease, but the complexity of the algorithm also increases. Therefore, according to the actual situation, the power consumption and positioning error of the algorithm can be compromised by selecting an appropriate S.

Conclusions
At present, the classical KNN and WKNN algorithms are mainly aimed at 2-D positioning, assuming that the height of the target from the floor is known, and it is not feasible to know the height of the target from the floor in advance. The least linear multiplication method and Newton-Raphson method are suitable for solving 2-D coordinates. Solving the 3-D coordinate is a non-convex optimization problem, which is easy to fall into a local optimal solution. In this paper, the shortcomings of the fingerprint positioning algorithm and the trilateration method are discussed, and an adaptive residual weighted K-nearest neighbor fingerprint positioning algorithm is proposed. Compared with the fingerprint positioning algorithm based on compressed sensing, the range-based WKNN algorithm can achieve high-precision positioning under the low-density LED layout. Compared with RF [14], ELM [16], ANN [17], and GI-LS [11] machine-learning algorithms, fingerprint positioning based on the ARWKNN algorithm not only has lower complexity, but also has lower positioning error. The impact of LEDs modulation bandwidth, LEDs transmit power, the signal-to-noise ratio, the maximum number of neighboring fingerprints, the sampling interval, the number of LEDs, the sampling ratio and distance metric on positioning errors are analyzed in detail. The distribution of optimal K and the complexity of the algorithm are also analyzed in detail. Simulation results show that the ARWKNN algorithm based on CLD and MMD metrics produces the smallest average positioning error in 2-D and 3-D, respectively. Compared with the SAWKNN [19], WKNN [12] and KNN [15,25] algorithms, the ARWKNN algorithm can significantly reduce the average positioning error while maintaining similar algorithm complexity.
Due to lighting requirements and LoS communication, the typical SNR of indoor visible light communication is relatively high, however, the RF, ELM, ANN, GI-LS, SAWKNN, WKNN, KNN, and ARWKNN algorithms have higher positioning error under low SNR conditions. Our next step is to design an efficient noise filtering algorithm to achieve higher positioning accuracy under low SNR conditions. LED communication can not only achieve high-precision positioning, but also achieve high rates. We will consider using modified OFDM to achieve high-precision positioning with high modulation bandwidth and provide a real scenario in the future.