Acoustic TDOA Measurement and Accurate Indoor Positioning for Smartphone

: The global satellite navigation signal works well in open areas outdoors. However, due to its weakness, it is challenging to position continuously and reliably indoors. In this paper, we developed a hybrid system that combines radio signals and acoustic signals to achieve decimeter-level positioning indoors. Speciﬁcally, acoustic transmitters are synchronized with different codes. At the same time, our decoding scheme only requires a simple cross-correlation operation without time-frequency analysis. Secondly, acoustic signals will be reﬂected by glass, walls and other obstacles in the indoor environment. Time difference of arrival (TDOA) measurement accuracy is seriously affected. We developed a robust ﬁrst path detection algorithm to obtain reliable TDOA measurement values. Finally, we combine the maximum likelihood (ML) algorithm with the proposed TDOA measurement method to obtain the location of the smartphone. We carried out static positioning experiments for smartphones in two scenes. The experimental results show that the average positioning error of the system is less than 0.5 m. Our system has the following advantages: (1) smartphone access. (2) an unlimited number of users. (3) easily deployed acoustic nodes. (4) decimeter-level positioning accuracy.


Introduction
Indoor positioning technology is the basis for emergency safety, crowd monitoring, precision marketing, entertainment and life, and human social needs [1]. At present, global navigation satellite systems (GNSS) can provide accurate positioning in open areas outdoors. However, mobile devices can hardly receive the GNSS signal because of the shelter of buildings. So it is unable to form effective indoor positioning by using the GNSS signal [2]. Common indoor positioning technologies include Wi-Fi [3] positioning technology, ultra-wide band (UWB) [4] positioning technology, optical positioning technology, and geomagnetic positioning technology [5,6]. These indoor positioning technologies have their own advantages and disadvantages, which are summarized as follows: Wi-Fi positioning technology is mainly based on received signal strength (RSS), and fingerprint is established through signal strength characteristics [7]. This method is affected by the complex indoor topological environment. In recent years, Wi-Fi round trip time (RTT) has attracted the attention of scholars. In Wi-Fi RTT [8] positioning technology, the user can obtain distance information by calculating round trip time information between the mobile phone and the router. Mobile devices can be located through the trilateral positioning method. When using Wi-Fi RTT, mobile phone manufacturers need to provide the underlying information of Wi-Fi signals to users. However, Wi-Fi signals are vulnerable to indoor multipath [9]. Wi-Fi RTT positioning technology [10] has the following application limitations: the capacity of users is limited, and personal information security cannot be guaranteed. At present, only a few mobile phones support Wi-Fi RTT. The same frequency interference problem exists in Wi-Fi positioning technology [11]. A arrow band pulse

Basic Acoustic Positioning System
The acoustic positioning system is composed of a wireless scheduler, acoustic nodes as the signal transmitters and a smartphone as the signal receiver with the localization to be estimated. The wireless scheduler is mainly used to synchronize all acoustic nodes on a unified time axis. The acoustic nodes mainly transmit signals, and the microphone in the smartphone receives the signals transmitted by the acoustic nodes. The hardware architecture of the acoustic node and scheduler is shown in Figures 1 and 2. The acoustic node is composed of a wireless receiver, FPGA, analog-to-digital converter and speaker. The scheduler is mainly composed of FPGA, wireless transmitter, and key module. When the wireless receiver module receives the signaling sent by the scheduler, the FPGA in the acoustic node starts the interrupt arbitration mechanism to drive the speaker to send the signal. The benefits of our system are the following: (1) the acoustic nodes can be easily synchronized within 0.5 milli-second level with the low-cost wireless scheduler. Such synchronization is supportive to achieve the positioning of sub-meter accuracy in the acoustic positioning system. (2) As the acoustic signals are in essence transmitted by broadcasting, there is no limit on the capacity of the receivers, which is beneficial for massive amount of users in large-scale positioning scenarios. To describe the hardware in more detail, the DAC module is used to convert the digital signal to the output of the analog acoustic signals and drive the speaker into vibrations of the membrane, which are transmitted through the medium of air. The key module is reserved to start or stop the system. The FPGA in the acoustic node has two functions: (1) storing the signal without losing data. (2) controlling and managing different modules.

Signal Design
It has been noted that the chirp signal has good autocorrelation characteristics, which can improve ranging resolution and reception sensitivity. In the meanwhile, the chirp signal can resist a certain degree of multipath fading, and multiple reflected acoustic signals can be distinguished with an appropriate signal processing model. Therefore, we choose the chirp signal as the signal emitted by the acoustic node. chirp signal is defined as follows: To describe the hardware in more detail, the DAC module is used to convert th ital signal to the output of the analog acoustic signals and drive the speaker into vibr of the membrane, which are transmitted through the medium of air. The key mod reserved to start or stop the system. The FPGA in the acoustic node has two functio storing the signal without losing data. (2) controlling and managing different modu

Signal Design
It has been noted that the chirp signal has good autocorrelation characteristics, can improve ranging resolution and reception sensitivity. In the meanwhile, the chir nal can resist a certain degree of multipath fading, and multiple reflected acoustic s can be distinguished with an appropriate signal processing model. Therefore, we c the chirp signal as the signal emitted by the acoustic node. chirp signal is defined lows: To describe the hardware in more detail, the DAC module is used to convert the digital signal to the output of the analog acoustic signals and drive the speaker into vibrations of the membrane, which are transmitted through the medium of air. The key module is reserved to start or stop the system. The FPGA in the acoustic node has two functions: (1) storing the signal without losing data. (2) controlling and managing different modules.

Signal Design
It has been noted that the chirp signal has good autocorrelation characteristics, which can improve ranging resolution and reception sensitivity. In the meanwhile, the chirp signal can resist a certain degree of multipath fading, and multiple reflected acoustic signals can be distinguished with an appropriate signal processing model. Therefore, we choose the chirp signal as the signal emitted by the acoustic node. chirp signal is defined as follows: where f 0 is the starting frequency and f c is the cut-off frequency. A is the amplitude of the signal. T is the duration of the chirp signal. Let's make k = f c − f 0 . If k > 0, the chirp signal is up-chirp. If k < 0, the chirp signal is down-chirp. In order to distinguish four acoustic  Figure 3 shows the time-frequency distribution of signals corresponding to each acoustic node. We design chirp signals according to the following principles: (1) up-chirp and down-chirp are distributed in adjacent frequency bands. For example, if s1 is up-chirp, s2 should be down-chirp. If s1 is down-chirp, s2 should be up-chirp. (2) Adjacent coding frequency band interval 500 Hz. (3) The time length of each signal is 10ms to avoid complex calculations. It should be noted that the encoding method in Figure 3 is advantageous for decoding. We only use cross-correlation to detect the received signals. In order to illustrate the effect on decoding by using cross-correlation functions, Figure 4 shows the sequentially received signals defined as s(t), which include s1, s2, s3 and s4.
where is the starting frequency and is the cut-off frequency. A is the amplitud the signal. T is the duration of the chirp signal. Let's make = − . If > 0, the ch signal is up-chirp. If < 0, the chirp signal is down-chirp. In order to distinguish f acoustic nodes, we design four chirp signals with different frequency bands. In orde reduce the impact of environmental noise, all acoustic nodes transmit signals with quencies above 15 kHz. The frequency of the signal transmitted by acoustic node 1 is kHz-15 kHz. The frequency of the signal transmitted by acoustic node 2 is 16.5 kHz-1 kHz. The frequency of the signal transmitted by acoustic node 3 is 19 kHz-18 kHz. T frequency of the signal transmitted by acoustic node 4 is 19.5 kHz-20.5 kHz. Figure 3 shows the time-frequency distribution of signals corresponding to e acoustic node. We design chirp signals according to the following principles: (1) up-ch and down-chirp are distributed in adjacent frequency bands. For example, if s1 is chirp, s2 should be down-chirp. If s1 is down-chirp, s2 should be up-chirp. (2) Adjac coding frequency band interval 500 Hz. (3) The time length of each signal is 10ms to av complex calculations. It should be noted that the encoding method in Figure 3 is advan geous for decoding. We only use cross-correlation to detect the received signals. In or to illustrate the effect on decoding by using cross-correlation functions, Figure 4 sho the sequentially received signals defined as s(t), which include s1, s2, s3 and s4.  We use different prior signals (s1, s2, s3, and s4) to correlate with the received sig s(t). After cross-correlation, the signals can be decoded and the time delay of the sign can be estimated as well, which is shown in Figure 5.  where is the starting frequency and is the cut-off frequency. A is the amplitu the signal. T is the duration of the chirp signal. Let's make = − . If > 0, the signal is up-chirp. If < 0, the chirp signal is down-chirp. In order to distinguish acoustic nodes, we design four chirp signals with different frequency bands. In or reduce the impact of environmental noise, all acoustic nodes transmit signals wit quencies above 15 kHz. The frequency of the signal transmitted by acoustic node 1 kHz-15 kHz. The frequency of the signal transmitted by acoustic node 2 is 16.5 kHz kHz. The frequency of the signal transmitted by acoustic node 3 is 19 kHz-18 kHz frequency of the signal transmitted by acoustic node 4 is 19.5 kHz-20.5 kHz. Figure 3 shows the time-frequency distribution of signals corresponding to acoustic node. We design chirp signals according to the following principles: (1) up and down-chirp are distributed in adjacent frequency bands. For example, if s1 chirp, s2 should be down-chirp. If s1 is down-chirp, s2 should be up-chirp. (2) Ad coding frequency band interval 500 Hz. (3) The time length of each signal is 10ms to complex calculations. It should be noted that the encoding method in Figure 3 is adv geous for decoding. We only use cross-correlation to detect the received signals. In to illustrate the effect on decoding by using cross-correlation functions, Figure 4 s the sequentially received signals defined as s(t), which include s1, s2, s3 and s4.  We use different prior signals (s1, s2, s3, and s4) to correlate with the received s(t). After cross-correlation, the signals can be decoded and the time delay of the s can be estimated as well, which is shown in Figure 5.  We use different prior signals (s1, s2, s3, and s4) to correlate with the received signal s(t). After cross-correlation, the signals can be decoded and the time delay of the signals can be estimated as well, which is shown in Figure 5.
In practice, acoustic nodes transmit signals in turn to avoid signal aliasing. All acoustic nodes in this paper can be scheduled by the scheduler for scheduling time. In this paper, the scheduler transmits a wireless trigger signal every two seconds and when the acoustic nodes receive the trigger signal, acoustic nodes transmit chirp signals in turn every 200 ms. The specific scheduling diagram is shown in Figure 6.  In practice, acoustic nodes transmit signals in turn to avoid signal aliasing. A tic nodes in this paper can be scheduled by the scheduler for scheduling time. In per, the scheduler transmits a wireless trigger signal every two seconds and w acoustic nodes receive the trigger signal, acoustic nodes transmit chirp signals every 200 ms. The specific scheduling diagram is shown in Figure 6.

Robust TDOA Measurement
TOA and TDOA of signals are important parameters of trilateration in indo tioning. TOA of an acoustic signal refers to the time from the signal sent by the node to the signal received by the microphone receiver. At present, TOA-based positioning systems require time synchronization between transmitting nodes an ers. If TOA can be converted into distance information, the target can be locate trilateral positioning method. TDOA is the difference in time taken between acou    In practice, acoustic nodes transmit signals in turn to avoid signal aliasing. All acoustic nodes in this paper can be scheduled by the scheduler for scheduling time. In this paper, the scheduler transmits a wireless trigger signal every two seconds and when the acoustic nodes receive the trigger signal, acoustic nodes transmit chirp signals in turn every 200 ms. The specific scheduling diagram is shown in Figure 6.

Robust TDOA Measurement
TOA and TDOA of signals are important parameters of trilateration in indoor positioning. TOA of an acoustic signal refers to the time from the signal sent by the acoustic node to the signal received by the microphone receiver. At present, TOA-based acoustic positioning systems require time synchronization between transmitting nodes and receivers. If TOA can be converted into distance information, the target can be located by the trilateral positioning method. TDOA is the difference in time taken between acoustic sig-

Robust TDOA Measurement
TOA and TDOA of signals are important parameters of trilateration in indoor positioning. TOA of an acoustic signal refers to the time from the signal sent by the acoustic node to the signal received by the microphone receiver. At present, TOA-based acoustic positioning systems require time synchronization between transmitting nodes and receivers. If TOA can be converted into distance information, the target can be located by the trilateral positioning method. TDOA is the difference in time taken between acoustic signals sent from different sources to the target. TDOA measurement does not require synchronization between acoustic nodes and receivers, only synchronization between acoustic nodes. Therefore, the microphone inside the smartphone can act as a receiver. Similarly, the TDOA can be converted into the distance difference, and then the distance difference is used to realize the target position estimation. The measurement accuracy of TOA and TDOA is the key to ensuring positioning accuracy. However, the signal is reflected by the wall and ground (i.e., multipath effect), and the accuracy of TOA and TDOA estimation is affected in practical applications. Therefore, reliable TDOA estimation is also an important work. As shown in Figure 7, the maximum peak obtained by the correlation operation lags behind the first path due to the influence of the multipath effect. Identifying reflected signals and direct signals is also the key to determining positioning accuracy. Since TDOA does not need the clock synchronization between the mobile phone and the acoustic node, we can use TDOA information to achieve indoor positioning for smartphones. chronization between acoustic nodes and receivers, only synchronization between acous-tic nodes. Therefore, the microphone inside the smartphone can act as a receiver. Similarly, the TDOA can be converted into the distance difference, and then the distance difference is used to realize the target position estimation. The measurement accuracy of TOA and TDOA is the key to ensuring positioning accuracy. However, the signal is reflected by the wall and ground (i.e., multipath effect), and the accuracy of TOA and TDOA estimation is affected in practical applications. Therefore, reliable TDOA estimation is also an important work. As shown in Figure 7, the maximum peak obtained by the correlation operation lags behind the first path due to the influence of the multipath effect. Identifying reflected signals and direct signals is also the key to determining positioning accuracy. Since TDOA does not need the clock synchronization between the mobile phone and the acoustic node, we can use TDOA information to achieve indoor positioning for smartphones. The benefits of the TDOA positioning system have been described previously. For four acoustic nodes, we can obtain four TOAs. So we can obtain three TDOAs by the following formula: According to the above formula, we can know the precondition for obtaining TDOA is to obtain the TOA of the signal. As mentioned above, acoustic signals will be reflected by glass, walls and other obstacles in the environment. Distinguishing reflected signals and direct signals is one of our research works. Based on the multipath problem, we have developed a reliable TOA detection method. The TOA detection process of the acoustic signal is as follows: In order to obtain the TOA of the corresponding signal emitted by the acoustic node, we follow the process shown in Figure 8. The effective acoustic data segment contains the corresponding signals of all acoustic nodes, and the length of the effective acoustic segment is 2 s. The whole process is divided into three steps, as shown in Figure 8 below. The benefits of the TDOA positioning system have been described previously. For four acoustic nodes, we can obtain four TOAs. So we can obtain three TDOAs by the following formula: According to the above formula, we can know the precondition for obtaining TDOA is to obtain the TOA of the signal. As mentioned above, acoustic signals will be reflected by glass, walls and other obstacles in the environment. Distinguishing reflected signals and direct signals is one of our research works. Based on the multipath problem, we have developed a reliable TOA detection method. The TOA detection process of the acoustic signal is as follows: In order to obtain the TOA of the corresponding signal emitted by the acoustic node, we follow the process shown in Figure 8. The effective acoustic data segment contains the corresponding signals of all acoustic nodes, and the length of the effective acoustic segment is 2 s. The whole process is divided into three steps, as shown in Figure 8 below.
nals sent from different sources to the target. TDOA measurement does not require synchronization between acoustic nodes and receivers, only synchronization between acoustic nodes. Therefore, the microphone inside the smartphone can act as a receiver. Similarly, the TDOA can be converted into the distance difference, and then the distance difference is used to realize the target position estimation. The measurement accuracy of TOA and TDOA is the key to ensuring positioning accuracy. However, the signal is reflected by the wall and ground (i.e., multipath effect), and the accuracy of TOA and TDOA estimation is affected in practical applications. Therefore, reliable TDOA estimation is also an important work. As shown in Figure 7, the maximum peak obtained by the correlation operation lags behind the first path due to the influence of the multipath effect. Identifying reflected signals and direct signals is also the key to determining positioning accuracy. Since TDOA does not need the clock synchronization between the mobile phone and the acoustic node, we can use TDOA information to achieve indoor positioning for smartphones. The benefits of the TDOA positioning system have been described previously. For four acoustic nodes, we can obtain four TOAs. So we can obtain three TDOAs by the following formula: According to the above formula, we can know the precondition for obtaining TDOA is to obtain the TOA of the signal. As mentioned above, acoustic signals will be reflected by glass, walls and other obstacles in the environment. Distinguishing reflected signals and direct signals is one of our research works. Based on the multipath problem, we have developed a reliable TOA detection method. The TOA detection process of the acoustic signal is as follows: In order to obtain the TOA of the corresponding signal emitted by the acoustic node, we follow the process shown in Figure 8. The effective acoustic data segment contains the corresponding signals of all acoustic nodes, and the length of the effective acoustic segment is 2 s. The whole process is divided into three steps, as shown in Figure 8 below.  In step 2, we use the cross-correlation operation to decode the signal and then extract the signal. The cross-correlation calculation formula is as follows: where R i (t) is the absolute value of cross correlation result, s d (t) is a prior template signal, y(t) is the received signal, F and F −1 are Fourier transform and its inverse transform, respectively, and F * is the complex conjugate of F. After the signal is decoded, the signal can be extracted. Rough extraction of the signal is as follows (Algorithm 1): In the third step, we use the signal extracted in the second step to perform crosscorrelation operations with the prior signal. We obtain the correlation function R i (t) through cross-correlation operation. We develop the stable first path extraction method base on R i (t). The information of the first path can be converted into TOA information. The first path extraction algorithm is as follows (Algorithm 2): (1) Set variable GD = 1, GD represents the number of the group.
(2) for n = 1:70 When we obtain four TOAs, we can obtain three TDOAs by Formula (2). Since there are a few abnormal values in the TOA detection process, then TDOA measurement can also be affected. TDOA measurement is the precondition of smartphone positioning, so robust TDOA measurement is very necessary. We have carried out optimization processing to overcome TDOA abnormal measurement. The algorithm is as follows (Algorithm 3): In the entire process mentioned above, we can obtain robust TDOA measurements. When the speed of sound is known, we can convert TDOA into distance difference. By using three or more effective TDOA observations, a smartphone can locate its own position. Figure 9 represents the fundamental mathematical model utilized in this article. S i represents the acoustic node i. R represents the internal microphone of smartphone. TDOA needs to be converted into distance difference for positioning. The formula is as follows:

Static Robust Positioning Algorithm Base on TODA
where c is the speed of sound, c is 340 m/s at 15 centigrade. ∆d i is the distance difference. d 1 is the distance from the smartphone to the reference node. In this paper, acoustic node 1 is the reference node. Suppose r(x, y) is the actual coordinate of the smartphone, and s i (x i , y i ) is the coordinate of the acoustic node. d i is the distance from the smartphone to the acoustic node i. The d i is calculated as follows: Substitute Formula (5) into Formula (4) to obtain Formula (6): where c is the speed of sound, c is 340 m/s at 15 centigrade. ∆ is the distance difference. is the distance from the smartphone to the reference node. In this paper, acoustic node 1 is the reference node. Suppose ( , ) is the actual coordinate of the smartphone, and ( , ) is the coordinate of the acoustic node. is the distance from the smartphone to the acoustic node . The is calculated as follows: Substitute Formula (5) into Formula (4) to obtain Formula (6)： By assuming that all the anchor positions are known, is the actual measured distance difference, which is related to TDOA measurement. Although we have optimized the TDOA detection algorithm, we cannot completely eliminate the TDOA measurement error. So also include errors.
where is the measurement error. In order to establish the relationship between the measured distance difference and the smartphone position r(x, y), we substitute (5) into (6) to obtain three equations. So, the least squares condition is usually considered to estimate position ( , ).
It is easy to see that the problem of solving Formula (8) is a nonlinear problem. In order to observe the performance of positioning algorithms, we carried out a static positioning simulation. We choose two commonly positioned algorithms as the comparison: the Chan method and the ML method. Chan method is a linear estimator that can convert nonlinear problems into linear problems. ML method is a nonlinear estimator, which directly solves optimization problems. These two methods are representative. By assuming that all the anchor positions are known, p i is the actual measured distance difference, which is related to TDOA measurement. Although we have optimized the TDOA detection algorithm, we cannot completely eliminate the TDOA measurement error. So p i also include errors.
where n i is the measurement error. In order to establish the relationship between the measured distance difference p i and the smartphone position r(x, y), we substitute (5) into (6) to obtain three equations. So, the least squares condition is usually considered to estimate position R(x , y ).
It is easy to see that the problem of solving Formula (8) is a nonlinear problem. In order to observe the performance of positioning algorithms, we carried out a static positioning simulation. We choose two commonly positioned algorithms as the comparison: the Chan method and the ML method. Chan method is a linear estimator that can convert nonlinear problems into linear problems. ML method is a nonlinear estimator, which directly solves optimization problems. These two methods are representative.
We simulate and analyze these two positioning methods to obtain a robust positioning algorithm for two scenes. The first scene is 4.8 m × 4.8 m in size. The second scene is 6.4 m × 7.9 m in size. We set the standard deviation of noise to 1. The distribution of acoustic nodes and test points is shown in Figure 10. We simulate and analyze these two positioning methods to obtain a robust positioning algorithm for two scenes. The first scene is 4.8 m × 4.8 m in size. The second scene is 6.4 m × 7.9 m in size. We set the standard deviation of noise to 1. The distribution of acoustic nodes and test points is shown in Figure 10. From Figure 10, it can be seen that the positioning accuracy of the Chan method is lower than that of the ML method under the condition that the standard deviation of the noise is 1. Therefore, the Chan method is not used at last. Under the condition that the standard deviation of the noise is set as 1, we make statistics on the positioning results of point A in scenario 1 and B in scenario 2. As shown in Figure 11, we obtained the position- From Figure 10, it can be seen that the positioning accuracy of the Chan method is lower than that of the ML method under the condition that the standard deviation of the noise is 1. Therefore, the Chan method is not used at last. Under the condition that the standard deviation of the noise is set as 1, we make statistics on the positioning results of point A in scenario 1 and B in scenario 2. As shown in Figure 11, we obtained the positioning results by using the ML algorithm. From Figure 10, it can be seen that the positioning accuracy of the Chan method lower than that of the ML method under the condition that the standard deviation of t noise is 1. Therefore, the Chan method is not used at last. Under the condition that t standard deviation of the noise is set as 1, we make statistics on the positioning results point A in scenario 1 and B in scenario 2. As shown in Figure 11, we obtained the positio ing results by using the ML algorithm. From Figure 11, it can be seen that the positioning result is affected by Gaussian nois The ML localization method exhibits high stability with the standard deviation of noi being set as 1. In order to illustrate the problem, we use the ML method to conduct From Figure 11, it can be seen that the positioning result is affected by Gaussian noise. The ML localization method exhibits high stability with the standard deviation of noise being set as 1. In order to illustrate the problem, we use the ML method to conduct a positioning simulation for 5 test points in scenario 1. The cumulative distribution function (CDF) diagram is as follows: From Figure 12, static positioning results are relatively stable by using the ML method. So, we chose ML in the static positioning experiment. The positioning error formula used in this article is as follows: where PE is the positioning error, (x, y) is the true position, and (x , y ) is the estimation.

Experimental Parameters
In this paper, we chose two experimental scenarios. Scenario 1 is in the lounge on t fourth floor of Luojia Laboratory, Wuhan University. Scenario 2 is in the lobby on t second floor of Luojia Laboratory, Wuhan University. The distribution of acoustic nod and test points in scenario 1 is shown in Figure 12a. The distribution of acoustic nodes an test points in scenario 2 is shown in Figure 13.

Experimental Parameters
In this paper, we chose two experimental scenarios. Scenario 1 is in the lounge on the fourth floor of Luojia Laboratory, Wuhan University. Scenario 2 is in the lobby on the second floor of Luojia Laboratory, Wuhan University. The distribution of acoustic nodes and test points in scenario 1 is shown in Figure 12a. The distribution of acoustic nodes and test points in scenario 2 is shown in Figure 13.

Experimental Parameters
In this paper, we chose two experimental scenarios. Scenario 1 is in the lounge on the fourth floor of Luojia Laboratory, Wuhan University. Scenario 2 is in the lobby on the second floor of Luojia Laboratory, Wuhan University. The distribution of acoustic nodes and test points in scenario 1 is shown in Figure 12a. The distribution of acoustic nodes and test points in scenario 2 is shown in Figure 13. In scenario 2, we divide the test point into five areas. In the experiment, the parameters of the experimental signal are designed in advance, and the parameters are shown in Table 1: In scenario 2, we divide the test point into five areas. In the experiment, the parameters of the experimental signal are designed in advance, and the parameters are shown in Table 1: In two experimental scenarios, the acoustic node deployment shape is rectangular. In  In two experimental scenarios, the acoustic node deployment shape is rectangu In Scenario

Experimental Results and Analysis
In the experiment, we used a smartphone to record 5 min acoustic data at each point. The recorded data will be processed and analyzed with the MATLAB platform. T length of time for each effective acoustic data segment is 2 s. In order to describe the resu

Experimental Results and Analysis
In the experiment, we used a smartphone to record 5 min acoustic data at each test point. The recorded data will be processed and analyzed with the MATLAB platform. The length of time for each effective acoustic data segment is 2 s. In order to describe the results of each processing stage more vividly, we choose test point C (0.8 m, 4 m) in scenario 1 as an example. First, we set four different filters (FIR1, FIR2, FIR3 and FIR4) to filter the effective acoustic data segment (This process corresponds to step 1 of Figure 8). We can obtain four filtered signals by using four filters. Four filtered signals are obtained, which are recorded as F1, F2, F3 and F4. In order to illustrate the effect of the four filters, we drew the time-frequency diagram.
It can be seen from the above Figure 15, it is not practical by using time-frequency characteristics to distinguish acoustic nodes (fuzzy time-frequency characteristics). It can also be seen from the above Figure 15, it is not practical to use time-frequency characteristics to obtain TDOA information (complex computation). So, we use prior template signals to correlate with the filtered signal (s1 is cross-correlated with F1, s2 is cross-correlated with F2, s3 is cross-correlated with F3, and s4 is cross-correlated with F4). The schematic diagram of the cross-correlation operation is as follows: From Figure 16, it can be seen that the four signals can be decoded by simp correlation operation. In order to ensure the robustness of decoding, we imple decoding method by using algorithm 1 in this paper. Algorithm 1 includes not coding but also signal extraction (This process corresponds to step 2 of Figure 8). P2, P3 and P4 are extracted by using Algorithm 1, we use Algorithm 2 for TOA e (this process corresponds to step 3 of Figure 8). In the experiment, we obtained 1 for each test point. To make it easier to see the TDOAs measurement results at C, we selected the first 10 TDOA measurement results to display as follows: From Figure 16, it can be seen that the four signals can be decoded by simple crosscorrelation operation. In order to ensure the robustness of decoding, we implement the decoding method by using algorithm 1 in this paper. Algorithm 1 includes not only decoding but also signal extraction (This process corresponds to step 2 of Figure 8). After P1, P2, P3 and P4 are extracted by using Algorithm 1, we use Algorithm 2 for TOA estimation (this process corresponds to step 3 of Figure 8). In the experiment, we obtained 100 results for each test point. To make it easier to see the TDOAs measurement results at test point C, we selected the first 10 TDOA measurement results to display as follows: decoding method by using algorithm 1 in this paper. Algorithm 1 includes not coding but also signal extraction (This process corresponds to step 2 of Figure 8). P2, P3 and P4 are extracted by using Algorithm 1, we use Algorithm 2 for TOA e (this process corresponds to step 3 of Figure 8). In the experiment, we obtained 1 for each test point. To make it easier to see the TDOAs measurement results at C, we selected the first 10 TDOA measurement results to display as follows: From Figure 17, it can be seen that we can obtain three different TDOAs e At test point C, the actual TDOAs are 8.7 ms, 13.3 ms, and 8.7 ms, respectively. W that our TDOA measurement value is also close to the actual value. The three T different due to the different distances between the smartphone and the fou nodes. Figure 17. TDOAs measurement results for the first time to ten times at test point C.
In order to further explain the improvement of TDOA measurement accura gorithms 2 and 3, we use the average of absolute values of TDOA measurem (simply referred to as TDOA measurement average error) to explain. We assum speed of sound propagation is 340 m/s. TDOA error of 1 ms corresponds to a error of 34 cm. We draw the TDOA measurement average error histogram as Figures 18 and 19. From Figure 17, it can be seen that we can obtain three different TDOAs ea At test point C, the actual TDOAs are 8.7 ms, 13.3 ms, and 8.7 ms, respectively. see that our TDOA measurement value is also close to the actual value. The three are different due to the different distances between the smartphone and the four a nodes. From Figure 17, it can be seen that we can obtain three different TDOAs each At test point C, the actual TDOAs are 8.7 ms, 13.3 ms, and 8.7 ms, respectively. We ca that our TDOA measurement value is also close to the actual value. The three TDOA different due to the different distances between the smartphone and the four aco nodes. Figure 17. TDOAs measurement results for the first time to ten times at test point C.
In order to further explain the improvement of TDOA measurement accuracy b gorithms 2 and 3, we use the average of absolute values of TDOA measurement (simply referred to as TDOA measurement average error) to explain. We assume tha speed of sound propagation is 340 m/s. TDOA error of 1 ms corresponds to a dist error of 34 cm. We draw the TDOA measurement average error histogram as show Figures 18 and 19. In order to further explain the improvement of TDOA measurement accu Algorithms 2 and 3, we use the average of absolute values of TDOA measureme (simply referred to as TDOA measurement average error) to explain. We assume speed of sound propagation is 340 m/s. TDOA error of 1 ms corresponds to a d Figure 16. Cross-correlation operation between prior signals and filtered signals: (a) s1 is crosscorrelated with F1; (b) s2 is cross-correlated with F2; (c) s3 is cross-correlated with F3; (d) s4 is cross-correlated with F4.
From Figure 17, it can be seen that we can obtain three different TDOAs each time. At test point C, the actual TDOAs are 8.7 ms, 13.3 ms, and 8.7 ms, respectively. We can see that our TDOA measurement value is also close to the actual value. The three TDOAs are different due to the different distances between the smartphone and the four acoustic nodes. From Figure 17, it can be seen that we can obtain three different TDOAs each time. At test point C, the actual TDOAs are 8.7 ms, 13.3 ms, and 8.7 ms, respectively. We can see that our TDOA measurement value is also close to the actual value. The three TDOAs are different due to the different distances between the smartphone and the four acoustic nodes. Figure 17. TDOAs measurement results for the first time to ten times at test point C.
In order to further explain the improvement of TDOA measurement accuracy by Algorithms 2 and 3, we use the average of absolute values of TDOA measurement error (simply referred to as TDOA measurement average error) to explain. We assume that the speed of sound propagation is 340 m/s. TDOA error of 1 ms corresponds to a distance error of 34 cm. We draw the TDOA measurement average error histogram as shown in Figures 18 and 19. In order to further explain the improvement of TDOA measurement accuracy by Algorithms 2 and 3, we use the average of absolute values of TDOA measurement error (simply referred to as TDOA measurement average error) to explain. We assume that the speed of sound propagation is 340 m/s. TDOA error of 1 ms corresponds to a distance error of 34 cm. We draw the TDOA measurement average error histogram as shown in Figures 18 and 19.  From Figure 18, it can be seen that the TDOAs obtained using the cross-correlation algorithm always have large errors. At test point 1, TDOA3 has a large average measurement error. At test point 2, TDOA1 and TDOA2 have large average measurement errors. At test point 3, TDOA1 has a large average measurement error. At test point 4, TDOA3 has a large average measurement error. At test point 5, TDOA1 has a large average measurement error. Therefore, TDOA1, TDOA2, and TDOA3 may all have large errors. When we use the ML algorithm for positioning, divergence occurs (i.e., the positioning result tends to infinity). Hence, stable and reliable TDOA measurements are a prerequisite for positioning. In indoor spaces, there is a phenomenon where the strongest path lags behind the direct path. This phenomenon is the essential reason for the large TDOA measurement errors in Figure 18. We developed Algorithms 2 and 3 to improve the problem of large TDOA measurement errors.
As can be seen from Figure 19, TDOAs measurement average error has significantly decreased compared to Figure 18. In test point 1, the TDOA2 measurement average error is the maximum value (0.6 ms). In test point 2, TDOA3 measurement average error is the  From Figure 18, it can be seen that the TDOAs obtained using the cross-correlation algorithm always have large errors. At test point 1, TDOA3 has a large average measurement error. At test point 2, TDOA1 and TDOA2 have large average measurement errors. At test point 3, TDOA1 has a large average measurement error. At test point 4, TDOA3 has a large average measurement error. At test point 5, TDOA1 has a large average measurement error. Therefore, TDOA1, TDOA2, and TDOA3 may all have large errors. When we use the ML algorithm for positioning, divergence occurs (i.e., the positioning result tends to infinity). Hence, stable and reliable TDOA measurements are a prerequisite for positioning. In indoor spaces, there is a phenomenon where the strongest path lags behind the direct path. This phenomenon is the essential reason for the large TDOA measurement errors in Figure 18. We developed Algorithms 2 and 3 to improve the problem of large TDOA measurement errors.
As can be seen from Figure 19, TDOAs measurement average error has significantly decreased compared to Figure 18. In test point 1, the TDOA2 measurement average error is the maximum value (0.6 ms). In test point 2, TDOA3 measurement average error is the  Figure 19. TDOA measurement average error in scenario 1 (using Algorithms 2 and 3).
From Figure 18, it can be seen that the TDOAs obtained using the cross-correlation algorithm always have large errors. At test point 1, TDOA3 has a large average measurement error. At test point 2, TDOA1 and TDOA2 have large average measurement errors. At test point 3, TDOA1 has a large average measurement error. At test point 4, TDOA3 has a large average measurement error. At test point 5, TDOA1 has a large average measurement error. Therefore, TDOA1, TDOA2, and TDOA3 may all have large errors. When we use the ML algorithm for positioning, divergence occurs (i.e., the positioning result tends to infinity). Hence, stable and reliable TDOA measurements are a prerequisite for positioning. In indoor spaces, there is a phenomenon where the strongest path lags behind the direct path. This phenomenon is the essential reason for the large TDOA measurement errors in Figure 18. We developed Algorithms 2 and 3 to improve the problem of large TDOA measurement errors.
As can be seen from Figure 19, TDOAs measurement average error has significantly decreased compared to Figure 18. In test point 1, the TDOA2 measurement average error is the maximum value (0.6 ms). In test point 2, TDOA3 measurement average error is the maximum value of (1.4 ms). In test point 3, TDOA3 measurement average error is the maximum value (1.2 ms). In test point 4, the TDOA1 measurement average error is the maximum value (1.5 ms). In test point 5, the TDOA3 measurement average error is the maximum value (0.56 ms). All TDOA measurement average errors are less than 1.6 ms (corresponding distance difference error is 54.4 cm), which provides a prerequisite for achieving decimeter-level positioning in scenario 1. When we use the ML algorithm for positioning, the results will be more stable. To further illustrate the better stability of our TDOA measurement method, we will display the data in Figures 18 and 19 by using a line chart.
As can be seen from Figure 20, the proposed method has better stability for TDOA measurement. In Figure 20a, the measurement average error of TDOA1 varies between 0.2 ms and 1.6 ms, the measurement average error of TDOA2 varies between 0.2 ms and 1 ms, and the measurement average error of TDOA3 varies between 0 ms and 1.5 ms. In Figure 20b, the measurement average error of TDOA1 varies between 0 ms and 23 ms, the measurement average error of TDOA2 varies between 0 ms and 23 ms, and the measurement average error of TDOA3 varies between 0 ms and 12 ms. So, it can be seen that the proposed method shows good stability. At the same time, it can be seen that the proposed method has a relatively low measurement error in all TDOA measurements. After we obtain TDOAs, we perform the localization function by using the ML algorithm. We refer to the floor tiles to obtain an approximate true value of the smartphone's position. Then we calculate the CDF with two methods (Proposed-ML and Cross-correlated-ML), as shown in the following Figure 21. , which provides a prerequisite for achieving decimeter-level positioning in scenario 1. When we use the ML algorithm for positioning, the results will be more stable. To further illustrate the better stability of our TDOA measurement method, we will display the data in Figures 18 and 19 by using a line chart.
As can be seen from Figure 20, the proposed method has better stability for TDOA measurement. In Figure 20a, the measurement average error of TDOA1 varies between 0.2 ms and 1.6 ms, the measurement average error of TDOA2 varies between 0.2 ms and 1 ms, and the measurement average error of TDOA3 varies between 0 ms and 1.5 ms. In Figure 20b, the measurement average error of TDOA1 varies between 0 ms and 23 ms, the measurement average error of TDOA2 varies between 0 ms and 23 ms, and the measurement average error of TDOA3 varies between 0 ms and 12 ms. So, it can be seen that the proposed method shows good stability. At the same time, it can be seen that the proposed method has a relatively low measurement error in all TDOA measurements. After we obtain TDOAs, we perform the localization function by using the ML algorithm. We refer to the floor tiles to obtain an approximate true value of the smartphone's position. Then we calculate the CDF with two methods (Proposed-ML and Cross-correlated-ML), as shown in the following Figure 21.     4 cm), which provides a prerequisite for achieving decimeter-level positioning in scenario 1. When we use the ML algorithm for positioning, the results will be more stable. To further illustrate the better stability of our TDOA measurement method, we will display the data in Figures 18 and 19 by using a line chart.
As can be seen from Figure 20, the proposed method has better stability for TDOA measurement. In Figure 20a, the measurement average error of TDOA1 varies between 0.2 ms and 1.6 ms, the measurement average error of TDOA2 varies between 0.2 ms and 1 ms, and the measurement average error of TDOA3 varies between 0 ms and 1.5 ms. In Figure 20b, the measurement average error of TDOA1 varies between 0 ms and 23 ms, the measurement average error of TDOA2 varies between 0 ms and 23 ms, and the measurement average error of TDOA3 varies between 0 ms and 12 ms. So, it can be seen that the proposed method shows good stability. At the same time, it can be seen that the proposed method has a relatively low measurement error in all TDOA measurements. After we obtain TDOAs, we perform the localization function by using the ML algorithm. We refer to the floor tiles to obtain an approximate true value of the smartphone's position. Then we calculate the CDF with two methods (Proposed-ML and Cross-correlated-ML), as shown in the following Figure 21.   In Scene 1, we found that the results of using the cross-correlation-ML algorithm for localization at testing points 2 and 4 tend toward infinity. Therefore, in Figure 22, we do not show the localization results for testing points 2 and 4 when using the Cross-correlation-ML method. By using the algorithm proposed in this paper, high-precision localization results were obtained in the localization experiment in Scene 1. In Scene 1, each test point was located 100 times. We used box plots to display the localization results obtained using the proposed algorithm in Scene 1. In Scene 1, we found that the results of using the cross-correlation-ML algorithm for localization at testing points 2 and 4 tend toward infinity. Therefore, in Figure 22, we do not show the localization results for testing points 2 and 4 when using the Cross-correlation-ML method. By using the algorithm proposed in this paper, high-precision localization results were obtained in the localization experiment in Scene 1. In Scene 1, each test point was located 100 times. We used box plots to display the localization results obtained using the proposed algorithm in Scene 1. As can be seen from Figure 22, the indoor acoustic positioning system was tested using a Proposed-ML algorithm to increase its accuracy to the decimeter level in scenario 1. The preparatory work involved coordinating the acoustic nodes and test points using computer software and calibrating them with a plumb line. In scenario 1, five test points were used, each with known coordinates. These test points were triggered every two seconds by the scheduler, and each point was tested 100 times to increase randomness. The positioning error was evaluated using Formula (9) given in the paper. To verify the accuracy of the system further, data were collected at 20 different test points in scenario 2. Each test point was tested 100 times, and the average positioning error of the 20 test points was shown in Figure 23. According to Figure 23, the method proposed in this article is still applicable to Scenario 2. In Region 1, the average positioning error of the test points is 0.36 meters; in Region 2, the average positioning error of the test points is 0.34 meters; in Region 3, the average positioning error of the test points is 0.32 meters; in region 4, the average positioning  As can be seen from Figure 22, the indoor acoustic positioning system was tested using a Proposed-ML algorithm to increase its accuracy to the decimeter level in scenario 1. The preparatory work involved coordinating the acoustic nodes and test points using computer software and calibrating them with a plumb line. In scenario 1, five test points were used, each with known coordinates. These test points were triggered every two seconds by the scheduler, and each point was tested 100 times to increase randomness. The positioning error was evaluated using Formula (9) given in the paper. To verify the accuracy of the system further, data were collected at 20 different test points in scenario 2. Each test point was tested 100 times, and the average positioning error of the 20 test points was shown in Figure 23. In Scene 1, we found that the results of using the cross-correlation-ML algorithm for localization at testing points 2 and 4 tend toward infinity. Therefore, in Figure 22, we do not show the localization results for testing points 2 and 4 when using the Cross-correlation-ML method. By using the algorithm proposed in this paper, high-precision localization results were obtained in the localization experiment in Scene 1. In Scene 1, each test point was located 100 times. We used box plots to display the localization results obtained using the proposed algorithm in Scene 1. As can be seen from Figure 22, the indoor acoustic positioning system was tested using a Proposed-ML algorithm to increase its accuracy to the decimeter level in scenario 1. The preparatory work involved coordinating the acoustic nodes and test points using computer software and calibrating them with a plumb line. In scenario 1, five test points were used, each with known coordinates. These test points were triggered every two seconds by the scheduler, and each point was tested 100 times to increase randomness. The positioning error was evaluated using Formula (9) given in the paper. To verify the accuracy of the system further, data were collected at 20 different test points in scenario 2. Each test point was tested 100 times, and the average positioning error of the 20 test points was shown in Figure 23. According to Figure 23, the method proposed in this article is still applicable to Scenario 2. In Region 1, the average positioning error of the test points is 0.36 meters; in Region 2, the average positioning error of the test points is 0.34 meters; in Region 3, the average positioning error of the test points is 0.32 meters; in region 4, the average positioning  According to Figure 23, the method proposed in this article is still applicable to Scenario 2. In Region 1, the average positioning error of the test points is 0.36 m; in Region 2, the average positioning error of the test points is 0.34 m; in Region 3, the average positioning error of the test points is 0.32 m; in region 4, the average positioning error of the test points is 0.40 m; and in region 5, the average positioning error of the test points is 0.29 m. Therefore, sub-meter-level positioning accuracy can still be achieved in Scenario 2. Our system and method provide a good solution for indoor positioning.

Conclusions
This article describes an intelligent smartphone positioning system based on acoustic localization. The encoding scheme we designed is simple, and decoding only requires simple cross-correlation operations. The signal frequency designed in this article is much higher than the frequency of environmental noise. The accuracy and stability of TDOA measurement are greatly affected by multipath effects. Traditional cross-correlation algorithms cannot solve this problem. Based on cross-correlation algorithms, we developed Algorithms 1-3. We not only decoded the acoustic node signals but also obtained stable and high-precision TDOA measurement results. In terms of positioning algorithms, we chose a reliable maximum likelihood algorithm as the basic positioning algorithm. Static positioning experiments were conducted in two scenarios, achieving an average positioning accuracy of decimeter level. In the future, we still need to solve the following problems: (1) Large scene smartphone positioning: our acoustic nodes are easy to deploy, and we have the ability to achieve positioning in large indoor spaces. However, there is a problem of near-far effect in large indoor spaces. We plan to overcome this problem using the normalization method. (2) Dynamic positioning: acoustic signal is susceptible to Doppler effects. This issue is something we need to address in the future. We plan to choose methods in the field of communication, such as carrier frequency offset compensation. (3) Switching between dynamic positioning and static positioning: When performing the positioning function, the user may be in a stationary state or a moving state. In the moving state, we can use an extended Kalman Filter to improve positioning accuracy.
We plan to use TOA information to detect movement distance and determine whether it is stationary. (4) Adaptive extraction of valid acoustic data segments: this article does not study the adaptive extraction method. However, signals from acoustic nodes can be encoded and decoded, which provides the possibility for adaptive extraction. (5) Smartphone outside of the rectangle of four nodes: When the Smartphone is outside the acoustic nodes, fingerprint positioning can be used. In addition, increasing the number of acoustic nodes and using time-division, space-division, and code-division technologies can also achieve localization.
Author Contributions: B.C. contributed the analysis of experimental data, the writing of manuscript and the programming design. J.W. did reviewing, and experimental assistance. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Data Availability Statement: Not applicable.