Positioning Energy-Neutral Devices: Technological Status and Hybrid RF-Acoustic Experiments

: The digital transformation is exciting the uptake of Internet-of-Things technologies, and raises the questions surrounding our knowledge of the positions of many of these things. A review of indoor localization technologies summarized in this paper shows that with conventional RF-based techniques, a signiﬁcant challenge exists in terms of achieving good accuracy with a low power consumption at the device side. We present hybrid RF-acoustic approaches as an interesting alternative: the slow propagation speed of sound allows for accurate distance measurements, while RF can easily provide synchronization, data, and power to the devices. We explain how the combination of adequate signaling realizing a late wake-up of the devices with backscattering could position energy-neutral devices. Experiments in a real-life testbed conﬁrmed the potential 10cm-accuracy based on RF-harvested energy. Nonetheless, these also expose open challenges to be resolved in order to achieve accurate 3D positioning.


Introduction
Digital technologies are increasingly adopted in diverse sectors to increase efficiency and to offer new and improved services. Through IoT technology, a huge number of common objects are given an ID in the virtual world. Ever-more autonomous vehicles (from cobots to drones) handle these IoT devices. For many applications and services, in business, societal, and personal contexts, information about the position of these IoT devices is essential. Examples include finding people in emergency situations, localizing equipment and goods, offering context-relevant information, and delivering proximity-restricted services. Indoor positioning technology has progressed in recent decades. However, a major hurdle in the uptake of these systems is the energy they require from the devices to be located. Moreover, the quest for improving the accuracy and reliability of position information leads to more complex processing, which logically counteracts potential power savings. Disruptive methods are needed to ultimately enable the indoor positioning of energy-neutral devices, which can operate purely based on the energy they harvest from their environment (as further defined in Section 2). This paper elaborates on the status of positioning technologies for devices with a very limited energy budget, and in particular, demonstrates the potential in novel hybrid RF-acoustic solutions to achieve energy neutrality.
The contributions of this paper are:

1.
A comprehensive overview on the up-to-date state of the art of indoor positioning technologies.

2.
A discussion of different hybrid RF-acoustic systems zooming in on methods to achieve the positioning of energy-neutral devices. 3.
Presentation and analysis of ranging experiments demonstrating the opportunities and open challenges to accurate 3D positioning. Energy-neutral operation hence creates the opportunity to eliminate the need for a battery in a device which in turn reduces the cost and maintenance and enables a small form factor. Furthermore, the localization possibility of the device could theoretically be extended to an unlimited lifetime since the harvesting of the necessary energy is inherently present. Energy can be harvested from a variety of sources and through different techniques [1], and recently, solutions were also specifically designed for small IoT devices [2,3]. In view of the objective to position size-constrained devices indoors, solar energy is typically not a viable option. Coupled inductive power transfer is a very efficient solution for wireless charging in close proximity of devices reaching > 90% efficiency, however, the coupling deteriorates rapidly over distances larger than a few centimeters. Harvesting mechanical energy can only be relied on when devices are repeatedly and significantly moving, and the form factor of the device allows the integration of a rather bulky harvester. APT is considered for reaching devices where inductive or RF transfer is prohibited, e.g., through human skin for biomedical implants or when metallic surfaces obstruct RF waves [4]. However, the power density is low (<0.1 nW cm −2 ) [4] and the conversion to electrical energy requires large transducers with rather low efficiencies. In recent years, RFPT has received significant interest as radio waves offer the advantage of being ubiquitous, and energy can hence be obtained over a large area, even in inaccessible places. RF power provides a density varying between 0.2 nW cm −2 and 1 µW cm −2 [5]. Practically, this results in a received power in the range of µW at the size-constrained sensor or actuator node. This makes the harvesting of RF signals highly suitable for the remote powering of small devices from a distance of several meters in indoor environments. Nonetheless, since the total amount of energy that can be transferred is very limited, innovative approaches for the positioning of these devices need to be developed.

Indoor Positioning Technologies
Indoor positioning is made possible via many technologies, each with their own advantages and disadvantages. Several overview papers are available, with some having over 300 references [6], reviewing the vast amount of positioning research from different points of view. The subjects vary from system requirements such as robustness [7] and accuracy for mobile targets [8] to dedicated positioning methods such as fingerprinting [9]. This section mainly discusses the current SOTA of acoustic and hybrid RF-acoustic systems, and compares them with the capabilities of UWB on the one hand, which achieves best in class accuracy in RF-based technologies, and RFID on the other hand, providing energyneutral operation. An overview of innovative indoor positioning systems selected from the perspective of this paper is presented in Table 1. In addition to the accuracy, the low-power character of the mobile node is also examined. Most research does not consider strict energy constraints, which is nevertheless required to allow operation purely based on harvested energy. A checkmark ( ) is given when a systems has a power consumption below 100 µW. It should be noted that a quantitative comparison of reported results is not straightforward. Different room sizes and materials in experiments highly influence the reflections that waves encounter in a venue, and consequently, the typical accuracy and precision achieved in the positioning. In this paper, we will further assess precision by means of the CDF of the distance error.
RFID is currently the most established and deployed technology for positioning energy-neutral devices. Hereby, the RF signal is used for activating and positioning the device at the same time. A backscattering mechanism is used, altering the characteristics of an incident RF wave based on some unique parameter(s) of the passive device. This technology offers an elegant and low-complexity approach. However, the accuracy of RFID localization systems is defined by the limited read range, and false or missed detections occur due to multipath [10]. In [11], an overview is given of current RFID positioning systems, dividing them in three classes based on the underlying positioning technology. For RSSI-based systems, the accuracy is very limited, in the order of meters for larger distances, and the reliability is strongly affected by multipath effects and tag orientation [12,13]. For smaller distances, sub-decimeter can be achieved with machine learning algorithms using extensive data sets [11]. PoA methods typically show a better localization accuracy, but currently suffer from two main issues: ambiguity due to phase periodicity and phase offset [14].
The most promising RF-based technologies that meet many quality metrics for indoor positioning are typically based on UWB signaling. In UWB localization, the system usually relies on the ToF of sub-nanosecond pulses to calculate the distance between the transmitter and the target [15]. Since UWB uses a very large frequency bandwidth (>500 MHz), a high resolution in time and thus distance can be obtained, resulting in a positioning accuracy in the order of tens of centimeters [16]. In addition, the high bandwidth enables great resis-tance against multipath fading and consequently makes UWB highly attractive for indoor environments. The average power consumption of commercial UWB systems depends on the device mode, but even in idle mode, this value, which is heavily influenced by the design, is larger than 5 mW [17]. There is still room for lower power design optimization for this technology. In order to coexist with narrowband system architectures, the spectral power density in UWB is limited to −41.3 dBm/MHz by regulations. Unfortunately, this is insufficient to remotely power UWB tags at useful distances [15,18]. In combination with the higher power consumption, the latter makes the design of an energy-neutral UWB tag at the current time of writing practically impossible.
Extensive research was conducted on acoustic localization systems [19][20][21]. As acoustic signals have a relative low propagation speed (343 m/s), centimeter positioning accuracy can be achieved based on ToF measurements without the need for high-resolution clocks and high-speed and hence power-hungry electronics [22][23][24][25][26][27][28][29]. Figure 2 depicts a typical hybrid RF-acoustic positioning set-up with at least three fixed nodes for 2D positioning and four for 3D positioning with a known position in the system which we call anchors, and also are referred to as beacons in the literature. Possibly more anchor nodes are desired to obtain better coverage and precision. The mobile node operating in energy-neutral mode can be implemented as a tag. Distance measurements from the anchors to the mobile node, hereafter called ranging, are at the basis of determining the position of the mobile node. When using TDoA measurements [24,[30][31][32][33][34][35], no synchronization is needed between the mobile node and anchor node, which is an additional advantage compared to the ToF system. Nevertheless, the anchor nodes must still be mutually synchronized. Usually, ultrasonic signals are used in acoustic localization systems given that these are not audible to humans.   Acoustic-based positioning systems were developed, for example, the LOCATE-US system [19,30], and more R&D is being conducted on custom designs, for example for ultrasonic transducers [33,34], to provide low-power functionality since these ultrasonic transducers can require a lot of power. The emergence of MEMS technology is highly interesting due to their ultra-low power character. Smartphones are often used as mobile nodes in acoustic localization systems since they are widespread and have a built-in speaker, microphone, and available computing power [31,32]. However, these are not low-power devices and are not designed to deal with ultrasonic sound signals as the audio components are usually limited to the audible region given their primary functions. Moreover, RF-based positioning for smartphones is more straightforward, given the presence of appropriate antennas.
A disadvantage of acoustic localization in contrast to, e.g., UWB, is that acoustic signals are susceptible to room acoustics (reverberation) and current low-power acoustic electronics have limited bandwidth compared to RF. Reverberation in acoustic signaling is the main source of self-interference, resulting in ambiguity in acoustic indoor positioning systems. When tests are performed in an ideal free-space environment, e.g., through simulation [23], then high accuracy can be achieved, which shows that acoustic systems can be an interesting technology for positioning. In highly reverberating situations, optimizations to improve accuracy such as Doppler compensation [36], likelihood expressions [24], or additional anchor nodes to increase spatial diversity [30,37] are required.
In some acoustic localization architectures, the acoustic channel is combined with an RF channel to form a hybrid system. Early hybrid systems adopted the RF signal for both synchronization and communication. A previous overview paper from Holm et al. [38] bundles these first generation RF-acoustic positioning systems. The papers discussed in this work [22,[25][26][27][28][29]35,39] focus more on recent energy-efficient systems. Although the energy efficiency of wireless acoustic sensor nodes has significantly progressed by the developments in MEMS and CMUT microphones and low-energy computing, current hybrid RF-acoustic solutions have not achieved fully energy-neutral operation for longrange (>10 m), centimeter-accurate and multiple access operation, with RF communication as the main limiting factor to achieve this [40]. Active low-power wireless standards such as Ant+ [25,26] achieve a long range but are still power hungry, taking up a large part of the energy budget in energy-constrained devices. RFID, although being passive and making use of energy harvesting, also has a limited reading range for communication [28,35] or synchronization [29]. A promising hybrid method makes use of the backscattering technology refined in RFID systems, and directly communicates the received acoustic signals back [41]. In [27], the acoustic signal coming from a CMUT transceiver is modulated around the carrier, making it vulnerable to self-interference and prohibiting multiple access when several mobile nodes are present.  [23] ToF 0.05 [31] TDoA 20 [32] TDoA 50 [33] TDoA 3 (100 µW) [34] TDoA <0.5 [24] ToF, TDoA <24 [24] AoA <400 RF-Acoustic [22] ToF <1 (150 mW) [39] RSS, AoA <50 [25,26] ToF <0.2 (21.7 mW) [27] ToF <1 (passive) [28] ToF 13 (81 µW) [29] ToF 5 [35] TDoA <0.5 (26.4 mW) ( )

Hybrid Acoustic-RF Approaches for Positioning Energy-Neutral Devices
Unambiguously positioning a mobile device in 3D space requires the distance to at least four anchor nodes. These anchor nodes are distributed across the room and their locations are known by the positioning system ( Figure 2). Particular challenges in positioning and tracking arise when the energy budget of the mobile device is very constrained. Clear insights into the mobile device's energy budget are therefore essential. In this section, we inspect the different components in a hybrid RF-acoustic positioning system and identify which key parameters affect the energy consumption. Based on this analysis, an optimized system design is presented. In the following subsections, we first focus on the setup options and signaling approaches for hybrid acoustic-RF positioning, and in the following, focus on tailored backscattering technology.

Setup
Two approaches can be followed to perform the necessary distance measurements. In the first, the anchor nodes emit the acoustic signals, which are subsequently captured by the mobile device through a microphone. In the other case, the mobile device broadcasts an acoustic signal and the anchor nodes fulfills the role of acoustic receiver.
Given the energy constraints of the mobile device, the transmission of acoustic signals is preferably left to anchor nodes. To pick up acoustic signals at useful distances, the power consumption of speakers easily amounts to several hundred milliwatt or even a couple of watt. In contrast, MEMS microphones only have a power consumption in the order of a few milliwatt. This leaves a reception window for the mobile device that can be two or three orders larger than the time window for transmission before both energy requirements equalize. Considering a LOS distance in the order of 10 m for a typical room, such a long reception window will generally not be needed. On the other hand, some energydemanding signal processing may be involved in the receiver contrast to transmission, e.g., for filtering or information extraction. Depending on the needs of the application, this speaker-microphone power consumption imbalance will impact the architectural decisions.
In terms of physical dimensions, preference is again mainly given to the anchortransmission setup. Since speakers are relatively large, a small MEMS microphone is preferred for the mobile device. Moreover, the small dimensions of the microphone cause the beam pattern to be omnidirectional, even at low ultrasound frequencies, which guarantees acoustic pick-up from all directions. At the same operating frequency, a larger speaker will exhibit a more directional beam pattern. Consequently, multiple speakers may have to be used at the mobile device to ensure acoustic transmission to sufficient anchor nodes, which increases cost and complexity.

Signaling
To obtain accurate distance measurements with low-energy consumption in the mobile device requires a well-considered design of system signaling. To this end, two classes of ToF-based hybrid RF-acoustic ranging strategies are discussed: (I) schemes performing an early wake-up; and (II) schemes realizing a late wake-up.
Two-class I, i.e., early wake-up signaling schemes are presented in Figure 3. The distance measurements are initiated by the anchor nodes. In the first case, the anchor node sends out an RF signal to indicate the start of acoustic transmission, and the mobile device times the propagation delay. In the second case, the anchor node transmits an acoustic signal, waits for a confirmation of reception from the mobile device, and measures the ToF itself. Either way, the mobile device does not know when to expect the acoustic wave from the anchor node in time. If a maximum distance of 10 m is assumed, the mobile device may need to wait actively for a full period of 29.15 ms. From an energy consumption perspective, these scenarios are not optimal, however, they can be easily deployed in situations where there is not prior knowledge of the expected range or when the distances can significantly differ.  Figure 3. Two hybrid RF-acoustic ranging strategies considering acoustic transmission by the anchor node(s). The ToF is either measured by (a) the mobile device M or (b) the anchor node A. In each case, the mobile device needs to actively wait for the arrival of the acoustic signal, which entails poor energy efficiency.
Rather than actively timing the propagation of an acoustic signal, class II, i.e., late wake-up signaling schemes have been introduced to improve energy efficiency. Such an energy-efficient ranging scheme was proposed in [42,43] and is shown in Figure 4. At a time T 0 , an ultrasonic chirp of duration τ TX is transmitted by an anchor node. Mobile devices M1 and M2 remain in a sleep mode to limit their energy consumption. At time T 0 + τ TX , the end of the acoustic transmission, the anchor node broadcasts a signal through RF, waking up the mobile devices quasi-immediately. Once active, both mobile devices capture a part of the propagating chirp signal for a time interval τ RX . Depending on the distance to the anchor node, a different frequency interval of the ultrasonic chirp signal will be picked up. Each mobile device sends the sampled microphone signal back to the anchor node and goes back to sleep mode afterwards. The distance information is contained in the part of the captured chirp signal, which can be obtained through correlation with the transmitted chirp signal. Using this late wake-up signaling scheme, a mobile device no longer has to wait actively for the acoustic signal to arrive. A fixed wake-up interval τ RX of 1 ms is employed in [42], which contrasts sharply with the active time in aforementioned early wake-up signaling schemes. One critical side note, however, is that the maximum ranging distance is limited to the duration of the chirp signal τ TX , however, increasing τ TX negatively affects the signal correlation due to self-interference. Depending on the distance to the anchor node (A), each mobile device (M1/M2) captures a different part of the emitted acoustic chirp signal at RF wake-up. In this case, M2 is located at a larger distance from the anchor node than M1. Both mobile devices only sample the acoustic signal during a short time period τ RX , minimizing the energy consumption.

RF Backscattering
Antenna scattering theory has gained a lot of interest in recent years as a compelling research domain for (near) passive communications. Stemming from the RFID research wave of the 1990s, where the backscattering of incoming antenna signals was used for identification [44], more recent research [45] has focused on data-intensive wireless communication. Eliminating active RF transceivers could resolve major hurdles in massive IoT and its future applications, targeting near passive communication between sensor/actuator devices and the Internet [45,46]. We here elaborate on the potential of backscattering approaches to achieve full energy neutrality for ranging operations, which can be further processed for determining the position of devices. Thereto, we first clarify the antenna scattering principles and its major parameters, and we then apply these to the case of backscattering modulation to be used for hybrid RF-acoustic ranging.

Antenna Scattering
In this paper, we adopt customized backscattering for the energy-neutral localization of a mobile device, by which it will communicate the information embedded in the received acoustic signals back to a central system. Backscattering is based on radar principles. The far-field parameter used to characterize the scattering properties of a radar target is the RCS (σ). According to [47], it is defined as a fictive area intercepting that amount of power, which, when scattered isotropically, produces at the receiver a density which is equal to that scattered by the actual target. This depends on the relative position between the target and transmitter/receiver, target geometry and material, frequency, angular orientation and transmitter/receiver polarization [48].
A more intuitive approach for the RCS in a bistatic setup, in which the transmitter and receiver at the infrastructure side are separated, is depicted in Figure 5. In this bistatic setup, R 1 represents the distance between the transmitting antenna and the backscatter device and R 2 is the distance between the backscatter device and receiving antenna. The transmitted power (P t ) is observed by a target, at a distance R 1 from the source. The incident power (P i ) that is intercepted by the target is determined by the incident power density (W i ) and the cross-section σ, so that the captured power is σW i . This intercepted power is either reradiated as scattered power or absorbed as heat. Isotropical reradiation delivers a scattered power density at distance R 2 from the target given by: Combining the above equations with the radar range equation gives the relation between the radar cross-section, the power transmitted by the transmitting antenna, and the power received by the receiving antenna as a function of the different distances and the RCS in the case of polarization matched and aligned antennas: A digital or analog signal can be backscattered by altering this RCS value. The above equation provides a description of the RCS but does not clearly state how it can be altered at the hardware level. Green [49] introduced a load-dependent radar cross-section defined as: With the load-dependent term: where Z a is the antenna and Z i is the load impedance. This term refers to the antenna mode and depends on the power absorbed in the load of a lossless antenna and the power which is reradiated by the antenna due to load mismatch. The second mode that can be derived from Equation (4) is the structural mode (A S ) scattering term which depends on the antenna geometry and materials [50,51].  Figure 5. Simplified visualization of the radar cross-section (RCS).
By varying the antenna load, only the structural mode of the RCS value is altered, resulting in a changing electromagnetic field at the receiving antenna. For analog signals, a linear load, such as the transimpedance of a JFET, can be changed depending on the incoming signal. For digital signals, the difference in RCS between a '1' and a '0' should be optimized for a maximal reading distance. Ideally, this should incorporate both the antenna and structural modes. However, complex measurements are necessary to determine the structural mode of the RCS. As a solution, the antenna commonly switches between the short and open state, trading the potential improved reading distance in for ease of use.

Backscatter Modulation
In the system description described in Section 3.2, the acoustic signals contain the necessary data to perform the ranging measurements. RF backscattering exploits the instantaneous character of electromagnetic waves and can directly transmit the received information embedded in the acoustic signals to an anchor node for further processing. The frequency of the received acoustic signals should be modulated on an RF signal.
The load switching of a single-tone incident wave can be seen as the most simple form of ASK modulation, i.e., OOK. It is easy to implement and therefore inexpensive. However, this modulation method has a major drawback, namely the fact that it is highly susceptible to noise. In the time domain, an RF source sends out the carrier wave S( f c , t), a single tone sine wave with a frequency of f c and the backscatter device changes the RCS at a frequency of ∆ f a . This latter represents the received acoustic chirp in the time window τ rx . A 1 and A 2 represent the path loss of the electromagnetic waves. In the bistatic setup, a separate antenna receives the sum of the carrier wave and the backscattered OOK demodulated signal: As backscattering is a mixing process, the OOK backscattered signals appear on the positive and negative sides of the single-tone carrier wave in the spectrum. In other words, the reflective and incoming wave spectra overlap. These reflected waves suffer from self-interference as ∆ f a is relatively small, and without proper cancellation, cannot be distinguished from the carrier wave due to the limited resolution of the receiver. With the increasing distance, these signals ultimately become too weak to be separated from the noise floor.
A solution for the self-interference problem is achieved through frequency translation backscatter [52], where the mixing capabilities of the backscatter principle are used to its advantage. This requires an extra component on the backscatter device, a local oscillator generating a sine wave at a frequency f clo . In the frequency domain, this moves the signal away from the direct carrier with an offset of f c ± f clo at both sides. On these two mirror frequencies, the received acoustic chirp signal again appears as a double sidebandmodulated signal. In total, four mirror frequencies of this acoustic signal can be seen in Figure 6. Assuming the RCS variation is defined by a sine wave, the second term of Equation (6) can be rewritten with the product to sum identities as: The added local oscillator contributes to a higher-power consumption, however, it is an excellent compromise for the gained reading distance. Additionally, as this proposed method does not require any complex self-interference cancellation mechanism on a hardware or software level, a simple radio can be used to demodulate the backscatter signal. A major downside of this method occurs when extending the approach to a multiple-access system. By adapting the local oscillator frequency, FDMA could be adapted to localize multiple nodes at the same time. However, due to the double sideband modulation, the maximum amount of simultaneous localized nodes defined by both the acoustic chirp and RF bandwidth is halved. Complex single sideband mixers, such as in [53], with two separate loads, could cancel out one band but at the same time consume more energy on the mobile node.  Figure 6. Frequency spectra of the combined direct and backscattered received RF signals without the local oscillator at the backscatter device (left). The blue and orange lines in both figures represent the spectra for, respectively, a small and a large resolution bandwidth. By adding the local oscillator, the backscattered signals were shifted away from the carrier, improving distinctiveness and enabling reception at lower spectral resolution.

Hardware Implementation, Power Consumption and Backscatter Demodulation.
Here, we put the theory into practice and present the implementation of a frequency shifted, load switching backscatter mechanism in hardware. In addition, we evaluate the power consumption of the contributing components and introduce the signal demodulation.

Hardware Implementation and Power Review
Previous implementations [53][54][55] demonstrate the backscatter capabilities for IoT applications based on custom ICs. Their small form factor and ultra-low power consumption, often orders of magnitude lower than systems built with lumped components, are attractive features. Due to time and resource constraints in prototyping for R&D, we used off-the-shelf components, selecting options with low-energy consumption in both the active and sleep usage. An overview of this system is depicted in Figure 7 and can be divided into two parts: (I) the acoustic section in which the analog ultrasound signals are captured and transformed into a digitized version, and (II) the RF section, which uses these digital signals to switch the load of the antenna.
The acoustic section consists of: • MEMS microphone (Vesper VM1000): key specifications in addition to its low-power consumption include its short wake-up time (below 200 µs), high sensitivity, and wide frequency response. In-house measurements show that frequencies over 80 kHz can be received sufficiently, much higher than stated in the audible domain-focused datasheet (Note that this frequency range is true for the tested microphone's batch. No guarantee can be given that other future batches of this microphone type will behave similarly). • Amplifier (TI TLV341): this low-noise amplifier has a high UGB, providing a gain up to 34.8 dB for a 40 kHz signal in a single stage. Amplifying and filtering before zero-crossing by the comparator is more noise resilient and therefore gives better results. The single-supply rail-to-rail capability of this opamp makes for an easy implementation and for maximum signal amplitude swings. • Comparator (TI TLV7031): the rise and fall time are specified below 5 nanoseconds, giving a quasi-instant change of state when a zero-crossing occurs.
The output of the comparator containing the digitized microphone signal is connected to an SPDT-switch (AD ADG839). This acts as a multiplexer, passing either the local oscillator signal (AD LTC6906) with a frequency of f clo or a low-voltage signal. The local oscillator frequency is set with two resistors, and can take values between 10 kHz and 1 MHz. Implementing double sideband modulation, we accommodate for a total bandwidth of 80 kHz defined by the maximum chirp frequency f a,max = 40 kHz. With the local oscillator bandwidth limitation, this allows for only 12 backscatter nodes to be active at the same time in the same acoustic area. The last part of this RF hardware section are the RF load switches (ADG904). A dipole PCB antenna consists of two λ/4 copper strips with a small gap between them. For this antenna, at least two load switches are required to correctly short and open these two sides. The ADG904 has four multiplexer channels. This means that there are two more multiplexer inputs available on both of these chips, and other resistors, inductors and capacitors could be used as a complex serial load between the two copper traces, enabling additional antenna loads for improved reading distance, as proposed in [56]. An LDO voltage converter is missing from the schematic overview in Figure 7a. This LDO regulator converts a higher input voltage coming from an energy buffer into a lower voltage for a DC-offset in the acoustic signal and for powering the low-voltage components.  The total energy consumption and awake time of the chosen hardware components are shown in Table 2. Current measurements were performed with nanoampere precision on an Otii Arc precision power analyzer and used for the stable power consumption calculations. The reference voltage was fixed at 2.5 V. A Keysight DSO-X 2002A digital scope with reference resistor and precision power source was chosen to measure the transient currents of the system at startup. For the active power consumption, a wake-up time (τ rx ) of only 1 ms was set. Measurements were performed in [57] to calculate the necessary power budget for RF energy harvesting. From this table, a total energy consumption below 5 µJ can be calculated to perform a single distance measurement. For complete energy-neutral operation, the current hardware should be extended with an energy harvesting circuit, e.g., E-Peas AEM 40940 and an RF wake-up detector for synchronized wake-ups. In active modus, the total power consumption is 755 µW. Without a reference wake-up and in alwaysactive modus, this system could last for 894.04 hours or 37.25 days on a lithium CR2032 coin cell battery with a nominal voltage of 3 V and a 225 mAh capacitance. With an RF wake-up, where the wake-up time is 986 µS, the awake time is limited to 1 ms, and the measured transient energy consumption is 1.66 µJ, the lifetime of such a node could be extended to 213,646.13 hours, or 8901.92 days, or 24.38 years, without the power consumption of the awake circuit and self-discharge of the battery into account.

Backscatter Demodulation
With the addition of the local oscillator on the backscatter device, demodulation needs to be performed at the bistatic receiver. For this type of backscattering, two types of demodulation can be performed: 1.
OOK demodulation. The digitized acoustic signal drives a multiplexer that forwards either the local oscillator or no signal at all. Consequently, the load is switched at the local oscillator frequency f clo or it is not switched. In the spectrum, this appears as a signal f clo away from the RF carrier frequency that is turned on and off. As mentioned previously, OOK is very susceptible to noise, and the drift of the local oscillators can make the amplitude demodulation on the receiving radio impossible. 2.
FSK demodulation. As the acoustic chirp ∆ f a can be considered as a signal modulated in frequency, this chirp signal can be observed on both sides of the local oscillator frequency. The demodulation is done by performing a frequency translation and a decimating FIR filter on one of the sidebands. With this, only the portion of the wideband signal with the frequency decreasing chirp signal is saved to a buffer for later use.
In this paper, FSK is used for demodulating the backscattered acoustic signal. In Figure 8, a snapshot of both the spectrum and the received signal in the time domain can be observed. The two sides of the spectrum are plotted, clearly showing the powerful RF carrier wave, the local oscillator frequency, and the two acoustic sidebands.

Experiments Demonstrating Opportunities and Challenges in Hybrid RF-Acoustic Positioning
To analyze the approach from Section 3 for realistic positioning scenarios, experiments were performed in a true-to-life test environment. In this section, we first describe the test environment and hardware used in the bistatic measurement setup. Consequently, three types of measurements were conducted. In the first two, the interdistance between the microphone-speaker pair and the transmitting and backscatter antenna is increased. In the latter, the angle between the transmitting and backscatter antenna is changed. This paper limits itself to ranging capabilities when acoustic and RF LOS is assured. With median errors below 10 cm for most distances, these experiments show the potential of low-power hybrid RF-acoustic ranging for 3D positioning.

Experimental Environment and Measurement Setup
The experiments are performed in a multi-functional measurement infrastructure named Techtile, depicted in Figure 9b. This is a 8 × 4 × 2.4 m modular structure built for distributed sensing and communication technologies [58] with 140 tiles covering the walls, ceiling and floor, each equipped with a power, radio, processing and communication module. A DAQ-setup with 384 single-ended, synchronized, 16-bit, analog input channels enables the fast sampling (3.3 MS/s) of sensors and actuators. This offers a true-to-life, modular testbed for both acoustic and RF-based research.
The acoustic characteristics of this room are defined by the RT60 value, the time it takes for the room to attenuate an impulse signal with 60 dB and linked critical distance. With an RT60 value of 0.41 s above 8 kHz for a volume of 77 m³ [59], this room has a critical distance of only 0.75 m. At this distance, the direct and reverberated sound signals can exhibit equal SPL, making this an acoustically harsh environment to conduct measurements in. Additionally, SPL measurements with an 60 Hz of 80 kHz frequency and a 10 s time window show different sources contributing to the ambient noise. In the audible domain, peaks over the 60 dB can be detected close to the server and networking rack, whilst a 40.9 dB peak is measured in the ultrasonic domain, coming from switched power supplies.  Processing unit for audio transmission (@anchor node): The acoustic signal is generated on a Raspberry Pi 4 running a Debian-based operating system. A C library PiGPIO is used for sending out the chirp with a frequency between 40 kHz and 20 kHz by binary switching a GPIO pin for 30 ms. The audio signal is amplified with an off-the-shelf amplifier with a frequency range over 45 kHz. The audio signals are transmitted by a Fostex FT17H ultrasonic tweeter with a HPBW below 30°in the xy-plane at 25 kHz. Similar to the RF source, the speaker is mounted on a pole and directed towards the MEMS microphone of the backscatter device. • Backscatter device (@mobile device): Consists of both RF and audio components and was detailed in Section 3.4. • Processing unit with SDR (@anchor node): The same Raspberry Pi 4 is used as processing unit running Python-based GNU radio in parallel with custom C++ timing and an acoustic signal-generating code. The RF backscattered waves were received by a dipole antenna connected to an Ettus B210 USRP SDR, and converted, filtered, and handled by signal-processing blocks in GNU radio. These FM-demodulated signals are stored in the hard drive for further distance range calculations. Interrupt-based precise timing between the start-up of the acoustic chirp and the RF wake-up time is performed by the software as well, with a maximum measured offset of 10 µs.

Conducted Experiments
Three types of experiments were conducted to test the accuracy and precision of the ranging method. The first two test the distant-dependent performance between the backscatter device and the audio or RF-source. In the third test, the angle of arrival (α) between the RF source and device is varied. All measurements are realized in quasi-ideal circumstances, as there is a direct LOS between the source and receiver and both audio and RF-sources and receivers are pointed towards each other. Interfering sources are kept to a minimum, but could not be totally excluded as the room was used under normal conditions with persons walking in and out and other equipment turned on and off at certain moments in time. The current setup lacks the RF wake-up system described in the previous section. This means that in the reported experiments, there was a continuous backscatter communication of the received acoustic signal. To simulate the synchronization and the limited awake time of the mobile node, a 1 ms sampling window was chosen at the end of the acoustic chirp.

Ranging Audio
As sound pressure is subject to the inverse square law, with sound pressure decreasing by 50% when the distance doubles, an experiment of interest is that of the investigation of how well the ranging method performs if the audio distance (∆x a ) is increased by moving the backscatter device in a linear, stepwise motion. The RF source and accompanying antenna are moved along, keeping the distance (∆x r f < 2 m) and angle of the incoming RF wave (α) similar in all test cases. This minimizes the potential influence of the RF source and receiver spacing. Nevertheless, since the receiving antenna of the anchor node is connected through an SDR with the centrally positioned processing unit, there is still some unavoidable influence of the increased RF distance. Figure 10 plots the CDF of the measured error for 10 distances ranging from 50 to 500 cm. For each distance, at least 150 measurements were performed. The error is the absolute difference of the calculated hybrid ranging measurement and a reference value measured with a laser-based instrument. The results show an initial steep slope and sudden increase in the outliers. This leads to the initial conclusion that at the acoustic level, the ranging performs well. Only at the 350 cm distance in Figure 10a can a more gradual increasing graph be observed. This anomaly can be addressed to path loss or reverberation in the acoustic or the RF medium, as better results are obtained with larger distances. However, the initial rapid incline of this graph shows that the system is capable of calculating the correct distance which rules out errors in the calculations or constructive interference at another fixed frequency. At the larger distances between the acoustic source and backscatter device, the method behaves as expected and increases with the interdistance (∆x a ). A detail of the acoustic CDF plot is depicted in Figure 10b. Here, it can be seen that most of the experiments have a P90 value below 20 centimeters. All except two have a median error within a 10 cm error, a favorable result for indoor positioning where outlier detection algorithms can be used to optimize the results. The 400 cm curves on both plots show that even for larger distances, excellent results can be obtained. For the largest two distances, a plateau can be noticed on the CDF plot. This plateau can be explained by the fact that, wrongfully, the index of the maximum correlation coefficient is used to estimate the distance, as can be seen in Figure 11. The red line on this correlation plot describes the actual distance, whilst the orange lines indicate the peak selected based on the overall maximum. In case of this sample set, adopting an advanced peak selection method would improve the distance error drastically. From Table 3, similar results can be deduced. The difference between the mean and P50 error can be attributed to the high outlier errors. How early these outliers occur is described by the P90 value. The last metric in this table is the mean value of the STMR. Here, the second highest correlation peak is divided by the highest which gives an index in how distinct the correlation peak is.
With f (t) as the received backscatter signal, g(t) is the transmitted chirp signal and max 2 is the second highest value of the cross-correlation between these two. Values closer to 1 indicate that these peaks have similar heights and can often be caused by noise-induced backscatter signals. For ranging calculations with a high STMR, refined peak selection methods could have a great impact on the results.   Figure 11. Example of a correlation plot between the received RF samples and the original transmitted chirp for an interdistance of ∆x a = 5 m. The red line is the actual distance, whilst the orange line is the selected maximum peak for the used distance estimation.

Ranging-RF
A second parameter to test is the influence of the distance between the RF-source and the antenna of the backscatter device. In the first set of acoustic measurements, this distance was kept above the far field distance and below 2 m. In the next measurements, we gradually increase the distance ∆x r f while keeping the distance between the acoustic transmitter and receiver constant. The influence of the inter-RF distance can be seen in two selected CDF plots, for ∆x a 300 cm and 400 cm, respectively, in Figure 12a  The left plot shows that for smaller values of ∆x r f , some interference at a fixed frequency occurs. This results in a more prominent peak than the correct distance peak after correlation. This interference was only noted at this distance, and its influence diminishes with the increasing distance. This diminution potentially highlights the influence of the RF wave's incoming angle (α), as this angle decreases when the RF source linearly moves away from the backscatter device in parallel with the backscatter-audio source axis. Out of the second CDF figure, it can be understood that for a well-performing ranging measurement, there is no influence from the interference or the distance, as some larger distances of ∆x a show better results. However, as not all lines follow the same course, the interference effect must come from somewhere else. As in this measurement set, the angle of arrival from the incoming RF-wave (α) was kept as consistent as possible but was not actively monitored.
To assess this influence of the angle, a last set of measurements was performed. Here, the backscatter device is kept at a constant distance of 250 cm from the acoustic source. The RF-source antenna moves around this device on a 190 cm radius, with the incoming angle (α) going from 0 • to 180 • in steps of 27.5 • . With this increasing angle, the ranging performance drops drastically, as can be seen in Figure 13 and confirms our previously hypothesis stated. For angles of 27.5 • and above, the curves behave similarly to how they do in Figure 12a. Again, some fixed frequency interference in the received backscatter signal can be observed, with a larger influence on the cross-correlation and ranging measurements, resulting in the known plateaus. Only at 90 • does the shape of the plot differ and incline more uniformly. The source antenna is herein perpendicular to the PCB dipole antenna. From a backscatter perspective, only a small surface of this latter dipole antenna can be seen by the source antenna, leading to poor absorption and reflection capabilities. This can be attributed to the directivity of the half-wave dipole backscatter antenna and is normally embedded in the extended radar range equation from which the incident power density W i is derived. Half-wave dipoles have a three-dimensional toroid shape. In this measurement setup, the radiation peaks when the transmission antenna faces the backscatter dipole (α = 0 • ) and goes towards zero when it is positioned in perpendicular to the antenna (α = 90 • ). The cause of the deviations in the previous ranging measurements depicted in Figures 10a and 12a can thus be explained by the incoming angle, which in most cases, is best kept low. With this, it can be seen that it is better to position the RF side along the acoustic source rather than opposite it. Another solution would be to position the dipole antenna on the perpendicular axis since it has an omnidirectional radiation pattern in this plane.

Conclusions and Road Ahead
The accurate positioning of energy-neutral devices could significantly extend current battery-driven location-based applications. A comprehensive overview of the active research in low-power indoor positioning systems shows that there are plenty of open research challenges and opportunities to progress towards the ambitious goal of the accurate and precise 3D positioning of energy-neutral devices. Our assessment shows that disruptive methods are needed, putting the energy neutrality of the mobile node as a fundamental objective when designing. A key example preaching this design method can be found in RFID. However, its solid identification capabilities cannot be straightforwardly scaled up to positioning. The results presented in this paper demonstrate the very promising potential of the combination of slower propagating acoustic waves with low-power backscatter RF communication technology to enable this sought-after feature in location based services. The measurements of the proposed hybrid RF-acoustic ranging system show that this technology is mature enough to be exploited for 3D positioning. With an energy consumption below 5 µJ per distance measurement, energy harvesting methods should be able to provide enough power for standalone operation. The 10 cm median accuracy error could be reduced by implementing better peak selection methods, adapting omnidirectional backscatter antennas, and selecting optimal impedances for the load modulation of the antenna.
The road ahead requires first of all further R&D to develop the full 3D positioning solution for possibly a large amount of energy-neutral devices. This requires (I) the implementation of the RF wake-up necessary for synchronization purposes; (II) the development of adequate algorithms to determine accurate 3D positions based on the ranging measurements; and (III) the extension of these ranging methods to provide multiple access capabilities. Furthermore, solutions to increase precision and reliability in challenging reflective indoor environments are to be researched. For example, simultaneous localization and mapping techniques (SLAMs) could help combat bad outliers. Furthermore, physicsinspired learning methods taking into account high-level information on the surroundings, e.g., position of walls, are also promising. Data Availability Statement: Not Applicable, the study does not report any data.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: DAQ