1. Introduction
Electric Vertical Take-off and Landing (eVTOL) aircraft are rapidly developing and becoming widely used in Urban Air Mobility (UAM). However, when flying in complex airspace or extreme weather conditions, they may lose their ability to respond due to electric propulsion system failure, communication failure, energy exhaustion, no-fly zones, or obstacles, causing them to crash in unknown geolocation, resulting in loss of geolocation and communication [
1,
2,
3]. Existing emergency search and rescue systems typically rely on 406 MHz CO-SPAS-SARSAT emergency beacons and airborne or ground-based surveillance networks, such as ADS-B, cellular/satellite communications, to successfully complete the final geolocation. Re-geolocation may be delayed or have reduced gain in situations such as beacon failure, antenna proximity, high sea level, or geolocation obstruction [
4,
5]. High-Frequency (HF, 3–30 MHz) skywaves can propagate beyond the horizon via ionospheric skywaves, covering thousands of kilometers. They can provide a certain level of signal strength in areas with weak ground and satellite coverage or in disaster-stricken areas. As a good geolocation method, they can be used to narrow down aircraft failures such as crashes or forced landings, thereby improving the timeliness of search and rescue [
6,
7,
8]. By utilizing these HF skywaves for passive geolocation, ground monitoring networks can quickly search the area and guide emergency response teams, significantly reducing rescue time. In this context, achieving accurate HF geolocation is crucial for eVTOL search and rescue missions after a crash.
Passive geolocation with HF skywaves has long been a critical capability in applications such as spectrum monitoring, emergency search-and-rescue, signal intelligence, and maritime surveillance [
9]. Unlike very-high-frequency (VHF) or microwave bands, HF waves propagate via ionospheric skywave paths, often undergoing one or more reflections before reaching ground-based receivers [
10]. This property allows HF signals to cover thousands of kilometers but also introduces substantial uncertainties in their propagation paths, delays, and angles of arrival [
11]. Accurate geolocation under such complex multipath and multi-hop conditions is crucial for achieving timely situational awareness and interference source mitigation in modern electromagnetic environments [
12,
13,
14].
Current research on HF passive geolocation mainly follows two technical routes. The first is based on Time Difference of Arrival (TDOA) measurements, where hyperbolic multilateration is used to estimate the emitter’s geolocation from synchronized multi-receiver observations [
15,
16,
17]. Osypiuk, R. [
18] enhanced MLAT algorithms by incorporating altitude data alongside TDOA measurements, improving geolocation accuracy and enabling optimized station deployment through nonlinear optimization analysis. Hybrid measurement-based indoor geolocation methods integrating advanced optimization algorithms can mitigate multipath and non-line-of-sight (NLOS) effects, improving geolocation accuracy, robustness, and convergence efficiency in complex environments [
19,
20]. However, such approaches rely on accurate estimation of individual delays, which becomes highly unreliable in low signal-to-noise ratio (SNR) or multipath conditions. The second route employs Direct Position Determination (DPD) or coherence-based spatial spectrum methods, which bypass explicit TDOA estimation by directly matching the received signals in the time–frequency domain [
21,
22,
23,
24]. Ma, F. [
25] proposed a TDOA-based DPD method for asynchronous sensor networks that uses anchor sources to jointly calibrate clock biases and estimate source position via Gauss–Newton, achieving accurate localization with reduced computational load. Li, B. [
26] introduced a DPD-based Non-homogeneous Data Fusion and Fast Position Update (NDFFPU) algorithm for tracking multiple unknown emitters with distributed arrays, achieving higher accuracy and lower complexity than conventional methods. Ma, F. [
27] proposed a particle-filter-based DPD method for moving sources that directly estimates positions and velocities from received signals, achieving improved computational efficiency and estimation accuracy over traditional TDOA/FDOA localization. Weiss, A.J. [
28] showed that a single-step DPD method for localizing a stationary transmitter with moving receivers outperforms conventional TDOA/FDOA geolocation in both efficiency and weak-signal accuracy, and derived compact Cramér–Rao lower bound (CRLB) expressions. Eshkevari, A. [
29] improved DPD-based localization of co-channel transmitters by introducing a dynamic sensor array response (DSAR) model that accounts for range-dependent path loss, reducing spurious pseudo-spectral-function (PSF) peaks for minimum variance distortionless response (MVDR) and multiple signal classification (MUSIC) beamformers. Xie, J. [
30] proposed a despreading DPD (DS-DPD) method for localizing Global Navigation Satellite System (GNSS) spoofers, which jointly exploits delay, Doppler, DOA and code sequence information to achieve order-of-magnitude accuracy gains even at very low interference-to-noise ratio (INR).
However, most of the above studies consider either receiver-geometry optimization or direct geolocation in isolation, rather than combining them in a unified framework. At the same time, the rapid development of the low-altitude economy is leading to an increasing number of eVTOL operations in mountainous and sparsely populated areas where satellite and terrestrial communication coverage can be weak or unavailable. When eVTOL vehicles suffer a serious malfunction or crash in such environments, quickly geolocating the aircraft to support emergency search-and-rescue becomes a critical task. To the best of our knowledge, existing HF skywave geolocation works have not explicitly addressed this eVTOL emergency-rescue scenario. There is also no complete solution that combines the receiver selection phase with the geolocation phase. These considerations motivate the two-stage “select-first, geolocation later” framework studied in this paper.
This paper proposes a two-stage ReSL-RSS (Receiver selection then geolocation with Random Spatial Spectrum (RSS)) framework (Stage A: receiver selection; Stage B: direct geolocation). This technology is designed for eVTOLs flying in low-altitude mountainous areas with sparse populations and no satellite geolocation signals. It also addresses emergency search and rescue scenarios for eVTOLs in situations where communication is interrupted due to connection loss, crashes, or malfunctions.
- (1)
Stage A employs two receiver selection routes. Case 1 (selection with unknown biases) combines geometric observability to rank and select candidate receivers without relying on prior biases while maintaining the consistency of reference receiver selection. Case 2 (selection with bias priors) builds on this foundation by applying prior constraints and robust weighting on NLOS biases, while also imposing spatial diversity conditions to improve availability and stability in obscured and low-SNR environments.
- (2)
Stage B adopts an RSS-based direct HF skywave geolocation method centered on grid-based objective function matching over both geographic positions and virtual ionospheric heights. Different from conventional TDOA multilateration or DPD-type methods, the proposed RSS stage operates directly on synchronized multi-receiver I/Q samples and does not estimate any intermediate TOA or TDOA features, which makes the overall ReSL-RSS framework more robust to low SNR, ionospheric model uncertainty and NLOS-affected receivers.
- (3)
Simulations and actual measurements are performed under the same settings. First, RSS geolocation is performed using receivers from stage A and random receiver selection. Then, the optimal receiver combination output by stage A is determined, simulated and compared with other algorithms, and finally actual measurements are performed.
The results show that, compared with other geolocation algorithms, this two-stage framework can consistently achieve lower geolocation root mean square error (RMSE), smaller median error, more concentrated error distribution, and fewer large error outliers.
2. System Model
We consider a passive HF geolocation scenario with a fixed unknown transmitter and multiple time-synchronized receivers located on the Earth’s surface. All coordinates are expressed in geodetic form latitude–longitude. Let the transmitter’s unknown location be , where is the latitude and is the longitude. For each receiver m (with ), we denote its known position by . By convention, receiver serves as the reference sensor. All receivers are assumed to share a common time base and sampling rate , meaning they collect synchronized baseband in-phase/quadrature (I/Q) samples at sampling rate . To avoid aliasing and preserve timing accuracy, is chosen to be at least twice the occupied signal bandwidth.
In the HF band, skywave propagation causes the transmitted signal to reach each receiver via one or more reflections from the ionosphere. Using the electromagnetic image method, a two-hop propagation path can be treated as an equivalent one-hop path but with an extended propagation delay. Consequently, the overall channel propagation to receiver
m can be characterized by a sparse discrete impulse response consisting of a few distinct propagation paths. In HF skywave propagation, these effective paths can be further described by an equivalent single-hop virtual-height model. Specifically, the virtual reflection height
h is constrained to a physically reasonable range of 80–500 km, which covers typical mid-latitude E/F-layer virtual reflection heights for HF links reported in [
6,
11,
12,
13]. In our framework,
h is discretized on a grid and jointly searched together with the source position in Stage B, so that ionospheric variability is treated as an additional model parameter rather than as a fixed deterministic profile.
In
Figure 1, high-frequency signals reach their destination via a single reflection and/or multiple reflections, where the reflection height is obtained by a similar triangular path [
31]. According to electromagnetic mirror theory, the two-hop path A–F1–E–F2–B is equal to the path A–C–B, and can also be modeled using a single-hop propagation mode with a longer propagation delay.
We model this impulse response as:
In Equation (
1),
n is the discrete time index and
J is the number of significant paths, with path index
, and we take
to correspond to the direct one-hop path. The coefficient
is the complex gain of path
j for receiver
m, encompassing the path’s attenuation and phase shift, and
is the sample delay (travel time, in samples) for that path from the transmitter to receiver
m. The delta function
indicates that each path contributes an impulse at delay
. Thus, Equation (
1) expresses that the discrete-time channel impulse response at sensor
m can be regarded as a superposition of
J delayed impulses, each corresponding to one propagation path and weighted by a complex amplitude, which physically leads to a received signal composed of
J delayed replicas of the transmitted signal.
To compute the geometric propagation delay
in Equation (
2) for a given path, we consider a spherical Earth model with a virtual ionospheric reflection height for that path. Let
denote the Earth’s radius,
c the speed of light,
the virtual ionospheric height associated with path
j to receiver
m, and let
be the central angle (at the Earth’s center) between the transmitter’s location
and the receiver’s location
. Then the propagation delay in samples for path
j at receiver
m is given by:
where
in Equation (
2) is related to the great-circle distance between transmitter and receiver on the Earth’s surface. The term
corresponds to the squared horizontal distance (across the Earth’s surface) between the two locations, while
represents the effective vertical path length (difference in radial distance) accounting for the signal’s reflection at height
above the Earth. The factor
in front converts the physical propagation time to a number of samples, since
is in samples/second.
The central angle
between the transmitter at
and receiver
m at
can be obtained via the spherical law of cosines. In particular, we have
where
and
in Equation (
3) are the latitude and longitude of receiver
m, and
,
are the latitude and longitude of the transmitter. Inside the arccos,
is the spherical dot product of the unit vectors pointing to the two locations, and
is half of the arc-cosine of this quantity. The factor
appears because a two-hop path was effectively converted to an equivalent one-hop with extended delay. In essence,
represents the angular separation between transmitter and receiver as seen from the Earth’s center.
Using the delays defined above, we can express the relative arrival times of signals between different receivers. We define the time-difference-of-arrival (TDOA) of path
j at receiver
m, relative to the one-hop path at the reference receiver (receiver 0), as
In other words,
measures how much later in samples path
j arrives at receiver
m compared to the reference receiver’s direct path. Using this definition, we can model the discrete-time signal observed at receiver
m as a sum of delayed copies of the transmitted signal. Specifically,
where
in Equation (
5) denotes the transmitted signal’s baseband waveform as a function of time index
t, and
represents additive noise at receiver
m. Equation (
5) says that the received signal
is composed of
J copies of
, each delayed by
samples and scaled by the complex gain
, plus noise. In particular, the term for
corresponds to the primary (one-hop) signal component arriving at time
at the reference and
at receiver
m, so
is the relative delay of the direct path between receiver
m and the reference.
It is often convenient to examine the signals in the frequency domain. Let
N be the length of a discrete Fourier transform (DFT) applied to the signals. Denote by
the
k-th DFT bin of the transmitted signal
, and by
and
the DFTs of
and
, respectively. Then the frequency-domain observation at receiver
m can be written as
In Equation (
6), each path
j thus contributes a term to
that is the transmitted spectrum
scaled by
and rotated in phase by an angle
. This phase shift is due to the path’s time delay
. The additive noise in the frequency domain is
. Equation (
6) is simply the DFT of (
5) and it shows that the effect of a propagation delay in the time domain is to introduce a frequency-dependent phase shift in the frequency domain.
The goal of the geolocation task is to estimate the transmitter’s coordinates directly from the synchronized multi-channel observations , without explicitly estimating any intermediate delays (such as individual TOAs or TDOAs for each path). In later sections, we will achieve this direct localization by evaluating a random spatial spectrum over a grid of candidate transmitter positions (and associated virtual ionospheric heights) and then finding the location that maximizes a resulting joint statistic.
This system model also supports a two-stage processing pipeline (Stage A + Stage B) for improved performance. Stage A is a sensor selection stage, operating on the same pool of
M candidate receivers and following the above signal model. In Stage A, we algorithmically select a subset
of sensors (receivers) and designate one of them as the reference
, with the aim of optimizing the geometry of the network and improving robustness to mixed line-of-sight (LOS) and non-line-of-sight (NLOS) propagation conditions. Stage B then performs the direct localization on the chosen subset
. In Stage B, we use only the selected sensors and reference
: we form TDOA observations relative to
(as in Equations (
1)–(
6), but now restricted to the chosen subset and using
as the reference in place of receiver 0) and evaluate a spatial spectrum over candidate locations. Finally, we identify the source location by maximizing this spectrum. The following sections detail the techniques for Stage A (sensor selection) and Stage B (RSS-based direct localization), respectively.
3. Stage A: Sensor Selection Algorithm
Consider a set of
S candidate receivers (sensors), indexed by
, with known positions
. Each position
could be specified, for example, in Earth-centered coordinates or another convenient 3D coordinate system; the exact coordinate frame is not critical for the selection algorithm. Let
denote the (unknown) transmitter’s true position in the same coordinate system. In Stage A, we wish to choose a subset of exactly
K sensors from these
S candidates, comprising one reference sensor and
ordinary sensors, to be used for geolocation in Stage B. We introduce binary indicator vectors
and
to represent the selection of the reference and ordinary sensors, respectively. Specifically,
if sensor
i is chosen as the reference and exactly one such
i must be 1, and
if sensor
i is chosen as one of the ordinary sensors. No sensor can be both reference and ordinary at the same time. These conditions can be written as selection constraints:
The first constraint ensures exactly one reference is selected, and the second ensures exactly
ordinary sensors are selected, so that in total
K sensors are chosen. And the third constraint prevents any sensor from being counted as both reference and ordinary. Together, these define the feasible set of selections
. Each sensor
i when active can provide a time-of-arrival (TOA) measurement of the signal. We model the TOA measured at sensor
i as
with
In Equations (
8) and (
9),
is the observed TOA at sensor
i, and it is composed of a true (unbiased) propagation time
plus a noise term
. The term
in Equation (
9) represents the nominal TOA in the absence of noise, where
is the true distance between the unknown source
u and sensor
i (Euclidean norm of the difference in position vectors) and
is a bias representing excess path length due to NLOS propagation, if the path is strictly line-of-sight, then
. The constant
c is the signal propagation speed, approximately the speed of light. The noise
accounts for measurement errors or timing noise at sensor
i and is assumed to be zero-mean Gaussian with variance
, i.e.,
.
If we have designated a particular sensor
r as the reference (meaning
for some
r), we can form TDOA measurements between each ordinary sensor
i and the reference
r. The TDOA between sensor
i and the reference
r is defined as
. Using the model in (
8) and (
9), this TDOA can be expressed as
In Equation (
10),
is the distance from the source to the reference sensor, and
is the NLOS bias (if any) for the reference sensor’s path. Equation (
10) shows that the true (noise-free) TDOA between sensor
i and
r is
, and the noise on this TDOA is
. Note that if both
i and
r had purely LOS paths (no biases), the TDOA would simply be
plus noise, reflecting the difference in distance from the source.
We now stack the TDOA measurements from all selected ordinary sensors (relative to the reference) into a single vector for further processing. Let
denote the set of indices of the
selected ordinary sensors, and let
r be the reference index with
. We form the TDOA observation vector
as
. Using Equation (
10), this can be written in vector form as
In Equation (
11)
is a deterministic vector-valued function of the source position
u and the bias vector
; specifically, the
ith component of
(for
) is
. The term
represents the combined measurement noise. Since each
has noise
, the noise vector can be modeled as
, where
is the
noise covariance matrix. From Equation (
10), we can derive
where
in Equation (
12) is a vector of all ones. The diagonal entries of
are
for each
, and each off-diagonal entry is
, because the reference’s noise
appears in all TDOA measurements and is thus a common source of correlation between any pair of TDOAs.
Next, we derive expressions for the Fisher information to guide our sensor selection. We first consider the line-of-sight only case (i.e., assume all biases
, meaning all selected sensors including the reference have unobstructed LOS paths). In this case, the only unknown of interest is the source position
u. We can write the TDOA model Equation (
11) in a simplified form
where
corresponds to the purely LOS range differences. The Jacobian matrix of the vector function
with respect to the source position
u is denoted
. The
ith row of
(for a selected sensor
) is given by the partial derivative of the
ith TDOA (relative to
r) with respect to
u. From Equation (
10), focusing on the LOS part (
), we have
, where
and
. Differentiating this with respect to the components of
u, we obtain:
where the row vector on the right-hand side of Equation (
13) is exactly the
ith row of
. Intuitively,
is the unit vector pointing from sensor
i toward the source or vice versa, and
is the unit vector pointing from the reference
r toward the source. Thus, the row in Equation (
13) is proportional to the difference between these two unit direction vectors, scaled by
. Stacking all selected sensors’ rows, we get the full Jacobian matrix
.
Meanwhile, if we consider the dependence of the TDOA model on the bias terms
, the Jacobian with respect to the bias vector
l can also be written down. From Equation (
10), each ordinary sensor
i’s TDOA depends on
with coefficient
, and on the reference’s bias
with coefficient
. Therefore, the Jacobian with respect to the biases for the selected sensors (including the reference) is
where
in Equation (
14) is the
identity matrix with rows corresponding to the ordinary sensors’ biases
for
, and
is a
column vector corresponding to the derivative with respect to the reference bias
. Thus, each row of
has a
in the column for its own bias
and a
in the column for
, indicating that an increase in
increases
by
, while an increase in
decreases all
by
.
Now we can formulate the Fisher Information Matrix (FIM) for the source localization problem under various assumptions:
Pure LOS case (no biases): In this scenario (
known), the only unknown parameters are the coordinates of
u. The FIM for estimating
u, given a particular choice of sensors (encoded by
), is
where this is a
matrix in Equation (
15) quantifies the information (inverse variance) available about the source position
u. The CRLB for unbiased estimation of
u under these conditions is given by the inverse of this matrix:
. In practice, one might use a scalar summary of this matrix (for example, the trace or the largest eigenvalue) as a measure of the expected localization error for the chosen sensors.
Unknown bias case (NLOS present, no prior): In this case, the bias values
for NLOS paths are unknown parameters to be estimated jointly with
u. We refer to this scenario as Case1 (position and sensor biases unknown, with no prior information on biases). We then have an augmented parameter vector
that includes both the source position and all selected bias terms. The FIM for this joint parameter vector (prior to incorporating any prior distributions) can be constructed in block-matrix form using the Jacobians
and
derived above:
The top-left block
corresponds to information about
u (as in the pure LOS case), the bottom-right block
corresponds to information about the biases
l, and the off-diagonal blocks
(and its transpose) couple the two. Since our ultimate interest is in
u (the source location), we can obtain an effective Fisher information matrix for
u alone by marginalizing out (or Schur-complementing) the nuisance parameters
l. The effective information matrix for
u in this unknown-bias scenario is given by the Schur complement of the
l-block in Equation (
16):
A practical and often overlooked consequence follows: if the reference is LOS (), depends only on LOS rows of ; unmodeled NLOS rows do not increase information about u. This motivates choosing a reference that is likely LOS and discounting suspected NLOS rows during selection.
Unknown bias case with priors: In some situations, we may have prior knowledge or statistical estimates for the NLOS biases
. For example, from historical data or auxiliary sensors that can gauge NLOS conditions. We refer to this scenario as Case2 (position and sensor biases known a priori, or at least we have prior distributions for them). We can incorporate this prior information in the FIM as additional “information” about the parameters. Suppose we have an independent prior for each bias
for
, with prior variance
, so
is the standard deviation of our prior belief about bias
. We can form a diagonal information matrix for these priors as:
In , the top-left block is all zeros because we assume no prior information on the true source position u, and the matrix on the diagonal corresponds to the information (inverse variance) we have for each bias: it contains for each selected ordinary sensor i and for the reference’s bias.
The generalized Fisher information matrix when considering both the data and the bias priors is then the sum of the information from the measurements Equation (
16) and the prior information above. Partition
conformably with
as:
In Equation (
19),
is the
block corresponding to
u,
the
block for
l, and
the cross terms. Then the effective Fisher information for
u (marginalizing out
l with its priors considered) is:
This formula is analogous to Equation (
17), but now
incorporates both the measurement information and the prior information about the biases. Because we have some knowledge of the biases (through the prior), even NLOS sensors can contribute some information about
u, unlike the prior-free case, although the contribution is reduced depending on the uncertainty of the bias (encoded in
).
With these Fisher information matrices in hand, we can formalize the Stage A sensor selection problem. The goal is to choose p and q (i.e., choose which sensor is reference and which are ordinary) to minimize the expected localization error of Stage B. As a metric, one convenient choice is the trace of the position error covariance matrix (the trace of the CRLB). Other criteria, such as the log-determinant of the CRLB or the largest eigenvalue (worst-case axis), could also be used—typically all such criteria will lead to a similar selection.
We formulate two versions of the selection problem corresponding to the scenarios above:
(Case1) Selection with unknown biases: We assume some sensors could be NLOS (unknown biases) and no bias priors are used. We then want to minimize the trace of the position-only CRLB accounting for unknown biases, which is based on
from Equation (
17):
(Case2) Selection with bias priors: We assume we have prior information for biases. The selection aims to minimize the trace of the CRLB using
from Equation (
20):
Each of these is a combinatorial optimization problem: we are searching over all choices of one reference and
others out of
S to minimize the chosen objective. The problem can be computationally challenging if
S is large, since the number of subsets to check grows combinatorially. In practice, one can use heuristic or relaxed optimization methods. For example, one approach is to use a convex relaxation, such as formulating an equivalent semidefinite programming (SDP) problem and then using randomized rounding to obtain a binary solution, to get a near-optimal solution offline. Alternatively, a faster greedy algorithm can be used for online selection: for instance, one can start with an empty set and iteratively add the sensor that most improves the objective greedy forward selection, or start with a random set and then iteratively swap sensors in and out to improve the objective. These algorithms work directly with the information matrices (
or
as appropriate) and enforce the selection constraints Equation (
7) at each step.
In implementing these selection strategies, it is often useful to have an initial rough estimate of the source location
u. The matrix
and the resulting Fisher information depend on
u only through the geometry terms
in Equation (
13). In a practical deployment, such a coarse estimate
can be obtained directly from the TOA/TDOA measurements used in Stage A: once the HF distress signal is received, one receiver is chosen as a time reference and each active sensor forms a TOA or TDOA measurement according to the model in
Section 3. These TDOA values define a family of hyperbolic curves with foci at the receiver positions. By intersecting these hyperbolas once, using all available receivers, we obtain a coarse multilateration solution
. This simple hyperbolic multilateration ignores NLOS biases and does not iterate, so its accuracy is limited, but it provides a point inside the region of interest. Using this approximate
to evaluate
therefore supplies a representative geometry for the sensor-selection criteria without requiring any accurate prior knowledge of the true eVTOL position.
To summarize the above procedure, the following Algorithm 1 outlines a generic implementation of Stage A for both Case 1 and Case 2.
| Algorithm 1 Stage A: Receiver Selection (Case 1/Case 2) |
| 1: | Input: candidate receivers with positions ; approx. source ; selection budget K (one reference and ordinary); noise covariance ; (Case 2) bias prior variances ; mode ; optional diversity constraints.
|
| 2: | Output: selected subset with , including reference .
|
| 3: | Using and , construct the Jacobians and and build the Fisher information blocks (Section 3).
|
| 4: | Define a function that, for any feasible binary selection satisfying , , , , returns the position-only FIM if , and otherwise.
|
| 5: | ; . |
| 6: | for all feasible that select exactly one reference and ordinary receivers and satisfy the diversity constraints do |
| 7: | |
| 8: | trace of the position-only CRLB}
|
| 9: | if
then |
| 10: | ; |
| 11: | end if |
| 12: | end for |
| 13: | {chosen K receivers}
|
| 14: | {the single reference (the index with )}
|
| 15: | Return: and .
|
4. Stage B: Geolocation Method
Stage B employs a Random Spatial Spectrum (RSS)-based direct localization algorithm that works directly with synchronized HF receiver data under the equivalent single-hop skywave model. Classical DPD [
32,
33,
34] techniques are also single-step estimators that use the same received data as conventional two-step schemes, but they usually assume LOS or NLOS propagation and fixed sensor geometry and construct a global maximum-likelihood or subspace cost function by coherently combining array outputs over candidate emitter locations, sometimes together with delay/Doppler parameters. In contrast, the RSS stage in our framework evaluates a spatial spectrum on a joint grid of ground positions and virtual ionospheric heights by computing cross-spectral coherence terms between the reference and each selected receiver and aggregating them into a joint spatial spectrum
. In this way, Stage B does not explicitly estimate any TOA/TDOA or FDOA parameters and remains consistent with the virtual-height skywave model and the receiver subset determined in Stage A.
Let be the subset of sensor indices chosen in Stage A, with . Denote the reference sensor chosen by Stage A as , and let be the set of the remaining selected sensors (the ordinary sensors). By construction, . For each selected sensor , we have a stream of synchronized baseband samples (as described in the system model). We can transform these to the frequency domain; let denote the length of the DFT we apply (this could be equal to N used in Stage A or a different value as needed), and let be the set of frequency bins we will use for localization (for example, we might exclude very low-frequency bins or very noisy bins). Then for each selected sensor m, we have frequency-domain data for .
Stage B proceeds by evaluating an RSS on a grid of candidate source positions and possible path heights, then finding the maximizer of this spectrum as the estimated source location. We define the search grid as follows:
Let be a set of P candidate source positions (particles) that we will test. These could be, for example, points on or near the Earth’s surface in the geographic region of interest (forming a latitude–longitude grid, possibly with some altitude dimension if altitude is uncertain).
Let be a set of Q candidate virtual ionospheric heights to consider for the signal paths. This grid allows the algorithm to consider different possible effective heights of reflection for the paths.
With these grids, the RSS algorithm works as follows.
Path-delay computation for candidates: For any hypothesized source position
and any hypothesized virtual reflection height
, we can compute the expected propagation delay (in samples) from
u to sensor
m by an equation analogous to (2). Namely,
In Equation (
23),
is the Earth’s radius (same as before), and
represents the central angle (at Earth’s center) between the candidate position
u and sensor
m’s location
. This
can be computed similarly to
in (3), except that now
u may be a variable point not equal to the true source. Throughout the paper,
is reserved for the central angle associated with the (unknown) true source position
u in the channel model of Equations (
2) and (
3), whereas
denotes the central angle associated with a hypothesized candidate location
u when evaluating the Stage B grid. Essentially,
is the modeled propagation delay (in samples) from the candidate source at
u to sensor
m, assuming a single-hop path that reaches an altitude of
h before coming down to
m. (This formulation also effectively models multi-hop paths by using an equivalent single-hop with a larger
h, just as we did in Stage A).
Using
, we can generate hypothesized TDOAs for each pair of paths (one for sensor
m and one for the reference
) as follows. For each ordered pair of heights
, define
In Equation (
24),
is a hypothesized time-difference-of-arrival: it assumes that the signal to sensor
m traveled via an ionospheric layer of height
, and the signal to the reference
traveled via height
. By considering all combinations
, we are effectively allowing the possibility that the path to
m might involve a different number of hops (or a different reflection height) than the path to the reference. In HF propagation, for instance, one receiver might receive a one-hop signal while another simultaneously receives a two-hop signal from the same transmitter; including both
q and
in the grid allows the algorithm to match such situations by trying a larger virtual height for one path.
Per-sensor coherence calculation: Next, for each candidate position
u and each sensor
, we quantify how well the data from sensor
m aligns with the data from the reference
if the transmitter were at
u with a particular pair of path heights. We do this via a cross-spectral coherence* calculation. Specifically, for a given
u and a given pair
, we compute
In Equation (
25), the term
is the complex product of the reference’s DFT with the conjugate (denoted by
H for Hermitian transpose) of sensor
m’s DFT at frequency bin
k—essentially the cross-power spectrum between the reference and sensor
m at that frequency. We then divide by
, where
is the power at the reference in that bin; this normalization down-weights frequency bins where the reference signal is very weak (the small positive term
is added to avoid division by zero or numerical instability when
is extremely small). Next, we multiply by
, which is a phase rotation compensating for the hypothesized TDOA
at that frequency. If the guess
is correct for the true propagation path difference, this phase factor will align the signal from sensor
m with the reference’s signal in that frequency bin. Finally, we average over all frequency bins
(hence the
) to obtain the coherence measure
. In essence,
measures the correlation between sensor
m’s received signal and the reference’s signal when we time-shift one relative to the other by the hypothesized delay
. If the transmitter were truly at
u and the path heights were
and
for
m and
respectively, we would expect
to be large (its magnitude close to 1, say). If the hypothesis is wrong, the signals will not align well and the magnitude
will be small.
We then aggregate over all height pairs to get an overall coherence score for sensor
m at location
u. We define
where
in Equation (
26) essentially sums up the magnitudes of the coherence metrics over all combinations of assumed heights for sensor
m and the reference. One can view
as a sensor-specific “likelihood” or scoring function indicating how plausible it is, based on sensor
m’s data (in comparison to the reference), that the transmitter is located at
u. If
u is incorrect, we expect that no choice of
will yield a consistently strong coherence, so the sum of magnitudes will remain low. If
u is near the true transmitter location, then there should be at least one pair
(the one corresponding to the actual propagation scenario) that yields a strong coherent alignment between
m and
, thereby giving a large contribution to the sum.
Joint spectrum computation and maximization: Finally, we combine the per-sensor scores
from all the selected sensors
(i.e., all the ordinary sensors, excluding the reference) to form a joint spatial spectrum
. We define
In Equation (
27),
is the product of the scores from each of the
sensors, excluding the reference. The idea behind using a product is that we want
u to be a strong candidate only if
all the selected sensors have good alignment (high coherence) when the transmitter is hypothesized to be at
u. If even one sensor has very low
, the product will be zero or very small, indicating
u is unlikely because it fails to match the data at that sensor. Thus,
will peak at the location
u that best explains the timing across the entire network of sensors simultaneously. Once we have computed
for all candidate points
, we pick the maximizer as our estimated source location:
where the position
defined in Equation (
28) is the output of Stage B’s coarse search. Often, it is beneficial to refine this estimate further. One common refinement technique is to take the initial estimate
and then perform a local search in its vicinity. For example, we can generate a small “cloud” of trial points around
by adding random perturbations (e.g., Gaussian-distributed offsets with a certain standard deviation forming a covariance matrix
) to
. Let
denote these local trial positions (with
possibly included). We then evaluate
on this smaller set of points and pick the best among them, which gives us an updated
. Because this second search is only over a local neighborhood (and we already have a good starting point), we can afford to make the perturbations fine-grained (small) and use more points
if needed, to hone in on a more precise estimate. This two-stage grid approach (a coarse global search followed by a finer local search around the best point) improves both accuracy and robustness without incurring the full cost of a uniformly dense global search.
Optionally, once we have a final location estimate
, we might also be interested in the likely virtual reflection heights for each path. The algorithm naturally provides a way to estimate these as well: for each sensor
and even for the reference
, we can find which height
best explains that sensor’s data given the source is at
. One simple approach is to choose
In other words, for each sensor m, we consider all coherence values where the height for sensor m is fixed to h and the reference’s height varies over . We sum the magnitudes over all to get a total coherence score for sensor m assuming a particular height h. We then pick the height that maximizes this score as . This can be interpreted as the algorithm’s estimate of the effective ionospheric reflection height for the path to sensor m. It is worth noting that this height estimate is a byproduct and not required for localization; however, it can be valuable for understanding the propagation scenario or for further processing.
To provide a concise guide for implementation, the following Algorithm 2 summarizes the above RSS-based direct Geolocation procedure.
| Algorithm 2 Stage B: RSS-based Direct Geolocation on |
| Input: with reference ; streams for , ; particles , ; stabilizer . |
| 2: | for
do |
| for do |
| 4: | . |
| for do |
| 6: | compute via (24); |
| accumulate via (25); |
| 8: | . |
| end for |
| 10: | end for |
| . |
| 12: | end for |
| ; optionally refine locally around and update . |
| 14: | Output: (and, optionally, via (29)). |
From the pseudocode above, the dominant computational cost in Stage B comes from evaluating the spatial spectrum
on the joint grid
using the selected receiver subset
K. For each candidate location
and each receiver
, the algorithm loops over all height pairs
to compute delay hypotheses and coherence terms and then aggregates them into
and
. In other words, the overall complexity grows approximately linearly with the number of candidate locations
and selected receivers
K, and quadratically with the number of virtual-height candidates
through the
loop. In the emergency eVTOL scenarios considered in
Section 5, Stage A restricts
K to only five receivers and confines
U to a small search rectangle in the region of interest, while
H contains only a few tens of virtual-height candidates consistent with
Section 2, so the grid search remains numerically stable and computationally tractable in our experiments.