Target Detection for Synthetic Aperture Radiometer Based on Satellite Formation Flight

Synthetic aperture interferometers formed by satellite formations have been adopted to improve spatial resolution. Due to the limited number of satellites and limited integrated time, the use of sparse baselines can result in distorted reconstructed images, which will generate false targets or miss true targets. When detecting a target on the Earth from a geostationary orbit, the target usually occupies only one pixel, and it is almost submerged by noise. Considering the slow-varying characteristics of the observation area, combined with historical observation data and the motion characteristics of the target itself, a target detection method based on multi-frame snapshot images is proposed. Firstly, the observation background is estimated using multi-frame historical data, and background elimination is used to suppress the background noise. Then, potential targets are selected using the local brightness temperature characteristics of the targets. Lastly, the target motion tracks are applied to erase false targets and correct the positions of missed targets. Simulation experiments have been conducted, and the false alarm rate and the missing alarm rate are counted for randomly distributed targets.


Introduction
A synthetic aperture interferometer radiometer (SAIR) is a typical imaging system that samples in a spatial frequency domain and obtains spatial domain images via inverse Fourier transform. SAIRs have the advantage of having fewer requirements in terms of weather conditions and operating time, making them widely used in meteorology, ocean exploration, and other remote sensing fields. The first on-board SAIR, the Microwave Imaging Radiometer with Aperture Synthesis (MIRAS) of the Soil Moisture and Ocean Salinity (SMOS) mission, worked on a low Earth orbit (LEO) of 600-800 km, covering a swath width of 1000 km, and had a ground spatial resolution of less than 50 km [1,2]. In order to provide continuous dynamic monitoring for regional areas, some geostationary orbit (GEO) SAIRs were proposed. These included GeoSTAR from the Jet Propulsion Laboratory of NASA [3], the Geostationary Atmospheric Sounder (GAS) from the European Space Research and Technology Center of the ESA [4], and the Geostationary Interferometric Microwave Sounder (GIMS) from the National Space Science Center, Chinese Academy of Sciences [5].
In recent years, SAIRs have started being used for target detection and recognition. Liu [6] demonstrated that the root-mean-square error of reconstructed images with targets with rapid brightness temperature (BT) changes, such as tropical cyclones (TCs), was closely related to observation frequency and imaging period. Chen [7] proposed a method for detecting higher-order moving targets using a rotating scanning SAIR (RS-SAIR) equipped with a linear sparse array. Yang and Hu [8] utilized the kernel method to predict the observation background by designing a new robust loss function, and a constant false to compute the baseline vectors between the satellites. Then, the (u, v) coverage percentage based on relative orbit elements is established.
To detect moving ships on the ocean, the satellites are chosen to operate in the geostationary orbit for the continuous exploration of the observation area. For the convenience of inter-satellite communication and relative measurement, the configuration of a subsatellite circle-shaped distribution is selected. At the same time, the satellites of this configuration can provide good instantaneous (u, v) coverage.
Assuming that the mother satellite has the orbit elements (a 0 , e 0 , i 0 , Ω 0 , ω 0 , M 0 ) and the kth daughter satellite has the orbit elements (a k , e k , i k , Ω k , ω k , M k ), the relative orbit elements can be calculated using Equation (1). n * = µ/a 3 * is the mean motion of the satellite; D denotes the relative average drift rate; ∆e = (∆e x , ∆e y ) represents the relative eccentricity vector; ∆i = (∆i x , ∆i y ) is the relative inclination vector; and ∏ is the difference in the mean argument of latitude.
As Figure 1 shows, around the mother satellite, the subpoints of the daughter satellites are distributed in a circle with a radius of r, and the initial phase of the kth satellite is φ k . The subsatellite plane track of the kth satellite is shown in Equation (2): Equation (2) is equivalent to the relative orbit elements (D k , ∆e xk , ∆e yk , ∆i xk , ∆i yk , ∏ k ). The formation's stability requires D k = 0 and ∏ k = 0, and the remaining elements are listed as follows: Therefore, by configuring the different φ k of the satellite, we can obtain different samples in the (u, v) domain, as presented in Equation (4): Building on the optimization objective function proposed by Cornwell [19], a new optimization function is introduced in Equation (5): From Equation (5), we can obtain the formation configuration solution with the optimal (u, v) coverage.
Building on the optimization objective function proposed by Cornwell [19], a new optimization function is introduced in Equation (5):   3  3  1 2 , , , , , ( , ,..., ) log( ) = −  n i j k l i j k l m r r r u u (5) From Equation (5), we can obtain the formation configuration solution with the optimal (u, v) coverage.

Resolution and Sensitivity with Imaging
In this section, we firstly discuss the principle of SAIR imaging, and then the resolution and sensitivity of the SAIR are introduced.

Imaging
The imaging principle of SAIR is based on the van Cittert-Zernike theorem. The theorem essentially states that the spatial coherence function, also called visibility, is exactly proportional to the Fourier components of the brightness. The visibility can be measured via cross-correlation between a pair of spatially separated antenna elements. The spatial domain is represented by the direction cosines of the incidence angle ( , ) (sin cos ,sin sin ) ξ η θ ϕ θ ϕ = , and the spatial frequency domain is represented by baselines ( , ) u v , which represent the distance between antennas measured in wavelength. The visibility is defined as follows: where K is a constant related to the receiver's characteristics and the system bandwidth.  By defining for simplicity, and neglecting spatial de-correlation effects (i.e., 1 =  h r ), the image reconstruction of the brightness temperature map can be obtained by computing the inverse Fourier transform of the measured visibility:

Resolution and Sensitivity with Imaging
In this section, we firstly discuss the principle of SAIR imaging, and then the resolution and sensitivity of the SAIR are introduced.

Imaging
The imaging principle of SAIR is based on the van Cittert-Zernike theorem. The theorem essentially states that the spatial coherence function, also called visibility, is exactly proportional to the Fourier components of the brightness. The visibility can be measured via cross-correlation between a pair of spatially separated antenna elements. The spatial domain is represented by the direction cosines of the incidence angle (ξ, η) = (sin θ cos ϕ, sin θ sin ϕ), and the spatial frequency domain is represented by baselines (u, v), which represent the distance between antennas measured in wavelength. The visibility is defined as follows: where K is a constant related to the receiver's characteristics and the system bandwidth. G(ξ, η) is the antenna power pattern. T B (ξ, η) is the brightness temperature in units of Kelvin. r h is the fringe washing function, and f 0 is the central frequency.
By defining for simplicity, and neglecting spatial de-correlation effects (i.e., r h = 1), the image reconstruction of the brightness temperature map can be obtained by computing the inverse Fourier transform of the measured visibility: However, with a limited number of antennas, the (u, v) plane is only sampled at discrete points, resulting in severe degradation in the reconstructed image due to the inadequate frequency space sampling.

Spatial Resolution and Sensitivity
The satellite formation with SAIR can form a spatial interferometric array, and the spatial resolution is determined by the formation distance: Sensors 2023, 23, 6348

of 15
As Equation (9) shows, the spatial resolution is inversely proportional to the aperture size, where res is the spatial resolution, λ is the detection wavelength, and D is the diameter of the antenna opening. For a circle-shaped subsatellite distribution, D will be equal to 2r when two satellites are symmetrically distributed in the circle.
For a single radiometer, the sensitivity is as follows: where T s represents the system noise temperature, T R represents the receiver noise temperature, T A represents the average brightness temperature, B is the signal bandwidth of the receiver, and τ a is the integrated time for the visibility data. Considering SAIR formed by a satellite formation, if the SAIR has n a antennas, the visibility data are averaged for time τ a , and the whole observation covers a time interval τ 0 , the sensitivity of this array is as follows: where A is the effective collecting area of the elemental antenna and A syn is the effective area of the beam created by the antenna array. By increasing the number of antennas n a and extending the time τ a , better sensitivity can be obtained.

Image SNR
The targets that are expected to be detected are moving ships. Due to the spatial resolution, each target only occupies one pixel or subpixel, and they are considered as point targets. The reconstrued image of sparse sampling has a worse signal-to-noise (SNR). Therefore, the neighborhood of the potential targets is defined, and the SNR within the neighborhood is calculated, instead of the SNR of the whole image. The SNR of a potential target's neighborhood is calculated as follows: where T tar is the brightness temperature of the potential target and T around is the average brightness temperature of the potential target's neighborhood, with a selection of an 11 × 11 pixel area.

Target Detection Algorithm and Process
The synthetic aperture radiometer based on satellite flight formation involves several key technologies, including satellite formation design and high-precision relative position measurement, the calibration of the instruments' amplitude and phase to maintain consistency and stability, and image reconstruction and target detection on sparse sampling. Target detection, especially for point targets on sparse sampling, is one of the key technologies. This paper is focused on the target detection of moving ships using sparse sampling.
All the frequency components were expected to be sampled to achieve a realistic reconstruction of the image, as Equation (8) indicates. Otherwise, the reconstructed image would be severely degraded, leading to various negative effects, such as generating false targets and missing true targets due to sparse baselines and target movement. These negative effects would make it difficult to detect targets from a single snapshot. However, if a series of snapshots of the observed area could be gained in a short period of time, the background remains relatively consistent across each snapshot, with the only difference being the position of the moving targets.
Under these conditions, a target detection method based on the image sequences was put forward. Firstly, background estimation and elimination operations were per-formed, which could remove the aliasing noise caused by background elimination to some extent. Then, a local extremum operation was applied to detect potential targets. Lastly, targets were selected based on motion tracking to remove the noise caused by the targets' movement. The overall process is illustrated in Figure 2, with varying colors indicating different levels of brightness temperature, and the targets to be detected are flagged with rectangular boxes.
would be severely degraded, leading to various negative effects, such as generating false targets and missing true targets due to sparse baselines and target movement. These negative effects would make it difficult to detect targets from a single snapshot. However, if a series of snapshots of the observed area could be gained in a short period of time, the background remains relatively consistent across each snapshot, with the only difference being the position of the moving targets.
Under these conditions, a target detection method based on the image sequences was put forward. Firstly, background estimation and elimination operations were performed, which could remove the aliasing noise caused by background elimination to some extent. Then, a local extremum operation was applied to detect potential targets. Lastly, targets were selected based on motion tracking to remove the noise caused by the targets' movement. The overall process is illustrated in Figure 2, with varying colors indicating different levels of brightness temperature, and the targets to be detected are flagged with rectangular boxes.

Background Estimation and Elimination
Assuming that at moment k , targets have appeared in the observation area. In a short period of time, the background brightness temperature of the observation area back T can be estimated using the historical data and the average value of the N-frame reconstructed images during that period, as Equation (13) shows: Taking into account the slow-varying characteristics of the observation area over a short and continuous period, the background elimination operation is utilized to remove the background interference. As Equation (14)

Background Estimation and Elimination
Assuming that at moment k, targets have appeared in the observation area. In a short period of time, the background brightness temperature of the observation areaT back can be estimated using the historical data and the average value of the N-frame reconstructed images during that period, as Equation (13) shows: Taking into account the slow-varying characteristics of the observation area over a short and continuous period, the background elimination operation is utilized to remove the background interference. As Equation (14) shows, T k obj is the result after the background elimination operation:

Potential Target Selection Based on Local Extremum Operation
After performing the background elimination operation, noise caused by the movement of the targets appears, and the aliasing noise of the targets still exists. Considering the brightness temperature characteristics of the targets, a local extremum operation is used to select potential targets. The dilation operation is firstly applied to the image. We choose the structing element B, which focuses on the 7 × 7 neighborhood of the central element, and the elements in B are all ones, except for the center element, which is zero. Equations (16) to (18) depict the process of selecting potential targets, where c ∈ {−1, +1}, T k dilation (l, m) is the result after dilating, and T k extremum (l, m) is the non-extremum suppression result of the image. When the parameter c is set to −1, the local minimum points are selected; otherwise, the local maximum points are selected. P k l,m records the position (l, m), and the local extremum value T k extremum (l, m) of the potential target. The number of potential targets P k l,m is closely related to the detection threshold value ϑ and the standard deviation σ(E k obj ). If a low threshold value ϑ is chosen, all targets can be detected, but it will treat noise points as targets; otherwise, if a high threshold value is chosen, it may lead to the incomplete detection of targets. A reasonable threshold value needs to be set based on the target and noise characteristics.
It should be noted that the contrast in brightness temperature between targets and aliasing noise in certain frames may decrease after background elimination, which may lead to most targets being missed. To address this issue, we calculate the sum of the local extremum values T k sumextremum of potential targets in each frame, and analyze the consecutive frames T k sumextremum : if outliers exist, the corresponding frame will be considered as a missed-detection moment, and all potential targets in this frame will be neglected.

Target Confirmation Based on the Motion Track
The motion characteristics of the potential targets are combined to confirm their validity. Herein, we introduce the hypothesis that the target's motion track is continuous without any abrupt changes. Based on this hypothesis, a matching matrix M k+1 k is built for potential targets in two adjacent frames at non-missed-detection moments. The matrix has the same number of rows as the number of P k l,m and the same number of columns as the number of P k+1 a i1 · · · a ij · · · · · · · · · · · · . . .
The parameter a ij in the matrix records the relationship between the ith potential target in P k l,m and the jth potential target in P k+1 ) is the position of the jth target; υ max is the maximum velocity of targets; τ is the time interval between two frames; and a ij can be seen as the velocity of the target. For a certain target, the velocity varies gradually, without any sudden changes.
The matching matrix M k+1 k records the motion information of potential targets. K + 1 frames of images will generate K matching matrices. By analyzing the matching matrix series M k+1 k , the target's motion track can be obtained. Furthermore, the target's position can be amended, and the target's position at missed-detection moments can be estimated using the motion track.

Satellite Formation and Geometry Configuration
A SAIR composed of 11 satellites, which consists of 1 mother satellite and 10 daughter satellites, was used for the simulation. Each satellite carried a Y-shaped SAIR to explore the observation area. And the frequency was assigned to the K-band as an example. In order to satisfy the interferometric requirements and the baseline measurement precision, the time synchronization between satellites should be better than half of the wavelength. In addition, each satellite was equipped with a high-stability atomic clock and a GNSS receiver and was also equipped with a communication antenna to establish the inter-satellite link. All daughter satellites communicate with the mother satellite. Differential positioning technology based on GNSS was used to gain precious and accurate relative positioning amongst the satellites. All daughter satellites receive the calibration signal sent by the mother satellite to perform the amplification and phase calibration. The related engineering solutions refer to the research work in [15,20].
The flight formation of the satellites was maintained in a circular radius of approximately 1 km, and the phases of the daughter satellites could be acquired by optimizing Equation (5). The optimization result, as well as the orbit elements, are presented in Tables 1  and 2

Simulation Scenario
Fifty moving ships with hundred-meter scales were randomly distributed on the ocean. The brightness temperature of the ocean was about 292~302 K, and the brightness temperature of the ships was about 250~260 K. The targets moved at a uniform linear speed of 20~25 knots during a short observation interval. The integrated time was 5 s, and the observation interval was about 20 s. That means the frame frequency of the imaging was 20 s. We used 12 h historical visibility sampling data without targets and 20 s visibility sampling data with targets to generate the reconstructed image.
The antenna aperture size was 0.15 m with a 4.77 • field of view (FOV), enabling it to observe an area with a diameter of 3000 km on the Earth's surface. Therefore, the spatial resolution of the system was 3.6 × 10 −4• , and the ground resolution was 227 m. Table 3 listed the primary parameters of the simulation scenario. We conducted multiple sets of random experiments, and each experiment generated 10 dynamic frames with targets moving 1 pixel per frame. The simulation period was about 200 s.
Taking one set of experiments for analysis, the brightness temperature of the observation area is shown in Figure 4a, and the reconstructed brightness temperature image using 12 h historical sampling data is shown in Figure 4b.
In Figure 5, ten-frame original images are shown in column (a), and two of the targets in part of the observation area are flagged with rectangles as the examples. Column (b) depicts the images reconstructed by the SAIR based on the satellite flight formation. These two targets were buried in the noise. Column (c) shows the images after performing the background elimination operation, and potential targets are highlighted in the images. But they were not easy to be detected. Column (d) shows the potential targets after the local extremum operation. Finally, the motion characteristics of 10 images were combined, the targets were confirmed, and the trajectories were determined, as shown in Figure 6.

Target Detection Performance
A total of 30 sets of simulation experiment were conducted to test the method's performance. Herein, despite the SNR, the false alarm rate and the missing alarm rate were introduced to analyze the detection results. The false alarm rate refers to the percentage of false targets detected among all targets detected, and the missing alarm rate is defined as the percentage of missed targets among all true targets. In Figure 7, the three lines are, respectively, the average SNR of the reconstructed images, the images after the background elimination operation, and the images after the local extremum operation. The

Target Detection Performance
A total of 30 sets of simulation experiment were conducted to test the method's performance. Herein, despite the SNR, the false alarm rate and the missing alarm rate were introduced to analyze the detection results. The false alarm rate refers to the percentage of false targets detected among all targets detected, and the missing alarm rate is defined as the percentage of missed targets among all true targets. In Figure 7, the three lines are, respectively, the average SNR of the reconstructed images, the images after the background elimination operation, and the images after the local extremum operation. The

Target Detection Performance
A total of 30 sets of simulation experiment were conducted to test the method's performance. Herein, despite the SNR, the false alarm rate and the missing alarm rate were introduced to analyze the detection results. The false alarm rate refers to the percentage of false targets detected among all targets detected, and the missing alarm rate is defined as the percentage of missed targets among all true targets. In Figure 7, the three lines are, respectively, the average SNR of the reconstructed images, the images after the background elimination operation, and the images after the local extremum operation. The blue curve represents the average SNR of the reconstructed images, the green curve represents the average SNR of the images after background elimination, and the red curve represents the average SNR of the images after obtaining the local extremum positions. The local extremum operation was one operation in the local area that remained at the minimum value and set the other values to zero. Using the local extremum operation, the noise was suppressed, and the SNR was improved.
blue curve represents the average SNR of the reconstructed images, the green curve represents the average SNR of the images after background elimination, and the red curve represents the average SNR of the images after obtaining the local extremum positions. The local extremum operation was one operation in the local area that remained at the minimum value and set the other values to zero. Using the local extremum operation, the noise was suppressed, and the SNR was improved.
The average false alarm rate and average missing alarm rate in 30 sets of experiments are shown in Figure 8.

Discussion
The target detection process included background elimination, a local extremum operation to detect potential targets, and an image series to confirm the targets. blue curve represents the average SNR of the reconstructed images, the green curve represents the average SNR of the images after background elimination, and the red curve represents the average SNR of the images after obtaining the local extremum positions. The local extremum operation was one operation in the local area that remained at the minimum value and set the other values to zero. Using the local extremum operation, the noise was suppressed, and the SNR was improved. The average false alarm rate and average missing alarm rate in 30 sets of experiments are shown in Figure 8.

Discussion
The target detection process included background elimination, a local extremum operation to detect potential targets, and an image series to confirm the targets.

Discussion
The target detection process included background elimination, a local extremum operation to detect potential targets, and an image series to confirm the targets.
The specific threshold during the local extremum operation was determined using the normalized reconstructive images. In this paper, we set 80 as a certain threshold ϑ. Comparatively, using a threshold of 70, the potential targets would be in the thousands. This was not consistent with the simulation scenario. Using a threshold of 90, the potential targets would be in the dozens, and it would lose the potential targets. So, 80 was selected for the simulation scenario.
As Figure 6 shows, the average SNR of the reconstructed image series was −47.8 dB. After background elimination, the average SNR turned out to be 9.0 dB. And after the filtering operation, the average SNR turned out to be 19.15 dB. The background elimination operation could significantly enhance the SNR of the targets in the degradation images, and the operation using local extremum knowledge could improve the SNR further. In the 5th, 16th, 17th, and 24th set of experiments, more targets were submerged by target movement noise and target brightness temperature difference noise in some frames after background elimination, and the missing alarm rate was above 10%, especially in the 22nd experiment, which had the worst missing alarm rate of about 24%. The method performed well in most experiments; the average missing alarm rate in all experiments was 5.2%, and the average false alarm rate was 4.8%.

Conclusions
A target detection method was proposed for a synthetic aperture radiometer based on satellite flight formation. Considering the slow-varying characteristics of the observation area and combined with historical observation data and the motion characteristics of the targets, the method was an effective means to solve the problem caused by sparse sampling. The errors such as baseline measurement errors and phase errors between satellites make the detection more difficult, and related research will be performed in the future.