Sensor Control in Anti-Submarine Warfare—A Digital Twin and Random Finite Sets Based Approach

Since the submarine has become the major threat to maritime security, there is an urgent need to find a more efficient method of anti-submarine warfare (ASW). The digital twin theory is one of the most outstanding information technologies, and has been quite popular in recent years. The most influential change produced by digital twin is the ability to enable real-time dynamic interactions between the simulation world and the real world. Digital twin can be regarded as a paradigm by means of which selected online measurements are dynamically assimilated into the simulation world, with the running simulation model guiding the real world adaptively in reverse. By combining digital twin theory and random finite sets (RFSs) closely, a new framework of sensor control in ASW is proposed. Two key algorithms are proposed for supporting the digital twin-based framework. First, the RFS-based data-assimilation algorithm is proposed for online assimilating the sequence of real-time measurements with detection uncertainty, data association uncertainty, noise, and clutters. Second, the computation of the reward function by using the results of the proposed data-assimilation algorithm is introduced to find the optimal control action. The results of three groups of experiments successfully verify the feasibility and effectiveness of the proposed approach.


Introduction
Submarines are the main combat forces of modern maritime warfare, and the major threats to maritime security. Anti-submarine warfare (ASW) is a type of warfare that depends on surface warships, aircraft, or submarines to fight against enemy submarines. The key of ASW is to quickly identify and localize as many enemy submarines as possible. Sensor control is the key technology for the victory in ASW, so we focus on the innovation of the online sensor control method. Many works have been done to apply simulation-based approaches in naval warfare research, but there are quite a few effective methods for combining simulation technologies with the real ASW in real time. In this paper, we study how to control the sensor of anti-submarine ships in ASW by employing simulation theory, random finite set (RFS) theory, and digital twin theory.
The sensor control problem is also known as sensor management problem. The sensor equipped on the anti-submarine ship can perform many different actions including moving to designated areas, searching in certain directions, etc. In real ASW, the anti-submarine ship usually takes tactical actions to estimate the accurate distance and guarantee the observability [1][2][3]. Different actions have different effectiveness; some actions will be effective for sensing the submarine, while some others not. Here the goal of sensor control is to ensure maximum efficiency of sensors and provide more accurate measurements to the simulation system. Finding the optimal control action is an urgent need in practical application. In this paper, the objective of sensor control in ASW is choosing control actions online so that the utility of sensors is maximized. Here, sensor control means sequential decision The backbone technology of digital twin is simulation for prediction, evaluation and analysis [24]. The simulation-based prediction with high confidence is the fundamental function of digital twin. The vision of the digital twin itself refers to a comprehensive functional description, evaluation, and prediction together with all available operational data of an entity, target, or system, which includes more or less all information which could be useful in all the current and subsequent phases [26]. The digital twin in ASW is not only used to describe and predict the behaviors of the submarine in the real ASW, but also to derive and evaluate solutions and course of actions (COAs) relevant to the real ASW.
The digital twin theory has been quite popular in recent years, but most of the research related to it only focuses on the theoretical research and requirement analysis. The practical application and implementation of digital twin has rarely been mentioned. This paper is the first attempt to apply and implement the digital twin theory. Simulation models for ASW are very complicated now, but still fail to describe the real ASW accurately. The main reason is the separation of simulation system and the real system, and the second reason is the failure to simultaneously integrate online measurements into the running simulation models.
In the proposed digital twin-based framework of sensor control in ASW, the online measurements are dynamically assimilated into the running simulation models, and the running simulation models guide the real ASW process in reverse. The intuitive application of digital twin for ASW is to obtain estimated states or parameters of the real ASW system by combining the real-time measurements with a simulation model [27]. Since real-time measurements can indicate the latest updated states of the real system, we focus on the problem of effectively assimilating continuous streams of data into running simulation models. At the same time, we have also studied the computation of reward function by using the results of the proposed data-assimilation algorithm.
The rest of the paper is structured as follows. We give the digital twin and RFS-based framework of sensor control in ASW in Section 2. The proposed RFS-based models for digital twin are introduced in Section 3. The RFS-based data-assimilation algorithm is detailed in Section 4. Section 5 describes the computation of the reward function by using the results of the proposed data-assimilation algorithm. Experimental results are detailed in Section 6, and the conclusions are given in Section 7.

Digital Twin and RFS-Based Framework of Online Sensor Control
As it is shown in Figure 1, the digital twin in ASW is mainly used for decision making. The digital twin can be regarded as a virtual equivalent or dynamic digital representation of the real ASW [28]. The simulated ASW in this framework is used to predict the emergent behaviors in the real ASW, evaluate the COAs and choose the best one for the operator. The simulated ASW evolves with the real ASW along the whole life cycle and integrates the currently available and commonly required data and knowledge. We can get the prediction, evaluation, and analysis of an enemy submarine by means of precise simulations. Digital twin can assist in ensuring information continuity throughout the whole operation, sensor control, and system behavior predictions in ASW based on simulations. To improve the coordination between the simulation system and the real system for sensor control in ASW, we propose the digital twin-based framework of online sensor control by incorporating RFS theory. The technical view of the proposed framework is given in Figure 2. Since there are two constituent objects (simulated space and physical space) in digital twin, in this paper, we propose two corresponding technologies to support the implementation of the digital twin-based framework: one is the RFS-based data-assimilation algorithm for assimilating real-time measurements into the running simulation model, and the other one is the computation of the reward function by using the results of the proposed data-assimilation algorithm for finding the optimal control action.
Just as it is shown in Figure 2, the proposed feed-back control loop incorporates real-time measurements into the running simulation model while dynamically managing the physical sensors to refine measurements. The physical ASW space provides the simulation inputs to the simulation model, and it also provides the real states to the physical sensors.
The RFS-based simulation model provides the predicted states to the RFS-based measurement model. The simulation model helps to analyze potential alternative solutions for the anti-submarine ship, and evaluate the impact of possible control actions for the online sensor control method. The RFS-based measurement model characterizes the behaviors of the physical sensor on the anti-submarine ship. It uses the predicted states outputted by the RFS-based simulation model to generate the predicted measurements and provides them to the data-assimilation process. The RFS-based simulation model and measurement model are the main elements of the simulated ASW space.
The digital twin is supported by data-assimilation algorithm which can incorporate real-time measurements into the running simulation models for more accurate prediction of the physical ASW system, and it can also evaluate the COAs [29]. RFS-based data-assimilation process is the foundation of digital twin and is in charge of fusing real-time measurements to estimate the states of the physical ASW space [30]. There are two tasks for the data-assimilation process, the first is to dynamically update the current simulation states of the physical ASW space and provide the updated states to the simulation model for subsequent simulation running; the second is to provide the updated states and corresponding weights to the reward function computation module for computing the reward function.

RFS-Based Modeling of the Simulated ASW
The simulated ASW in digital twin depends on two kinds of models to describe the physical ASW: one is the Markov transition density-based simulation model; the other one is the measurement likelihood-based measurement model. Here we use the RFS-based simulation model to model the state transition of the enemy submarine, and the RFS-based measurement model to model the physical sensor of anti-submarine ship.

RFS-Based Data Model
Conventional estimation techniques fail to support digital twin for ASW, because many sophisticated simulation models of ASW cannot provide the analytical mathematical structures for deriving the functional forms of probability distribution. The sequential Monte Carlo (SMC) method which is also named as particle filter (PF), is the most widely used data-assimilation algorithm in recent years [29]. PF-based data-assimilation algorithm in traffic simulation is presented by Xu in [31] and Wu in [32]. PF-based data-assimilation algorithm in wildfire simulation is presented by Hu in [33,34]. They all use the same non-parametric statistic inference method based on PF, because PF has no assumptions on the distribution and linearity of the simulation model.
The conventional PF-based data-assimilation algorithm depends on the vector-based representation of data including states and measurements. The vector-based representation makes the vector-based data-assimilation algorithm have the following essential disadvantages: 1. It is based on the assumption that the studied system is a single dynamic system that is permanently active. It cannot be used for the dynamic system that switches on and off randomly. Switching is quite common for submarine activity, for example, a submarine may enter and leave a battle area at random instance. 2. It is based on the assumption that the detection is perfect with no false detections and no missed detections, and it also needs the number and ordering of measurements to be previously designated. Furthermore, it cannot jointly estimate the number of submarines and the states of each submarine.
Being different from the conventional vector-based representation, the RFS-based representation can take the more complex situations into consideration. The RFS-based representation of states can support the transition of submarines from one mode to another. RFS-based representation of states enables the submarine number to be constantly varying. For example, completely new submarines can enter a scene randomly. Submarines can likewise leave a scene, as when disappearing behind some other occlusion, or they can be damaged or destroyed. The RFS-based representation of measurements enables us to take the imperfection of the sensors into consideration. The sensors on the anti-submarine ship can fail to generate a measurement of the submarine state, or pick up false measurements. Since there is no information on which state generates the measurement, and the number of measurements is a random variable, RFS rather than vector-based representation can be more useful.

RFS-Based Measurement Model
The measurement model is in charge of mapping the simulation states to the measurements collected by the sensor. The sensor on the anti-submarine ship provides measurements with imperfect enemy submarine detection including noises, clutters, and missed detections. Detection uncertainty and clutters in the measurements can be described by the union of Bernoulli RFS and Poisson RFS as introduced in [35,36]. The sensor sequentially gives an unordered finite set of measurements Z k ⊂ Z, and never indicates which of these measurements is generated from the enemy submarine.
The measurements produced by the sensor can be mathematically modeled as an RFS Z = {z 1 , z 2 , . . . , z m }. Its major advantage is that both the measurements' number m = |Z| and the value of the constituent vector z ∈ Z in the measurement space Z ⊆ R n z are random. It makes no assumptions on the order of detections in the RFS Z. The measurements for ASW can be represented as Z = C ∪ W, here C is the Poisson random finite subset of false detections and W is the Bernoulli random finite subset corresponding to the enemy submarine.
The behavior of the sensor can be described by the conventional likelihood function g(z|x) which characterizes the probability that measurement z is generated by the submarine state x.
Here we can have g(z|x k ) = N (z; h(x k ), ω k ). If a measurement z ∈ Z k is generated from an enemy submarine, then the relationship between z ∈ Z k and the submarine state x k can be described by following equation: where ω k is a zero-mean independent Gaussian noise with variance σ 2 θ , and h(x k ) is in charge of modeling the true submarine bearing or range at time k. In the following part of this section, we will derive the specific mathematical form for the measurement model denoted by ϕ(Z|X), where X is the Bernoulli RFS of simulation states. The measurement model ϕ(Z|X) can be written in two specific forms, one for X = ∅ and the other for X = {x}.
Firstly, if the enemy submarine does not exist in the real operational area, the measurement set will only consist of clutters. It means that X = ∅ and Z = C ∪ ∅ = C. Here we model the number of clutters in the measurement set by the following Poisson distribution: here λ is the average number of clutters. Clutters are modeled as independent identically distributed random vectors conditioned on |C|. The values of these random vectors are taken from the measurement space Z with probability density function (PDF) c(z). Here c(z) = (2π) −1 is the time invariant spatial distribution of clutters. The measurement model can be modeled as follows: Secondly, if the enemy submarine exists in the real operational area with state X = {x}. The measurement RFS W corresponding to the enemy submarine can be modeled as a Bernoulli RFS conditioned on X = {x}. If the submarine fails to be detected, we have W = ∅. If the submarine is detected and causes a measurement z, we have W = {z}. To derive the measurement model ϕ(Z|{x}), we should specify the PDF of the RFS W conditioned on {x}. This PDF can be modeled as follows: where p D (x) is the probability of detecting the submarine state x.
The measurement model ϕ(Z|{x}) can be represented as ϕ(Z|{x}) = ∑ W⊆Z η(W|{x})κ(Z \ W), here \ is the set-difference operation and κ was defined by (3). Since the RFS W is a Bernoulli RFS, and the summation is computable, we can get the measurement model for the case X = {x} as follows: In summary, the RFS-based measurement model for ASW can be represented as follows:

RFS-Based Simulation Model
Digital twin requires a simulation model that can predict possible future states. To give a unified description of the enemy submarine presence/absence in the operational area and its kinematic state, we employ the Bernoulli RFS-based simulation model to describe the dynamics of enemy submarine at discrete time k. The state space is ∅ ∪ S(X ), where S(X ) is a set of all singleton {x} and x ∈ X . Here we use PDF to denote the uncertainty of the submarine's states which are evolving according to the state space model in discrete time as follows: here f k|k−1 denotes the deterministic part of the true evolution equation which is in charge of mapping the state to the next time step, and v is the stochastic part of the true evolution that we fail to capture deterministically, and it makes assimilating online measurements necessary. If the enemy submarine is existing in the real operational area, the state X k will be a singleton, and can be modeled by the Markov process whose Markov transition density is denoted by The submarine state vector is adopted as follows: where (x t k , y t k ) and (ẋ t k ,ẏ t k ) is respectively the enemy submarine position and velocity in Cartesian coordinate. The state vector of the anti-submarine ship x o k is similarly represented by is the nearly constant velocity (NCV) model and relies on the relative state vector which is defined as follows: The specific mathematical form of π k|k−1 (x k |x k−1 ) is described by the state space model as follows: here F k is the deterministic part of the true evolution equation, S k+1,k is a deterministic matrix related to the effect of anti-submarine ship acceleration, and ε k ∼ N (0, Q k ) is stochastic part of the true evolution equation. Here we have . Their specific forms are as follows: where ⊗ is the Kronecker product, and is the intensity of process noise. Here we assume T k = T = const, so we can get F k = F and Q k = Q.
To characterize the enemy submarine's appearance and disappearance during the simulation interval, we use a binary random variable ξ k ∈ {0, 1} to denote its existence. Here ξ k = 1 indicates that the enemy submarine exists at time k, and ξ k = 0 indicates that it does not exist at time k. The dynamics of ξ k is described by the first-order 2-state Markov chain with a transitional probability matrix Ξ as it is shown in Figure 3. The elements of Ξ are defined as Ξ is defined as follows: where p b = P{ξ k+1 = 1|ξ k = 0} is the probability that the submarine appears in the operational area during the simulation interval, and p s = P{ξ k+1 = 1|ξ k = 1} is the probability that the submarine is still in the operational area during the simulation interval. Here p b and p s are assumed to be known.
If the submarine appears during the simulation interval T k , PDF b k|k−1 (x) can be used to denote its PDF. Now we derive the RFS-based simulation model of enemy submarine. Since we represent the submarine's simulation states by using Bernoulli RFS, we consider two different situations for the RFS-based simulation model.
Firstly, if the enemy submarine does not exist in the real operational area at time k, we can get X k = ∅. In addition, the submarine can appear in the operational area with probability p b,k and have kinematic state x k with PDF b k|k−1 (x k ), or remain absent from the operational area with probability 1 − p b,k . The simulation model f k|k−1 (X k |∅) for state X k is specified as follows: Secondly, if the enemy submarine is existing in the real operational area at time k − 1, which means X k−1 = {x k−1 }, it can come through to the next time step with probability p s,k (x k−1 ) and transit to x k with PDF π k|k−1 (x k |x k−1 ), or disappear with probability 1 − p s,k (x k−1 ). Thus, the simulation model for state X k at time step k is characterized by From (14) and (15), we can know that the Bernoulli RFS can either be an empty set or a nonempty set with one element only. In addition, the probabilities of these two cases can be modeled respectively as 1 − q and q. If it has one element, the element will be spatially distributed over S(X ) ⊆ R n x according to the standard PDF s(x). So, the simulation model for the simulation state denoted by the Bernoulli RFS X k at time k can be given by: where q k|k = p{|X k | = 1|Z 1:k } is the probability of submarine's existence in the real operational area, s k|k (x k ) = p(x k |Z 1:k ) is the spatial PDF of the submarine.

RFS-Based Data-Assimilation Algorithm
Unforeseen operational entities can enter the designated real operational area and critical assets can be destroyed, thereby invalidating the current simulation setting [37]. It is possible that the simulation execution may not proceed as expected because of the dynamic and changing environment. Since assimilating real-time measurements into the running simulation models can significantly improve the accuracy of the simulation results, the implementation of digital twin theory depends on the simulation system's use of the information from the real world.
Here an RFS-based data-assimilation algorithm is proposed for online assimilating the sequence of measurement sets in the present of noise, false alarms, data association uncertainty, and detection uncertainty. The proposed RFS-based data-assimilation algorithm can overcome the limitations of the standard vector-based data-assimilation algorithms very well.

Data Assimilation with RFS-Based Models
Since data assimilation usually updates the posterior distribution of simulation states by using the simulation model and measurement model, it can be regarded as a Bayesian inference procedure from a probabilistic point of view [38]. Here we also use the RFS-based models and Bayesian inference for data assimilation.
The RFS-based data assimilation estimates recursively the posterior PDF of submarine's states by using the RFS-based simulation model, measurement model, and online measurements. It usually consists of two stages, the prediction stage and the update stage. Since we have got a Bernoulli RFS-based simulation model (16), the posterior PDF at time k, denoted by f k|k (X k |Z 1:k ) is completely specified by two terms: • the posterior submarine's existence probability q k|k = p{|X k | = 1|Z 1:k }; • the posterior spatial PDF of X k = {x} denoted by s k|k (x k ) = p(x k |Z 1:k ).
Denote the posterior PDF of simulation states at time k − 1 as f k−1|k−1 (X k−1 |Z 1:k−1 ). The RFS-based prediction equation of the data-assimilation process is: At time k − 1, the posterior PDF is characterized by the pair (q k−1|k−1 , s k−1|k−1 (x)), the prediction equation of data assimilation can be derived from (18) as: In Bayesian theory, the updated PDF is calculated by: Given the measurement model (7) and the prediction Equations (19) and (20), the update equation can be derived from (21) as: where

SMC-Based Calculation
In the general cases, the RFS-based prediction equation and update equation cannot be solved analytically [39]. Here we propose a SMC-based calculation. We use the particle system {w to approximate the spatial PDF s k|k (x). Here x (i) k is the state of particle i and w (i) k is its weight. As s k|k (x) is a conventional PDF, the corresponding weights of the particles should be normalized, As it is shown in Figure 4, the RFS-based simulation model firstly runs to the next time point and generates the predicted states of ASW. Weight updating relies on the difference between the online measurements and the predicted measurements generated from the predicted states. Suppose at time k − 1, the submarine's existence probability is q k−1|k−1 , and the spatial PDF is approximated by here δ c (x) is the Dirac delta function concentrated at state c. The predicted submarine's existence probability q k|k−1 can be computed by (19). According to (20) and (25), the predicted spatial PDF depends on the sum of two terms. Consequently s k|k−1 (x) can be approximated by the particle system as a weighted point mass representation: here the particles are drawn from two importance densities, ρ k for persisting particles and β k for birth particles : with weights where B k is the number of submarine-birth particles drawn from the importance density β k . ...

Input Probability of Submarine Existence
Input Spatial PDF of Submarine The simplest choice of importance density ρ k (x|x k−1 , Z k ) is the transitional density π k|k−1 (x k |x k−1 ). If there is little prior knowledge of the action plan of enemy submarine, we should assume that the enemy submarine can appear anywhere in the state space S(X ). So, we model b k|k−1 (x) by using the uniform distribution over S(X ). The birth importance density β k in (27) needs to have the same support as b k|k−1 (x) (i.e., the entire S(X )) [40].
The update step of the data-assimilation algorithm is implemented according to (22)- (24). First, for every z ∈ Z k , the integral I k (z) = g k (z|x)s k|k−1 (x)dx, which appears in (24), is approximately calculated as follows: Then based on (24), δ k can be calculated by The submarine's existence probability is updated by using (22), and the corresponding weights are updated according to (23): These weights should be normalized to get the normalized importance weights: for i = 1, · · · , N + B k . At last, we resample N times from {w to avoid sample degeneracy.

Computation of Reward Function
We propose an online sensor control method by using the predicted states, updated states, and their corresponding weights generated by the proposed data-assimilation algorithm. Online sensor control is mainly used for finding the optimal control action from a set of admissible control actions. Thus, it means sequential decision making, where each decision generates measurements that provide an additional information for data assimilation. In the digital twin for ASW, the control action is determined in the present of uncertainty both in the measurement space and the state space.
Here an information theoretic approach is proposed for online sensor control. In this approach, the posterior PDF is used to represent the uncertain states, and the reward function is regarded as a measure of the information gain related to each action.

Derivation of Reward Function
In this paper, the online sensor control means the online selection of headings for individual anti-submarine ship, to maximize the use efficiency of its measurement system. Here control actions are ranked by using the quantity of information predicted to be gained from their execution. The data-assimilation enhanced simulation model is used to rapidly predict the possible output information of alternative control actions.
Reward function is used to measure the reduction in the information gain, in comparison with the current information state. The information gain can be characterized by using various information measures [41]. The Fisher information is typically used as a criterion for optimization in the absence of detection uncertainty [42][43][44]. Here we use the Rényi divergence-based reward function. The Rényi information divergence provides a way to measure the dissimilarity between two probability densities [45]. The Rényi divergence between any two probability densities p 0 (x) and p 1 (x) is described as: where α ≥ 0 is the factor that reflects how much we emphasize the tails of two probability distributions. Let u k ∈ U k denote the control action chosen for controlling the sensor at time t k in order to collect the future measurements at t k+1 . Here U k denotes the set of admissible control actions at time t k . In general, both the simulation model f k+1|k and the measurement model ϕ k+1 depend on the control action u k ∈ U k . Then the prediction Equation (18) and update Equation (21) for RFS-based data assimilation can be rewritten as follows: The optimal control action to be applied at time k is defined by maximizing the expected Rényi information divergence according to equation where φ(v, p, Z) is the real-valued reward function associated with the control action v. (36) results in the predicted PDF p and the future measurement set Z. Online sensor control via (36) tries to obtain the maximum reward based on a single future step. This is done by anticipating possible future measurements. To find the optimal control action for the anti-submarine ship to take next, we should predict the system states if the control action u k has been chosen by using the RFS-based simulation model. In addition, we should generate the predicted measurement set before actually receiving the measurement set Z k+1 . Hence the calculation of the expected value of the Rényi divergence for each possible control action is closely related to the data-assimilation process. The future measurement set Z k+1 (v) supports the computation of the reward function φ. Since Z k+1 (v) is obtained after the control action has been executed, this will create uncertainty. To overcome the impact of uncertainty, (36) employs the expectation operator E. The reward function φ(u k , p, Z) in (36) is adopted as the Rényi divergence between: • the predicted PDF f k+1|k (X k+1 |Z 1:k , u 0:k ) given by (34) which is based on action u k , and • the predicted future posterior f k+1|k+1 (X k+1 |Z 1:k+1 , u 0:k ) given by (35), obtained by using the new measurement set Z k+1 collected after the sensor has been controlled to take action u k .
We simplify the reward function φ by suppressing its second and third argument. Depending on (33), the reward function φ can be represented by:

Data-Assimilation-Based Computation
The expected reward function E[φ(u k )] does not have the closed-form analytic solution. Thus, we employ numerical approximate method and data-assimilation-based solution for it. This makes the SMC-based implementation of data assimilation become valuable. By adopting the SMC-based implementation, the optimal control action selection can be implemented quickly. Furthermore, this makes the data-assimilation process and online sensor control to be interdependent and interoperable.
Since the predicted PDF and the updated PDF in the RFS-based data-assimilation process are Bernoulli PDFs, let f k+1|k (X k+1 |Z 1:k , u 0:k ) and f k+1|k+1 (X k+1 |Z 1:k+1 , u 0:k ), which feature in (37), be specified by the pairs (q k+1|k , s k+1|k (x)) and (q k+1|k+1 , s k+1|k+1 (x)) respectively. So, the predicted PDF and the updated PDF can be written as follows: According to the rules of set integral, the reward function defined in (37) can be simplified to: According to (36), the optimal control action can be selected as the expected value: Now we use SMC to obtain the numerical implementation of (41). First, we obtain the values of the reward functions for the predicted future measurement set sequence Z k+1 (v). Then we compute the expected value of φ(v) by calculating the sample mean of the obtained values. Here Z k+1 (v) is obtained after taking control action v ∈ U k . Each realization of Z k+1 (v) is generated from the predicted PDF represented by q k+1|k , s k+1|k (x) by using the RFS-based measurement model.
Basing on the SMC-based implementation of the proposed data-assimilation algorithm, we select M predicted submarine states from w k+1|k . So, we can get M predict ideal measurements for the computation of the reward function. The key of computing (40) is the computation of the following integral Depending on the SMC-based implementation of the proposed data assimilation, it can be computed as follows. Let s k+1|k (x) be approximated by {w . By taking control action v, we can obtain a sample of the future measurement set {Z with M submarine originated noiseless measurement set. According to the proposed RFS-based data-assimilation algorithm, s k+1|k+1 (x) can be approximated by the particle system {w is computed according to (31)- (32). Equation (42) In conclusion, the computation of the reward function assigned to every control action v ∈ U k for the RFS-based online sensor control method is as follows: followed by the application of (41).

Simulation Experiments
To verify the effectiveness of the proposed digital twin and RFS-based approach for sensor control in ASW, we have carried out three groups of experiments: the first one is related to the data-assimilation algorithm, the second one is related to the online sensor control with single submarine, the last one is related for the online sensor control with multiple submarines. The data-assimilation experiment is used to verify the correctness and effectiveness of the proposed RFS-based data-assimilation algorithm for assimilating the online measurements to the simulation system. The following two groups of online sensor control experiments are-based the data-assimilation experiment, and used to verify the proposed sensor control method for ASW. They use the prediction results of the data-assimilation experiment to control the real sensor of the anti-submarine ship.

Data-Assimilation Experiment
As it is shown in Figure 5, we adopt the identical-twin experiment to evaluate the proposed data-assimilation algorithm [34,46]. In this experiment, the RFS-based simulation model of enemy submarine with designated initial setting is first running, and the measurements corresponding to the simulation results are recorded. These simulation results are regarded as the real states of the physical ASW. And the measurements recorded here are regarded as the real-time measurements generated by the real sensor. We assimilate the measurements by using the proposed data-assimilation algorithm, and then compare the assimilated simulation states with the obtained real states. As it is shown in Figure 5, we use three terms in the experiment: the real submarine state, the assimilated one, and the simulated one, to present the experimental result. A real submarine state is the simulated one from which the measurements are recorded. To reflect the fact that the submarine simulation execution usually depends on the biased initial parameters as compared with the real submarine in ASW, here the simulated submarine state is the simulation result based on some biased initial parameters, for example, imprecise process noise intensity. Here "biased" means in the sense that the parameters are different from those used in the real submarine state. Finally, an assimilated submarine state is the data-assimilation enhanced simulation result based on the same biased initial parameters as in the simulated one. The goal of this experiment is to prove that the assimilated submarine state is more accurate than the simulated one by assimilating measurements.

Experimental Setup
In the real ASW, the submarine moves at a speed of approximately 5 knots. The scan repetition time of the sensor on the anti-submarine ship is 30 s. The probability of detection is assumed to be Gaussian distributed with mean 0 and covariance σ D = 5000. The number of clutters per scan is assumed to be Poisson distributed with the mean value λ = 1. The parameters of the data-assimilation process are as follows: particle number N = 5000, birth probability p b = 0.01. The initial parameters for the enemy submarine and the anti-submarine ship are presented in Table 1.
The performance measure of experimental results is the positional root mean square (RMS) error defined as follows: where P is the total number of Monte Carlo runs, (x k|k ) is the assimilated (or simulated) submarine state at time k in the pth run, and (x k , y k ) is the ground truth.  Figure 6a displays the real submarine trajectory, simulated one, and assimilated one by averaging over 500 Monte Carlo runs. Figure 6b displays the RMS error curves by averaging 500 Monte Carlo runs. This experiment tests the effectiveness of the proposed data-assimilation algorithm when the process noise intensity, initial speed, heading, and position are biased. From Figure 6 we can see that the simulated one has large deviations from the real submarine state because of the erroneous initial parameters. However, the assimilated submarine state is much closer to the real one. By using the proposed data-assimilation algorithm, the assimilated submarine state overcomes the problem of erroneous initial parameters, and matches the real submarine state with much smaller errors.  Figure 7a illustrates a typical result of a single run of the submarine's existence probability obtained by the data-assimilation algorithm with = 0.2. The red dotted line q k|k = 1 is the ground truth of the submarine's existence probability which means that the enemy submarine exists in the real ASW all the way. The submarine's existence probability shown in Figure 7a grows to 1 after some time steps and remains high throughout the simulation execution. If the detection of the submarine is missing, it drops but is still bigger than 0.8. Figure 7b shows the submarine's existence probability averaged over 500 Monte Carlo simulations. We can find that as time involves, the assimilated submarine's existence probability gradually approaches to 1. The occasional missed detections and false detections could not affect markedly the performance of the data-assimilation algorithm for this application. The results of data-assimilation experiment prove that the proposed data-assimilation algorithm can successfully assimilate the online measurements to the running simulation model of ASW. In the following section, we will analyze the sensitivity of the proposed data-assimilation algorithm.

Sensitivity Analysis
The influence of particle number N on the overall performance of the proposed data-assimilation algorithm is studied by using different particle numbers. Figure 8a shows the RMS position errors averaged over 500 Monte Carlo simulations for different particle number N. From Figure 8a we can find that if the particle number N increases, the RMS position error will decrease. However, if the particle number N is larger than a certain degree, the influence of the particle number N on the RMS position error will be every small. This is consistent with standard particle filter theory. Figure 8b shows the influence of the particle number N on the estimated probability of submarine existence q k|k for k = 1, 2, · · · , 80. From Figure 8b we can find that the influence of particle number N on the estimated probability of submarine existence is very limited for this application.
To estimate the influence of the mean value λ of clutters on the performance of the data-assimilation algorithm, Figure 9a,b shows the RMS position errors and assimilated probability of submarine existence curves obtained from different λ. From Figure 9a we can find that if λ is small than a certain value, its influence on RMS position error is very limited. However, if λ increases, the RMS position error will gradually increase, too. Figure 9b shows that if the mean value λ of clutters increases, the error of estimated probability of submarine existence will also increase. In addition, it will take more time for assimilated probability of submarine existence to approach the ground truth. The results agree with Equation (22), (23) and (31). When λ increases, the updated weight w for the particles representing the true submarine states will decrease, this leads to the increasing of RMS position error.  We also compare the performance of the data-assimilation algorithm for the different settings of max detection probability p D,Max . The results are as shown in Figure 10a,b. From these figures, we can find that the bigger p D,Max is, the smaller the errors of RMS position and probability of submarine existence are. If p D,Max increases, according to (31), the updated weights for the submarine generated measurements will increase. This also makes the accuracy of data-assimilation increase.
The influence of p s on the performance of the data-assimilation algorithm is shown in Figure 11a,b. From Figure 11a,b we can find that the increase of p s will improve the accuracy of estimated submarine states and probability of submarine existence. From Equation (19), we know that if p s increases, the predicted existence probability q will also increase. From Equation (28), we know that if p s increases, the predicted weights for the survival particles will increase, and this leads to the decreasing of the RMS position error.

Online Sensor Control Experiment with Single Submarine
To verify the effectiveness of the proposed online sensor control method, we use a scenario where the anti-submarine ship trajectory consists of only two constant velocity motion legs. The online sensor control is conducted at the end of the first leg, when the choice is between different turns at different headings. In this experiment, there is only one enemy submarine in the operational area.

Experimental Setup
During the first leg, the speed of the anti-submarine ship is 4 knots and the course during this leg is −50 • . As it is shown in Figure 12, at the end of the first leg (k = 50), the anti-submarine ship needs to choose a new course for the second leg. We verify the performance of the proposed online sensor control method by using detection parameters p D,Max = 0.98, σ D = 5000 and λ = 5. To find the best option for the anti-submarine ship heading among the 24 second leg options, we need to obtain the RMS position error at time step k = 51 for each admissible course θ = −170 • , 100 • , · · · , 175 • . We obtained these RMS position errors by fixing the value of θ and conducting 500 Monte Carlo runs for each θ to compute the averaged RMS position errors.
The set of admissible control actions U k is determined as follows. If the current position of the anti-submarine ship is u k = [χ k ψ k ] T , its future admissible locations are: U k = (χ k + V ship · cos(l θ + θ 0 ), ψ k + V ship · sin(l θ + θ 0 )); l = 1, · · · , N θ (46) where θ = 2π/N θ is a selected course step size, θ 0 = −50 • is the initial course of the anti-submarine ship, N θ = 24, and V ship = 4 knots. The anti-submarine ship can move in its current course (case l = 24) or move in other courses. 24 control actions are considered at the end of the first motion leg. We first get the RMS position errors for all the courses at the time step at which the control actions are executed. Then we test the online sensor control method by comparing the number of times out of 500 Monte Carlo runs that it has chosen for each particular course.

Experimental Results
The RMS position errors for different courses are plotted in Figure 13. It indicates that the second leg course decisions of 70 • and 85 • are preferred in this experiment. After finding out the best control actions for the anti-submarine ship's second leg heading, we let the anti-submarine ship make its own decision by using the proposed online sensor control method. Figure 14 shows the number of times out of 500 Monte Carlo runs the online sensor control method has chosen for each particular course. From Figures 13 and 14, we can know that the proposed online sensor control method can give a suitable control decision by using the digital twin-based framework of sensor control in ASW. The results successfully verify the correctness and effectiveness of the proposed digital twin and RFS-based framework. In the following section, we will study the influence of the number M of prediction measurements on the results.

Sensitivity Analysis
We analyze the influence of the number M of prediction measurements on the proposed online sensor control method. Table 2 shows the results for different values of M. From Table 2, we know that when M increases, the number of times for the good control actions (including courses 70 • and 85 • ) increases, and the number of times for the bad control actions decreases. However, the influence extent is very limited, since M increases quickly and the performance of the method is slightly improved.   45  23  37  28  22  15  17  11  9  5  7  6  8  2  −65 •  45  42  27  25  32  30  32  33  30  29  30  30  31

Online Sensor Control Experiment with Multiple Submarines
In this experiment, we verify the effectiveness of the proposed online sensor control method by a scenario where the anti-submarine ship tracks multiple submarines. Here we control the range-only sensor of the anti-submarine ship by using the proposed digital twin and RFS-based method.

Experimental Setup
The state of single submarine at time step k is represented by is the distance between the sensor and the submarine at p k . In this experiment, the operational area is a square of sides s = 1200 m, R 0 = 300 and h = 0.0002 m −1 . The measurement is generated by where ω is zero-mean white Gaussian measurement noise, with deviation σ ω = σ 0 + β p k − u o k 2 . In this experiment, σ 0 = 1 m and β = 5 × 10 −5 m −1 . The clutters are modeled as the Poisson RFS. The intensity of clutters is modeled by the uniform densityκ(z) = λ · c(z) with mean λ = 5.
In this experiment, there are 5 moving enemy submarines in the operational area. As it is shown in (46), we control the sensor by finding the optional course. To prove the validity of the proposed method, we compare the proposed control method with the other method that the control vector is randomly selected from the set U k . Each method runs 10 times, and the averaged OSPA errors (order parameter p = 2 and cutoff c = 100 m) are compared.

Experimental Results
In this experiment, the anti-submarine ship controls the sensor and runs the sensor control method every T c = 5 time steps. We compare the performance by using the OSPA error at every time step. The mean OSPA errors of two methods are given in Figure 15. We can see that the proposed digital twin and RFS-based online sensor control method can effectively reduce the OSPA error. We can also find that as time evolves, the OSPA error of the proposed method also gradually reduces in this application.  Figure 16 gives the estimated submarines' numbers of 10 Monte Carlo simulations generated by using the proposed data-assimilation algorithm and the different online sensor control methods.
The black line represents the truth of submarines' number, and the data points of different shapes represent the estimated submarines' number at each time step of 10 Monte Carlo simulations. Figure 16a is the result of the proposed sensor control method, and Figure 16b is the result of the random control method. We can find that the proposed sensor control method performs much better than the random control method on estimating submarines' number.  Figure 17 gives the estimated submarines' states of 10 Monte Carlo simulations generated by using the proposed data-assimilation algorithm and the different online sensor control methods. The black lines represent the truth of submarines' states, and the data points of different shapes represent the truth of the submarines' states. Figure 17a is the result of the proposed sensor control method, and Figure 17b is the result of the random control method. We can find that the proposed sensor control method also performs much better than the random control method for estimating submarines' states. The paths of the anti-submarine ship for two methods are shown in Figure 18. Different colors represent different Monte Carlo simulations, and the red points represent the truth of the submarines' trajectories. We can see that the proposed sensor control method can successfully guide the sensor to move close to the enemy submarines to get more accurate and more reliable measurements.

Sensitivity Analysis
In this section, the performance of the prosed online sensor control method is further analyzed by sensitivity analysis. We analyze some parameters' influence on the performance of the proposed sensor control method. These parameters are the time interval T c of two sensor control actions, factor α, the number M of prediction measurements, and particle number N for each submarine.
From Figure 19, we can see that as T c increases, the mean OSPA error also increases. This means that decreasing T c can improve the performance of the prosed online sensor control method. Figure 20 gives the OSPA error for various values of parameter α. We can find that α has quite little influence on the performance of the proposed sensor control method in this application. The influence of the number M of prediction measurements on the online sensor control method is shown in Figure 21. We can see that the influence of M on the performance of the proposed sensor control method is little. The reason is that the performance of the proposed method does not only depend on the reward function, but also depends on the data-assimilation algorithm. As it is shown in Figure 22, the particle number N can affect the performance of the proposed method. The increase of N can ensure more reasonable choice of the control action.

Conclusions
In this paper, we studied the digital twin-based framework of sensor control in ASW. We firstly combine the simulated ASW with the real ASW by employing the digital twin theory. Then we proposed an RFS-based data-assimilation algorithm to dynamically incorporate online measurements generated from the real ASW. At last, we also fostered the ability of the simulation system to control the sensor in ASW by deriving and implementing the data-assimilation-based reward function. The proposed data-assimilation algorithm has the potential to overcome the limitations of the conventional vector-based algorithms. It can jointly estimate the number of enemy submarines and the state of each enemy submarine. We tested the proposed data-assimilation algorithm by using the identical-twin experiment. The results prove that the proposed algorithm can assimilate the input measurements and improve the accuracy of simulation results. We tested the proposed online sensor control method with two group of experiments, including the scenario with single submarine and the scenario with multiple submarines. The results showed that the proposed online sensor control method can effectively control the sensor. This paper can be regarded as an application of digital twin in ASW and the methods can also be applied to other applications.