Multi-Ship Control and Collision Avoidance Using MPC and RBF-Based Trajectory Predictions

The field of automatic collision avoidance for surface vessels has been an active field of research in recent years, aiming for the decision support of officers in conventional vessels, or for the creation of autonomous vessel controllers. In this paper, the multi-ship control problem is addressed using a model predictive controller (MPC) that makes use of obstacle ship trajectory prediction models built on the RBF framework and is trained on real AIS data sourced from an open-source database. The usage of such sophisticated trajectory prediction models enables the controller to correctly infer the existence of a collision risk and apply evasive control actions in a timely manner, thus accounting for the slow dynamics of a large vessel, such as container ships, and enhancing the cooperation between controlled vessels. The proposed method is evaluated on a real-life case from the Miami port area, and its generated trajectories are assessed in terms of safety, economy, and COLREG compliance by comparison with an identical MPC controller utilizing straight-line predictions for the obstacle vessel.


Introduction
In the last two decades, research on automatic collision avoidance and optimal path planning for surface vessels has intensified, driven by the ever-growing density of maritime traffic in narrow waterways, such as gulfs, ports, and canals [1]. Motivated by the design of autonomous surface vehicles (ASV) controllers, but also aiming for the decision support of officers on watch in conventional vessels [2], control and optimization tools that ensure the safety and the cost effectiveness of navigational actions are being intensively developed. These tools are perceptive of the surrounding environment through arrays of sensors, radars, and other positioning and communication aids. In this context, the automatic identification system (AIS) encompasses most of the aforementioned technologies in order to gather positioning and other vessel data. The already vast AIS comprises an everexpanding worldwide maritime trajectory dataset, which is made available by vessels, port authorities, and other platforms in charge of efficient and safe maritime path planning. Given the fact that the majority of vessel accidents are related to erroneous handling rather than equipment failure or environmental conditions [3], these tools aim to phase out the human officers on watch as vessel controllers, or at least augment their navigational decision-making using optimization-and prediction-based methods.
The formulation of the trajectory optimization problem used in collision avoidance controllers must take multiple aspects of vessel navigation into account, while being perceptive of their surrounding environment in real time. The generation of control actions that will result in a trajectory remaining sufficiently clear from any stationary or moving objects is not the sole objective; an efficient controller should also ensure the Sensors 2021, 21, 6959 2 of 23 economy of the control actions, as well as the adherence to the collision avoidance rules, commonly known as the COLREGs [4]. Multiple collision avoidance controllers have been proposed that fulfil the aforementioned specifications; in [5], a hierarchical multiobjective optimization problem is formulated, which generates an intermediate waypoint for the controlled vessel while accounting for the good seamanship rules. In [6], a fuzzy-Bayesian collision avoidance controller is formulated capable of addressing multiple obstacle vessels at once. In [7], optimal trajectories for the collision avoidance problem are generated using a B-Spline-based search algorithm. Lastly, in [8], a collision avoidance controller utilizes a probabilistic method in order to infer the one-step-ahead position of obstacle vessels, while also accounting for non-COLREG-compliant obstacle vessels.
In general, it has been observed that controllers that are not model-based can have trouble incorporating crucial aspects of the trajectory optimization problem, thus compromising practicality. Without a working model of the controlled vessel, its maneuvering capabilities cannot be easily included in the formulation, and neither can the effect of environmental conditions be quantified [9,10]. For these reasons, model predictive control (MPC) emerges as an effective control method for the problem at hand because it utilizes a model of the plant in order to compute an optimal control trajectory based on the predicted trajectory of other ships in the vicinity. As a framework, MPC can account for the uncertainties of both the utilized model of the plant and the trajectory prediction models of other ships, while also incorporating all possible control objectives (such as navigational risk, course smoothness, or deviation from the original path) in a single cost function. Some collision avoidance controllers based on MPC have been proposed in the literature; a robust MPC controller utilizing straight-line obstacle vessel trajectory predictions is proposed in [9], capable of COLREG compliance and handling of multiple obstacles. In [11], motion planning for an autonomous vessel using a sampling-based MPC method takes place. In [12], an MPC controller for the collision avoidance task is built by approximating the behavior of an LQR controller, thus ensuring asymptotic stability of the system. In [13], a neural network used to approximate the MPC response for the generation of COLREG compliant trajectories for multi ship encounters is presented. In addition, MPC has been integrated in distributed control frameworks of multi-ship schemes; for example, a distributed MPC scheme has been employed for a multi-vessel formation controller with collision avoidance capabilities [14], or for the robust distributed control of multiple vessels operating for the inter-terminal transport of containers [15].
It becomes apparent that for the scope of the collision avoidance task, information about the future trajectories of other ships plays a central role. Prevalent in non-data driven methodologies already used for the vessel trajectory prediction (VTP) problem is the first principles-based modeling technique [16], carrying a number of significant shortcomings, such as their inherent complexity, which has a greater negative impact due to the fact that the model is usually employed multiple times within the duration of each MPC sample. In order to simplify the solution of the employed kinematic differential equations and facilitate the real-time prediction of future states, these types of models are usually created using several assumptions that try to approximate real-world conditions, but also make the final model far less accurate. Therefore, one should employ a more sophisticated, data-driven approach for the creation of effective trajectory prediction models that are included in MPC controllers. Machine learning has answered the call of producing highly accurate models, which may be easily integrated in predictive frameworks through the use of black-box modeling, and more specifically, artificial neural network (NN) approaches [17]. NNs employ different architectures in order to remap the original non-linear problem to a higher-dimensional input space and approximate its dynamics utilizing standard functions. In this context, various NN techniques have been successfully utilized in control frameworks solving the vessel trajectory prediction problem.
Feedforward NN architectures, most commonly represented by the multilayer perceptrons (MLPs), have been employed to solve the vessel trajectory prediction problem as in [18,19], where MLP NNs are trained using the well-established backpropagation algo-rithm outperforming rival methodologies, i.e., linear models and Kalman filters. In [19], a real AIS dataset gathered from the confined space of a river waterway is used to approximate the vessel dynamics in such environments. Backpropagation has been the baseline of more efficient training methods as in [20], where different computational intelligence approaches like differential evolution, genetic algorithms, and swarm-based techniques are used to modify the original backpropagation algorithm in order to create more accurate feedforward NN models. Other NN architectures, like support vector machines (SVMs), have been employed in conjunction with computational intelligence optimization techniques, i.e., the particle swarm optimization (PSO) algorithm, on AIS datasets to solve the vessel trajectory prediction problem [21]. In most cases, the inherent abilities of NN architectures that can meet the standard of high accuracy are limited to a one-step ahead prediction horizon, in the sense that multi-step ahead predictions would require an approximation of unknown future states to be made and present an error enlarged through propagation to the end of the prediction horizon. Such an error would become critically high after a small number of steps, rendering the control framework useless.
To overcome this problem, long-term trajectory prediction approaches have been devised with the inclusion of memory features, such as the recurrent neural networks (RNNs), with their most notable representative, i.e., the long short-term memory (LSTM) NNs already used in the context of the vessel trajectory prediction problem [22][23][24][25]. Besides trajectory modeling and prediction in open waters, advances have also been made in crowded port waters as in [26], where another modification of the RNNs, namely the bidirectional gated recurrent unit, is used to address the vessel trajectory prediction problem, outperforming standard NN methods in such scenarios. Gated recurrent units are promising candidates for predicting the collective behavior of vessel fleets [27].
Radial basis function (RBF) networks form a unique NN architecture belonging to the general feedforward NN category. RBFs differ from other NN architectures, having simpler structures, employing faster training algorithms, and usually producing more accurate models than MLPs. Within the context of vessel trajectory prediction, RBFs have been integrated in control frameworks by approximating unknown vessel parameters [28][29][30]. Recently, RBFs have been applied on real AIS data in order to produce highly accurate models for one-step and multi-step ahead predictions [31], showing their potential in being integrated to receding horizon control methodologies.
Remarkably, in the collision avoidance research literature, there are no instances where the multi-step-ahead trajectory prediction of moving obstacles is addressed in such a systematic manner; usually these trajectories are either known a priori, or there are no obstacle ships present whatsoever. An exception occurs in [9], where straight line trajectory predictions are employed, based on estimates of current course and speed for the moving obstacle. In this work, a multi-ship MPC controller utilizing RBF obstacle trajectory prediction models trained using real AIS data is presented for the collision avoidance task. The main contributions of this work are as follows: first, we introduce a novel MPC scheme for collision avoidance control, where nonlinear data-driven models are used to predict the trajectories of obstacle ships; to the authors' best knowledge, this is the first such instance in the literature. The previous state-of-the-art approach of using straight-line obstacle trajectory predictions may have yielded satisfactory approximation results in open sea case studies, where ships are expected to travel in a straight line, but is of limited practical use for the cases of narrow gulfs, ports, or canals, where ships are expected to maneuver while navigating through, thus resulting to nonlinear trajectories. Secondly, the aforementioned nonlinear models are trained using an AIS data-driven methodology, which, once again, constitutes an original approach in the context of vessel collision avoidance. Using real trajectories as training data increases the practicality of the proposed scheme, since the resulting trajectory prediction models capture the dynamics of real vessels. The proposed method is tested in a collision avoidance case study at the Miami port area, and its performance is illustrated by the comparison with an MPC controller employing straight-line obstacle prediction models, which corresponds to the current state-of-the-art approach [9].
The paper is structured as follows. In Section 2, the AIS-data-driven methodology for the creation of the RBF trajectory prediction models is presented. In Section 3, some preliminaries on maritime collision avoidance and optimal trajectory generation are described, and later, the proposed method is presented. In Section 4, the case study based on the port of Miami is outlined, and the simulation results are discussed in depth. Lastly, in Section 5, concluding remarks are made.

Radial Basis Function Neural Networks
RBF NNs have been successfully employed to approximate nonlinear system dynamics in order to predict future system states in numerous diverse applications [32,33]. Their success can be mainly attributed to their structure, which is simpler when compared to other NN architectures, as they comprise a single hidden layer, attached linearly to the network output. This property not only allows for using very fast training algorithms, but also makes RBF NNs suitable for integration in MPC schemes, as (a) it facilitates the controller design [34], and (b) helps to solve the MPC online optimization problem in shorter computational times [35], thus rendering such schemes applicable to systems with fast dynamics [36]. Another property that makes RBF NNs a popular modeling method in predictive control schemes is related to their increased approximation capabilities [37,38]. Indeed, notwithstanding their simple structure, RBF NNs have proven to be more accurate in modeling various nonlinear systems when compared to other machine learning methods, including MLPs, support vector machines (SVMs), random forests, etc. [39]. Especially within the context of the vessel modeling and control problems, RBFs have already shown great potential in modeling unknown vessel parameters [28,29], in order to create models capable to be integrated in trajectory tracking control algorithms. Furthermore, in a recent publication [31], it has been shown that RBF NNs trained with the fuzzy means (FM) algorithm outperform other data-driven techniques such as MLPs when modeling vessel trajectories based on AIS data.
Training an RBF NN is a process consisting mainly of two phases. The first phase is performed by applying a clustering technique on the training dataset in order to identify the centerpoint and optimized parameters of a number of radially symmetric basis functions called nodes. The incorporation of radial basis functions (e.g., Gaussian, quadratic, etc.) is the first main difference between RBFs and other feedforward NNs. The linear combination of these nodes produces the output prediction of the network. Finding the node weights is the goal of the second phase, a problem trivially solved by least squares.
A typical RBF network can be seen in Figure 1. The structure comprises three distinct layers, the first of which is the input layer and has the sole purpose of distributing the N inputs to the L nodes of the hidden layer. The second point differentiating RBFs from other architectures is the existence of only one hidden layer of N-dimensional nodes. In order to produce a predictionŷ(k) given an input datapoint x(k), at first the Euclidean norm is used to calculate the activity µ l (x(k)) for every node l by using the difference between the k-th input vector x(k) and the l-th node centerû l , such that The activity acts as input to the free parameters of each node according to the chosen RBF. The hidden node response vector z(k) for the k-th datapoint is given by z(k) = [g(µ 1 (x(k))), g(µ 2 (x(k))), ..., g(µ L (x(k)))] (2) where g corresponds to the chosen activation function. Note that in this work, a thin plate spline function is employed g(µ l (x(k))) = µ 2 l (x(k)) · log µ l (x(k)) (3) due to the fact that it is an established choice as an RBF kernel producing models of high accuracy [40], but also because there are no tunable parameters other than the actual input to the function. Such parameters would require optimization techniques to be included in the training process, e.g., employment of the Gaussian function would need kernel width optimization, which is usually performed by cpu-intensive iterative algorithms or suboptimal trial-and-error techniques. Finally, the network's prediction is calculated by linearly combining the hidden note responses such thatŷ where w is a vector containing the node weights. where g corresponds to the chosen activation function. Note that in this work, a thin plate spline function is employed due to the fact that it is an established choice as an RBF kernel producing models of high accuracy [40], but also because there are no tunable parameters other than the actual input to the function. Such parameters would require optimization techniques to be included in the training process, e.g., employment of the Gaussian function would need kernel width optimization, which is usually performed by cpu-intensive iterative algorithms or suboptimal trial-and-error techniques. Finally, the network's prediction is calculated by linearly combining the hidden note responses such that where w is a vector containing the node weights.

The Fuzzy Means (Fm) Training Algorithm
For a given training dataset where Y denotes the real outputs, and after the hidden node responses Z are formulated, the weight vector w can be trivially calculated by least squares in matrix form thus completing the second training phase in one easy step. The first phase of training requires a clustering algorithm to be applied to the training dataset, in which case the fuzzy means (FM) algorithm is a great candidate for this task [39]. Following the notation of the previous example, let us suppose a system with N normalized input variables i x . At first, each input variable domain must be partitioned into s 1-D fuzzy sets (FS). Each fuzzy subspace l A , where 1,2,..., N l s = , is formed by the selected N s fuzzy sets according to the respective input variable. This process creates a Ndimensional grid, where all intersection points, also called nodes, are candidates to become RBF centerpoints. The FM algorithm undertakes the task of selecting the appropriate candidate nodes that will be assigned as RBF centers. To perform the selection procedure, a membership function

The Fuzzy Means (Fm) Training Algorithm
For a given training dataset where Y denotes the real outputs, and after the hidden node responses Z are formulated, the weight vector w can be trivially calculated by least squares in matrix form thus completing the second training phase in one easy step.
The first phase of training requires a clustering algorithm to be applied to the training dataset, in which case the fuzzy means (FM) algorithm is a great candidate for this task [39]. Following the notation of the previous example, let us suppose a system with N normalized input variables x i . At first, each input variable domain must be partitioned into s 1-D fuzzy sets (FS). Each fuzzy subspace A l , where l = 1, 2, . . . , s N , is formed by the selected s N fuzzy sets according to the respective input variable. This process creates a N-dimensional grid, where all intersection points, also called nodes, are candidates to become RBF centerpoints. The FM algorithm undertakes the task of selecting the appropriate candidate nodes that will be assigned as RBF centers. To perform the selection procedure, a membership function Sensors 2021, 21, 6959 6 of 23 determines whether an input vector lies within the domain of an RBF centered around a candidate node. In the simple case where all input variable spaces are equally partitioned, the following distance metric can be used to assign N-dimensional spherical domains to each candidate node where x(k) is the k-th input vector, a l i,ji is the centerpoint of fuzzy subspace A l , and δa is the sphere radius, which is the same for each input. The FM algorithm uses a fast non-iterative procedure to find a subset of the subspaces, so that the final RBF NN's hidden layer comprises only the fuzzy sets, which sufficiently cover all training datapoints, in the sense that each datapoint is included in at least one fuzzy set. For an in-depth description of the FM algorithm, the reader may refer to [39].

Data Preprocessing
Best modeling practices mandate that a training dataset should be error-and noisefree, a case that is far from truth when using data from AIS transceivers [41][42][43][44][45]. AIS data are irregularly sampled and contain heavy noise, missing data, and erroneous values. Thus, before employing any modeling technique, rigorous preprocessing is in order.
The Marine Cadastre service (www.marinecadastre.gov, accessed on 25 July 2021) has been the source of all data used in this work. MarineCadastre.gov is a is a service that gathers and publicly provides AIS data to marine planning initiatives. In this work, data from all days between 1 July 2019 and 30 June 2020 have been included and filtered to keep vessels sailing an area around the port of Miami covered by the geolocation rectangle defined by the latitudes of 25.720 • through 25.840 • and the longitudes of −80.145 • through −80.042 • . To conform to the initial assumption of similar size and similar dynamics, we allowed only cargo ships sailing on engine power into the dataset, further filtering the dataset to yield a total of 180 vessels.
To address the problems of sample irregularity, noise, and erroneous values, the dataset was resampled to 120 s, which was deemed enough to capture the high inertia dynamics of large cargo ships. The interpolation technique applied on the data to perform the resampling was the Akima piecewise cubic interpolation [46], which is quite effective on geolocation data, performing a mild denoising as well. A heuristic that rejects very far-off outlier values due to GPS errors was also applied. The trajectories were split in data samples, with each one containing ten consecutive vessel positions. Note that each trajectory's starting point should be the last point of the previous one resulting in an overlap of one point, but this final position will be used as the model's output, so no actual overlapping exists within the input data. The resampling and splitting process yielded a total of about 14 k samples from 3.1 k resampled trajectories of the initial 180 vessels. Algorithm 1 depicts the step-by-step procedure of preprocessing.

Algorithm 1 Preprocessing Stage
1: Process each entry in the common dataset so that it contains only the following: Vessel ID, timestamp, latitude, and longitude. Reject all other information. 2: Sort dataset by vessel ID and sort each vessel data by date. 3: Apply resampling and outlier filtering on the data of each vessel to achieve a resampling of 120 s. 4: Split vessel data into trajectories containing ten consecutive vessel positions each. 5: Create final preprocessed dataset, which should contain the vessel ID and final 10-position trajectories.

Modeling Procedures
AIS transceiver equipped vessels are able to record and exchange timestamped information, including geolocation, speed, direction, vessel identification, and specifications data. Vessel trajectory prediction algorithms integrated with collision avoidance techniques can incorporate trajectory prediction models in order to identify imminent threats and navigate safely and efficiently within heavily crowded port areas or open seas.
Let us suppose an available AIS dataset, comprising an arbitrary number of T v trajectories for a total of V vessels, where v = 1, 2, . . . , V. Let us also suppose that the included trajectories contain an arbitrary number of K v,t AIS messages AISm v,t k (timestamped geolocation and other data). In this work, for simplicity reasons, we employ the following format in AIS messages where k = 1, 2, . . . , K v,t , and Ts v,t k denotes the message timestamp, while y v,t k and x v,t k are the respective latitude and longitude contained in the k-th AIS message for the t-th trajectory of the v-th vessel. The fact that there are unknown parameters, e.g., the state and controls of the vessels, prohibits the use of kinematics in calculating future vessel states. Nevertheless, the vessel dynamics exist in the information hidden within the dataset and can be extracted and, in most cases, approximated by using a black-box modeling technique such as RBF NNs. We can assume that a common underlying pattern exists in the dynamics of samesize vessels executing similar maneuvers, for example, when approaching or leaving a port, when berthing, when crossing waterway paths, etc. Thus, if a suitable dataset of sufficient size is made available, an RBF NN can be trained to perform one-step-ahead predictions about a vessel's future geolocation by using past AIS messages as seen in the following equations where N is the number of past AIS messages given as inputs to the RBF NN. The accuracy and simple structures of FM-trained RBF NNs make them ideal to be integrated in multi-step-ahead predictive control formulations. Receding horizon techniques require models of very high accuracy due to the inevitable error enlargement through propagation. This effect appears when the incorporated model is expected to recurrently make future predictions based on its own previous output throughout the prediction horizon. The problem is further worsened with the increase of the prediction horizon and can ultimately drive the control algorithm to failure. Another important point to be noted is that the linear combination used to produce an RBF NN's prediction is a simple and very fast calculation, a fact that benefits model predictive control (MPC) frameworks [36], which require an optimization problem to be solved iteratively and expect a significant number of model predictions to be made in order to converge to the optimal solution at each timestep.
Delta values of the last position of each sample were used as the model's output, while the first nine positions were the model's input The ∆ŷ v,t k+1 and ∆x v,t k+1 values may be added to the last input position to calculate the final predicted vessel position. Based on the above procedure, the results of the modeling process produced an RBF model of very high accuracy [31]. The step-by-step procedure of the modeling stage can be seen in Algorithm 2. Note that the number of past inputs was determined after a trial-and-error procedure, where several RBF models were trained using a different number of inputs. After testing inputs in the range of 3 to 15 past vessel positions, data obtained on model performance showed that using less than nine inputs produced models with reduced prediction accuracy, while using more than nine inputs increased the model's complexity without any significant accuracy gain compared to the model using nine inputs.
so that each trajectory sample is in the form Moreover, a series of tests has been performed by the recurrent application of this model based on a horizon of 5 timesteps for all trajectories of the testing subset, where, at each successive timestep, the model had to use an increasing number of its own previous predictions. As the model uses more of its past predictions, accuracy decreases due to the enlargement of the propagated prediction error. Such a test can provide intuition on the models' ability to be incorporated in receding horizon predictive frameworks. The quality metrics used for these tests were the root mean squared error (RMSE) and the root mean squared haversine formula distance (RMSHFD). The haversine formula is commonly used to measure great circle distances on spherical surfaces. Table 1 presents the performance metrics obtained after the recurrent application of the chosen model in order to make predictions for the full length of the trajectories included in the testing subset of the training procedure. Mean RMSE values for the two outputs of the model, namely the latitude and longitude, are provided in degrees, wherein can be seen that the error lies in the order of 1.5 thousandth of a degree. The mean RMSHFD metric shows the respective error margin in meters when combining the two model outputs to get the actual predicted future vessel position for all tested trajectories. More details on the modeling procedure for the one-step ahead models, including detailed results and comparison with other machine learning approaches, can be found in [31].

Preliminaries on Maritime Collision Avoidance and Trajectory Generation
The objective of maritime collision avoidance is the generation of a risk-free trajectory that the controlled vessel should follow. A well-defined and effective method of assessing collision risk in the near future is the closest point of approach (CPA). Stemming from the concept of the CPA, two metrics are defined: time to CPA (TCPA) and distance to CPA (DCPA) (see Figure 2). A discussion regarding the quick calculation of TCPA and DCPA using the line-of-sight (LOS) distance between the controlled vessel and the obstacle ship is presented in [5]. These metrics depict the urgency of the collision danger of vessel i with another vessel j as well as its magnitude, and by specifying lowest acceptable thresholds d min and t min concerning the minimum DCPA and minimum TCPA, respectively, one can construct a risk cost function, as presented in [5].
step ahead models, including detailed results and comparison with other machine learning approaches, can be found in [31].

Preliminaries on Maritime Collision Avoidance and Trajectory Generation
The objective of maritime collision avoidance is the generation of a risk-free trajectory that the controlled vessel should follow. A well-defined and effective method of assessing collision risk in the near future is the closest point of approach (CPA). Stemming from the concept of the CPA, two metrics are defined: time to CPA (TCPA) and distance to CPA (DCPA) (see Figure 2). A discussion regarding the quick calculation of TCPA and DCPA using the line-of-sight (LOS) distance between the controlled vessel and the obstacle ship is presented in [5]. These metrics depict the urgency of the collision danger of vessel i with another vessel j as well as its magnitude, and by specifying lowest acceptable thresholds and concerning the minimum DCPA and minimum TCPA, respectively, one can construct a risk cost function, as presented in [5].
Here, is a scaling parameter, and denotes the trajectory matrix containing the x-y position of the i vessel for every timestep = ⋮ ⋮ .
By combining TCPA and DCPA, the spatial-temporal nature of a maritime collision risk with vessel i is successfully reflected. The physical interpretation of Equation (12) is that a candidate trajectory with larger minimum distance from an obstacle ship occurring at an earlier time will always be safer than a path with a smaller minimum distance and/or earlier time of occurrence. Common values for and are 10 min and 0.6 nm; because the present paper is concerned with collision avoidance in busy waterways such as ports, a lower value of 0.4 nm is used. In any case, Equation (12) can be readily incorporated in the cost function of an MPC optimization problem formulation.
A second item in the domain of trajectory generation is efficiency. Vessels should strive to not deviate too much from their original course when addressing a collision risk Here, a 0 is a scaling parameter, and T i denotes the trajectory matrix containing the x-y position of the i vessel for every timestep x n y n   .
By combining TCPA and DCPA, the spatial-temporal nature of a maritime collision risk with vessel i is successfully reflected. The physical interpretation of Equation (12) is that a candidate trajectory with larger minimum distance from an obstacle ship occurring at an earlier time will always be safer than a path with a smaller minimum distance and/or earlier time of occurrence. Common values for t min and d min are 10 min and 0.6 nm; because the present paper is concerned with collision avoidance in busy waterways such as ports, a lower d min value of 0.4 nm is used. In any case, Equation (12) can be readily incorporated in the cost function of an MPC optimization problem formulation.
A second item in the domain of trajectory generation is efficiency. Vessels should strive to not deviate too much from their original course when addressing a collision risk with another vessel. The efficiency of the generated trajectory T i for vessel i can be reflected by calculating the sum of absolute deviations from the original trajectory T OG,i Next, an important requirement to be fulfilled when addressing the problem of collision avoidance are the COLREGs [4]. The implementation of the COLREGs restricts the domain of possible candidate paths according to the type of encounter, for example 'head-on,' 'crossing,' and 'overtaking.' Head-on vessels should pass each other on the port side, while a vessel crossing from the starboard side should be given the right of way. A visual depiction of the encounter rules takes place in Figure 3. Multiple approaches for the modeling of the COLREG rules have been made in the literature [2,5,8], although these are usually concerned with a one-step-ahead calculation. However, for the case of an MPC controller, in order to ensure COLREG compliance for a candidate trajectory, all of its waypoints must be taken into account. By assuming that the LOS angle is increasing in the anti-clockwise direction, one needs to evaluate whether the LOS angles of each sequential trajectory timestep position are increasing monotonically, in order to confirm the compliance of the trajectory for the 'head-on' and 'give-way' situations.
the anti-clockwise direction, one needs to evaluate whether the LOS angles of each sequential trajectory timestep position are increasing monotonically, in order to confirm the compliance of the trajectory for the 'head-on' and 'give-way' situations.
The idea is depicted in Figure 4, where a head-on encounter between vessels i and j occurs; here, the LOS angles for trajectory monotonically increase, and therefore, it is deemed as compliant. In contrast, the monotonically decreasing LOS angles of the trajectory confirm its non-compliance as per the COLREGs intentions. A penalty for noncompliance of a vessel i encountering a vessel j in a 'head-on' or 'give-way' situation can therefore be formulated, where is the LOS angle vector, calculated for each trajectory point of encountering vessel i and the current position of encountered vessel j.   The idea is depicted in Figure 4, where a head-on encounter between vessels i and j occurs; here, the LOS angles for trajectory T i monotonically increase, and therefore, it is deemed as compliant. In contrast, the monotonically decreasing LOS angles of the T i trajectory confirm its non-compliance as per the COLREGs intentions. A penalty for noncompliance of a vessel i encountering a vessel j in a 'head-on' or 'give-way' situation can therefore be formulated, where a LOS ij is the LOS angle vector, calculated for each trajectory point of encountering vessel i and the current position of encountered vessel j.
Next, an important requirement to be fulfilled when addressing the problem of collision avoidance are the COLREGs [4]. The implementation of the COLREGs restricts the domain of possible candidate paths according to the type of encounter, for example 'headon,' 'crossing,' and 'overtaking.' Head-on vessels should pass each other on the port side, while a vessel crossing from the starboard side should be given the right of way. A visual depiction of the encounter rules takes place in Figure 3. Multiple approaches for the modeling of the COLREG rules have been made in the literature [2,5,8], although these are usually concerned with a one-step-ahead calculation. However, for the case of an MPC controller, in order to ensure COLREG compliance for a candidate trajectory, all of its waypoints must be taken into account. By assuming that the LOS angle is increasing in the anti-clockwise direction, one needs to evaluate whether the LOS angles of each sequential trajectory timestep position are increasing monotonically, in order to confirm the compliance of the trajectory for the 'head-on' and 'give-way' situations.
The idea is depicted in Figure 4, where a head-on encounter between vessels i and j occurs; here, the LOS angles for trajectory monotonically increase, and therefore, it is deemed as compliant. In contrast, the monotonically decreasing LOS angles of the trajectory confirm its non-compliance as per the COLREGs intentions. A penalty for noncompliance of a vessel i encountering a vessel j in a 'head-on' or 'give-way' situation can therefore be formulated, where is the LOS angle vector, calculated for each trajectory point of encountering vessel i and the current position of encountered vessel j.   . Head-on situation between vessels i and j; the LOS angle can be used to assess the COLREG compliance of a candidate trajectory.
Next, the generated vessel trajectory, apart from being safe and COLREG compliant, should also take into account the maneuvering capabilities of the controlled vessel, i.e., it should be guaranteed that the trajectory is technically possible to be tracked by the vessel. The feasible search domain of the trajectory optimization problem can be constructed by a purely geometric approach in the case of a one-step-ahead calculation, such as in [5], where the design variables are the vessel's next position and course. However, the extension of this geometric approach to multiple-steps-ahead requires the application of nonlinear constraints that would bound every sequential vessel position with its previous one, in order to enforce technical feasibility. For this reason, a model-based approach is preferred. The Nomoto models constitute a class of vessel course models that are tailored for this task, and have been widely adopted, not only for the design of collision avoidance schemes [47], but also for path tracking controllers [48]. The 1st order linear Nomoto model is shown as follows: Here, ω is the angular velocity of the vessel, while a is the control input to the vessel's rudder. The maneuvering capabilities of the vessel are reflected by the T s and K s constants, called time constants and rudder gain constants, respectively, while typical values are in the [0.5, 2] range for both. Solving the differential Equation (16) by assuming constant rudder angle input for a t time interval, the 1st order linear Nomoto model can be discretized as follows [8]: Here, ∆θ is the course change that would occur if a control input of a was applied and held for a time period of t. By setting this time period t as the discretization interval ∆t, a course model can be used to create a discrete vessel position model as follows Here, θ k , x k , y k is the current course, horizontal displacement, and vertical displacement according to a global reference frame, respectively, while V k is the vessel velocity. The discretization interval ∆t can be set according to the simulation resolution required. Equation (18) constitute a discrete position model L i for the i-th vessel, with input vector u i = a V and state vector x i = θ x y . By evaluating the discrete vessel position model L i for {1, 2, . . . , n} consecutive timesteps, where n the total timesteps, a trajectory T i can be created for the i-th vessel, as shown in Equation (13).

Collision Avoidance with Mpc and Obstacle Trajectory Prediction Models
The MPC framework has demonstrated its aptitude in handling the uncertainties and nonlinearities of the collision avoidance problem multiple times in the literature [9,49]; however, no other works have incorporated a nonlinear data-driven obstacle trajectory prediction model in their formulation. In MPC, the optimal moves of the controlled vessels are calculated for multiple steps ahead by solving a constrained optimization problem, with constraints in real time, for each controller sample time t cst . The cost function of the optimization problem is constituted by two horizons, namely the prediction horizon h p and the control horizon h c ; the first accounts for the total discrete timesteps ahead that the model can be evaluated, while the second for the number of timesteps that the control variables can be modified. Given a set of controlled vessels V c = {1, 2, . . . , N c } and a set of non-controlled or obstacle vessels V o = {1, 2, . . . , N o } where N c and N o are the total number of controlled and non-controlled vessels, respectively, the optimization problem's cost function can be formulated as the summation of all the cost functions of the respective controlled vessels for the kth timestep: Here, U(k) is the input matrix and is created by the horizontal concatenation of the input vectors of all controlled vessels V c , up until the control horizon h c : Next X c (k) and X o (k) are the controlled and non-controlled vessel state matrices, respectively, and are created by the horizontal concatenation of the state vectors of all controlled and non-controlled vessels V c and V o , respectively, up to the prediction horizon h p .
For simplicity, because consecutive states x i (k) up to x i k + h p − 1 constitute a single trajectory T i (k), one can write X c (k) and X o (k) as the concatenation of the trajectories of the respective vessel sets V c , V o as per Equation (13): Next, C i (k) is the cost function of the i-th controlled vessel, formulated as follows: Here, X(k) is the vertical concatenation of the two state matrices X c (k), X o (k), containing the trajectories of all vessels The cost function is comprised by two terms F i and G, each concerned with the prediction and control horizon, respectively. The presence of G term, weighted by the a G parameter, encourages the smoothness of the control actions and, consequently, the generated trajectories of the controlled vessels Term F i consolidates the collision avoidance and course keeping objectives, and is specific to the i-th vessel In Equation (26), f r,ij is the collision risk between the i-th and the j-th vessel, as calculated using their respective trajectories X i (k), X j (k) by applying Equation (12), and f d,i is the deviation from the original trajectory T OG, i , as expressed in Equation (14). Both terms are weighted by the a r and a d weighting parameters, respectively. Since we are concerned with the safety of the generated trajectory throughout the whole prediction horizon, the mean collision risk from all vessels in V\i is evaluated, in contrast to other approaches [5], where only the maximum collision risk at time k is minimized. This way, all possible collision risks are addressed and reduced simultaneously, thus avoiding the adverse possibility of evading one collision risk and increasing another. Moreover, the reason that risk avoidance is used as a control objective in Equation (27) and not as a hard optimization constraint is to ensure that the MPC optimization problem of Equation (20) will not fail in the case of the existence of an inescapable collision risk; as shown in Equation (12), risk is a function of distance to CPA, meaning that the controller will continue to attempt to maximize that distance, thus fulfilling the control intention in such an encounter. However, in order to guarantee that collisions will be avoided, one more constraint to the MPC optimization problem is added by setting an emergency distance d e (where d e < d min ); to be more specific, the vector N contains the DCPAs of all controlled vessels V c , which are required to be above the emergency distance.
At this point, it must be noted that since the state matrix X(k) consolidates all vessel trajectories, controlled and non-controlled alike, a degree of cooperation is induced between the respective controlled vessels V c . Lastly, returning to the optimization problem denoted in Equation (20), the U(k) input matrix is bounded by the upper and lower matrices U u , U l , respectively. The vector P contains the COLREG non-compliance penalties for the controlled vessels V c as calculated in Equation (15), which are required to be zero via an equality constraint.
The next item to be addressed regarding the MPC formulation is the used model that maps the input variables U to the state variables of the controlled vessels X c . Here, the 1st order linear Nomoto model is used, as described in Equation (18), with the addition of input noise that accounts for modeling error e and environmental parameters: Here, G is a random variable sampled from a Gaussian distribution with a standard deviation of σ. Finally, the state matrix of the non-controlled vessels X o (k) is assumed to be unknown for the scope of this research, and thus, an estimation is required, based on past positions. For this task, the RBF prediction model presented in Section 2.3 is employed for each non-controlled vessel j, and its trajectory for the k-th timestep is estimated using its past nine positions: Next, in order to alleviate a possible computational burden for the MPC optimization problem, an important assumption should be made. The formulation of the control scheme as-presented would give rise to a high-dimensional search space for the MPC optimization problem, thus greatly hindering its effective solution. It is assumed then that all vessels retain their initial speed, with the only controllable variable being the vessel's rudder angle; this way, the total number of control variables is reduced by half. This approach to the collision avoidance problem has occurred in the literature [5], and is not simplistic for two reasons: first, good seafaring practice dictates that course change maneuvers are preferred over speed ones, not only because they conserve energy, but also because they better emphasize the intentions of the vessel to outside observers, such as other vessels in the vicinity. Second, since large container ships will be examined in the scope this case study, their large longitudinal inertia [48] confirms the assumption that the speed remains almost constant during the timeframe of a typical collision avoidance maneuvering scenario. Therefore, for the scope of this paper, the input matrix U at timestep k is formulated as follows: where a i (k) is the rudder angle of vessel i at timestep k.
Having defined all aspects of the MPC optimization problem, a reiteration of the challenges of the collision avoidance control problem and how they are addressed by the controller is in order: firstly, the goal of the control design is to generate trajectories for the controlled vessels that are risk-free (Equation (12)), smooth (Equation (26)), COLREGcompliant (Equation (15)), and do not deviate from the original course (Equation (14)). Possible collision risks are assessed by utilizing trajectory predictions for non-controlled (obstacle) vessels in the vicinity. The controllable variables are the rudder angles of the vessels (vessel speed is considered constant), while a discrete 1st order Nomoto model (Equation (28)) is used for the modeling of the vessel dynamics, which was also infused with a noise signal for the purpose of accounting for uncertainties and environmental factors. The aforementioned vessel dynamics model has been compared to its higher-order nonlinear counterparts in [50], and it was shown that vessel course inaccuracies occur only for high yaw rates. Given the fact that the proposed collision avoidance method is concerned with large vessels with slow dynamics, the used vessel dynamics model is deemed adequate for the case. In addition, MPC has shown to be robust against model uncertainties or input noise [36]. Finally, the constraints that must be adhered to when searching for the optimal solution (Equation (20)) are the technical bounds on the controlled variables (i.e., maximum and minimum rudder angles) and the COLREG compliance of the result trajectory.

Control Framework
Having presented the proposed MPC controller, this section describes its integration within a general control framework. As shown in Figure 5, the framework is comprised by an offline and an online process. The offline process corresponds to the RBF trajectory prediction model training, using data from a specific area of interest (for example, a port)naturally then, it could be undertaken by the port authority. The online process corresponds to the real-time control of autonomous vessels in the presence of obstacle vessels in the area of interest. The MPC collision avoidance controller, as described in Section 3.2, is integrated here and is supplied with real time trajectory predictions of all obstacle vessels in order to calculate the optimal control actions for the controlled vessels. Since the RBF trajectory prediction model has been trained offline in the port authority premises, it is sensible to place the MPC controller there too, and to communicate the computed control actions per control timestep via a communications link with the controlled vessels. Figure 6 demonstrates this concept.
The MPC optimization problem described in Equation (20) is solved using the sequential quadratic programming (SQP) algorithm, which involves iterative calls to the objective function [35]. As shown in Figure 5, the integration of the MPC controller in the control framework requires the calculation of the obstacle vessel trajectory predictions for every controller timestep. Therefore, two main sources of computational complexity arise: the first is the evaluation of the RBF trajectory prediction model, which is shown to be in the order of magnitude of milliseconds [31], meaning that multiple obstacle vessels can be accounted for by the control scheme. The second is the solution of the optimization problem (Equation (20)) by the SQP algorithm, which is known to converge quickly and with few objective function calls [51]. It is concluded that a typical controller timestep duration, comprised by the two aforementioned sources, will not exceed the order of magnitude of seconds, which is considered reasonable given the slow dynamics of large vessels.  The MPC optimization problem described in Equation (20) is solved using the sequential quadratic programming (SQP) algorithm, which involves iterative calls to the objective function [35]. As shown in Figure 5, the integration of the MPC controller in the control framework requires the calculation of the obstacle vessel trajectory predictions for every controller timestep. Therefore, two main sources of computational complexity arise: the first is the evaluation of the RBF trajectory prediction model, which is shown to be in the order of magnitude of milliseconds [31], meaning that multiple obstacle vessels can be accounted for by the control scheme. The second is the solution of the optimization problem (Equation (20)) by the SQP algorithm, which is known to converge quickly and with few objective function calls [51]. It is concluded that a typical controller timestep duration, comprised by the two aforementioned sources, will not exceed the order of magnitude of seconds, which is considered reasonable given the slow dynamics of large vessels.

Case Study
In this section, the performance of the proposed multi-ship MPC controller is assessed using real-life obstacle ship trajectories, which were sourced and preprocessed as described in Section 2.4. In order to underline the importance of using sophisticated  The MPC optimization problem described in Equation (20) is solved using the sequential quadratic programming (SQP) algorithm, which involves iterative calls to the objective function [35]. As shown in Figure 5, the integration of the MPC controller in the control framework requires the calculation of the obstacle vessel trajectory predictions for every controller timestep. Therefore, two main sources of computational complexity arise: the first is the evaluation of the RBF trajectory prediction model, which is shown to be in the order of magnitude of milliseconds [31], meaning that multiple obstacle vessels can be accounted for by the control scheme. The second is the solution of the optimization problem (Equation (20)) by the SQP algorithm, which is known to converge quickly and with few objective function calls [51]. It is concluded that a typical controller timestep duration, comprised by the two aforementioned sources, will not exceed the order of magnitude of seconds, which is considered reasonable given the slow dynamics of large vessels.

Case Study
In this section, the performance of the proposed multi-ship MPC controller is assessed using real-life obstacle ship trajectories, which were sourced and preprocessed as described in Section 2.4. In order to underline the importance of using sophisticated  Figure 6. Communications within the proposed control framework.

Case Study
In this section, the performance of the proposed multi-ship MPC controller is assessed using real-life obstacle ship trajectories, which were sourced and preprocessed as described in Section 2.4. In order to underline the importance of using sophisticated trajectory prediction models in the context of collision avoidance controller design, the proposed method is compared to an MPC controller that uses straight-line predictions for the trajectories of obstacle ships based on their current course and speed [9]. To this end, two crossing scenarios are examined, while performance indicators of the generated trajectories are extracted and discussed in detail. The simulations were coded and executed on Matlab 2020b, on a computer with an Intel i7 processor and 16 GB RAM. The simulation sample time is 30". Lastly, the tuning and parameters of the methods are shown in Table 2, while the vessel parameters are shown in Table 3.

Multi-Ship Collision Avoidance Control for the Miami Port
For this case study, two controllable vessels are chosen, moving in parallel to each other and encountering an obstacle vessel moving into the port of Miami. For the performance evaluation of the two controllers, two scenarios are created; the first contains a head-on encounter type, while the second an overtaking maneuver that changes into a crossing encounter as time progresses. In the first scenario, the two controlled vessels are leaving the port of Miami at a course of 110 • , when they encounter a single obstacle on their starboard side, which, in turn, is looking to enter the port. In the second scenario, the two controlled vessels are overtaking an obstacle vessel on her port side when, suddenly, she turns portside in order to enter the port of Miami, crossing into their intended path. The challenge posed by the two scenarios is that the two controllable vessels should maintain a safe distance between each other and the obstacle vessel, while also navigating smoothly and without unnecessary deviation from their original course. It should also be noted that the obstacle vessel is non-controllable and, therefore, follows a predetermined path, without considering other vessels.
The response of the MPC controller utilizing straight-line prediction models (hereby referred to as 'MPC-SLP') for the first scenario is shown in the left column of the subfigures within Figure 7 for the 3-, 9-, 10-, and 16.5-min timesteps. The response of the proposed MPC controller utilizing RBF prediction models (hereby referred to as 'MPC-RBFP') for the same scenario and same time instances are shown in the right column of the subfigures within Figure 8. Next, the responses of MPC-SLP and MPC-RBFP for the second scenario are shown in the left and right subfigure columns of Figure 8, respectively, for the 6-, 12-, 13.5-, and 17-min timesteps. In the aforementioned response figures, the red and blue dotted lines denote the original, undisturbed trajectory for controlled vessels 1 & 2, respectively, while the black dotted line shows the predetermined path that the obstacle ship will follow as the simulation progresses. Next, the red and blue dashed lines denote the trajectory that the controlled vessels intend to follow, as calculated by the current MPC iteration, while the black dashed line shows the current trajectory prediction of the obstacle ship, as utilized by the MPC controller. The grey dashed circles have a radius of d min and denote the safe ship domain for the two controlled vessels; should any vessel enter another's domain at any time, a collision risk arises. Lastly, the red-colored and blue-colored rectangles mark the controlled vessels 1 & 2 positions, respectively, while the grey rectangle marks the obstacle ship's position; it should be noted that the markers are not to-scale with the real dimensions of the vessels, since they have been enlarged for graphical convenience.

Discussion
Firstly, in order to assist the discussion in this subsection, distance plots are generated for the controlled vessels that are in closest proximity with the obstacle ship for each scenario (see Figure 9). In addition, the performance metrics for each controller in each scenario are shown in Table 4.  The left subfigure column refers to the MPC-SLP scheme, while the second to the MPC-RBFP scheme. Note that for scenario 1 (a1, b1) and for scenario 2 (a2, b2), the MPC-SLP violates the lower limit on distance from CPA; therefore, its trajectories are deemed unsafe.  (14). (2) As calculated by Equation (26). (3) As calculated by Equation (12). (4) As calculated by Equation (24).
Next, the performance of the two controllers is assessed in an overtaking/crossing encounter in scenario 2. Here, the effect of the used trajectory prediction models is once again eminent: At timestep 6 (see Figure 8(a1,b1)), MPC-RBFP calculates a sharp control move to port-side for controlled vessel 1 in anticipation of the obstacle ship's crossing towards the port of Miami; in contrast, MPC-SLP applies a lower rate of steering for controlled vessel 1, because the straight-line trajectory prediction places its CPA with the The left subfigure column refers to the MPC-SLP scheme, while the second to the MPC-RBFP scheme. Note that for scenario 1 (a1,b1) and for scenario 2 (a2,b2), the MPC-SLP violates the lower limit on distance from CPA; therefore, its trajectories are deemed unsafe.  (1) As calculated by Equation (14). (2) As calculated by Equation (26). (3) As calculated by Equation (12). (4) As calculated by Equation (24).
For the head-on encounter of scenario 1, the correct trajectory prediction of the obstacle ship proves vital for the success of the proposed scheme. Considering timestep 3 (see Figure 7(a1,b1)), the MPC-RBFP scheme is already applying evasive control actions, since the correct inference of the general direction of the obstacle ship has given rise to a possible collision risk in the near future. In contrast, the MPC-SLP controller does not apply any control actions yet, because, based on the straight-line prediction model that it utilizes, the obstacle vessel will continue north and, thus, remain well clear of the controlled vessels. For the same reason, it takes MPC-SLP another 5 minutes in order to correctly assess the collision risk and apply decisive control actions, but by then it is too late; by timestep 9 (see Figure 7(a3,b3)), controlled vessel 2 reaches its CPA with the obstacle ship, with a DCPA of 680 m for controlled vessel 2, well below the acceptable minimum distance d min , as shown in Figure 9(a1). In contrast, the MPC-RBFP controller generates a smooth, safe, and consistent trajectory, owed to the correct trajectory prediction of the obstacle vessel. Not only does it reach an acceptable DCPA of 751 m for controlled vessel 2, but it also manages to apply consistent control actions and not significantly deviate from the original course, as shown in Table 4.
Next, the performance of the two controllers is assessed in an overtaking/crossing encounter in scenario 2. Here, the effect of the used trajectory prediction models is once again eminent: At timestep 6 (see Figure 8(a1,b1)), MPC-RBFP calculates a sharp control move to port-side for controlled vessel 1 in anticipation of the obstacle ship's crossing towards the port of Miami; in contrast, MPC-SLP applies a lower rate of steering for controlled vessel 1, because the straight-line trajectory prediction places its CPA with the obstacle ship at a later time instance. This failure to correctly place the CPAs has adverse effects on vessel 2 trajectory too, since it is displaced unnecessarily to the left in false anticipation of a collision risk. In addition, the obstacle ship crosses into the domain of controlled vessel 1 (see Figure 8(a2)) once it changes course towards the Miami port at timestep 8 . On the other hand, the MPC-RBFP scheme places controlled vessel 1 in a better position to narrowly evade the breach of its safe domain (see Figure 8(b2)) throughout the simulation. This performance is owed to the trajectory that the RBF model generated for the obstacle vessel, placing its predicted CPA much closer to the real CPA for both controlled vessels. Moreover, it should be noted that for scenario 2, unnecessary deviations from the original course are avoided for controlled vessel 2, as indicated by the total deviation values in Table 4. In general, the proposed method achieves a lower overall cost for the generated trajectories, as shown in Table 4, while obtaining a certain degree of cooperation between the two vessels, where one makes way for the other in anticipation of their upcoming evasive maneuvers. Moreover, the results show that the proposed method exhibits robust characteristics against environmental effects, which are modeled as input noise in the vessel dynamic model for the scope of the simulations, while accounting for COLREGs. Lastly, the average CPU time evaluation of the MPC calculation was recorded as 7s for both scenarios, which is well within the allocated simulation controller timestep t cst of 60 s, proving, in fact, that the proposed method is scalable to a greater number of controlled and obstacle vessels.

Conclusions
In this paper, a multi-ship MPC controller utilizing RBF obstacle ship trajectory prediction models trained on real AIS data is proposed for the collision avoidance task in busy ports or waterways. The proposed method is compared to an MPC controller using straight-line obstacle ship trajectory prediction models for a real simulation case for the port of Miami. The simulations have shown that the incorporation of a trajectory prediction model with a moderate degree of accuracy greatly benefits the performance of a collision avoidance controller; this is due to the fact that a collision risk can be detected earlier and in time, so that it can be accommodated by the slow maneuvering dynamics of larger vessels such as container ships. Moreover, this early detection enables the planning of a more economic trajectory for the controlled vessels and enables their better cooperation. Data Availability Statement: In this work, the publicly available AIS dataset provided by the Marine Cadastre service (www.marinecadastre.gov, accessed on 25 July 2021) has been used for trajectory modeling purposes.