Operational Monitoring of Illegal Fishing in Ghana through Exploitation of Satellite Earth Observation and AIS Data

: Over the last decade, West African coastal countries, including Ghana, have experienced extensive economic damage due to illegal, unreported and unregulated (IUU) ﬁshing activity, estimated at about USD 100 million in losses each year. Illegal, unreported and unregulated ﬁshing poses an enormous threat to the conservation and management of the dwindling ﬁsh stocks, causing multiple adverse consequences for ﬁsheries, coastal and marine ecosystems and for the people who depend on these resources. The Integrated System for Surveillance of Illegal, Unlicensed and Unreported Fishing (INSURE) is an efﬁcient and inexpensive system that has been developed for the monitoring of IUU ﬁshing in Ghanaian waters. It makes use of fast-delivery Earth observation data from the synthetic aperture radar instrument on Sentinel-1 and the Multi Spectral Imager on Sentinel-2, detecting objects that differ markedly from their immediate background using a constant false alarm rate test. Detections are matched to, and veriﬁed by, Automatic Identiﬁcation System (AIS) data, which provide the location and dimensions of ships that are legally operating in the region. Matched and unmatched data are then displayed on a web portal for use by coastal management authorities in Ghana. The system has a detection success rate of 91% for AIS-registered vessels, and a fast throughput, processing and delivering information within 2 h of acquiring the satellite overpass. However, over the 17-month analysis period, 75% of SAR detections have no equivalent in the AIS record, suggesting signiﬁcant unregulated marine activity, including vessels potentially involved in IUU. The INSURE system demonstrated its efﬁciency in Ghana’s exclusive economic zone and it can be extended to the neighbouring states in the Gulf of Guinea, or other geographical regions that need to improve ﬁsheries surveillance.


Introduction
The rising global demand for fish resources has made West African waters a hot spot for industrial fishing fleets from all over the world [1]. These waters are nursery grounds and migratory paths for commercially high-value fish species, such as skipjack, yellowfin, and bigeye tuna species. However, over the last decades, the Gulf of Guinea (GoG) has seen a significant decline in marine fish capture, putting at risk the sustainable exploitation of natural resources in this region [2]. In part, this decline is due to massive bycatch and discard problems, which have reduced marine resources and caused The development of this system is the result of a collaboration between Plymouth Marine Laboratory (PML) and Economic Community of West African States Coastal and Marine Resources Management Centre (ECOWAS Marine Centre), based at the University of Ghana, in Accra.

Methods
The INSURE system is designed to operationally manage and report on a synthetic data stream that comprises data from EO and AIS sources. A schematic diagram of the system is presented in Figure 1. The system benefits from using both active microwave and optical remote sensing data provided by the Sentinel-1 SAR and Sentinel-2 MSI, respectively. By cross-checking the remote sensing observations with information obtained from AIS, non-cooperative vessels are revealed and reported via the web-based Geographic Information System, referred to here as the INSURE web portal.

Methods
The INSURE system is designed to operationally manage and report on a synthetic data stream that comprises data from EO and AIS sources. A schematic diagram of the system is presented in Figure 1. The system benefits from using both active microwave and optical remote sensing data provided by the Sentinel-1 SAR and Sentinel-2 MSI, respectively. By cross-checking the remote sensing observations with information obtained from AIS, non-cooperative vessels are revealed and reported via the web-based Geographic Information System, referred to here as the INSURE web portal.
The system processing chain manages multiple processes in a scheduled, automated fashion, including: database management, checking for new scene availability, downloading and staging of EO scenes, scene calibration, scene masking and mapping, vessel detection, checking for AIS data availability, AIS data ingestion, vessel matching, image generation and statistical reporting, and staging for portal visualisation. The processing chain interfaces with the European Space Agency (ESA) Sentinel catalogue via the Copernicus API hub (https://scihub.copernicus.eu/apihub/), querying the availability of new Sentinel-1 and Sentinel-2 scenes. It also ingests AIS databases, delivered via FTP from the University of Ghana. The processing chain, which is predominantly written in Python, runs in an automated fashion every hour.
The ordering and scheduling of processing chain tasks is managed by a centralised SQLite database, which stores a set of keys against the required stages, as well as metadata on the progress of a given scene through the chain. This metadata also includes status information on the success of each stage, and error information in the case of stage failures.

EO Data Sources
The INSURE system takes advantage of EO data from the Sentinel-1 and, preliminarily, Sentinel-2 series of satellites. The EO data are sourced from the ESA Sentinel API hub, and downloaded on an automated basis.

Data Coverage
The main properties of EO data available at ESA hub are summarised in Table 1. It is seen that only ascending passes are available for Sentinel-1A and Sentinel-1 B sensors in Ghana. These passes The system processing chain manages multiple processes in a scheduled, automated fashion, including: database management, checking for new scene availability, downloading and staging of EO scenes, scene calibration, scene masking and mapping, vessel detection, checking for AIS data availability, AIS data ingestion, vessel matching, image generation and statistical reporting, and staging for portal visualisation.
The processing chain interfaces with the European Space Agency (ESA) Sentinel catalogue via the Copernicus API hub (https://scihub.copernicus.eu/apihub/), querying the availability of new Sentinel-1 and Sentinel-2 scenes. It also ingests AIS databases, delivered via FTP from the University of Ghana. The processing chain, which is predominantly written in Python, runs in an automated fashion every hour.
The ordering and scheduling of processing chain tasks is managed by a centralised SQLite database, which stores a set of keys against the required stages, as well as metadata on the progress of a given scene through the chain. This metadata also includes status information on the success of each stage, and error information in the case of stage failures.

EO Data Sources
The INSURE system takes advantage of EO data from the Sentinel-1 and, preliminarily, Sentinel-2 series of satellites. The EO data are sourced from the ESA Sentinel API hub, and downloaded on an automated basis. The main properties of EO data available at ESA hub are summarised in Table 1. It is seen that only ascending passes are available for Sentinel-1A and Sentinel-1 B sensors in Ghana. These passes are separated by a 6-day time interval, and the data can be combined to reduce the overall revisit time interval from 12 days (for each sensor) to 6 days (for two sensors). The frequency of observations can be further improved by combining Sentinel-1 SAR and Sentinel-2 MSI optical data, but dense cloud cover over the region for most of the year prevents a reduction of the revisit time on a regular basis. Hence, the revisit time of Sentinel-1 and Sentinel-2 sensors does not permit to achieve daily observations of non-cooperative vessels in Ghana. Nevertheless, the information delivered by Sentinel-1 and Sentrinel-2 sensors is essential for extending the capabilities of monitoring IUU fishing in Ghana. For the Ghanaian coastal region, Sentinel-1A ascending pass scenes were available with some limitations in coverage (Figure 2a). Recent augmentation of Sentinel-1 coverage with Sentinel-1B scenes ( Figure 2b) helped to address this limitation, providing more extensive coverage in the offshore environment around Accra.
The ascending pass for Sentinel-1 traverses the West African coast at around 18:10. This timing is not ideal as the majority of smaller fishing vessels have landed catches by the early afternoon, and therefore we record many of these vessels only when they have returned to port. Larger vessels, with the capability to operate overnight and across multiple days may still be detected at sea, irrespective of the overpass time. Supplementary coverage from Sentinel-1 descending passes (at around 06:00 local time) would improve the system's ability to monitor the activity of the smaller ships, but is currently not available.
The timing of Sentinel-2A and Sentinel-2B MSI sensors is more appropriate for fishing vessel observation in the Ghanaian coastal zone, with both sensors scheduled to acquire data during their descending pass at around 10:20 AM. Regional sensor coverage for Sentinel-2 is illustrated in Figure 2c,d. The ascending pass for Sentinel-1 traverses the West African coast at around 18:10. This timing is not ideal as the majority of smaller fishing vessels have landed catches by the early afternoon, and therefore we record many of these vessels only when they have returned to port. Larger vessels, with the capability to operate overnight and across multiple days may still be detected at sea, irrespective of the overpass time. Supplementary coverage from Sentinel-1 descending passes (at around 06:00

Sentinel-1 SAR Data
The INSURE processing chain processes Level 1, high-resolution, ground range detected products (GRDH), using interferometric wide swath (IW) mode and dual polarisation (VV+VH). The data in these products are mapped to ground range coordinates and multi-look processed for speckle noise Remote Sens. 2019, 11, 293 7 of 28 reduction by ESA prior to downloading. The IW GRDH products provide spatial resolution up to 20 m in range and azimuth, 12-day revisit time for ascending orbits available for Ghana coverage, and all-weather day and night observation capabilities due to the low sensitivity of Sentinel-1 to cloud cover.
The block-diagram in Figure 3 illustrates the Sentinel-1 data processing chain. The GRDH data are initially pre-processed to remove distortions at image edges and reduce thermal noise [17]. Radiometric calibration was applied to convert the data to absolute backscatter values. The images were then mapped into the geographic coordinate system and cropped to cover only the area of interest. These operations were implemented using the ESA SeNtinel Applications Platform (SNAP) software package and associated Sentinel-1 Toolbox [18]. Next, land pixels in the image were flagged by a land mask ( Figure 4). A small buffer zone was defined around the land mask border to reduce land mask errors along the coastline near complex landscape features [7], implemented using a morphological operator [19]. Initially, the "SRTM 3sec" (NASA Shuttle Radar Topography Mission) digital elevation model (DEM) was tested for generation of a land-water mask, but preliminary results demonstrated that the accuracy of these data was not sufficient for masking out land in Sentinel-1 images unless the width of the buffer zone was >500 m. In SUMO ship detection algorithm, Greidanus     Here, a more accurate land-water mask is generated by stacking together and merging of 62 Sentinel-1 SAR images at VH polarisation, being available for Ghana EEZ region ( Figure 2e) over a 6-month time period. The images were preliminary calibrated and mapped to geographical coordinates using SNAP [18]. A composite product was generated by taking the median of all the overlapping images. A threshold was then applied to the composite Sentinel-1 data to generate a land-water map. The threshold equal to σ 0 = 0.0433 was selected by using Otsu's method [20] and then fine-tuned by visual analysis of thresholding results.
The remaining artefacts and errors were corrected by manual editing of a binary image. This approach allows for a reducing buffer zone width to 20 m and improving mask fidelity in the Ghanaian coastal zone, e.g., in lagoons and river estuaries.  Here, a more accurate land-water mask is generated by stacking together and merging of 62 Sentinel-1 SAR images at VH polarisation, being available for Ghana EEZ region ( Figure 2e) over a 6-month time period. The images were preliminary calibrated and mapped to geographical coordinates using SNAP [18]. A composite product was generated by taking the median of all the overlapping images. A threshold was then applied to the composite Sentinel-1 data to generate a land-water map. The threshold equal to σ 0 = 0.0433 was selected by using Otsu's method [20] and then fine-tuned by visual analysis of thresholding results.
The remaining artefacts and errors were corrected by manual editing of a binary image. This approach allows for a reducing buffer zone width to 20 metres and improving mask fidelity in the Ghanaian coastal zone, e.g., in lagoons and river estuaries.
Pixels that are classified as open water are then processed to detect vessels. A two-stage algorithm was implemented for automatic processing of Sentinel-1 images. A constant false alarm rate (CFAR) detector [7] was applied at the first stage (pre-screening) to identify objects that differed substantially from their immediate background ( Figure 3). A fully-developed multiplicative model of SAR speckle was adopted for the CFAR detection. In this model the intensity of speckle was assumed to be directly proportional to the measured backscatter signal [21]. Following this model, the variance of the speckle noise σs 2 in the SAR image was defined as σs 2 = I/L, where L is the equivalent number of looks and I is the signal intensity [22]. The CFAR detector was defined as, Pixels that are classified as open water are then processed to detect vessels. A two-stage algorithm was implemented for automatic processing of Sentinel-1 images. A constant false alarm rate (CFAR) detector [7] was applied at the first stage (pre-screening) to identify objects that differed substantially from their immediate background ( Figure 3). A fully-developed multiplicative model of SAR speckle was adopted for the CFAR detection. In this model the intensity of speckle was assumed to be directly proportional to the measured backscatter signal [21]. Following this model, the variance of the speckle noise σ s 2 in the SAR image was defined as σ s 2 = I/L, where L is the equivalent number of looks and I is the signal intensity [22]. The CFAR detector was defined as, I t and I b are the target and background intensities and T is the constant threshold that defines the false alarm rate of the CFAR detector. We applied a sliding window with a ring shape to estimate the background intensity in the neighbourhood of each processed pixel. The size of sliding window was set to 600 m. The target intensity parameter was estimated using a window of much smaller size (30 m), centred at the processed pixel. A safety gap of 150 m between the background and target windows was provided to ensure that the target pixels did not affect the estimates of background intensity. By applying target and background windows of rounded shape, we achieved an equal contribution of image pixels from all directions. The target and background intensities were estimated using a sliding window sample median. By replacing an average estimate in CFAR detector [7] with this local median estimate, we improved its robustness for complex backgrounds and for areas with high vessel density, e.g., at the entrances to harbours and sea ports. To merge vertical transmit and vertical receive polarisation data (VV) with vertical transmit and horizontal receive polarisation data (VH), we applied a multi-channel detection technique [7]. The statistic S d in Equation (1) was calculated separately for the VV and VH polarisations and two independent detection signatures S VV d and S VH d were produced. These signatures were then merged by applying the geometric mean given by The CFAR threshold was then applied to the geometric mean given by Equation (2), rather than to the individual bands.
In the literature there is no established view on what is the best method for combining co-and cross-polarisation signals for the vessel detection task [7]. The experimental results of vessel detection in Sentinel-1 images collected over the Mediterranean Sea by Santamaria et al. [11] suggested that the contribution of co-polarised (HH and VV) data is low compared to the cross-polarised components (HV and VH). However, the results of other studies reported by Crisp [7] show that HH polarisation is the best for vessel detection when incidence angle is larger than 45 degrees, and VH or HV polarisation is better for incidence angle < 45 degrees. Nevertheless, the best choice of polarisation depends on many factors, such as wind speed and direction, incidence angle, vessel heading, dimensions, and construction material [7], while some of these parameters are not known a priori. Given that co-and cross-polarisation data are statistically independent, some improvement in detector performance can be achieved by merging these data. For example, the SUMO detector [9] applies a "logical or" operator to combine vessel detection results obtained for co-and cross-polarisation components separately. This technique shows better performance when cross-polarisation signal becomes weak but may produce false alarms due to ship wakes. The approach, based on using geometric mean, Equation (2), is less affected by ship wakes and other signals not observed at co-and cross-polarisations [23,24].
Next, the contours of detected objects are retrieved from the binary data and connected components are labelled. These components represent groups of vessel pixels and each of them is processed separately to estimate vessel coordinates, heading, length, and width parameters ( Figure 3). Preliminary estimates of these parameters are obtained by applying a method of image moments for a small region around the cluster of vessel pixels detected by CFAR algorithm [25,26].
The accuracy of vessel heading and dimension estimates are adversely affected by artefacts that appear around bright objects in SAR images. These artefacts are formed by the side-lobe effects of Sentinel-1 sensor impulse response function (IRF) and can be seen around vessel structure elements, such as the bow, stern, funnel, and superstructure [27]. A morphological image-filtering technique was applied to suppress the IRF artefacts in Sentinel-1 images and improve accuracy. Preliminary estimates of vessel heading and dimensions were used to select the size of a rectangular structuring element. Then, a morphological operator was applied using the structuring element to suppress side-lobe artefacts in the image [28]. Next, the refined estimates of vessel location, dimension, and heading were obtained by applying an offset centre of gravity (OCOG) method. Accordingly, vessel length and width were estimated as: where S x = ∑ Ny y=1 S x,y /N y , S y = ∑ Nx x=1 S x,y /N x and S x,y is the output of morphological filter at pixel position (x,y), N x and N y are the dimensions of the vessel analysis window. Vessel horizontal and vertical coordinates were estimated as Refined estimate of vessel orientation was obtained in three steps. Firstly, the vessel analysis window of size N x xN y was split equally into left and right halves. Then, the OCOG method Equation (4) was applied to each half to estimate centre coordinates (X L , Y L ) and (X R , Y R ). An orientation angle was then estimated as Equation (5) gives an estimate of vessel orientation with an ambiguity of 180 degrees. Figure 5 illustrates the main processing stages, with VV-and VH-polarisation images of a vessel shown in Figure 5a, column 1, and CFAR signatures (S VV d and S VH d ) shown in Figure 5b, column 2. Merged signature S d and vessel binary mask are shown in Figure 5c, column 3. The mask in the figure does not match vessel shape, as it is affected by side-lobe distortions. The vessel signature and its binary mask after morphological processing are presented in Figure 5d For the objects that passed the false alarm test, we estimated additional features such as the number of vessel "peaks", defined as areas of local maxima, and their intensities. For fishing vessels, the number of peak signals usually varies from two to three and are typically aligned along the vessel length dimension. Joint analysis of peak number and location was applied to discriminate vessels moored together and identify trans-shipment scenarios, when illegally caught fish are transferred between vessels.
An example of applying the developed methodology for automatic vessel detection in Sentinel-1 SAR data is shown in Figure 6. The image shows vessels of different size concentrated at the entrance to Port Tema, Ghana. It is a challenging scene for the detection algorithm due to the high concentration of vessels in close proximity to each other. Nevertheless, the algorithm performs a successful detection, and accurately estimates orientation and dimension parameters for the majority of vessels visible in Figure 6. To reduce false alarms produced by the CFAR pre-screening algorithm, the estimated vessel dimension parameters are compared with values known a priori from the analysis of AIS data. Any detected objects with dimensions falling outside of a predefined range from 20 to 1000 m are discarded. A static object detection algorithm [11] was applied to reduce false alarms associated with "ghost signals"-periodic patterns, produced in SAR images by bright objects, such as oil platforms on the sea, or buildings and other objects on land. Some of such signals may be removed at a stage of land masking, since the mask is created by merging multiple SAR images that may contain static objects (see Figure 4). The remaining objects are flagged as static if they are located within a 100-metre distance to the nearest cluster. The clusters are identified automatically by analysis of positions of vessels, detected over a two-month interval. The list of cluster coordinates is updated iteratively after processing a new SAR scene. The iteration process starts with the image that comes earlier. The objects detected in this image are clustered into same groups if the distance between them does not exceed 100 m. The location of the cluster centre position is updated by averaging the coordinates of objects linked to cluster. The objects detected in a new image are assigned to the existing clusters if the distance from the object to its centre is less than 100 m. Then, the centre positions of all clusters are updated. The objects located further away from the existing clusters form the new ones. This procedure repeats for all objects detected in a two-month interval that includes 9 Sentinel-1 scenes. The time interval is limited to provide adjustment to any changes that may take place over longer period. Each cluster of detected objects is flagged as static if the number of objects in the cluster is >3. The described procedure did not allow for discrimination between SAR ambiguities and fixed structures [26] due to the very small area of overlap between Sentinel-1 scenes with different geometry (see Figure 2).
For the objects that passed the false alarm test, we estimated additional features such as the number of vessel "peaks", defined as areas of local maxima, and their intensities. For fishing vessels, the number of peak signals usually varies from two to three and are typically aligned along the vessel length dimension. Joint analysis of peak number and location was applied to discriminate vessels moored together and identify trans-shipment scenarios, when illegally caught fish are transferred between vessels.
An example of applying the developed methodology for automatic vessel detection in Sentinel-1 SAR data is shown in Figure 6. The image shows vessels of different size concentrated at the entrance to Port Tema, Ghana. It is a challenging scene for the detection algorithm due to the high concentration of vessels in close proximity to each other. Nevertheless, the algorithm performs a successful detection, and accurately estimates orientation and dimension parameters for the majority of vessels visible in Figure 6.

Sentinel-2 MSI data
Optical ship detection is performed using Sentinel-2 MSI Level-1C data, either in its full swath or tiled form. Due to contamination by clouds for most of the year in the GoG region, Sentinel-2 were given lower priority in the developed system.
A block-diagram of Sentinel-2 MSI data processing chain is shown in Figure 7. The Level-1C MSI data are first atmospherically corrected to generate bottom-of-atmosphere reflectance data. The Sen2Cor algorithm [29] is adopted for this task. An important feature of Sen2Cor is that it classifies pixels in Sentinel-2 scenes into 12 classes (e.g., clouds, cloud shadows, water, cirrus clouds), and provides confidence measures for the cloud class. The suitability of the Sen2Cor algorithm for processing Sentinel-2 MSI scenes over water has been demonstrated by Dörnhöfer [30].

Sentinel-2 MSI data
Optical ship detection is performed using Sentinel-2 MSI Level-1C data, either in its full swath or tiled form. Due to contamination by clouds for most of the year in the GoG region, Sentinel-2 were given lower priority in the developed system.
A block-diagram of Sentinel-2 MSI data processing chain is shown in Figure 7. The Level-1C MSI data are first atmospherically corrected to generate bottom-of-atmosphere reflectance data. The Sen2Cor algorithm [29] is adopted for this task. An important feature of Sen2Cor is that it classifies pixels in Sentinel-2 scenes into 12 classes (e.g., clouds, cloud shadows, water, cirrus clouds), and provides confidence measures for the cloud class. The suitability of the Sen2Cor algorithm for processing Sentinel-2 MSI scenes over water has been demonstrated by Dörnhöfer [30].
Following atmospheric correction, the images are resampled to a 10-m spatial resolution and cropped to cover the region of interest. Open water areas are identified by applying the land mask generated from Sentinel-1 data ( Figure 4). After masking, the images are divided into tiles to be processed separately, in parallel. The tiles are partly overlapped to eliminate border effects.
A two-stage technique is applied to detect vessels in pre-processed Sentinel-2 MSI data. Firstly, the image is pre-screened to identify anomalies-objects that stand out from the background (mostly open water and clouds) in terms of spectral properties. The approach, based on the RX algorithm originally proposed by Reed and Yu [31], was adopted for anomaly detection in Sentinel-2 data ( Figure 7). The RX algorithm uses a multivariate Gaussian model to describe statistical properties of image background and a Mahalanobis distance metric to identify objects in the background. A modification of the RX algorithm was proposed by Schaum [32,33], who improved its efficiency by preliminarily projecting multispectral data into principal component (PC) space, retaining the largest PC components that describe target variability and normalising transformed data to unit variances.
However, the performance of this modification is affected by clouds. To improve its performance, we screened the input data with a low-probability cloud mask generated by the Sen2Cor algorithm [29]. The complement of this mask, generated by Sen2Cor, was used to identify open water pixels, and the parameters of the PC transformation were estimated from the data masked as open water. Then, the spectral content of the image was transformed into PC space. The minimum number of PC components needed to detect anomalies was established experimentally by carrying out analysis of vessel detection on several test scenes. The largest four PC components were then selected ( Figure 7). Figure 6. Automatic vessel detection: (a) vessels at the entrance to Port Tema, Ghana (example Sentinel-1 pseudo-colour scene); (b) automatic vessel detection; (b), inset -estimated vessels dimensions and orientation.

Sentinel-2 MSI data
Optical ship detection is performed using Sentinel-2 MSI Level-1C data, either in its full swath or tiled form. Due to contamination by clouds for most of the year in the GoG region, Sentinel-2 were given lower priority in the developed system.
A block-diagram of Sentinel-2 MSI data processing chain is shown in Figure 7. The Level-1C MSI data are first atmospherically corrected to generate bottom-of-atmosphere reflectance data. The Sen2Cor algorithm [29] is adopted for this task. An important feature of Sen2Cor is that it classifies pixels in Sentinel-2 scenes into 12 classes (e.g., clouds, cloud shadows, water, cirrus clouds), and provides confidence measures for the cloud class. The suitability of the Sen2Cor algorithm for processing Sentinel-2 MSI scenes over water has been demonstrated by Dörnhöfer [30]. Following atmospheric correction, the images are resampled to a 10-m spatial resolution and cropped to cover the region of interest. Open water areas are identified by applying the land mask generated from Sentinel-1 data (Figure 4). After masking, the images are divided into tiles to be processed separately, in parallel. The tiles are partly overlapped to eliminate border effects.
A two-stage technique is applied to detect vessels in pre-processed Sentinel-2 MSI data. Firstly, the image is pre-screened to identify anomalies-objects that stand out from the background (mostly open water and clouds) in terms of spectral properties. The approach, based on the RX algorithm originally proposed by Reed and Yu [31], was adopted for anomaly detection in Sentinel-2 data At the detection stage, the connected pixels for each anomaly are grouped and processed separately. For each group, a bounding rectangle was calculated, and the coordinates, dimension, and orientation parameters were derived from its corner coordinates. A group of anomaly pixels was flagged as a false alarm if its estimated length or width was smaller or larger than the expected vessel size that was defined in range from 20 to 1000 m. Hence, large clouds incorrectly identified as vessels were removed. These processing stages are illustrated in Figure 8, columns (a) to (c). A Sentinel-2 image of a vessel (Figure 8a) was pre-screened using the modified RX algorithm [33] to obtain a vessel "signature" ( Figure 7). The RX algorithm uses a multivariate Gaussian model to describe statistical properties of image background and a Mahalanobis distance metric to identify objects in the background. A modification of the RX algorithm was proposed by Schaum [32,33], who improved its efficiency by preliminarily projecting multispectral data into principal component (PC) space, retaining the largest PC components that describe target variability and normalising transformed data to unit variances. However, the performance of this modification is affected by clouds. To improve its performance, we screened the input data with a low-probability cloud mask generated by the Sen2Cor algorithm [29]. The complement of this mask, generated by Sen2Cor, was used to identify open water pixels, and the parameters of the PC transformation were estimated from the data masked as open water. Then, the spectral content of the image was transformed into PC space. The minimum number of PC components needed to detect anomalies was established experimentally by carrying out analysis of vessel detection on several test scenes. The largest four PC components were then selected (Figure 7). At the detection stage, the connected pixels for each anomaly are grouped and processed separately. For each group, a bounding rectangle was calculated, and the coordinates, dimension, and orientation parameters were derived from its corner coordinates. A group of anomaly pixels was flagged as a false alarm if its estimated length or width was smaller or larger than the expected vessel size that was defined in range from 20 to 1000 metres. Hence, large clouds incorrectly identified as vessels were removed. These processing stages are illustrated in Figure 8, columns (a) to (c). A Sentinel-2 image of a vessel (Figure 8a) was pre-screened using the modified RX algorithm [33] to obtain a vessel "signature" (Figure 8b). The vessel mask (Figure 8c) was calculated by thresholding the vessel signature and grouping connected pixels. At the next processing stage, we aimed to reduce the number of false alarms in the pre-screening algorithm. Two major factors were responsible for producing false alarms: small clouds of low density and the edges of dense clouds. Small clouds can produce false alarms if they are not recognised as such by Sen2Cor, due to their small size and low density. To reduce alarms of this type, we modelled low-density cloud spectrum as a weighted average of the open water and dense cloud At the next processing stage, we aimed to reduce the number of false alarms in the pre-screening algorithm. Two major factors were responsible for producing false alarms: small clouds of low density and the edges of dense clouds. Small clouds can produce false alarms if they are not recognised as such by Sen2Cor, due to their small size and low density. To reduce alarms of this type, we modelled low-density cloud spectrum as a weighted average of the open water and dense cloud spectra [8]. A two-component linear mixture model was applied, with the weights changing between 0 (open water) and 1 (dense clouds). For modelling dense cloud spectra, we used an average of the image pixels classified as high-probability clouds by Sen2Cor. Open water class spectrum was estimated by using a Sen2Cor low-probability cloud mask. Independent estimates of water and cloud spectra were produced for each tile of the processed image.
A linear model of background spectrum was then defined as the line in multispectral space joining the open water and dense cloud average spectral points. Accordingly, any reflectance points located in multispectral space within a small distance from this line were classified as water or clouds. The Euclidian norm was used to calculate the distance. If the nearest distance to this line in multispectral space was above the threshold, the image pixels were classified at vessels. The threshold was selected by calculating and analysing the spectral distances for cloud and vessel objects in test Sentinel-2 MSI images. The optimal value of threshold was estimated equal to 3.0 based on visual analysis of cloud and vessel discrimination results. An example of the spectral distance signature calculated for a vessel is shown in Figure 8d.
Artefacts at the edges of contrail clouds shown in Figure 9b can also produce false alarms, as the spectrum of cloud edges can differ from the background spectra. These false alarms are difficult to model or predict, and cannot be recognized based on the size signature as many of them have dimensions comparable to those of the vessels we are interested in. However, the cloud signatures in MSI images typically have smoother edges than those edges of vessels; a property that INSURE exploits for discrimination. The detail and edge sharpness of an object detected in an MSI images was estimated by applying a spatial median filter with window size 5 × 5 pixels to smooth out image details and subsequently subtracting the smoothed image from the original to obtain the detail image ( Figure 8e). This operation was applied to each multispectral component and the strongest detail within a small neighbourhood of the detected object was selected. The detected object was classified as a false alarm if the detail parameter, was below the selected threshold. The threshold was chosen equal to 0.06 experimentally, by numerical analysis of scenes with vessels and clouds.
Remote Sens. 2018, 10, x FOR PEER REVIEW 14 of 29 located in multispectral space within a small distance from this line were classified as water or clouds. The Euclidian norm was used to calculate the distance. If the nearest distance to this line in multispectral space was above the threshold, the image pixels were classified at vessels. The threshold was selected by calculating and analysing the spectral distances for cloud and vessel objects in test Sentinel-2 MSI images. The optimal value of threshold was estimated equal to 3.0 based on visual analysis of cloud and vessel discrimination results. An example of the spectral distance signature calculated for a vessel is shown in Figure 8d. Artefacts at the edges of contrail clouds shown in Figure 9b can also produce false alarms, as the spectrum of cloud edges can differ from the background spectra. These false alarms are difficult to model or predict, and cannot be recognized based on the size signature as many of them have dimensions comparable to those of the vessels we are interested in. However, the cloud signatures in MSI images typically have smoother edges than those edges of vessels; a property that INSURE exploits for discrimination. The detail and edge sharpness of an object detected in an MSI images was estimated by applying a spatial median filter with window size 5 × 5 pixels to smooth out image details and subsequently subtracting the smoothed image from the original to obtain the detail image ( Figure 8e). This operation was applied to each multispectral component and the strongest detail within a small neighbourhood of the detected object was selected. The detected object was classified as a false alarm if the detail parameter, was below the selected threshold. The threshold was chosen equal to 0.06 experimentally, by numerical analysis of scenes with vessels and clouds. An example of automatic vessel detection using this methodology is demonstrated for a Sentinel-2 image showing multiple vessels approaching a local port in the coastal waters near Accra, Ghana, in Figure 10. The scene also contains clouds of different size and density and has changing reflectance background due to local variation of chlorophyll-a concentration. The detected vessels are shown in green. The proposed technique successfully estimates position, dimensions, and orientation for most of the vessels observed in this challenging scene.

AIS Dataset
The Automatic Identification System (AIS) is a marine vessel tracking system that relays the position, dimensions, course, speed, rate of rotation, class, operator, and activity of a given vessel, all associated with a Maritime Mobile Service Identity (MMSI) number that is, in theory, unique. In practice, however, this is rarely the case. False registration of MMSI numbers (known as spoofing), mis-reporting of latitude and longitude coordinates, and a whole host of other "tricks" allow many vessels, apparently equipped with AIS systems, whether legally mandated or not, to avoid detection [6]. The AIS data are relayed to marine traffic systems either via satellite or coastal monitoring stations. The rate at which the AIS system reports positions is a function of vessel movement, with a higher reporting frequency at higher speed.
The INSURE uses data from both coastal and satellite AIS sources, provided by the ECOWAS Coastal and Marine Resources Management Centre, situated at the University of Ghana. Only data for the Ghanaian coastal area were utilised in the system, pursuant to a data dissemination agreement with the commercial AIS provider. However, due to utilisation of satellite AIS data sources, the service can be extended to wider areas. The AIS data were provided for the times of nearly all Sentinel-1 satellite overpasses. Figure 11 gives an indication of the volume and type of AIS tracks under consideration. It shows that, for the times under consideration in this analysis, AIS signals were concentrated around the ports of Tema and Takoradi (and Lome, in Togo), as well as in key offshore areas associated with the oil and gas industry and shelf-based fishing sites.
No quality control of the AIS data was performed prior to ingestion by the INSURE processor. The AIS points that were present in a scene but did not correspond to a feature in the SAR image were flagged as spurious. However, these occurrences were rare, because the most common way to subvert an AIS location, by switching latitude and longitude coordinates, would move the vessel out

AIS Dataset
The Automatic Identification System (AIS) is a marine vessel tracking system that relays the position, dimensions, course, speed, rate of rotation, class, operator, and activity of a given vessel, all associated with a Maritime Mobile Service Identity (MMSI) number that is, in theory, unique. In practice, however, this is rarely the case. False registration of MMSI numbers (known as spoofing), mis-reporting of latitude and longitude coordinates, and a whole host of other "tricks" allow many vessels, apparently equipped with AIS systems, whether legally mandated or not, to avoid detection [6]. The AIS data are relayed to marine traffic systems either via satellite or coastal monitoring stations. The rate at which the AIS system reports positions is a function of vessel movement, with a higher reporting frequency at higher speed.
The INSURE uses data from both coastal and satellite AIS sources, provided by the ECOWAS Coastal and Marine Resources Management Centre, situated at the University of Ghana. Only data for the Ghanaian coastal area were utilised in the system, pursuant to a data dissemination agreement with the commercial AIS provider. However, due to utilisation of satellite AIS data sources, the service can be extended to wider areas. The AIS data were provided for the times of nearly all Sentinel-1 satellite overpasses. Figure 11 gives an indication of the volume and type of AIS tracks under consideration. It shows that, for the times under consideration in this analysis, AIS signals were concentrated around the ports of Tema and Takoradi (and Lome, in Togo), as well as in key offshore areas associated with the oil and gas industry and shelf-based fishing sites. of the scene (in this case far offshore in the Gulf of Guinea, directly south of Cote d'Ivoire, and out of the acquisition range of Sentinel-1 and Sentinel-2).

Vessel Tracks Interpolation
While coastal AIS systems are able to report data as frequently as every 30 seconds, satellite AIS data is typically reported every hour. Consequently, in the former case, the time difference between the nearest AIS time-stamp to the satellite overpass time, Tdiff, is likely to be of the order of seconds. In the latter case, however, this may be as much as 30 minutes. While Tdiff remains small, linear interpolation of vessel position in time is unlikely to result in a substantial error, but as Tdiff increases, alternative approaches become more appropriate. The INSURE matching routine uses a Centripetal Catmull-Rom Spline (CCRS) function [34], f(X,Y, T), to generate a smoother ship track, that is more congruent with the movements that an actual vessel may make (e.g., it precludes instantaneous turning). A single parameter, α, determines the allowed curvature of a line segment, and is varied such that the approach also takes into account the heading of the vessel as recorded at each point, producing an interpolated ship track like that shown in Figure 12a.  No quality control of the AIS data was performed prior to ingestion by the INSURE processor. The AIS points that were present in a scene but did not correspond to a feature in the SAR image were flagged as spurious. However, these occurrences were rare, because the most common way to subvert an AIS location, by switching latitude and longitude coordinates, would move the vessel out of the scene (in this case far offshore in the Gulf of Guinea, directly south of Cote d'Ivoire, and out of the acquisition range of Sentinel-1 and Sentinel-2).

Vessel Tracks Interpolation
While coastal AIS systems are able to report data as frequently as every 30 sec, satellite AIS data is typically reported every hour. Consequently, in the former case, the time difference between the nearest AIS time-stamp to the satellite overpass time, T diff , is likely to be of the order of seconds. In the latter case, however, this may be as much as 30 min. While T diff remains small, linear interpolation of vessel position in time is unlikely to result in a substantial error, but as T diff increases, alternative approaches become more appropriate. The INSURE matching routine uses a Centripetal Catmull-Rom Spline (CCRS) function [34], f(X, Y, T), to generate a smoother ship track, that is more congruent with the movements that an actual vessel may make (e.g., it precludes instantaneous turning). A single parameter, α, determines the allowed curvature of a line segment, and is varied such that the approach also takes into account the heading of the vessel as recorded at each point, producing an interpolated ship track like that shown in Figure 12a.
Setting α appropriately also prevents the interpolated curve from looping back on itself, behaviour considered unlikely in this case. The CCRS function is built using all AIS points from 2 h either side of the satellite overpass. Where insufficient points exist to build the CCRS function, linear interpolation was used. In cases where no AIS data points exist before a pass, or after a pass, linear extrapolation, based on vessel speed was used to estimate vessel position, though this was done with low confidence. Once vessel positions were interpolated, those that had a legitimate, detectable position between the last recorded AIS position prior to overpass (AISpre) and first recorded AIS position post-overpass (AISpost), were denoted as "in T-window" and considered in the matching routine.
alternative approaches become more appropriate. The INSURE matching routine uses a Centripetal Catmull-Rom Spline (CCRS) function [34], f(X,Y, T), to generate a smoother ship track, that is more congruent with the movements that an actual vessel may make (e.g., it precludes instantaneous turning). A single parameter, α, determines the allowed curvature of a line segment, and is varied such that the approach also takes into account the heading of the vessel as recorded at each point, producing an interpolated ship track like that shown in Figure 12a.

Matching Routine
The catalogue of interpolated AIS vessel positions is sequentially compared with the catalogue of satellite vessel positions, with each satellite-detected vessel retaining a potential "match-link" to all AIS vessels that fall within a radius of influence. The radius was determined based on the maximum potential speed of the vessel and the time interval between the AIS signal and the overpass time. All vessels outside this radius were discarded as mismatches. The vessels inside the radius were considered as the potential match and the score is calculated for each of them to characterise the quality of match. The score is defined as a weighted average of the distance metric of vessel position and length features [13]: S ij = w D ·d(Lat EO , Lon EO , Lat AIS , Lon AIS ) + w L ·d(len EO , len AIS ), (6) where i and j are vessel indices in the satellite and AIS catalogues, d(Lat EO , Lon EO , Lat AIS , Lon AIS ) is the distance (in metres) between vessel positions in satellite image (Lat EO , Lon EO ) and the AIS position (Lat AIS , Lon AIS ), interpolated to the image time as described in Section 2.2.2. d(len EO , len AIS ) is the difference (in meters) between vessel lengths estimated from the image and reported by AIS. w D and w L in (1) are the weights of the position and length features, with w D + w L = 1. These weights were selected equal to w D = 0.9 and w L = 0.1 giving more significance to vessel position errors. In vessel-dense environments, sequential matches based on n-dimensional proximity does not yield the optimal solution. To solve this problem, the match assessment is performed repeatedly (by default 10,000 times), with the vessel selection sequence made at random. The total n-dimensional score of all matches made for each iteration is retained and the lowest score (indicating the best matching) is recorded. Leaving potential candidate vessels unmatched biases the system to a high score, ensuring that vessels are preferentially matched (Figure 12b).

Information Delivery and Utilisation
The outputs of the EO and AIS data processing are disseminated through a web-based geographic information system (GIS). By making the data available through a web browser we minimise the amount of custom software that potential users of the system would have to install. The portal has been developed with ease of use as a priority. The layers available are shown with appropriate metadata and keywords allowing quick finding of data.
To visualise the data, we utilised two Open Geospatial Consortium (OGC) standardised interfaces; Web Map Service (WMS) and Web Feature Service (WFS). The WMS is used for displaying imagery data. The data are processed into NetCDF files and served through the open-source Thredds software (Unidata THREDDS). This software allows quick access to geospatial data, with the ability to manipulate the colour scale to highlight individual features. Where datasets have an associated time dimension the portal displays a time-bar allowing very quick selection of the date required.
To visualise detected ships, we utilised the WFS standard. The data are converted into KML files and then into Shapefiles. These are registered in an open-source GeoServer environment, and exposed via the WFS. The data layers can be displayed on top of the Sentinel WMS imagery, acting as an information layer outlining which ships have been identified. Each detected entity carries metadata such as the estimated ship length, which are displayed when the individual ship is clicked. Overall the GIS portal allows for quick assessment of the data produced by the INSURE processing chain, enabling users to see the information about a potential identified vessel. A snapshot of the INSURE Web Portal is shown in Figure 13.
Remote Sens. 2018, 10, x FOR PEER REVIEW 18 of 29 exposed via the WFS. The data layers can be displayed on top of the Sentinel WMS imagery, acting as an information layer outlining which ships have been identified. Each detected entity carries metadata such as the estimated ship length, which are displayed when the individual ship is clicked.
Overall the GIS portal allows for quick assessment of the data produced by the INSURE processing chain, enabling users to see the information about a potential identified vessel. A snapshot of the INSURE Web Portal is shown in Figure 13. Figure 13. The web portal interface. Note that the circular "lenses" on the right-hand side are provided to highlight the various decision layers available, and are not a live web-portal feature.

Validation
A prototype demonstration of the developed vessel monitoring system has been run in the Ghanaian EEZ. A proof of concept using Sentinel-2 MSI data for vessel monitoring was successfully evaluated using test scenes and found to be highly effective even for scenes severely affected by clouds. The processing chain for Sentinel-2 data is currently in the pre-operational phase.
Validation of Sentinel-1 SAR ship detection algorithm was performed by comparison with AIS vessel signals to ascertain how many "expected" vessel signatures were detected in SAR imagery. In total, 232 Sentinel-1 satellite images were automatically processed and matched with AIS data for the time interval from July 2016 to December 2017. The proportion of SAR-detected vessels that match with AIS data was calculated for this period of time, and compared with estimates of the proportion of all vessels that use AIS. The results of this analysis show that the detection algorithm had a 91% success rate at detecting AIS vessels in the region (see Table 2). The statistics have not been corrected for the potential prevalence of "false" AIS vessel signatures, which have no SAR signal and would remain unmatched, though as discussed above, we believe this number to be negligible. Figure 14 shows the distribution of SAR to AIS match distances over a 17-month period: 75% of matches are found within 300 m, and only a small proportion of matches beyond 1000 m. As no quality control of the AIS data was performed, these occurrences of errors could be caused by AIS "tricking" that allows many vessels, apparently equipped with AIS systems, to avoid detection [6]. In addition, vessels with large distance discrepancies tend to be associated with match-ups that rely on linear extrapolation of vessel tracks and or satellite AIS data with larger time discrepancies (see Figure 13. The web portal interface. Note that the circular "lenses" on the right-hand side are provided to highlight the various decision layers available, and are not a live web-portal feature.

Validation
A prototype demonstration of the developed vessel monitoring system has been run in the Ghanaian EEZ. A proof of concept using Sentinel-2 MSI data for vessel monitoring was successfully evaluated using test scenes and found to be highly effective even for scenes severely affected by clouds. The processing chain for Sentinel-2 data is currently in the pre-operational phase.
Validation of Sentinel-1 SAR ship detection algorithm was performed by comparison with AIS vessel signals to ascertain how many "expected" vessel signatures were detected in SAR imagery. In total, 232 Sentinel-1 satellite images were automatically processed and matched with AIS data for the time interval from July 2016 to December 2017. The proportion of SAR-detected vessels that match with AIS data was calculated for this period of time, and compared with estimates of the proportion of all vessels that use AIS. The results of this analysis show that the detection algorithm had a 91% success rate at detecting AIS vessels in the region (see Table 2). The statistics have not been corrected for the potential prevalence of "false" AIS vessel signatures, which have no SAR signal and would remain unmatched, though as discussed above, we believe this number to be negligible. Figure 14 shows the distribution of SAR to AIS match distances over a 17-month period: 75% of matches are found within 300 m, and only a small proportion of matches beyond 1000 m. As no quality control of the AIS data was performed, these occurrences of errors could be caused by AIS "tricking" that allows many vessels, apparently equipped with AIS systems, to avoid detection [6]. In addition, vessels with large distance discrepancies tend to be associated with match-ups that rely on linear extrapolation of vessel tracks and or satellite AIS data with larger time discrepancies (see Section 2.2.2).  It is seen in Table 2 that a relatively small number of vessels reported by AIS were not matched with SAR data. This can be explained by false information about vessel location provided by AIS or missing detections of vessels in SAR images, for example, due to their small size. To test this hypothesis a distribution of the number of vessels missing SAR images vs. their length was estimated and presented in Figure 15. It is seen that the length of missing vessels is nearly equally distributed between 10 and 350 metres with a small bias towards the vessels <50 metres in length. Hence, missing SAR detections cannot be explained by the small vessel size. Figure 15. The lengths distribution of vessels reported by AIS but not detected in SAR data.

Long-Term Analysis
In addition to its ability to provide information on a scene-by-scene basis, INSURE vessel detection data can provide an aggregate picture of vessel activity over a longer time period. Figure  16 shows a summary of all vessels detected in Sentinel-1 SAR imagery that are matched to AIS vessel Distance, m It is seen in Table 2 that a relatively small number of vessels reported by AIS were not matched with SAR data. This can be explained by false information about vessel location provided by AIS or missing detections of vessels in SAR images, for example, due to their small size. To test this hypothesis a distribution of the number of vessels missing SAR images vs. their length was estimated and presented in Figure 15. It is seen that the length of missing vessels is nearly equally distributed between 10 and 350 m with a small bias towards the vessels <50 m in length. Hence, missing SAR detections cannot be explained by the small vessel size. It is seen in Table 2 that a relatively small number of vessels reported by AIS were not matched with SAR data. This can be explained by false information about vessel location provided by AIS or missing detections of vessels in SAR images, for example, due to their small size. To test this hypothesis a distribution of the number of vessels missing SAR images vs. their length was estimated and presented in Figure 15. It is seen that the length of missing vessels is nearly equally distributed between 10 and 350 metres with a small bias towards the vessels <50 metres in length. Hence, missing SAR detections cannot be explained by the small vessel size. Figure 15. The lengths distribution of vessels reported by AIS but not detected in SAR data.

Long-Term Analysis
In addition to its ability to provide information on a scene-by-scene basis, INSURE vessel detection data can provide an aggregate picture of vessel activity over a longer time period. Figure  16 shows a summary of all vessels detected in Sentinel-1 SAR imagery that are matched to AIS vessel Distance, m Figure 15. The lengths distribution of vessels reported by AIS but not detected in SAR data.

Long-Term Analysis
In addition to its ability to provide information on a scene-by-scene basis, INSURE vessel detection data can provide an aggregate picture of vessel activity over a longer time period. Figure 16 shows a summary of all vessels detected in Sentinel-1 SAR imagery that are matched to AIS vessel tracking data (black spots) and unmatched vessels (red spots), over the test period from July 2016 to December 2017. Here we can see that more than three quarters of the vessels detected by Sentinel-1 have no AIS association, a pattern that is evident in most snapshot scenes. There may be several reasons for unmatched detections, such as false detection and tampering with the information broadcast by AIS. But the proportion of errors of this type is relatively small, and the obtained results may indicate that most of the unmatched vessels were either (i) operating without an AIS system, or (ii) operating with an AIS system that was turned off, or (iii) operating with an AIS systems that was either deliberately or unintentionally misreporting its position.
Remote Sens. 2018, 10, x FOR PEER REVIEW 20 of 29 Figure 16. The number of vessels recorded in SAR matched to AIS versus those that exist in the SAR record but have no AIS counterpart. Over 75% of the SAR detected vessels are not AIS registered. The size of each point corresponds to its detected size in either the SAR record, or as reported by the AIS transponder at the time of SAR sensor overpass. Figure 11 shows the classification of all vessels' positions extracted from the AIS record. Tankers, when not in transit, are predominantly clustered around the port of Lome in Togo, and in proximity to the offshore oil platforms around 3 degrees West. Service vessels, mostly associated with the oil and gas industry, are similarly focussed in the latter region, while cargo vessels operate in the proximity of major ports when not in transit. Fishing vessels, by comparison, are either confined to the ports and inshore regions or involved in open-ocean transit.
By comparing Figure 11 with Figure 16 we can see that the pattern of vessel clustering, visible in the AIS signals, is repeated in the unmatched vessels. Of particular note is the prevalence of small AIS unmatched vessels around 3 o W. This area is recognized as a rich fishing ground and noted for drift fishing by the semi-industrial fleet. Further offshore we have sporadic occurrences of larger unmatched vessels (50-100 m in length). We do not have sufficient AIS information to categorise the purpose of these vessels from size alone (see Section 3.3), but their presence in a deep-water environment, operating without AIS in near dark conditions (since the SAR passes are around 18:00 local time) raises questions as to their purpose.  Figure 16. The number of vessels recorded in SAR matched to AIS versus those that exist in the SAR record but have no AIS counterpart. Over 75% of the SAR detected vessels are not AIS registered. The size of each point corresponds to its detected size in either the SAR record, or as reported by the AIS transponder at the time of SAR sensor overpass. Figure 11 shows the classification of all vessels' positions extracted from the AIS record. Tankers, when not in transit, are predominantly clustered around the port of Lome in Togo, and in proximity to the offshore oil platforms around 3 degrees West. Service vessels, mostly associated with the oil and gas industry, are similarly focussed in the latter region, while cargo vessels operate in the proximity of major ports when not in transit. Fishing vessels, by comparison, are either confined to the ports and inshore regions or involved in open-ocean transit.
By comparing Figure 11 with Figure 16 we can see that the pattern of vessel clustering, visible in the AIS signals, is repeated in the unmatched vessels. Of particular note is the prevalence of small AIS unmatched vessels around 3 • W. This area is recognized as a rich fishing ground and noted for drift fishing by the semi-industrial fleet. Further offshore we have sporadic occurrences of larger un-matched vessels (50-100 m in length). We do not have sufficient AIS information to categorise the purpose of these vessels from size alone (see Section 3.3), but their presence in a deep-water environment, operating without AIS in near dark conditions (since the SAR passes are around 18:00 local time) raises questions as to their purpose.

Discrimination of Fishing Vessels in SAR Images
Typically, when considering only Sentinel-1 data, the temporal separation between consecutive SAR images of the coast off Ghana is of the order of 12 days when considering a single satellite, or 6 days when considering both Sentinel-1A and Sentinel-1B. Consequently, it is not possible to use satellite data to inform behavioural algorithms [35] to determine a given vessel's likely activity, as this typically requires high-frequency positional information of the kind delivered by AIS systems. However, the INSURE SAR-based vessel detection approach returns key information about the nature of the detected vessel (e.g., dimensions, backscatter intensity, number of intensity peaks), and these particular values allow the likely activity of the vessel to be evaluated through statistical inference, given training data. Hence, by comparing the lengths of AIS-SAR matched vessels that are delineated as "fishing", with those recorded as "not-fishing", we make a determination of how likely an unknown vessel (e.g., from SAR signal only) is to be a fishing boat, based on length alone. This univariate approach can be adapted to incorporate other verifiable metrics such as vessel width, and potentially in the future, vessel configuration. Figure 17 compares the probability distributions of the vessel lengths for AIS-SAR matched and unmatched cases, made by fitting a skewed Gaussian distribution to the underlying data. The probability distributions of vessel length for matched AIS were derived directly from AIS data, while the distribution for SAR was derived from vessels detected on those images. Comparing all vessels ( Figure 17a) shows a similar median length for the AIS and SAR data. The shapes of the probability envelopes are also similar, though a possible bimodal distribution in both sources suggests a partition between larger (length > 180 m) and smaller vessels (length < 120m). The same patterns are evident in non-fishing vessels (Figure 17c). The peak length of 180 m in the AIS histograms for all ( Figure 17a) and non-fishing vessels (Figure 17c) is most likely related to common bulk carrier vessels. It is not observed in the SAR histogram as the measurements are scattered around several histogram bin intervals. This is also illustrated in the scatterplot in Figure 18 that shows vessel length reported by AIS on the x-axis and vessel length estimated to form SAR on the y-axis.
Fishing vessels in Figure 17b tend to appear as longer in the SAR record. The probability distributions from known vessel matches can be extrapolated to model the chance that an unknown vessel is a fishing vessel (Figure 17d). This model suggests that, while we can say with a high degree of certainty that a vessel is not fishing if it has a length of over 200 m, in the crucial 50 to 110 m length partition, we have at best, a 50-60% chance of determining a vessel as a fishing vessel. About 70% of all vessels identified as unknown in SAR (orange line in Figure 17d) are shorter than 110 m, suggesting that a significant proportion may be fishing vessels.
The positions of fishing vessels in the region reported by AIS are presented in Figure 1 in red. The map shows two main destinations for fishing vessels in the region-the ports of Tacoradi and Tema-but there are no clusters of fishing activity. Figure 19 also illustrates the locations of non-cooperative vessels with lengths less than 110 m, detected by SAR but not identified by AIS. There are no obvious location clusters in the map showing fishing grounds, but we may conclude that most of the vessels are observed on the continental shelf close to the coast and other vessels are clustered along the shelf break, which is the preferred location for fishing activity. The vessels matched with AIS data are mostly clustered at port areas and around 3 • W (see Section 3.2). There is a relatively small number of fishing vessels detected in both SAR and AIS. Their locations are shown in Figure 19 with red circles. between larger (length > 180 m) and smaller vessels (length < 120m). The same patterns are evident in non-fishing vessels (Figure 17c). The peak length of 180 m in the AIS histograms for all ( Figure 17a) and non-fishing vessels (Figure 17c) is most likely related to common bulk carrier vessels. It is not observed in the SAR histogram as the measurements are scattered around several histogram bin intervals. This is also illustrated in the scatterplot in Figure 18 that shows vessel length reported by AIS on the x-axis and vessel length estimated to form SAR on the y-axis.  Fishing vessels in Figure 17b tend to appear as longer in the SAR record. The probability distributions from known vessel matches can be extrapolated to model the chance that an unknown vessel is a fishing vessel (Figure 17d). This model suggests that, while we can say with a high degree of certainty that a vessel is not fishing if it has a length of over 200 m, in the crucial 50 to 110 m length

Data Latency
The latency is an important parameter for the INSURE system which was estimated during the system demonstration period. The estimated time required for satellite data acquisition, processing, and delivery to a user through the web portal is shown in the Gantt chart in Figure 20. As is shown in the figure, the total latency was different for Sentinel-1 and Sentinel-2 sensors (about 10 hours for Sentinel-1 SAR and 15 hours for Sentinel-2 MSI). Most of this time is required for data acquisition and delivery through the Copernicus Scientific Data Hub: 7 hours for Sentinel-1 SAR and 10 hours for Sentinel-2 MSI. The operation of querying new satellite data at the Sentinel Data Hub was the second significant source of latency in the INSURE processing chain (see Figure 20). However, this latency can be reduced from 1 hour to 5 min by changing the system query interval. The AIS database processing component, hosted at the University of Ghana, has a latency of approximately 2.30 hours from acquisition period for the region of interest. The AIS data provider widely proclaims that their service

Data Latency
The latency is an important parameter for the INSURE system which was estimated during the system demonstration period. The estimated time required for satellite data acquisition, processing, and delivery to a user through the web portal is shown in the Gantt chart in Figure 20. As is shown in the figure, the total latency was different for Sentinel-1 and Sentinel-2 sensors (about 10 h for Sentinel-1 SAR and 15 h for Sentinel-2 MSI). Most of this time is required for data acquisition and delivery through the Copernicus Scientific Data Hub: 7 h for Sentinel-1 SAR and 10 h for Sentinel-2 MSI.

Data Latency
The latency is an important parameter for the INSURE system which was estimated during the system demonstration period. The estimated time required for satellite data acquisition, processing, and delivery to a user through the web portal is shown in the Gantt chart in Figure 20. As is shown in the figure, the total latency was different for Sentinel-1 and Sentinel-2 sensors (about 10 hours for Sentinel-1 SAR and 15 hours for Sentinel-2 MSI). Most of this time is required for data acquisition and delivery through the Copernicus Scientific Data Hub: 7 hours for Sentinel-1 SAR and 10 hours for Sentinel-2 MSI. The operation of querying new satellite data at the Sentinel Data Hub was the second significant source of latency in the INSURE processing chain (see Figure 20). However, this latency can be reduced from 1 hour to 5 min by changing the system query interval. The AIS database processing component, hosted at the University of Ghana, has a latency of approximately 2.30 hours from acquisition period for the region of interest. The AIS data provider widely proclaims that their service The operation of querying new satellite data at the Sentinel Data Hub was the second significant source of latency in the INSURE processing chain (see Figure 20). However, this latency can be reduced from 1 h to 5 min by changing the system query interval. The AIS database processing component, hosted at the University of Ghana, has a latency of approximately 2.30 h from acquisition period for the region of interest. The AIS data provider widely proclaims that their service has the lowest latency of 5-10 min. However, a huge number of vessels' positional and navigational information were reported with a delay of 30 min to about 1 h. This may be as a result of few ground stations being available to receive and process the data or low probability of detection of the satellite especially in areas heavily populated with ships. For this reason, the latency cannot be reduced without loss of AIS information.
Overall, the analysis of the Gantt chart in Figure 20 shows that the total latency of the INSURE system is too high for near real-time (NRT) vessel surveillance applications. Large delays in the information processing chain increase vessel search area and the chance of escaping from the coast-guard patrol. However, a 2-hour delay is usually considered acceptable for NRT surveillance applications. The INSURE system latency can be reduced to less than 2.5 h if the delay in accessing satellite data can be reduced to 1 h, e.g., by gathering data from local ground stations.

Application for Ghanaian Fisheries
The experimental results presented in Section 3 demonstrated the effectiveness of the INSURE system and the advantages of using satellite Earth observation data, combined with AIS data to monitor vessel activity in the Ghanaian EEZ. The distribution of vessels within the EEZ, shown in Figures 11 and 16, is related to the freight transport and fishing activities occurring within adjacent waters. In addition, a small proportion of vessels operating in the region can also be related to activities in the upstream and downstream sectors of the oil and gas industry.
A report from UNCTAD [36] reviewing maritime transport in Africa suggests that a significant number of the maritime transport and port calls in West Africa occur in the Ivory Coast, Ghana, Togo and Nigeria, and involve bulk carriers, container ships and tankers operating between the continents of North America, Europe and Asia. Clearly, the map of SAR-detected vessels in Figure 16 highlights two of these ports in the western and eastern corridors off the coast of Ghana. These ports are important commercial hubs for cargo transport, serving not only Ghana, but acting as points of transit for goods en route to landlocked West African nations. The ship tracks form part of the major transport routes for different classes of vessels that operate in the Gulf of Guinea.
The vessel observation data, generated by the INSURE service during the demonstration period, indicate that the western continental shelf region has become a busy field for the exploration of hydrocarbons and the production of petroleum and natural gas. The Tano and West Cape Three Points oil blocks are traversed constantly by service vessels that support the operations of oil companies. A significant number of these vessels operate from the Port of Takoradi and travel back and forth up to 3 • W where they engage in oil and gas exploration and production. In AIS and satellite EO data this is evident in the observed cluster of service vessels with connecting trajectories with the Port of Takoradi, and the movement of vessels involved in the distribution and marketing of oil products between the major ports in Ghana and Togo.
Experimental data confirm that coastal waters are popular for both artisanal and industrial fishing fleets, mostly due to the broad coastal shelf especially at the western end of Ghana. This also supports the use of purse seine and drift gill nets deployed from canoes, which we are unable to detect here due to the wooden hull construction of the vessels and their comparatively small size.
Conversely, the industrial fishing sector is dominated by engine-powered vessels with length ranging from 20 m to 100 m; vessels that are clearly seen in Sentinel-1 SAR images. These vessels operate beyond the 30-m depth Inshore Exclusive Zone on the continental shelf, as well as in the deeper waters. They target high commercial value fishes such as tuna [37] and demersal species including squids [38]. The region is characterised by seasonal upwelling [39] that provides the right conditions for fish to aggregate and feed, making them easy targets for fishermen. The intense fishing activities coupled with weak enforcement of fishing laws promote widespread incidence of fishing infractions. There is also increasing competition for fish between the artisanal fishermen and the well-resourced industrial fishing fleets which has a huge economic ramification for the coastal dwellers involved in the fisheries sector.
The competition for fish, which obviously puts the local fishermen at a disadvantage, has compelled some operators of these small fishing boats to move, for instance, to restricted fishing areas. For example, the presence of fishing vessels within the oil fields is clearly shown in AIS data in Figure 11. The image shows footprints of small fishing vessels which were recently fitted with Class-B AIS transponders by the Fisheries Enforcement Unit of the Ministry of Fisheries and Aquaculture Development, Ghana. These small boats have been reported to target fish that are attracted to some of the stable structures such as rigs, buoys and cables used in exploration and production (E&P) operations. The use of high-energy lamps on the rigs also contributes to the aggregation of fish. These structures, which serve as fish aggregating devices (FADS), and are restricted from fishing and navigation by commercial fishing fleets, tend to have high fish abundance. Artisanal fishermen, although they are prohibited from moving close to these E&P platforms, are attracted to these restricted areas by the high abundance of fish.

Unmatched Detections
In the observation data there is a high proportion of unmatched SAR detections with AIS in the ports and at the anchorage. This is attributed to the saturation of satellite AIS system in locations with high vessel density, as a high volume of AIS messages is broadcast [40]. A significant number of the unmatched detection with approximate vessel length of 20 m to 100 m is also observed during the experiment, supporting the assertion that the EEZ of Ghana is vulnerable and has a high rate of non-cooperative vessels. These vessels cannot be identified with AIS and there is a high chance of "pirate fishing" in the area. A large number of the vessels detected on the continental shelf corroborates with observations that suggest the Ghanaian shelf is a major fishing area. Within these fishing areas, artisanal fishermen have constantly complained about the easy access by different unlicensed fishing vessels, which often engage in IUU fishing such as the use dynamites and lights in fishing, pair trawling and sometimes transhipment to reefer vessels.

Technical Challenges and Future Development
For vessel monitoring system, the most important parameters are the latency of delivering the information to customers and the frequency of this information update. The INSURE monitoring system uses Copernicus Sentinel Data Hub, where the Sentinel-1 SAR and Sentinel-2 MSI data can be accessed within about 7-10 h after acquisition. This delay is not critical if the service is used for the analysis of historical data and assessment of IUU fishing activity in the region. However, for operational applications the achieved latency is too long, as it increases the uncertainty of vessel location and reduces the chance of vessel identification. The analysis of data processing latency in Section 3.4 shows that the total latency can be reduced to about 2.5 h if the satellite data will be delivered by the ESA NRT service or through the nearest ground station.
The reliance on a single SAR sensor (Sentinel-1) also restricts the frequency of information update in the INSURE processing chain to one image in 6 days at low latitude regions. This can be sufficient for tracking seasonal changes of IUU fishing activity throughout a year. However, more regular observations and control of fishing vessels in Ghana EEZ is needed for enforcement of fisheries laws in Ghana. The observation time interval can be reduced by processing Sentinel-2 MSI multispectral data in the INSURE system (see Section 2.1 "EO data sources" for discussion on sensor monitoring time). However, dense cloud cover over the Gulf of Guinea for most of year restricts application of satellite optical sensors. To improve the frequency of observations, the system can combine the data from additional SAR sensors, such as COSMO-SkyMed and Radarsat-2.
The INSURE system identifies non-cooperative vessels, but not direct evidence of vessels being involved in IUU fishing. Capabilities of the system can be extended by using additional indications of vessel involvement in IUU fishing or piracy, through analysis of AIS and VMS tracking data and searching for specific patterns of vessel behaviour. Additional benefits will be achieved through integration of the INSURE system with the discharge detection and polluter identification systems to collect any evidence of discarding fish catch or pumping bilge waters by unidentified fishing vessels.
Further benefits can be achieved by combining the INSURE data with marine environment information derived from ocean colour, sea surface temperature (SST) and Chl-a products. To evaluate the damage caused by IUU fishing the locations of non-cooperative vessels can be combined with the maps of potential fishing zones (PFZ), generated by ECOWAS Coastal and Marine Resource Management Centre (ECMRMC), the University of Ghana [41] (see Figure 21). The PFZ maps are developed from fish catch data derived from catch log-books from industrial fishing fleets which target tuna, fishing vessel traffic information extracted from AIS data and ocean surface parameters such as Chl-a, sea surface temperature, salinity, heights and currents. These parameters are input into a generalized additive model (GAM) to derive a forecast of potentially productive fishing areas within the continental shelves and oceanic regions of Western Africa. Figure 21 shows an example PFZ map generated by the ECMRMC. integration of the INSURE system with the discharge detection and polluter identification systems to collect any evidence of discarding fish catch or pumping bilge waters by unidentified fishing vessels. Further benefits can be achieved by combining the INSURE data with marine environment information derived from ocean colour, sea surface temperature (SST) and Chl-a products. To evaluate the damage caused by IUU fishing the locations of non-cooperative vessels can be combined with the maps of potential fishing zones (PFZ), generated by ECOWAS Coastal and Marine Resource Management Centre (ECMRMC), the University of Ghana [41] (see Figure 21). The PFZ maps are developed from fish catch data derived from catch log-books from industrial fishing fleets which target tuna, fishing vessel traffic information extracted from AIS data and ocean surface parameters such as Chl-a, sea surface temperature, salinity, heights and currents. These parameters are input into a generalized additive model (GAM) to derive a forecast of potentially productive fishing areas within the continental shelves and oceanic regions of Western Africa. Figure 21 shows an example PFZ map generated by the ECMRMC.

Conclusions
An end-to-end system has been developed for monitoring non-cooperative vessels in the Ghanaian Exclusive Economic Zone. The identity of non-cooperative vessels cannot be identified with AIS and there is a high chance that such vessels are involved in illegal fishing activity. The procedure uses fast-delivery data from the synthetic aperture radars on Sentinel-1 and the Multi Spectral Imager on Sentinel-2, with automatic download and processing for object detection. The detection algorithms have a high success rate with 91% of registered vessels being matched to a satellite detection, with coordinates for half the cases agreeing to within 100 m. The satellite data also yielded estimates for length and width of vessels that matched the distribution found in the area. Objects detected by satellite are then marked up as "static", "registered" or "suspect" and the information provided to ECOWAS and the Fisheries Commission, Ghana, via a dedicated web portal, typically within 2-3 hours from the ingestion time.
The capabilities of the INSURE system were demonstrated through a pilot study in Ghana EEZ from July 2016 to December 2017. The system successfully identified non-cooperative vessels in Ghana EEZ that do not use AIS or try to avoid identification by switching off and "tricking" the system. Over 75% of detected vessels were identified as non-cooperative. These preliminary experimental results illustrate the scale of the problem and demonstrated the importance of using Earth observation data in combination with AIS and VMS to monitor IUU fishing activity in Ghana's

Conclusions
An end-to-end system has been developed for monitoring non-cooperative vessels in the Ghanaian Exclusive Economic Zone. The identity of non-cooperative vessels cannot be identified with AIS and there is a high chance that such vessels are involved in illegal fishing activity. The procedure uses fast-delivery data from the synthetic aperture radars on Sentinel-1 and the Multi Spectral Imager on Sentinel-2, with automatic download and processing for object detection. The detection algorithms have a high success rate with 91% of registered vessels being matched to a satellite detection, with coordinates for half the cases agreeing to within 100 m. The satellite data also yielded estimates for length and width of vessels that matched the distribution found in the area. Objects detected by satellite are then marked up as "static", "registered" or "suspect" and the information provided to ECOWAS and the Fisheries Commission, Ghana, via a dedicated web portal, typically within 2-3 h from the ingestion time.
The capabilities of the INSURE system were demonstrated through a pilot study in Ghana EEZ from July 2016 to December 2017. The system successfully identified non-cooperative vessels in Ghana EEZ that do not use AIS or try to avoid identification by switching off and "tricking" the system. Over 75% of detected vessels were identified as non-cooperative. These preliminary experimental results illustrate the scale of the problem and demonstrated the importance of using Earth observation data in combination with AIS and VMS to monitor IUU fishing activity in Ghana's EEZ. Taking AIS data as an authentic record of ship movements, Boerder et al. [42] recognised the region off West Africa as one of the main hotspots for trans-shipment, incurring large losses to legal fisheries; our analysis, using EO data that cannot be tricked, suggests they may have significantly underestimated the size of the problem. While the INSURE system was developed for Ghana, it is not limited to this region and can be extended to include neighbouring states in the Gulf of Guinea or used for completely different geographical regions that need to improve fisheries surveillance. The system can be easily adapted to local user requirements, such as using AIS data available locally to reduce operational expenses. The presence of a web portal interface simplifies its integration into existing operational coastal monitoring environments.