Author Contributions
Conceptualization, S.K., J.v.V. and C.J.V.; methodology, S.K., J.v.V. and C.J.V.; software, S.K.; validation, S.K.; formal analysis, S.K.; investigation, S.K.; resources, S.K., J.v.V. and F.J.V.; data curation, S.K. and J.v.V.; writing—original draft preparation, S.K.; writing—review and editing, C.J.V., J.v.V., J.J.M. and F.J.V.; visualization, S.K.; supervision, C.J.V., J.v.V., F.J.V. and J.J.M.; project administration, F.J.V.; funding acquisition, J.v.V. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Distribution of the variable cloud fraction for the dataset used in this study.
Figure 1.
Distribution of the variable cloud fraction for the dataset used in this study.
Figure 2.
The tropospheric column. Visualized day: 14 June 2019. Studied area: the Mediterranean Sea, restricted by the northern coasts of Libya and Egypt from the south and the south coast of Crete from the north. Black lines indicate ships’ tracks based on information from AIS data. The red area on the right-hand side of the figure corresponds to the land outflow from the variety of land base sources of . For the convenience of visualization, the presented data were not regridded—the native local size of the TROPOMI pixels are presented in the figure.
Figure 2.
The tropospheric column. Visualized day: 14 June 2019. Studied area: the Mediterranean Sea, restricted by the northern coasts of Libya and Egypt from the south and the south coast of Crete from the north. Black lines indicate ships’ tracks based on information from AIS data. The red area on the right-hand side of the figure corresponds to the land outflow from the variety of land base sources of . For the convenience of visualization, the presented data were not regridded—the native local size of the TROPOMI pixels are presented in the figure.
Figure 3.
A list of days used for the dataset creation and the number of ships per day studied.
Figure 3.
A list of days used for the dataset creation and the number of ships per day studied.
Figure 4.
(a) Ship track—estimated, based on AIS data records. The ship track is shown for the time period starting from 2 h before the satellite overpass until the moment of the satellite overpass. (b) Wind-shifted ship track—a ship track shifted in accordance with the speed and direction of the wind. The wind-shifted ship track indicates the expected position of the ship plume. A black arrow indicates the wind direction. For both presented images, the size of the pixel is equal to 4.2 × 5 km.
Figure 4.
(a) Ship track—estimated, based on AIS data records. The ship track is shown for the time period starting from 2 h before the satellite overpass until the moment of the satellite overpass. (b) Wind-shifted ship track—a ship track shifted in accordance with the speed and direction of the wind. The wind-shifted ship track indicates the expected position of the ship plume. A black arrow indicates the wind direction. For both presented images, the size of the pixel is equal to 4.2 × 5 km.
Figure 5.
Ship sector definition pipeline. (a) Ship plume image—the TROPOMI signal for the area around the analyzed ship. Two ship plumes can be distinguished, but only one is of interest. (b) The signal enhanced by Moran’s I spatial auto-correlation statistic. (c) Ship track—estimated, based on AIS data records. The ship track is shown for the time period starting from 2 h before the satellite overpass until the moment of the satellite overpass. (d) Wind-shifted ship track—a ship track shifted in accordance with the speed and direction of the wind. The wind-shifted ship track indicates the expected position of the ship plume. A black arrow indicates the wind direction. (e) Extreme wind-shifted ship tracks—calculated, based on wind information with assumed uncertainties; define the borders of the ship sector. (f) A resulting ship sector—an ROI of an analyzed ship. For all presented images, the size of the pixel is equal to 4.2 × 5 km.
Figure 5.
Ship sector definition pipeline. (a) Ship plume image—the TROPOMI signal for the area around the analyzed ship. Two ship plumes can be distinguished, but only one is of interest. (b) The signal enhanced by Moran’s I spatial auto-correlation statistic. (c) Ship track—estimated, based on AIS data records. The ship track is shown for the time period starting from 2 h before the satellite overpass until the moment of the satellite overpass. (d) Wind-shifted ship track—a ship track shifted in accordance with the speed and direction of the wind. The wind-shifted ship track indicates the expected position of the ship plume. A black arrow indicates the wind direction. (e) Extreme wind-shifted ship tracks—calculated, based on wind information with assumed uncertainties; define the borders of the ship sector. (f) A resulting ship sector—an ROI of an analyzed ship. For all presented images, the size of the pixel is equal to 4.2 × 5 km.
Figure 6.
Sector normalization. We rotate the ship sectors so that all resulting sectors have the same orientation equal to 320, independently of the original direction of the ship’s heading. We then rescaled the image so that the range of both coordinates is between 0 and 1. The gray area in each figure indicates a ship sector. The ship sector origin indicator shows the position of the ship at the moment of the satellite overpass. Two examples of original and rotated sectors are shown: one in the top row and one in the bottom row.
Figure 6.
Sector normalization. We rotate the ship sectors so that all resulting sectors have the same orientation equal to 320, independently of the original direction of the ship’s heading. We then rescaled the image so that the range of both coordinates is between 0 and 1. The gray area in each figure indicates a ship sector. The ship sector origin indicator shows the position of the ship at the moment of the satellite overpass. Two examples of original and rotated sectors are shown: one in the top row and one in the bottom row.
Figure 7.
Levels and sub-sectors. We perform a feature construction by dividing the normalized sector into sub-regions: levels and sub-sectors. For the convenience of visualization, data points from one day of analysis were used for the preparation of the figure.
Figure 7.
Levels and sub-sectors. We perform a feature construction by dividing the normalized sector into sub-regions: levels and sub-sectors. For the convenience of visualization, data points from one day of analysis were used for the preparation of the figure.
Figure 8.
Classwise distribution of the two main features of the dataset: and Moran’s I.
Figure 8.
Classwise distribution of the two main features of the dataset: and Moran’s I.
Figure 9.
Input data example for univariate threshold-based benchmarks. (a) Input data for a benchmark method threshold. (b) Input data for a benchmark method Moran’s I threshold. At the top of the ship sector, the reader can find an example when a cluster of low-value was mistakenly enhanced by Moran’s I. (c) Input data for a benchmark method, Moran’s I on high . For all presented images, the size of the pixel is equal to 4.2 × 5 km.
Figure 9.
Input data example for univariate threshold-based benchmarks. (a) Input data for a benchmark method threshold. (b) Input data for a benchmark method Moran’s I threshold. At the top of the ship sector, the reader can find an example when a cluster of low-value was mistakenly enhanced by Moran’s I. (c) Input data for a benchmark method, Moran’s I on high . For all presented images, the size of the pixel is equal to 4.2 × 5 km.
Figure 10.
Nested cross-validation—illustration scheme.
Figure 10.
Nested cross-validation—illustration scheme.
Figure 11.
Precision–recall curve based on 5-fold cross-validation. Dashed lines indicate the results obtained from univariate threshold-based methods that, in this study, we considered as benchmarks.
Figure 11.
Precision–recall curve based on 5-fold cross-validation. Dashed lines indicate the results obtained from univariate threshold-based methods that, in this study, we considered as benchmarks.
Figure 12.
Receiver Operating Characteristic (ROC) curve based on five-fold cross-validation. Dashed lines indicate the results obtained from univariate threshold-based methods that, in this study, we considered as benchmarks.
Figure 12.
Receiver Operating Characteristic (ROC) curve based on five-fold cross-validation. Dashed lines indicate the results obtained from univariate threshold-based methods that, in this study, we considered as benchmarks.
Figure 13.
Coefficients of the features in the decision function of the linear models and the impurity-based feature importance values for tree-based models.
Figure 13.
Coefficients of the features in the decision function of the linear models and the impurity-based feature importance values for tree-based models.
Figure 14.
Pearson correlations between estimated (based on classification results) values of emitted by each ship on a given day and a theoretical ship emission proxy. Black lines indicate a fitted linear trend. Grey lines show 30% deviations from the fitted linear trend.
Figure 14.
Pearson correlations between estimated (based on classification results) values of emitted by each ship on a given day and a theoretical ship emission proxy. Black lines indicate a fitted linear trend. Grey lines show 30% deviations from the fitted linear trend.
Figure 15.
XGBoost classifier allows for the segmentation of plumes that were not recognized by the labeler. (a) TROPOMI tropospheric vertical column density. Units: mol/m. The variable was a part of the input to machine learning models. The ship plume is difficult to distinguish by the human eye. (b) TROPOMI image enhanced by Moran’s I. The variable was a part of the input to machine learning models. After enhancement, the ship plume can be recognized better. At the top of the ship sector can be found an example when a cluster of low-value was enhanced incorrectly. (c) Results of the segmentation of the XGBoost model. Black pixels indicate pixels classified by the model as a “plume”. (d) Human labels. The absence of black pixels means that there were no pixels within the area labeled as a plume. For all presented images, the size of the pixel is equal to 4.2 × 5 km. Measurement date: 24 June 2019. Ship type: tanker. Ship length: 230 m. Average speed within the studied time scope: 14.27 kt.
Figure 15.
XGBoost classifier allows for the segmentation of plumes that were not recognized by the labeler. (a) TROPOMI tropospheric vertical column density. Units: mol/m. The variable was a part of the input to machine learning models. The ship plume is difficult to distinguish by the human eye. (b) TROPOMI image enhanced by Moran’s I. The variable was a part of the input to machine learning models. After enhancement, the ship plume can be recognized better. At the top of the ship sector can be found an example when a cluster of low-value was enhanced incorrectly. (c) Results of the segmentation of the XGBoost model. Black pixels indicate pixels classified by the model as a “plume”. (d) Human labels. The absence of black pixels means that there were no pixels within the area labeled as a plume. For all presented images, the size of the pixel is equal to 4.2 × 5 km. Measurement date: 24 June 2019. Ship type: tanker. Ship length: 230 m. Average speed within the studied time scope: 14.27 kt.
Figure 16.
-based thresholding allows for distinguishing plumes cumulated within one pixel of the TROPOMI image. (a) TROPOMI tropospheric vertical column density. Units: mol/m. (b) TROPOMI image enhanced by Moran’s I. At the top left of the ship sector can be found an example when a cluster of low-value was enhanced incorrectly. (c) Results of the segmentation of the threshold method. A black pixel is a pixel that was identified by a model as a plume. (d) Human labels. The absence of black pixels means that there were no pixels within the area labeled as a plume. For all presented images, the size of the pixel is equal to 4.2 × 5 km. Measurement date: 9 June 2019. Ship type: tanker. Ship length: 285 m. Average speed within the studied time scope: 15.4 kt.
Figure 16.
-based thresholding allows for distinguishing plumes cumulated within one pixel of the TROPOMI image. (a) TROPOMI tropospheric vertical column density. Units: mol/m. (b) TROPOMI image enhanced by Moran’s I. At the top left of the ship sector can be found an example when a cluster of low-value was enhanced incorrectly. (c) Results of the segmentation of the threshold method. A black pixel is a pixel that was identified by a model as a plume. (d) Human labels. The absence of black pixels means that there were no pixels within the area labeled as a plume. For all presented images, the size of the pixel is equal to 4.2 × 5 km. Measurement date: 9 June 2019. Ship type: tanker. Ship length: 285 m. Average speed within the studied time scope: 15.4 kt.
Figure 17.
Distribution of the dataset features for the images, where there were no visible ship plumes distinguished, and for the images, where there was a visually distinguishable ship plume.
Figure 17.
Distribution of the dataset features for the images, where there were no visible ship plumes distinguished, and for the images, where there was a visually distinguishable ship plume.
Table 1.
Parameters applied for ship sector definition.
Table 1.
Parameters applied for ship sector definition.
Parameter | Value |
---|
Trace track duration | 2 h |
Wind speed uncertainty | 5 m/s |
Wind direction uncertainty | 40 |
Table 2.
The number of measurement points per class in the dataset.
Table 2.
The number of measurement points per class in the dataset.
| No Plume | Plume |
---|
Number of pixels | 68,646 | 6980 |
Number of images | 208 | 535 |
Table 3.
Results on the test set with 5-fold cross-validation. Bold font indicates the best-obtained result. Under the dashed line: results obtained from univariate threshold-based methods that, in this study, we considered as benchmarks.
Table 3.
Results on the test set with 5-fold cross-validation. Bold font indicates the best-obtained result. Under the dashed line: results obtained from univariate threshold-based methods that, in this study, we considered as benchmarks.
Model | AP | ROC-AUC |
---|
Linear SVM | 0.609 ± 0.063 | 0.935 ± 0.009 |
Logistic | 0.610 ± 0.064 | 0.936 ± 0.010 |
RBF SVM | 0.742 ± 0.031 | 0.951 ± 0.008 |
Random Forest | 0.743 ± 0.030 | 0.952 ± 0.008 |
XGBoost | 0.745 ± 0.030 | 0.953 ± 0.007 |
threshold | 0.375 ± 0.062 | 0.823 ± 0.017 |
Moran’s I threshold | 0.493 ± 0.063 | 0.912 ± 0.011 |
Moran’s I on high | 0.607 ± 0.056 | 0.922 ± 0.010 |
Table 4.
Results on the comparison between the estimated amount of and theoretically derived ship emission proxy. Sorted in accordance with the achieved level of the Pearson correlation. Italic font indicates baseline results.
Table 4.
Results on the comparison between the estimated amount of and theoretically derived ship emission proxy. Sorted in accordance with the achieved level of the Pearson correlation. Italic font indicates baseline results.
Segmentation Method | Pearson Correlation | Number of Detected Plumes |
---|
XGBoost | 0.834 | 371 |
Manual Labeling | 0.781 | 334 |
Random Forest | 0.775 | 436 |
| 0.774 | 334 |
Logistic | 0.766 | 452 |
Linear SVM | 0.765 | 452 |
RBF SVM | 0.757 | 447 |
Moran’s I
on high
| 0.733 | 422 |
Moran’s I | 0.681 | 448 |
Table 5.
Average and standard deviation for the dataset features for the images, where there were no visible ship plumes distinguished, and for the images, where there was a visually distinguishable ship plume.
Table 5.
Average and standard deviation for the dataset features for the images, where there were no visible ship plumes distinguished, and for the images, where there was a visually distinguishable ship plume.
Variable Name | No Plume Image | Image with a Plume |
---|
Wind speed (m/s) | 5.47 ± 2.31 | 5.27 ± 2.00 |
Ship speed (kt) | 16.83 ± 2.01 | 17.41 ± 2.04 |
Ship length (m) | 279.92 ± 86.64 | 303.99 ± 82.79 |