Machine and Deep Learning Framework for Sargassum Detection and Fractional Cover Estimation Using Multi-Sensor Satellite Imagery
Abstract
1. Introduction
1.1. Remote Sensing for Sargassum Monitoring
1.2. Existing Detection Methodologies and Challenges
1.3. Study Rationale and Objectives
2. Data and Methods
2.1. Study Area
2.2. Satellite Data Acquisition and Specifications
2.3. Data Preprocessing and Spectral Library Generation
- For Landsat-8 Collection 2, the QA_PIXEL band was used [26]. This band provides a bit-packed representation of surface conditions. Pixels were masked and excluded from the analysis if their QA_PIXEL value indicated either Cloud or Cloud Shadow with medium or high confidence. This corresponds to any pixel where the Cloud Confidence bits are 10 (Medium) or 11 (High), or where the Cloud Shadow Confidence bits are 10 (Medium) or 11 (High). For example, pixel values such as 22,280 (High Conf Cloud) and 23,888 (High Conf Cloud Shadow) were removed.
 - For Sentinel-2 Level-2A, the Scene Classification Layer (SCL) was used [27]. This layer provides a per-pixel classification of scene content. Pixels were masked if they belonged to any of the following classes, which represent clouds, cloud shadows, or other unwanted atmospheric features: Cloud Shadows (Class 3), Cloud Medium Probability (Class 8), Cloud High Probability (Class 9), or Thin Cirrus (Class 10).
 
2.4. Model Development
2.5. Spatial Analysis and Inter-Algorithm Agreement
3. Results
3.1. Model Performance
3.2. Qualitative Detection and Agreement Analysis
3.3. Quantitative Agreement Summary
3.3.1. Overall Inter-Ensemble Reliability
3.3.2. Pairwise Agreement and Detection Distribution
4. Discussion
4.1. Effectiveness of Multi-Sensor Integration and Data Harmonization
4.2. Model Performance and the Nature of the Test Set
4.3. Interpreting Inter-Algorithm Agreement and Model Behavior
4.4. Global Scalability and Generalizability
4.5. Limitations and Future Research
- Empirical Validation Workflow: The most critical next step is the rigorous validation of fractional cover estimates. A future project should implement a validation workflow involving: (a) planning UAV or drone survey campaigns over Sargassum aggregations to coincide with Landsat-8/Sentinel-2 overpasses; (b) co-registering the high-resolution UAV imagery with the satellite pixels; (c) calculating ground-truthed fractional cover from the UAV data; and (d) performing a statistical comparison to calibrate and validate the satellite-derived proxy maps.
 - Spatial Context: The next logical step is to integrate spatial context. Moving from a 1D (spectral) to a 2D (spectral-spatial) CNN architecture would allow the model to learn not only the spectral signature of Sargassum but also its characteristic texture and morphology, improving its ability to distinguish genuine Sargassum slicks from confounding features such as cloud edges or ship wakes.
 - Optimizing Spectral Inputs with Sentinel-2: Future work should explore using an expanded set of nine Sentinel-2 spectral bands, from the blue (Band 2) to the SWIR-1 (Band 11) region. This approach offers greater spectral resolution for discriminating Sargassum. Particular focus should be given to the four bands located between the red and NIR regions (i.e., the ‘Red-Edge’ bands 5, 6, 7, and the narrow NIR band 8a).
 - Semi-Supervised Learning: Given the resource-intensive nature of manual labeling, future research could explore semi-supervised learning techniques to alleviate this burden. By pre-training a model on a large corpus of unlabeled satellite imagery to learn the general statistics of ocean scenes, we could then fine-tune it on our smaller, high-quality labeled dataset to achieve greater robustness and generalization.
 
5. Conclusions
- High Efficacy on Curated Data: All five evaluated classifiers (RF, KNN, XGB, MLP, 1D-CNN) demonstrated excellent performance on the test set, underscoring the strong spectral separability of Sargassum in high-quality satellite imagery.
 - Probabilistic Output as a Viable Proxy: Using classifier probabilities outputs as a direct proxy for fractional cover is a practical and effective method for transitioning from simple binary detection to quantitative mapping of Sargassum distribution and density.
 - Ensemble Analysis is Crucial for Confidence Assessment: The inter-algorithm agreement analysis, particularly Fleiss’ and Cohen’s Kappa statistics, proved invaluable. It moved the evaluation beyond simple accuracy metrics to reveal the system’s actual behavior, highlighting that detections are often made with either very high (unanimous) or very low (contentious) confidence, providing a critical framework for assessing the reliability of detections in an operational context.
 - A Foundation for Operational Monitoring: The developed methodology provides a robust and well-vetted basis for operational Sargassum monitoring. The findings emphasize the importance of using a model ensemble and agreement metrics to produce nuanced, reliable data products for mitigating the impacts of Sargassum blooms.
 
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| CNN | Convolutional Neural Network | 
| DL | Deep Learning | 
| ESA | European Space Agency | 
| FAI | Floating Algae Index | 
| GASB | Great Atlantic Sargassum Belt | 
| KNN | K-Nearest Neighbors | 
| LAADS | Level-1 and Atmosphere Archive | 
| DAAC | Distribution System Distributed Active Archive Center | 
| L2SP | Level-2 Science Products | 
| ML | Machine Learning | 
| MLP | Multi-Layer Perceptron | 
| MODIS | Moderate Resolution Imaging Spectroradiometer | 
| MSI | Multispectral Instrument | 
| NASA | National Aeronautics and Space Administration | 
| NIR | Near-Infrared | 
| OLI | Operational Land Imager | 
| QGIS | Quantum Geographic Information System | 
| RF | Random Forests | 
| SR | Surface Reflectance | 
| SWIR | Short-Wave Infrared | 
| USGS | U.S. Geological Survey | 
| XGB | Extreme Gradient Boosting | 
Appendix A. Satellite Imagery Log
| Sensor | Acquisition Date | Scene Identifier (Path/Row or Tile ID) | 
|---|---|---|
| Landsat-8 OLI | 23 July 2015 | 016/046 | 
| Landsat-8 OLI | 23 July 2015 | 016/047 | 
| Landsat-8 OLI | 30 July 2015 | 017/046 | 
| Landsat-8 OLI | 30 July 2015 | 017/047 | 
| Landsat-8 OLI | 6 August 2015 | 018/047 | 
| Landsat-8 OLI | 6 August 2015 | 018/048 | 
| Landsat-8 OLI | 31 August 2015 | 017/048 | 
| Landsat-8 OLI | 11 January 2018 | 017/047 | 
| Landsat-8 OLI | 21 February 2018 | 016/047 | 
| Landsat-8 OLI | 21 February 2018 | 016/048 | 
| Landsat-8 OLI | 28 February 2018 | 017/047 | 
| Landsat-8 OLI | 28 February 2018 | 017/048 | 
| Landsat-8 OLI | 22 July 2018 | 017/047 | 
| Landsat-8 OLI | 4 November 2018 | 016/048 | 
| Sentinel-2 MSI | 6 July 2016 | T16QFJ | 
| Sentinel-2 MSI | 1 July 2017 | T16QFJ | 
| Sentinel-2 MSI | 28 July 2017 | T16QGH | 
| Sentinel-2 MSI | 12 August 2017 | T16QGJ | 
| Sentinel-2 MSI | 30 August 2017 | T16QFJ | 
| Sentinel-2 MSI | 21 September 2017 | T16QGJ | 
| Sentinel-2 MSI | 21 June 2018 | T16QFJ | 
| Sentinel-2 MSI | 23 June 2018 | T16QGJ | 
| Sentinel-2 MSI | 11 July 2018 | T16QEH | 
| Sentinel-2 MSI | 27 August 2018 | T16QGH | 
| Sentinel-2 MSI | 8 September 2020 | T16QEH | 
| Sentinel-2 MSI | 25 June 2021 | T16QEH | 
| Sentinel-2 MSI | 20 July 2021 | T16QEH | 
| Sentinel-2 MSI | 14 August 2022 | T16QEH | 
| Sentinel-2 MSI | 23 September 2022 | T16QEH | 
| Sentinel-2 MSI | 20 June 2023 | T16QEH | 
| Sentinel-2 MSI | 10 July 2023 | T16QEH | 
| Sentinel-2 MSI | 19 August 2023 | T16QEH | 
| Sentinel-2 MSI | 23 September 2023 | T16QEH | 
| Sentinel-2 MSI | 18 August 2024 | T16QEH | 
Appendix B. Supplementary Code Access and Details
Details on Hyperparameter Grids
Appendix C. Calculation of Macro-Averaged Performance Metrics
Example Calculation: K-Nearest Neighbors (KNN)
- True Positives (TP): 3331 (“Sargassum“ correctly identified)
 - False Negatives (FN): 148 (“Sargassum“ missed, predicted as “No Sargassum“)
 - False Positives (FP): 78 (“No Sargassum“ misidentified as “Sargassum“)
 - True Negatives (TN): 55,255 (“No Sargassum“ correctly identified)
 
- TP becomes TN (55,255), FP becomes FN (148), and FN becomes FP (78).
 
- Macro Precision =
 - Macro Recall =
 - Macro F1-Score =
 
References
- Wang, M.; Hu, C.; Barnes, B.; Mitchum, G.; Lapointe, B.; Montoya, J. The Great Atlantic Sargassum Belt. Science 2019, 365, 83–87. [Google Scholar] [CrossRef] [PubMed]
 - Rodríguez-Martínez, R.; Jordán-Dahlgren, E.; Hu, C. Spatio-Temporal Variability of Pelagic Sargassum Landings on the Northern Mexican Caribbean. Remote Sens. Appl. Soc. Environ. 2022, 27, 100767. [Google Scholar] [CrossRef]
 - Ody, A.; Thibaut, T.; Berline, L.; Changeux, T.; André, J.; Chevalier, C.; Blanfuné, A.; Blanchot, J.; Ruitton, S.; Stiger-Pouvreau, V.; et al. From In Situ to Satellite Observations of Pelagic Sargassum Distribution and Aggregation in the Tropical North Atlantic Ocean. PLoS ONE 2019, 14, e0222584. [Google Scholar] [CrossRef] [PubMed]
 - Gower, J.; Young, E.; King, S. Satellite Images Suggest a New Sargassum Source Region in 2011. Remote Sens. Lett. 2013, 4, 764–773. [Google Scholar] [CrossRef]
 - Triñanes, J.; Putman, N.; Goñi, G.; Hu, C.; Wang, M. Monitoring Pelagic Sargassum Inundation Potential for Coastal Communities. J. Oper. Oceanogr. 2021, 16, 48–59. [Google Scholar] [CrossRef]
 - Arellano-Verdejo, J.; Lazcano-Hernandez, H.; Cabanillas-Terán, N. ERISNet: Deep Neural Network for Sargassum Detection Along the Coastline of the Mexican Caribbean. PeerJ 2019, 7, e6842. [Google Scholar] [CrossRef] [PubMed]
 - Laval, M.; Belmouhcine, A.; Courtrai, L.; Descloitres, J.; Salazar-Garibay, A.; Schamberger, L.; Minghelli, A.; Thibaut, T.; Dorville, R.; Mazoyer, C.; et al. Detection of Sargassum from Sentinel Satellite Sensors Using Deep Learning Approach. Remote Sens. 2023, 15, 1104. [Google Scholar] [CrossRef]
 - Marsh, R.; Skliris, N.; Tompkins, E.; Dash, J.; Almela, V.; Tonon, T.; Oxenford, H.; Webber, M. Climate-Sargassum Interactions Across Scales in the Tropical Atlantic. PLoS Clim. 2023, 2, e0000253. [Google Scholar] [CrossRef]
 - Chandler, C.; Ávila Mosqueda, S.; Salas-Acosta, E.; Magaña-Gallegos, E.; Mancera, E.; Reali, M.; Barreda-Bautista, B.; Boyd, D.; Metcalfe, S.; Sjogersten, S.; et al. Spectral Characteristics of Beached Sargassum in Response to Drying and Decay over Time. Remote Sens. 2023, 15, 4336. [Google Scholar] [CrossRef]
 - Hernández, W.; Morell, J.; Armstrong, R. Using High-Resolution Satellite Imagery to Assess the Impact of Sargassum Inundation on Coastal Areas. Remote Sens. Lett. 2021, 13, 24–34. [Google Scholar] [CrossRef]
 - Lazcano-Hernandez, H.; Arellano-Verdejo, J.; Rodríguez-Martínez, R. Algorithms Applied for Monitoring Pelagic Sargassum. Front. Mar. Sci. 2023, 10, 1216426. [Google Scholar] [CrossRef]
 - Roger, J.C.; Ray, J.P.; Vermote, E.F. MODIS Surface Reflectance User’s Guide: Collections 6 and 6.1. MODIS Land Surface Reflectance Science Computing Facility, Version 1.7 ed. Principal Investigator: Dr. Eric F. Vermote. 2023. Available online: https://modis-land.gsfc.nasa.gov (accessed on 22 October 2025).
 - Wang, M.; Hu, C. Mapping and Quantifying Sargassum Distribution and Coverage in the Central West Atlantic Using MODIS Observations. Remote Sens. Environ. 2016, 183, 350–367. [Google Scholar] [CrossRef]
 - Sun, D.; Chen, Y.; Wang, S.; Zhang, H.; Qiu, Z.; Mao, Z.; He, Y. Using Landsat 8 OLI Data to Differentiate Sargassum and Ulva prolifera Blooms in the South Yellow Sea. Int. J. Appl. Earth Obs. Geoinf. 2021, 98, 102302. [Google Scholar] [CrossRef]
 - Wang, M.; Hu, C. Satellite Remote Sensing of Pelagic Sargassum Macroalgae: The Power of High Resolution and Deep Learning. Remote Sens. Environ. 2021, 264, 112631. [Google Scholar] [CrossRef]
 - Sun, Y.; Wang, M.; Liu, M.; Li, Z.; Chen, Z.; Huang, B. Continuous Sargassum Monitoring Across the Caribbean Sea and Central Atlantic Using Multi-Sensor Satellite Observations. Remote Sens. Environ. 2024, 309, 114223. [Google Scholar] [CrossRef]
 - Hu, C. A Novel Ocean Color Index to Detect Floating Algae in the Global Oceans. Remote Sens. Environ. 2009, 113, 2118–2129. [Google Scholar] [CrossRef]
 - Hu, C.; Feng, L.; Hardy, R.; Hochberg, E. Spectral and Spatial Requirements of Remote Measurements of Pelagic Sargassum Macroalgae. Remote Sens. Environ. 2015, 167, 229–246. [Google Scholar] [CrossRef]
 - Podlejski, W.; Descloitres, J.; Chevalier, C.; Minghelli, A.; Lett, C.; Berline, L. Filtering Out False Sargassum Detections Using Context Features. Front. Mar. Sci. 2022, 9, 960939. [Google Scholar] [CrossRef]
 - Xiao, Y.; Liu, R.; Kim, K.; Zhang, J.; Cui, T. A Random Forest-Based Algorithm to Distinguish Ulva prolifera and Sargassum from Multispectral Satellite Images. IEEE Trans. Geosci. Remote Sens. 2021, 60, 4201515. [Google Scholar] [CrossRef]
 - Shin, J.; Lee, J.; Jang, L.; Lim, J.; Khim, B.; Jo, Y. Sargassum Detection Using Machine Learning Models: A Case Study with the First 6 Months of GOCI-II Imagery. Remote Sens. 2021, 13, 4844. [Google Scholar] [CrossRef]
 - Hu, C.; Zhang, S.; Barnes, B.; Xie, Y.; Wang, M.; Cannizzaro, J.; English, D. Mapping and Quantifying Pelagic Sargassum in the Atlantic Ocean Using Multi-Band Medium-Resolution Satellite Data and Deep Learning. Remote Sens. Environ. 2023, 289, 113515. [Google Scholar] [CrossRef]
 - Cui, B.; Zhang, H.; Jing, W.; Liu, H.; Cui, J. SRSe-Net: Super-Resolution-Based Semantic Segmentation Network for Green Tide Extraction. Remote Sens. 2022, 14, 710. [Google Scholar] [CrossRef]
 - Claverie, M.; Ju, J.; Masek, J.; Dungan, J.; Vermote, E.; Roger, J.; Skakun, S.; Justice, C. The Harmonized Landsat and Sentinel-2 Surface Reflectance Data Set. Remote Sens. Environ. 2018, 219, 145–161. [Google Scholar] [CrossRef]
 - Roy, D.; Kovalskyy, V.; Zhang, H.; Vermote, E.; Yan, L.; Kumar, S.; Egorov, A. Characterization of Landsat-7 to Landsat-8 Reflective Wavelength and Normalized Difference Vegetation Index Continuity. Remote Sens. Environ. 2016, 185, 57–70. [Google Scholar] [CrossRef] [PubMed]
 - U.S. Geological Survey. Landsat 8-9 Collection 2 Level 2 Science Product Guide; Technical Report; Version 5.0; U.S. Geological Survey: Reston, VA, USA, 2024. [Google Scholar]
 - Copernicus. Sentinel-2 Products Specification Document; Technical Report S2-PDGS-TAS-DI-PSD; European Space Agency (ESA): Paris, France, 2024. [Google Scholar]
 - Cohen, J. A Coefficient of Agreement for Nominal Scales. Educ. Psychol. Meas. 1960, 20, 37–46. [Google Scholar] [CrossRef]
 - Fleiss, J. Measuring Nominal Scale Agreement Among Many Raters. Psychol. Bull. 1971, 76, 378–382. [Google Scholar] [CrossRef]
 - Congalton, R.; Green, K. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices, 3rd ed.; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar] [CrossRef]
 - Landis, J.R.; Koch, G.G. The Measurement of Observer Agreement for Categorical Data. Biometrics 1977, 33, 159–174. [Google Scholar] [CrossRef] [PubMed]
 - Aleissaee, A.; Kumar, A.; Anwer, R.; Khan, S.; Cholakkal, H.; Xia, G.; Khan, F. Transformers in Remote Sensing: A Survey. Remote Sens. 2023, 15, 1860. [Google Scholar] [CrossRef]
 - Xia, J.; Romeiser, R.; Zhang, W.; Özgökmen, T. Use of Vision Transformer to Classify Sea Surface Phenomena in SAR Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 10937–10956. [Google Scholar] [CrossRef]
 















| Feature | Sentinel-2 (MSI) | Landsat-8 (OLI/TIRS) | Aqua (MODIS) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Operator | ESA | NASA/USGS | NASA | ||||||
| Nominal Revisit | ∼5 days (S2A, S2B) | 16 days | 1–2 days | ||||||
| Swath Width () | 290 | 185 | 2330 | ||||||
| Channel | Band | SRes (m) | CW (nm) | Band | SRes (m) | CW (nm) | Band | SRes (m) | CW (nm) | 
| Blue | B2 | 10 | ∼490 | B2 | 30 | ∼482 | B3 | 500 | ∼469 | 
| Green | B3 | 10 | ∼560 | B3 | 30 | ∼561 | B4 | 500 | ∼555 | 
| Red | B4 | 10 | ∼665 | B4 | 30 | ∼655 | B1 | 250 | ∼645 | 
| NIR | B8A | 20 | ∼865 | B5 | 30 | ∼865 | B2 | 250 | ∼859 | 
| SWIR1 | B11 | 20 | ∼1610 | B6 | 30 | ∼1609 | B6 | 500 | ∼1640 | 
| Harmonized Band | Landsat-8 Source Band | Harmonization Equation | 
|---|---|---|
| Blue (B2 equivalent) | Blue (B2) | |
| Green (B3 equivalent) | Green (B3) | |
| Red (B4 equivalent) | Red (B4) | |
| NIR (B8A equivalent) | NIR (B5) | |
| SWIR1 (B11 equivalent) | SWIR1 (B6) | 
| Statistic | Blue | Green | Red | NIR | SWIR1 | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sarg | No Sarg | Sarg | No Sarg | Sarg | No Sarg | Sarg | No Sarg | Sarg | No Sarg | |||||||
| Mean | 0.121 | 0.202 | 0.122 | 0.192 | 0.110 | 0.179 | 0.214 | 0.213 | 0.063 | 0.135 | ||||||
| Std Dev | 0.032 | 0.115 | 0.055 | 0.115 | 0.054 | 0.123 | 0.078 | 0.135 | 0.070 | 0.102 | ||||||
| Min | 0.060 | 0.025 | 0.013 | −0.019 | −0.011 | −0.037 | −0.002 | −0.040 | −0.050 | −0.058 | ||||||
| 25% | 0.089 | 0.135 | 0.065 | 0.129 | 0.053 | 0.117 | 0.157 | 0.129 | −0.015 | 0.076 | ||||||
| Median | 0.139 | 0.163 | 0.144 | 0.172 | 0.134 | 0.162 | 0.213 | 0.194 | 0.118 | 0.138 | ||||||
| 75% | 0.149 | 0.251 | 0.174 | 0.232 | 0.160 | 0.223 | 0.271 | 0.282 | 0.124 | 0.184 | ||||||
| Max | 0.238 | 0.970 | 0.213 | 0.907 | 0.210 | 0.958 | 0.398 | 1.036 | 0.155 | 0.814 | ||||||
| Algorithm | Accuracy | Precision (Macro) | Recall (Macro) | F1-Score (Macro) | Train Time (s) | 
|---|---|---|---|---|---|
| Random Forest | 0.9962 | 0.9898 | 0.9760 | 0.9828 | 2093.89 | 
| KNN | 0.9962 | 0.9877 | 0.9781 | 0.9828 | 38.89 | 
| XGBoost | 0.9964 | 0.9899 | 0.9776 | 0.9837 | 225.61 | 
| MLP | 0.9964 | 0.9879 | 0.9798 | 0.9838 | 11,378.22 | 
| 1D-CNN | 0.9948 | 0.9800 | 0.9734 | 0.9767 | 122.05 | 
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.  | 
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Echevarría-Rubio, J.M.; Martínez-Flores, G.; Morales-Pérez, R.A. Machine and Deep Learning Framework for Sargassum Detection and Fractional Cover Estimation Using Multi-Sensor Satellite Imagery. Data 2025, 10, 177. https://doi.org/10.3390/data10110177
Echevarría-Rubio JM, Martínez-Flores G, Morales-Pérez RA. Machine and Deep Learning Framework for Sargassum Detection and Fractional Cover Estimation Using Multi-Sensor Satellite Imagery. Data. 2025; 10(11):177. https://doi.org/10.3390/data10110177
Chicago/Turabian StyleEchevarría-Rubio, José Manuel, Guillermo Martínez-Flores, and Rubén Antelmo Morales-Pérez. 2025. "Machine and Deep Learning Framework for Sargassum Detection and Fractional Cover Estimation Using Multi-Sensor Satellite Imagery" Data 10, no. 11: 177. https://doi.org/10.3390/data10110177
APA StyleEchevarría-Rubio, J. M., Martínez-Flores, G., & Morales-Pérez, R. A. (2025). Machine and Deep Learning Framework for Sargassum Detection and Fractional Cover Estimation Using Multi-Sensor Satellite Imagery. Data, 10(11), 177. https://doi.org/10.3390/data10110177
        
