An Attention-Based Hybrid CNN–Bidirectional LSTM Model for Classifying Chlorophyll-a Concentration in Coastal Waters
Abstract
1. Introduction
2. Materials and Methods
2.1. GOT001 Marine Telemetry Station System
2.2. Field Measurement Meteorological, Hydrochemical, and Biological Data
2.3. Feature Selection
2.4. Chl-a Classification for Prediction
2.5. CNN–LSTM
2.6. BILSTM
2.7. Attention
2.8. Data Processing, Analysis, and Visualization
2.9. Model Performance Evaluation
2.10. Code Availability
3. Results
3.1. Meteorological, Hydrochemical, and Biological Results
3.2. Feature Selection and Chl-a Classification Results
3.3. Model Performance Results
4. Discussion
4.1. Meteorological, Hydrochemical, and Biological Characteristics
4.2. Interpretation of Feature Selection and Chl-a Classification
4.3. Interpretation of Model Performance
4.4. Limitations
4.5. Future Work
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Correction Statement
Appendix A
| Model | Layer Type | Configuration/Parameters |
|---|---|---|
| CNN–LSTM | Input layer | input_shape = (1, 3) (1 time–step, 3 features) |
| Conv1D | filter = 64, kernel_size = 1, activation = ‘relu’ | |
| MaxPooling1D | pool_size = 1 | |
| LSTM (1) | units = 64, return_sequences = True | |
| LSTM (2) | units = 32 | |
| Dense | units = 32, activation = ‘relu’ | |
| Dropout | rate = 0.3 | |
| Output dense | units = num_classes (3), activation = ‘softmax’ | |
| Compilation | optimizer = ‘adam’, loss = ‘categorical_crossentropy’, metrics = [‘accuracy’] | |
| Training | epochs = ‘tunning’, batch_size = 32, validation_split = 0.2 | |
| Evaluation | metrics = classification_report, confusion_matrix, accuracy | |
| CNN–BiLSTM | Input layer | input_shape = (24, 3) − 24 time steps, 3 features |
| Conv1D | filters = 64, kernel_size =3, activation = ‘relu’, padding = ‘same’ | |
| MaxPooling1D | pool_size = 2 | |
| Bidirectional LSTM (1) | units = 64, return_sequences = True | |
| Bidirectional LSTM (2) | units = 32, return_sequences = True | |
| Dense | units = 32, activation = ‘relu’ | |
| Dropout | rate = 0.3 | |
| Output dense | units = num_classes (3), activation = ‘softmax’ | |
| Compilation | optimizer = ‘adam’, loss = ‘categorical_crossentropy’, metrics = [‘accuracy’] | |
| Training | epochs = ‘tunning’, batch_size = 32, validation_split = 0.2 | |
| Evaluation | metrics = classification_report, confusion_matrix, accuracy | |
| CNN–BiLSTM–Attention | Input layer | input_shape = (24, 3) − 24 time steps, 3 features |
| Conv1D | filters = 64, kernel_size =3, activation = ‘relu’, padding = ‘same’ | |
| MaxPooling1D | pool_size = 2 | |
| Bidirectional LSTM (1) | units = 64, return_sequences = True | |
| Bidirectional LSTM (2) | units = 32, return_sequences = True | |
| Attention | self-attention mechanism: Attention ()([x, x]) | |
| GlobalAveragePooling 1D | aggregates sequence output | |
| Dense | units = 32, activation = ‘relu’ | |
| Dropout | rate = 0.3 | |
| Output dense | units = num_classes (3), activation = ‘softmax’ | |
| Compilation | optimizer = ‘adam’, loss = ‘categorical_crossentropy’, metrics = [‘accuracy’] | |
| Training | epochs = ‘tunning’, batch_size = 32, validation_split = 0.2 | |
| Evaluation | metrics = classification_report, confusion_matrix, accuracy |
References
- Willer, D.F.; Robinson, J.P.W.; Patterson, G.T.; Luyckx, K. Maximising sustainable nutrient production from coupled fisheries–aquaculture systems. PLoS Sustain. Transform. 2022, 1, e0000005. [Google Scholar] [CrossRef]
- Trujillo, P.; Piroddi, C.; Jacquet, J. Fish farms at sea: The ground truth from Google Earth. PLoS ONE 2012, 7, e30546. [Google Scholar] [CrossRef]
- Food and Agriculture Organization (FAO). The State of World Fisheries and Aquaculture 2020; FAO: Rome, Italy, 2020; Available online: https://www.fao.org/documents/card/en/c/ca9229en (accessed on 10 August 2025).
- Deng, J.; Chen, F.; Hu, W.; Lu, X.; Xu, B.; Hamilton, D.P. Variations in the distribution of Chl-a and simulation using a multiple regression model. Int. J. Environ. Res. Public Health 2019, 16, 4553. [Google Scholar] [CrossRef] [PubMed]
- Luang-on, J.; Ishizaka, J.; Buranapratheprat, A.; Phaksopa, J.; Goes, J.I.; Kobayashi, H.; Hayashi, M.; Maúre, E.D.R.; Matsumura, S. Seasonal and interannual variations of MODIS-Aqua chlorophyll-a (2003–2017) in the upper Gulf of Thailand influenced by Asian monsoons. J. Oceanogr. 2022, 78, 209–228. [Google Scholar] [CrossRef]
- Heisler, J.; Glibert, P.M.; Burkholder, J.M.; Anderson, D.M.; Cochlan, W.; Dennison, W.C.; Dortch, Q.; Gobler, C.J.; Heil, C.A.; Humphries, E.; et al. Eutrophication and harmful algal blooms: A scientific consensus. Harmful Algae 2008, 8, 3–13. [Google Scholar] [CrossRef] [PubMed]
- Paerl, H.W.; Otten, T.G. Harmful cyanobacterial blooms: Causes, consequences, and controls. Microb. Ecol. 2013, 65, 995–1010. [Google Scholar] [CrossRef]
- Bresciani, M.; Stroppiana, D.; Odermatt, D.; Morabito, G.; Giardino, C. Assessing remotely sensed chlorophyll-a for the implementation of the Water Framework Directive in European perialpine lakes. Sci. Total Environ. 2011, 409, 3083–3091. [Google Scholar] [CrossRef]
- Reynolds, C.S. The Ecology of Phytoplankton; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar] [CrossRef]
- Huisman, J.; Codd, G.A.; Paerl, H.W.; Ibelings, B.W.; Verspagen, J.M.H.; Visser, P.M. Cyanobacterial blooms. Nat. Rev. Microbiol. 2018, 16, 471–483. [Google Scholar] [CrossRef]
- Gons, H.J.; Auer, M.T.; Effler, S.W. MERIS satellite chlorophyll mapping of oligotrophic and eutrophic waters in the Laurentian Great Lakes. Remote Sens. Environ. 2008, 112, 4098–4106. [Google Scholar] [CrossRef]
- Cloern, J.E.; Foster, S.Q.; Kleckner, A.E. Phytoplankton primary production in the world’s estuarine–coastal ecosystems. Biogeosciences 2014, 11, 2477–2501. [Google Scholar] [CrossRef]
- Jongjaraunsuk, R.; Taparhudee, W.; Suwannasing, P. Comparison of water quality prediction for red tilapia aquaculture in an outdoor recirculation system using deep learning and a hybrid model. Water 2024, 16, 907. [Google Scholar] [CrossRef]
- Jongjaraunsuk, R.; Taparhudee, W. Optimizing prediction of key water quality parameters in tilapia river-based cage culture using simple parameters based on different deep learning models. Agric. Nat. Resour. 2025, 59, 590412. [Google Scholar] [CrossRef]
- Palani, S.; Liong, S.Y.; Tkalich, P. An ANN application for water quality forecasting. Mar. Pollut. Bull. 2008, 56, 1586–1597. [Google Scholar] [CrossRef] [PubMed]
- Ahmed, U.; Mumtaz, R.; Anwar, H.; Shah, A.A.; Irfan, R.; García-Nieto, J. Efficient water quality prediction using supervised machine learning. Water 2019, 11, 2210. [Google Scholar] [CrossRef]
- Liu, P.; Wang, J.; Sangaiah, A.K.; Xie, Y.; Yin, X. Analysis and prediction of water quality using LSTM deep neural networks in the IoT environment. Sustainability 2019, 11, 2058. [Google Scholar] [CrossRef]
- Castrillo, M.; López García, A. Estimation of high-frequency nutrient concentrations from water quality surrogates using machine learning methods. Water Res. 2020, 172, 115490. [Google Scholar] [CrossRef]
- Zambrano, A.F.; Giraldo, L.F.; Quimbayo, J.; Medina, B.; Castillo, E. Machine learning for manually measured water quality prediction in fish farming. PLoS ONE 2021, 16, e0256380. [Google Scholar] [CrossRef]
- Kim, H.R.; Soh, H.Y.; Kwak, M.T.; Han, S.H. Machine learning and multiple-imputation approaches to predict chlorophyll-a concentration in the coastal zone of Korea. Water 2022, 14, 1862. [Google Scholar] [CrossRef]
- Barzegar, R.; Aalami, M.T.; Adamowski, J. Short-term water quality variable prediction using a hybrid CNN–LSTM deep learning model. Stoch. Environ. Res. Risk Assess. 2020, 34, 415–433. [Google Scholar] [CrossRef]
- Abbas, A.; Park, M.; Baek, S.S.; Cho, K.H. Deep learning–based algorithms for long-term prediction of chlorophyll-a in catchment streams. J. Hydrol. 2023, 626, 130240. [Google Scholar] [CrossRef]
- Yao, L.; Wang, X.; Zhang, J.; Yu, X.; Zhang, S.; Li, Q. Prediction of sea surface chlorophyll-a concentrations based on deep learning and time-series remote sensing data. Remote Sens. 2023, 15, 4486. [Google Scholar] [CrossRef]
- Huang, C.; Xu, S.; Bi, R.; Jiang, B.; Du, Y.; Ma, H. Chlorophyll-a inversion algorithm and algae classification technique based on hyperspectral data. In Proceedings of the 2024 12th International Conference on Agro-Geoinformatics, Novi Sad, Serbia, 15–18 July 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–4. [Google Scholar] [CrossRef]
- Ni, J.; Liu, R.; Tang, G.; Xie, Y. An improved attention-based bidirectional LSTM model for cyanobacterial bloom prediction. Int. J. Control Autom. Syst. 2022, 20, 3445–3455. [Google Scholar] [CrossRef]
- Yu, W.; Wang, X.; Jiang, X.; Zhao, R.; Zhao, S. A novel hybrid model based on two-stage data processing and machine learning for forecasting chlorophyll-a concentration in reservoirs. Environ. Sci. Pollut. Res. 2024, 31, 262–279. [Google Scholar] [CrossRef] [PubMed]
- Mertens, S.; Verbraeken, L.; Sprenger, H.; Demuynck, K.; Maleux, K.; Cannoot, B.; De Block, J.; Maere, S.; Nelissen, H.; Bonaventure, G.; et al. Proximal hyperspectral imaging detects diurnal and drought-induced changes in maize physiology. Front. Plant Sci. 2021, 12, 640914. [Google Scholar] [CrossRef] [PubMed]
- Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans. Geosci. Remote Sens. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
- Ihianle, I.K.; Nwajana, A.O.; Ebenuwa, S.H.; Otuka, R.I.; Owa, K.; Orisatoki, M.O. A deep learning approach for human activities recognition from multimodal sensing devices. IEEE Access 2020, 8, 179028–179038. [Google Scholar] [CrossRef]
- Chen, H.; Yang, J.; Fu, X.; Zheng, Q.; Song, X.; Fu, Z.; Wang, J.; Liang, Y.; Yin, H.; Liu, Z.; et al. Water quality prediction based on LSTM and attention mechanism: A case study of the Burnett River, Australia. Sustainability 2022, 14, 13231. [Google Scholar] [CrossRef]
- Shen, T.; Zhou, T.; Long, G.; Jiang, J.; Pan, S.; Zhang, C. DiSAN: Directional self-attention network for RNN/CNN-free language understanding. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; pp. 5045–5052. [Google Scholar]
- Wang, C.; Li, X.; Shi, Y.; Jiang, W.; Song, Q.; Li, X. Load forecasting method based on CNN and extended LSTM. Energy Rep. 2024, 12, 2452–2461. [Google Scholar] [CrossRef]
- Tang, W.; Long, G.; Liu, L.; Zhou, T.; Blumenstein, M.; Jiang, J. Omni-Scale CNNs: A simple and effective kernel size configuration for time series classification. In Proceedings of the International Conference on Learning Representations (ICLR), Virtual, 25–29 April 2022. [Google Scholar]
- Lang, C.; Steinborn, F.; Steffens, O.; Lang, E.W. Electricity load forecasting—An evaluation of simple 1D CNN network structures. arXiv 2019. [Google Scholar] [CrossRef]
- Thoppil, P.G. Enhanced phytoplankton bloom triggered by atmospheric high-pressure systems over the northern Arabian Sea. Sci. Rep. 2023, 13, 769. [Google Scholar] [CrossRef]
- Somavilla, R.; Rodriguez, C.; Lavín, A.; Viloria, A.; Marcos, E.; Cano, D. Atmospheric control of deep chlorophyll maximum development. Geosciences 2019, 9, 178. [Google Scholar] [CrossRef]
- Song, H.; Ji, R.; Stock, C.; Kearney, K.; Wang, Z. Interannual variability in phytoplankton blooms and plankton productivity over the Nova Scotian Shelf and in the Gulf of Maine. Mar. Ecol. Prog. Ser. 2011, 426, 105–118. [Google Scholar] [CrossRef]
- Cairo, C.; Barbosa, C.; Lobo, F.; Novo, E.; Carlos, F.; Maciel, D.; Flores Júnior, R.; Silva, E.; Curtarelli, V. Hybrid chlorophyll-a algorithm for assessing trophic states of a tropical Brazilian reservoir based on Sentinel-2 MSI data. Remote Sens. 2020, 12, 40. [Google Scholar] [CrossRef]
- Cai, H.; Zhang, C.; Xu, J.; Wang, F.; Xiao, L.; Huang, S.; Zhang, Y. Water quality prediction based on the KF-LSTM encoder–decoder network: A case study with missing data collection. Water 2023, 15, 2542. [Google Scholar] [CrossRef]
- Ehteram, M.; Ahmed, A.N.; Sherif, M.; El-Shafie, A. An advanced deep learning model for predicting the water quality index. Ecol. Indic. 2024, 160, 111806. [Google Scholar] [CrossRef]










| Abbreviation | Full Name (Unit) | Instrument/Model |
|---|---|---|
| Meteorological parameters | ||
| AIRTEMP | Air temperature (°C) | MaxiMet GMX501 |
| DEWPOINT | Dew point temperature (°C) | |
| PRECIP | Precipitation (mm) | |
| PRESSURE | Atmospheric pressure (hPa) | |
| RH | Relative humidity (%) | |
| WINDDIR | Real-time wind direction (degrees) | |
| WINDSPD | Real-time wind speed (m/s) | |
| SOLAR | Solar radiation (W/m2) | |
| Hydrochemical parameters | ||
| COND | Electrical conductivity (µS/cm) | Infinity–CTW |
| SAL | Salinity (PSU) | |
| SEATEMP | Sea surface temperature (°C) | |
| DO | Dissolved oxygen (mg/L) | Rinkow–w (AROW2) |
| DOSAT | Dissolved oxygen saturation (%) | |
| TURB | Turbidity (NTU) | Infinity–CLW |
| Hsig | Significant wave height (m) | Nortek AWAC 1 MHz |
| Hmax | Maximum wave height (m) | |
| CURDIR | Near surface current direction (degrees) | |
| CURSPD | Near surface current speed (m/s) | |
| TP | Top peak wave period (s) | |
| WAVEDIR | Wave direction (degrees) | |
| WLEVEL | Water level (m) | |
| pH | Power of hydrogen | Digital pH sensor |
| Biological parameter | ||
| Chl-a | Chlorophyll-a concentration (µg/L) | Infinity–CLW |
| Step | Description |
|---|---|
| Library import | Utilize pandas for data handling and RF Regressor from sklearn for model training, along with additional packages for preprocessing and evaluation. |
| Data loading and preprocessing | Import data, filter relevant features and target variables, handle missing values. |
| Train–test split | Partition dataset into training and testing subsets (80/20 split). |
| Model configuration and training | Initialize the RF with specific parameters (n_estimators = 100, random_state = 42, max_depth = 5, max_leaf_nodes = 10), and fit the model on the training set. |
| Performance evaluation | Evaluate model accuracy using MAE on both training and testing sets. |
| Prediction visualization | Generate scatter plots of predicted vs. actual values to assess prediction trends. |
| Feature importance plot | Create bar plot of feature importance scores to highlight key variables for each prediction target. |
| Parameter | Unite | Mean ± Sd | Min | Max |
|---|---|---|---|---|
| Meteorological parameters | ||||
| AIRTEMP | °C | 28.39 ± 1.85 | 19.10 | 33.70 |
| DEWPOINT | °C | 25.27 ± 2.35 | 10.10 | 29.50 |
| PRECIP | mm | 0.16 ± 1.48 | 0.00 | 45.60 |
| PRESSURE | hPa | 1008.74 ± 2.77 | 999.10 | 1017.80 |
| RH | % | 83.34 ± 9.13 | 38.00 | 100.00 |
| WINDDIR | degrees | 162.90 ± 82.54 | 0.00 | 359.00 |
| WINDSPD | m/s | 3.93 ± 1.83 | 0.03 | 15.88 |
| SOLAR | W/m2 | 219.66 ± 313.90 | 0.00 | 1431 |
| Hydrochemical parameters | ||||
| COND | µS/cm | 51.56 ± 4.25 | 2.84 | 71.60 |
| SAL | PSU | 30.58 ± 2.67 | 5.14 | 43.24 |
| SEATEMP | °C | 29.85 ± 1.64 | 23.74 | 33.89 |
| DO | mg/L | 5.97 ± 1.32 | 0.00 | 11.95 |
| DOSAT | % | 95.50 ± 31.98 | 0.00 | 200.00 |
| TURB | NTU | 14.90 ± 78.49 | 0.00 | 995.89 |
| Hsig | m | 0.12 ± 0.10 | 0.00 | 2.38 |
| Hmax | m | 0.21 ± 0.16 | 0.02 | 3.01 |
| CURDIR | degrees | 165.75 ± 106.15 | 0.00 | 360.00 |
| CURSPD | m/s | 0.08 ± 0.05 | 0.00 | 0.51 |
| TP | s | 3.38 ± 0.98 | 0.60 | 33.33 |
| WAVEDIR | degrees | 199.36 ± 43.20 | 0.11 | 359.91 |
| WLEVEL | m | 0.13 ± 0.80 | −2.16 | 2.24 |
| pH | – | – | 7.00 | 9.00 |
| Biological parameter | ||||
| Chl-a | µg/L | 3.79 ± 5.85 | 0.00 | 105.42 |
| Model | Epochs Adjusted | Accuracy | Precision | Recall | F1 Score | Time/Step (s) |
|---|---|---|---|---|---|---|
| CNN–LSTM | 100 | 0.573 ± 0.014 | 0.580 ± 0.017 | 0.573 ± 0.012 | 0.570 ± 0.017 | 3.918 ± 0.354 |
| 200 | 0.593 ± 0.002 | 0.597 ± 0.006 | 0.593 ± 0.006 | 0.593 ± 0.006 | 4.031 ± 0.159 | |
| 300 | 0.583 ± 0.020 | 0.590 ± 0.017 | 0.573 ± 0.021 | 0.573 ± 0.021 | 4.168 ± 0.028 | |
| 400 | 0.597 ± 0.006 | 0.597 ± 0.180 | 0.597 ± 0.006 | 0.593 ± 0.180 | 3.933 ± 0.180 | |
| 500 | 0.590 ± 0.001 | 0.600 ± 0.085 | 0.590 ± 0.085 | 0.590 ± 0.010 | 4.029 ± 0.085 | |
| 600 | 0.590 ± 0.002 | 0.600 ± 0.236 | 0.590 ± 0.236 | 0.590 ± 0.000 | 4.224 ± 0.236 | |
| CNN–BiLSTM | 100 | 0.750 ± 0.017 | 0.750 ± 0.017 | 0.747 ± 0.023 | 0.747 ± 0.023 | 36.997 ± 0.736 |
| 200 | 0.807 ± 0.006 | 0.807 ± 0.006 | 0.807 ± 0.006 | 0.807 ± 0.006 | 34.403 ± 1.991 | |
| 300 | 0.813 ± 0.012 | 0.813 ± 0.012 | 0.813 ± 0.012 | 0.813 ± 0.012 | 35.300 ± 1.819 | |
| 400 | 0.813 ± 0.012 | 0.820 ± 0.000 | 0.820 ± 0.000 | 0.820 ± 0.000 | 35.733 ± 1.137 | |
| 500 | 0.813 ± 0.006 | 0.813 ± 0.006 | 0.813 ± 0.006 | 0.813 ± 0.006 | 35.667 ± 1.002 | |
| CNN–BiLSTM–Attention | 100 | 0.750 ± 0.017 | 0.750 ± 0.017 | 0.750 ± 0.017 | 0.750 ± 0.017 | 36.207 ± 0.131 |
| 200 | 0.807 ± 0.006 | 0.807 ± 0.006 | 0.807 ± 0.006 | 0.807 ± 0.006 | 36.743 ± 0.649 | |
| 300 | 0.813 ± 0.006 | 0.813 ± 0.006 | 0.813 ± 0.006 | 0.813 ± 0.006 | 35.510 ± 1.585 | |
| 400 | 0.813 ± 0.006 | 0.813 ± 0.006 | 0.810 ± 0.000 | 0.810 ± 0.000 | 35.277 ± 1.704 | |
| 500 | 0.813 ± 0.006 | 0.810 ± 0.000 | 0.810 ± 0.000 | 0.810 ± 0.000 | 34.200 ± 0.300 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Taparhudee, W.; Pokavanich, T.; Chansuparp, M.; Khaodon, K.; Rermdumri, S.; Intarachart, A.; Jongjaraunsuk, R. An Attention-Based Hybrid CNN–Bidirectional LSTM Model for Classifying Chlorophyll-a Concentration in Coastal Waters. Water 2026, 18, 33. https://doi.org/10.3390/w18010033
Taparhudee W, Pokavanich T, Chansuparp M, Khaodon K, Rermdumri S, Intarachart A, Jongjaraunsuk R. An Attention-Based Hybrid CNN–Bidirectional LSTM Model for Classifying Chlorophyll-a Concentration in Coastal Waters. Water. 2026; 18(1):33. https://doi.org/10.3390/w18010033
Chicago/Turabian StyleTaparhudee, Wara, Tanuspong Pokavanich, Manit Chansuparp, Kanokwan Khaodon, Saroj Rermdumri, Alongot Intarachart, and Roongparit Jongjaraunsuk. 2026. "An Attention-Based Hybrid CNN–Bidirectional LSTM Model for Classifying Chlorophyll-a Concentration in Coastal Waters" Water 18, no. 1: 33. https://doi.org/10.3390/w18010033
APA StyleTaparhudee, W., Pokavanich, T., Chansuparp, M., Khaodon, K., Rermdumri, S., Intarachart, A., & Jongjaraunsuk, R. (2026). An Attention-Based Hybrid CNN–Bidirectional LSTM Model for Classifying Chlorophyll-a Concentration in Coastal Waters. Water, 18(1), 33. https://doi.org/10.3390/w18010033

