1. Introduction
Ocean temperature profiles describe the vertical thermal structure of ocean waters and provide detailed information on variations across the mixed layer, thermocline, and bottom waters [
1,
2,
3]. These profiles are crucial in fisheries and aquaculture because they directly influence subsurface habitat conditions, prey availability, and water column stability, all of which affect the growth and health of farmed species [
4,
5,
6]. Regional rapid forecasting of temperature profiles is therefore increasingly important for adaptive aquaculture management [
7,
8]. Regions such as the Yellow Sea Cold Water Mass, characterized by persistent bottom–surface thermal differences, are sensitive habitats for species such as scallops and sea cucumbers, whose physiological stress correlates with bottom temperature fluctuations [
9,
10].
Therefore, regional rapid forecasting of vertical temperature evolution helps aquaculture practitioners adjust cage depths, optimize feeding, and mitigate risks from extreme events [
11,
12]. High-resolution, physically consistent, and timely models are increasingly important for smart aquaculture and ecological risk management.
However, ocean temperature profiles evolve under complex physical processes [
13,
14], influenced by surface factors like wind and heat flux as well as multi-scale interactions involving salinity, turbulence, and deep circulation [
15]. In regions with intricate thermal structures, such as the Yellow Sea, nonlinear dynamics lead to abrupt structural shifts [
16], further complicated by weather variability, small-scale turbulence, and ecological events [
17]. Accurately capturing these rapidly varying, nonlinear vertical dynamics, especially at hourly scales, remains a key scientific and technical challenge.
While numerical models remain foundational in ocean prediction, they face limitations in real-time applications. High computational costs, sensitivity to initial conditions, and difficulty in deployment of edge devices limit their responsiveness in localized applications that require frequent updates. They have limited capacity to learn from real-world data, particularly under sparse observations.
Recent advances in satellite remote sensing, reanalysis products, and deep learning provide promising alternatives. Satellite data offer key physical variables such as SST, wind fields, and sea surface height anomalies, while deep learning excels at spatiotemporal modeling and is well-suited for edge deployment [
18]. Several studies have applied data-driven methods. LSTM has been used to predict mixed-layer temperatures from meteorological inputs [
19]; the effectiveness of LSTM in SST forecasting was demonstrated in [
20]; a multi-layer ConvLSTM was proposed for 3D ocean temperature modeling [
21]; a 4D CNN (SST-4D-CNN) was developed for thermocline prediction [
22]; SST, ZOS, and wind fields have been fused to enhance subsurface forecasts [
23]; and CNN–LSTM–attention architectures integrating temperature, salinity, ocean currents, and MLD have demonstrated the value of embedding physical drivers [
24].
Despite these advances, several challenges persist in current ocean temperature modeling approaches. Traditional numerical models, while physically grounded, often require high computational resources and are difficult to adapt for localized applications requiring frequent updates. Many models use input data updated only daily or less frequently, hindering regional rapid forecasting. Most models do not explicitly incorporate physical constraints, which reduces their interpretability. Most existing studies also focus on 2D SST prediction without full-profile modeling, which limits practical applicability.
To overcome these challenges, this study leverages the synergy between large-scale satellite data and fine-scale, continuous in situ observations. We propose a deep learning model, namely PICA-Net, which combines 1D-CNN for vertical structure extraction, BiLSTM for temporal dynamics, and attention mechanisms for adaptive feature weighting [
25,
26]. Using hourly historical data, it predicts temperature profiles at 6 h intervals over the next 24 h. Inputs include satellite-derived variables and physical drivers.
To improve physical coherence, the model incorporates additional physical constraint terms—including temporal–spatial diffusion consistency, mixed-layer homogeneity constraint, and surface heat flux consistency constraint—into its loss function [
27]. Trained on reanalysis data, the model shows strong accuracy and generalization, and incorporating physical constraints further improves shallow-layer coherence and physical realism. Together, they demonstrate the complementary strengths of data-driven and physics-informed approaches.
The reanalysis data we used (e.g., CMEMS, ERA5) assimilate multiple in situ sources such as Argo float profiles, ship-based measurements, and other observational platforms, after rigorous quality control. Reanalysis products are adopted as ground truth for their temporal–spatial continuity and alignment with known ocean dynamics, serving as reliable substitutes for in situ data [
28]. This lays the foundation for future integration of real-time observations, advancing responsive and accurate temperature forecasting systems.
Ultimately, PICA-Net supports edge deployment, enabling real-time, site-specific forecasting to guide smart aquaculture decisions. It offers robust support for scheduling, environmental alerts, and precision operations. The remainder of this paper is structured as follows:
Section 2 presents the data sources and preprocessing;
Section 3 details the PICA-Net architecture and physical constraints;
Section 4 describes the experimental setup and results, including ablation studies and edge deployment; and
Section 5 concludes with future directions for real-time integration and model generalization.
3. Experiments and Results
3.1. Experimental Setup and Forecasting Strategy
We adopt a sliding window approach based on reanalysis data to generate training samples, using the past 24 h of multi-source inputs (14 features across 16 depth layers) with an input shape of [
14,
16,
24], and predicting temperature profiles at four 6 h intervals. All features are standardized using training set statistics. The model is trained with Adam (initial learning rate 1 × 10
−3), using MSE as the primary loss and optional physical regularization terms (heat diffusion, MLD uniformity, and heat flux balance) to enhance physical consistency. Training runs up to 200 epochs with a batch size of 64 and early stopping enabled.
3.2. Evaluation Metrics and Comparison Schemes
To evaluate the performance of PICA-Net in hourly temperature profile prediction, we use two standard metrics, namely mean absolute error (MAE) and root mean square error (RMSE), calculated over the full test set, individual forecast lead times (+6 h to +24 h), and depth layers (16 levels).
The specific formulas for MAE and RMSE are defined as follows:
Among them, represents the model prediction value, represents the actual observation value, and is the total number of samples.
Additionally, we conduct three comparative experiments: feature ablation to assess key remote sensing inputs, module ablation to evaluate the contribution of each network component, and model comparison against LSTM, TCN, Transformer, and Random Forest to verify accuracy and physical consistency.
3.3. Feature Ablation Study
To evaluate the contribution of satellite remote sensing features to PICA-Net’s performance, we conducted feature ablation experiments focused on key surface forcings such as SST, heat flux, and wind. Using the full-feature model (14 variables) as the baseline (MAE: 0.2876 °C, RMSE: 0.4073 °C), we systematically removed groups of inputs—SST and Q_net, ZOS, wind speed (u10, v10), wind stress (τx, τy), and all remote sensing features—and retrained the model. Performance changes on the validation set were analyzed to quantify the importance of each input group in predicting vertical temperature structures.
To assess the contribution of each input feature, we conducted feature ablation experiments by systematically removing individual or grouped variables, retraining the model, and evaluating the resulting RMSE on the validation set, as shown in
Table 2.
The experimental results are summarized in detail in
Table 2. A clear conclusion can be drawn from the table: all tested remote sensing features have a significant positive contribution to the prediction accuracy of the model, because removing any set of features will cause a significant increase in model prediction errors (MAE and RMSE).
3.4. Module Ablation Experiment
To assess the effectiveness of PICA-Net’s hybrid architecture and the role of its core components—1D-CNN, Bi-LSTM, and Attention—we performed a module ablation study. Using the full model as baseline, we tested three variants by removing each module individually: (1) No-CNN (Bi-LSTM + Attention), evaluating CNN’s role in capturing vertical spatial features; (2) No-BiLSTM (1D-CNN + Attention), testing temporal modeling capacity; and (3) No-Attention (1D-CNN + Bi-LSTM), assessing the impact of attention. All models were trained under the same conditions, and changes in MAE and RMSE were used to quantify each module’s contribution to predictive accuracy and interpretability.
To evaluate the contribution of each architectural module in PICA-Net, we conducted a set of ablation experiments by selectively disabling the CNN, Bi-LSTM, or attention components. The models were retrained under the same settings, and performance was assessed on the validation set, as summarized in
Table 3.
The results in
Table 3 clearly show that the complete hybrid model architecture performs best, and the absence of any core module will significantly reduce the model’s predictive ability. This proves the rationality and efficiency of our architectural design.
3.5. Comparison with Other Advanced Models
To evaluate the performance of PICA-Net, we compared it against four representative baseline models: Random Forest (traditional machine learning), LSTM (classic RNN), and two advanced deep learning architectures—Temporal Convolutional Network (TCN) and Transformer. All models were trained with identical datasets and input features for fair comparison. Evaluation was based on MAE and RMSE to assess both average accuracy and error sensitivity.
As shown in
Table 4, the proposed PICA-Net model achieved the lowest root mean square error (RMSE) among all comparison models, demonstrating its optimal comprehensive performance in marine temperature profile prediction tasks.
3.6. Physical Constraints
To improve the physical plausibility of ocean temperature profile predictions without sacrificing accuracy, PICA-Net incorporates three weak physical constraints—based on key oceanographic principles—as training regularization terms. These guide the model toward more physically consistent outputs. Their quantitative impact on prediction performance is summarized in
Table 5 below:
Although adding physical regularization slightly increases the overall RMSE from 0.4073 °C to 0.4125 °C, this minor rise suggests that such constraints do not significantly affect global accuracy.
The term “non-significant” is used in a descriptive sense, referring to the small difference of 0.0052 °C, which falls within expected fluctuations between validation batches and does not imply a formal statistical test.
However, scalar metrics like RMSE may overlook structural improvements. To further evaluate model behavior, we performed case-by-case profile visualizations to examine whether physical constraints enhance the consistency and interpretability of predicted thermal structures under complex oceanic conditions.
In
Figure 3 and
Figure 4, we present the model’s prediction results over two consecutive days, 20–21 February 2025:
The black dashed line denotes the observed temperature profile (ground truth). The red solid line represents the prediction of the baseline PICA-Net model without physical constraints. The blue solid line shows the prediction from PICA-Net trained with physical constraints.
3.7. Edge Deployment Experiment
To evaluate the real-world deployment potential of the proposed temperature profile prediction model, a series of edge computing experiments were conducted, including model optimization, accuracy verification, and performance benchmarking. We tested on two platforms: a mainstream PC (Intel i7-10750H, RTX 1650Ti, Intel, Santa Clara, CA, USA) as baseline, and the NVIDIA Jetson TX2 as the target edge device (Nvidia, Santa Clara, CA, USA). The full hardware and software configurations are listed in
Table 6.
The trained PyTorch model was first converted to ONNX format and then optimized using TensorRT to generate three inference engines: FP32 (baseline), FP16 (reduced precision for speed), and INT8 (quantized for maximum efficiency). To evaluate deployment effectiveness, we measured inference time, RMSE accuracy, power consumption, and model size across configurations. Results are summarized in
Table 7.
4. Discussion
This chapter analyzes the experimental results in depth, going beyond metric comparisons to explore physical mechanisms, model behavior, and scientific significance. It provides a comprehensive evaluation of PICA-Net, emphasizing its accuracy, physical consistency, generalization, and applicability to real-time ocean temperature profile forecasting.
4.1. Discussion on the Importance of Remote Sensing Features
Feature ablation experiments (
Table 2) confirm that all evaluated satellite remote sensing variables significantly enhance PICA-Net’s prediction accuracy. Wind stress and wind speed had the greatest impact—removing them increased RMSE by 13.92% and 11.78%, respectively—highlighting the dominant role of wind-driven mixing in shaping short-term temperature structure. Similar conclusions on the influence of wind-stress on surface temperature and mixing have been reached in studies leveraging satellite wind products and flux retrievals [
29]. Sea surface height anomaly (ZOS), when removed, led to a 9.70% RMSE rise, underscoring its value in capturing subsurface dynamics like eddies and fronts. SST and net heat flux (Q_net) contribute to surface thermal processes; removing them caused a 7.02% performance drop [
30,
31]. When all remote sensing features were excluded, RMSE rose by 13.01%, showing that PICA-Net benefits from the synergy of diverse physical variables. Overall, the results validate the model’s multi-source input design and emphasize the importance of integrating dynamic and thermodynamic satellite data for accurate regional rapid forecasting of ocean temperature profiles.
4.2. Analysis of Synergistic Effects in Model Architecture
Module ablation experiments (
Table 3) reveal that PICA-Net’s high accuracy stems from the collaborative function of its three core components—1D-CNN, Bi-LSTM, and attention—rather than reliance on any single module. Removing the 1D-CNN led to the largest performance degradation, with RMSE increasing by 28.70%, confirming its role in extracting spatial patterns like thermoclines and mixed layers. Eliminating Bi-LSTM resulted in a 13.48% increase in RMSE, highlighting its importance for modeling temporal dependencies in dynamic ocean processes. Although removing the attention mechanism caused the smallest impact (RMSE increased by 4.81%), it significantly improves the model’s adaptability by dynamically reweighting features under varying conditions such as storms or calm periods. These findings confirm the architectural synergy of PICA-Net, where each module contributes uniquely to building a compact, accurate, and physically consistent prediction system [
32].
4.3. Performance Comparison of PICA-Net with Other Models
To validate the performance of PICA-Net, we compared it with several representative baseline models. Among deep learning approaches, the Temporal Convolutional Network (TCN) achieved the lowest MAE (0.2842 °C), slightly outperforming PICA-Net (0.2876 °C). However, PICA-Net demonstrated superior RMSE (0.4073 °C vs. 0.4187 °C), indicating better robustness against large deviations—crucial for real-world applications [
33]. LSTM and Transformer models performed worse, with RMSEs 6.31% and 8.30% higher than PICA-Net, respectively, underscoring the importance of combining temporal and spatial feature extraction [
34]. In contrast, the Random Forest model exhibited the poorest performance (RMSE 0.7681 °C), revealing its limitations in modeling complex spatiotemporal ocean dynamics. These results collectively confirm that PICA-Net achieves state-of-the-art accuracy, physical reliability, and generalization capacity for temperature profile prediction.
4.4. Analysis of Model Prediction Error
4.4.1. Error Analysis for Different Forecast Lead Times
To assess short-term forecast stability, we evaluated PICA-Net’s performance at 6, 12, 18, and 24 h lead times. The experimental results are shown in
Figure 5, clearly revealing the pattern of how the model’s error increases with the forecast lead time.
As shown in
Figure 5, both MAE and RMSE increase steadily with longer horizons—RMSE rising from 0.3512 °C at +6 h to 0.4600 °C at +24 h, a 30% increase. This reflects typical prediction error accumulation and reduced ocean predictability at longer scales due to random, high-frequency dynamics. Nevertheless, PICA-Net consistently maintained high accuracy (overall RMSE = 0.4073 °C), demonstrating robust short-term forecasting capability.
4.4.2. Vertical Distribution Characteristics of Prediction Errors
To further investigate the ability of the PICA-Net model to reproduce the vertical structure of the ocean, we calculated the average prediction error (MAE and RMSE) of the model across the entire validation set and at each standard depth layer. The results are shown in
Figure 6, where the red solid line represents MAE and the blue solid line represents RMSE.
Results show low errors in the upper ocean (0–18 m), where the model benefits from surface remote sensing inputs like SST, Q_net, and wind fields. However, errors increase sharply below the thermocline (~18 m), peaking at 34.4 m (MAE ≈ 0.5 °C, RMSE ≈ 0.7 °C). This trend reflects the model’s limited ability to infer deep-layer dynamics using only surface data. The continued error growth at depth highlights the challenge of predicting subsurface thermal structures without direct physical constraints, especially as internal oceanic processes dominate. Future improvements may require incorporating indirect subsurface indicators or assimilating sparse in situ profile data.
4.5. Further Discussion on the Regularization Effect of Physical Constraints
Although the introduction of physical constraints slightly increased RMSE, qualitative results (
Figure 3 and
Figure 4) show that these terms effectively regularize the model by suppressing nonphysical spikes and high-frequency jitters. While baseline predictions may align numerically with ground truth, they often violate fluid thermodynamic principles. Predictions from PICA-Net trained with physical constraints yield smoother, physically plausible profiles, improving interpretability and stability. This highlights the importance of integrating physical priors into neural networks, demonstrating the synergy between data-driven learning and physical laws in enhancing ocean prediction quality [
35].
4.6. Discussion of Edge Deployment Experiments
4.6.1. Accuracy Validation Analysis
As shown in
Table 7, the proposed model achieved a baseline RMSE of ~0.4 °C on the full validation set. To assess the impact of deployment and quantization, 100 samples were tested across platforms. On PC (GPU), the model yielded an RMSE of 0.250139 °C; the same value was obtained on Jetson TX2 using the FP32 engine, confirming that the PyTorch → ONNX → TensorRT pipeline is accurate and lossless. Quantized versions showed minimal to zero precision loss: the FP16 engine had only a 0.1% increase in RMSE, and the INT8 version maintained identical accuracy. These results validate the model’s suitability for efficient edge deployment without sacrificing prediction accuracy.
4.6.2. Real-Time Performance and Resource Consumption Analysis
Real-time inference is essential for operational forecasting. On Jetson TX2, the model achieves an inference time of 3.72 ms with FP32, which is further reduced to 2.98 ms after FP16 quantization—improving speed by 25% without loss of accuracy. Although the TensorRT engine file (~4 MB) is larger than the original PyTorch weights (~0.88 MB), this increase is justified by the engine’s precompiled optimizations and remains negligible relative to edge device storage capacity. Slightly higher instantaneous power consumption in FP16 and INT8 modes results from denser computations per unit time, but shorter inference durations keep total energy use low. Overall, the FP16-optimized model offers a balanced solution, combining fast inference, high accuracy, and efficient resource usage, making it well-suited for edge-based, regional rapid forecasting of ocean temperature profiles.
4.7. Limitations
While PICA-Net demonstrates promising performance in short-term temperature profile forecasting, several limitations remain. First, the current model is trained entirely on reanalysis products, which, while already assimilating a wide range of in situ and satellite observations, still represent a single class of processed data. This limits the diversity of input data sources, and future work will explore the integration of additional real-time observations to further enhance model robustness. Second, although physically inspired loss terms are incorporated, they are relatively simple and do not fully represent three-dimensional dynamics, internal waves, or nonlinear oceanic processes. Third, the model’s predictive skill decreases with depth, reflecting the challenge of inferring deep-layer thermal structures primarily from surface and near-surface inputs. Lastly, this study focuses solely on temperature prediction; extending the framework to salinity, currents, or biogeochemical variables remains unexplored. These limitations will be addressed in future iterations of the system.
5. Conclusions
This paper presents and systematically validates a lightweight deep learning framework, PICA-Net, designed to provide hourly resolution temperature profile forecasts with edge deployment capabilities for practical applications such as smart fisheries, marine environmental monitoring and regional rapid forecasting. By integrating multi-source physical driving features—particularly key satellite remote sensing variables—the model effectively captures the dominant dynamic mechanisms governing vertical thermal evolution in the ocean.
Architecturally, PICA-Net aims to jointly model local vertical structures, temporal evolution, and feature importance. Experimental results demonstrate that, across a 24 h prediction horizon with 6 h intervals, PICA-Net consistently outperforms representative baseline models in terms of accuracy, physical consistency, and deployment efficiency, highlighting its potential for real-world operational use.
Furthermore, this study incorporates additional physical constraints into the model’s loss function, including thermal diffusion smoothing, mixed-layer depth (MLD) consistency, and net heat flux consistency. Although adding these physical constraints slightly increases the overall RMSE, they significantly improve the stability of predicted thermocline structures and suppress nonphysical anomalies in the temperature field. This demonstrates the feasibility and value of using weak physical regularization to improve the physical plausibility of data-driven predictions.
Despite strong performance in the upper and mid-depth layers, depth-wise error analysis indicates that PICA-Net’s performance in deeper layers still has room for improvement. Future work will explore the inclusion of richer sub-surface features to enhance the model’s ability to capture deep ocean thermal dynamics. In addition, we plan to integrate real-time observational data from in situ dissolved oxygen chain sensors and develop corresponding data assimilation mechanisms to improve the model’s real-time adaptability and robustness. This transition from reanalysis-based training to real-time in situ data integration will also enable the model to evolve into a truly field-operational forecasting system. Ultimately, we aim to build a high-resolution, physically consistent, and flexible temperature profile prediction system, enabling regional rapid forecasting and offering reliable technical support for smart fisheries, coastal ecological monitoring, and marine hazard early warning applications.