Author Contributions
Conceptualization, methodology, software, validation, formal analysis, investigation, data curation, writing (original draft preparation, review and editing), and visualization: M.R.P.R.; methodology, validation, writing (review and editing), and supervision: M.J.S.B.; conceptualization, methodology, resources, writing (review and editing), supervision, and project administration: Ó.J.G.C. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Digital elevation model of the Boyacá study area (60–4654 m). CHIRPS grid: 61 × 65 = 3965 cells at 0.05° resolution; contour lines at 500 m intervals.
Figure 1.
Digital elevation model of the Boyacá study area (60–4654 m). CHIRPS grid: 61 × 65 = 3965 cells at 0.05° resolution; contour lines at 500 m intervals.
Figure 2.
Experimental framework: three Type (iii) component combination hybrid families processing identical feature bundles (BASIC, KCE, and PAFC) from CHIRPS precipitation and SRTM topography. Evaluation uses RMSE, MAE, , and bias at horizons H = 1 to H = 12.
Figure 2.
Experimental framework: three Type (iii) component combination hybrid families processing identical feature bundles (BASIC, KCE, and PAFC) from CHIRPS precipitation and SRTM topography. Evaluation uses RMSE, MAE, , and bias at horizons H = 1 to H = 12.
Figure 3.
Convergence behavior by architecture and experiment for H = 12. Cooler colors indicate lower validation loss.
Figure 3.
Convergence behavior by architecture and experiment for H = 12. Cooler colors indicate lower validation loss.
Figure 4.
Forecast horizon degradation: from H = 1 to H = 12 for top models from each family. Lines show mean with shaded ±1 standard deviation bands. ConvLSTM: Convolutional Long Short-Term Memory; FNO: Fourier Neural Operator; and GNN-TAT: Graph Neural Network with Temporal Attention.
Figure 4.
Forecast horizon degradation: from H = 1 to H = 12 for top models from each family. Lines show mean with shaded ±1 standard deviation bands. ConvLSTM: Convolutional Long Short-Term Memory; FNO: Fourier Neural Operator; and GNN-TAT: Graph Neural Network with Temporal Attention.
Figure 5.
Feature set performance heatmap: at H = 12 by model and feature engineering strategy for all families. Rows represent architecture variants; columns represent feature bundles (BASIC, KCE, and PAFC).
Figure 5.
Feature set performance heatmap: at H = 12 by model and feature engineering strategy for all families. Rows represent architecture variants; columns represent feature bundles (BASIC, KCE, and PAFC).
Figure 6.
Multi-metric radar comparison across normalized metrics (higher is better) for ConvLSTM, FNO, and GNN-TAT. Axes represent , 1-NRMSE, 1-|bias|, stability, and efficiency.
Figure 6.
Multi-metric radar comparison across normalized metrics (higher is better) for ConvLSTM, FNO, and GNN-TAT. Axes represent , 1-NRMSE, 1-|bias|, stability, and efficiency.
Figure 7.
Parameter efficiency frontier: versus parameter count (log scale) for all families. The Pareto frontier (dashed line) connects non-dominated configurations. GNN-TAT: Graph Neural Network with Temporal Attention; FNO: Fourier Neural Operator; and ConvLSTM: Convolutional Long Short-Term Memory.
Figure 7.
Parameter efficiency frontier: versus parameter count (log scale) for all families. The Pareto frontier (dashed line) connects non-dominated configurations. GNN-TAT: Graph Neural Network with Temporal Attention; FNO: Fourier Neural Operator; and ConvLSTM: Convolutional Long Short-Term Memory.
Figure 8.
Model ranking by at H = 12 with 95% bootstrap confidence intervals for the top 15 configurations. Architectures are color-coded by family. The dashed line indicates .
Figure 8.
Model ranking by at H = 12 with 95% bootstrap confidence intervals for the top 15 configurations. Architectures are color-coded by family. The dashed line indicates .
Figure 9.
Training dynamics: validation loss curves for representative models from each family. Solid lines show validation loss; vertical dotted lines mark early stopping epochs.
Figure 9.
Training dynamics: validation loss curves for representative models from each family. Solid lines show validation loss; vertical dotted lines mark early stopping epochs.
Figure 10.
GNN-TAT internal architecture: graph encoder (3965 nodes), temporal attention (four heads; 60-month history), and LSTM decoder (H = 1–12).
Figure 10.
GNN-TAT internal architecture: graph encoder (3965 nodes), temporal attention (four heads; 60-month history), and LSTM decoder (H = 1–12).
Figure 11.
GNN-TAT model comparison on full Boyacá grid: (a) RMSE by model and feature set, (b) by model and feature set with ConvLSTM baseline (dashed red), (c) RMSE degradation across horizons H = 1–12, and (d) bias distribution by GNN variant.
Figure 11.
GNN-TAT model comparison on full Boyacá grid: (a) RMSE by model and feature set, (b) by model and feature set with ConvLSTM baseline (dashed red), (c) RMSE degradation across horizons H = 1–12, and (d) bias distribution by GNN variant.
Figure 12.
Per-grid-cell (NSE) at H = 12 for (a) V2 ConvLSTM and (b) V4 GNN-TAT. Green indicates higher skill; red indicates lower or negative .
Figure 12.
Per-grid-cell (NSE) at H = 12 for (a) V2 ConvLSTM and (b) V4 GNN-TAT. Green indicates higher skill; red indicates lower or negative .
Figure 13.
Density scatter plots of observed versus predicted monthly precipitation for (a) V2 ConvLSTM and (b) V4 GNN-TAT at H = 12 across all grid cells. Color indicates sample density on a logarithmic scale; the dashed line shows the 1:1 reference.
Figure 13.
Density scatter plots of observed versus predicted monthly precipitation for (a) V2 ConvLSTM and (b) V4 GNN-TAT at H = 12 across all grid cells. Color indicates sample density on a logarithmic scale; the dashed line shows the 1:1 reference.
Figure 14.
Performance by elevation band at H = 12: (a) (NSE) and (b) RMSE (mm) for V2 ConvLSTM and V4 GNN-TAT. Cell counts per band are indicated below panel (a).
Figure 14.
Performance by elevation band at H = 12: (a) (NSE) and (b) RMSE (mm) for V2 ConvLSTM and V4 GNN-TAT. Cell counts per band are indicated below panel (a).
Figure 15.
Mean precipitation across forecast horizons (H = 1 to H = 12) at three representative grid cells: (a) valley (<1500 m), (b) mid-slope (1500–2500 m), and (c) highland (>2500 m). Black: observed (CHIRPS); blue: V2 ConvLSTM; and orange: V4 GNN-TAT.
Figure 15.
Mean precipitation across forecast horizons (H = 1 to H = 12) at three representative grid cells: (a) valley (<1500 m), (b) mid-slope (1500–2500 m), and (c) highland (>2500 m). Black: observed (CHIRPS); blue: V2 ConvLSTM; and orange: V4 GNN-TAT.
Table 1.
Architecture components for each model family. Input shape: and where and H = 12 output horizons.
Table 1.
Architecture components for each model family. Input shape: and where and H = 12 output horizons.
| Model | Params | Key Layers | Hybrid Components |
|---|
| Family 1: Convolutional Recurrent Hybrids |
| ConvLSTM_Enhanced | 78K | ConvLSTM2D(32) → BN → ConvLSTM2D(16) | Conv spatial + LSTM temporal |
| ConvLSTM_Bidirectional | 1.2M | Bidir(ConvLSTM2D(32)) → Concat | Conv + LSTM + bidirectional |
| ConvLSTM_Residual | 234K | ConvLSTM2D → Residual skip → Add | Conv + LSTM + residual |
| ConvLSTM_Attention | 178K | ConvLSTM2D → Multi-head Attention | Conv + LSTM + attention |
| Transformer_Baseline | 41.8M | TimeDistributed → Four-head Attention | Flatten + self-attention |
| Family 2: Spectral–Temporal Hybrids |
| FNO_Pure | 85K | SpectralConv2d (12 modes) → MLP | Fourier spectral + MLP |
| FNO_ConvLSTM_Hybrid | 106K | SpectralConv2d → ConvLSTM | Fourier + ConvLSTM |
| Family 3: Graph Attention LSTM Hybrids (GNN-TAT) |
| GNN_TAT_GAT | 98K | GAT(Four heads) → Temporal Attn → LSTM | Graph + Attention + LSTM |
| GNN_TAT_GCN | 98K | GCN(Two layers) → Temporal Attn → LSTM | Graph + Attention + LSTM |
| GNN_TAT_SAGE | 106K | GraphSAGE → Temporal Attn → LSTM | Graph + Attention + LSTM |
Table 2.
Master model comparison: All architectures at H = 12 forecast horizon.
Table 2.
Master model comparison: All architectures at H = 12 forecast horizon.
| Family | Model | Params | Features | H = 1 R2 | H = 6 R2 | H = 12 R2 | RMSE | MAE |
|---|
| ConvLSTM Family (Baselines) |
| ConvLSTM | ConvLSTM | 78K | BASIC | 0.642 | 0.645 | 0.601 | 83.7 | 60.2 |
| ConvLSTM | ConvLSTM_Bidirectional | 1.2M | BASIC | 0.618 | 0.653 | 0.598 | 84.0 | 61.0 |
| ConvLSTM | ConvLSTM_Residual | 234K | BASIC | 0.653 | 0.651 | 0.589 | 84.9 | 61.6 |
| ConvLSTM | ConvLSTM_EfficientBidir | 312K | BASIC | 0.611 | 0.603 | 0.588 | 85.1 | 60.9 |
| ConvLSTM | ConvLSTM_Attention | 178K | BASIC | 0.482 | 0.527 | 0.480 | 95.6 | 70.1 |
| Physics-Informed (FNO) |
| FNO | FNO_ConvLSTM_Hybrid | 106K | BASIC | 0.630 | 0.609 | 0.582 | 85.6 | 60.6 |
| FNO | FNO_Pure | 85K | BASIC | 0.180 | 0.169 | 0.206 | 118.0 | 92.4 |
| FNO | FNO_Pure | 85K | PAFC | 0.126 | 0.125 | 0.179 | 120.1 | 93.3 |
| FNO | FNO_ConvLSTM_Hybrid | 106K | PAFC | 0.374 | 0.054 | −0.533 | 164.0 | 116.0 |
| Hybrid GNN-TAT |
| GNN-TAT | GNN_TAT_GCN | 98K | PAFC | 0.625 | 0.592 | 0.555 | 88.3 | 63.4 |
| GNN-TAT | GNN_TAT_GAT | 98K | BASIC | 0.613 | 0.612 | 0.554 | 88.5 | 62.8 |
| GNN-TAT | GNN_TAT_SAGE | 106K | KCE | 0.550 | 0.530 | 0.518 | 92.0 | 67.9 |
| GNN-TAT | GNN_TAT_GAT | 98K | KCE | 0.549 | 0.616 | 0.515 | 92.3 | 65.7 |
| GNN-TAT | GNN_TAT_GCN | 98K | BASIC | 0.555 | 0.495 | 0.484 | 95.2 | 70.4 |
Table 3.
Hyperparameter configuration for all model families.
Table 3.
Hyperparameter configuration for all model families.
| Category | Parameter | Value |
|---|
| ConvLSTM Baselines |
| Training | Epochs | 200 |
| Training | Batch Size | 4 |
| Training | Learning Rate | 0.001 |
| Training | Early Stop Patience | 20 |
| ConvLSTM | Filters | 32, 16 |
| ConvLSTM | Kernel Size | 3 × 3 |
| Physics-Informed (FNO) |
| FNO | Fourier Modes | 12 |
| FNO | Width | 32 |
| Training | Learning Rate | 0.001 |
| Training | Epochs | 80 |
| Training | Batch Size | 2 |
| Training | Early Stop Patience | 30 |
| Hybrid GNN-TAT |
| Training | Epochs | 150 |
| Training | Batch Size | 2 |
| Training | Learning Rate | 0.001 |
| Training | Weight Decay | |
| Training | Early Stop Patience | 50 |
| GNN | Hidden Dimension | 64 |
| GNN | Number of Layers | 2 |
| GNN | Attention Heads (GAT) | 4 |
| GNN | Dropout Rate | 0.1 |
| Temporal | Attention Heads | 4 |
| LSTM | Hidden Dimension | 64 |
| Graph | Edge Threshold | 0.3 |
| Graph | Max Neighbors | 8 |
Table 4.
Forecast horizon degradation analysis: R2 performance from H = 1 to H = 12 (all families).
Table 4.
Forecast horizon degradation analysis: R2 performance from H = 1 to H = 12 (all families).
| Family | Model | Features | H = 1 | H = 3 | H = 6 | H = 9 | H = 12 | Degrad. |
|---|
| ConvLSTM | ConvLSTM | BASIC | 0.642 | 0.646 | 0.645 | 0.624 | 0.601 | −6.4% |
| ConvLSTM | ConvLSTM_Bidir | BASIC | 0.618 | 0.642 | 0.653 | 0.629 | 0.598 | −3.3% |
| FNO | FNO_ConvLSTM | BASIC | 0.630 | 0.620 | 0.609 | 0.595 | 0.582 | −7.6% |
| FNO | FNO_Pure | BASIC | 0.180 | 0.175 | 0.169 | 0.188 | 0.206 | +14.4% |
| GNN-TAT | GNN_TAT_GAT | BASIC | 0.613 | 0.610 | 0.612 | 0.586 | 0.554 | −9.6% |
| GNN-TAT | GNN_TAT_GCN | PAFC | 0.625 | 0.617 | 0.592 | 0.531 | 0.555 | −11.1% |
Table 5.
Elevation-stratified performance at H = 12 for best ConvLSTM and GNN-TAT configurations.
Table 5.
Elevation-stratified performance at H = 12 for best ConvLSTM and GNN-TAT configurations.
| Elevation Band | Cells | V2 | V4 | V2 RMSE | V4 RMSE |
|---|
| <1000 m | 2093 | 0.587 | 0.570 | 91.9 | 93.7 |
| 1000–2000 m | 706 | 0.559 | 0.476 | 79.6 | 86.8 |
| 2000–3000 m | 771 | 0.586 | 0.508 | 60.2 | 65.6 |
| >3000 m | 395 | 0.530 | 0.485 | 53.2 | 55.7 |
Table 6.
Statistical significance tests: pairwise family comparisons.
Table 6.
Statistical significance tests: pairwise family comparisons.
| Comparison | Test | Statistic | p-Value | Effect (d) | Significant? |
|---|
| ConvLSTM vs. GNN-TAT | Mann–Whitney U | 187.0 | 0.0322 | 0.73 | Yes |
| ConvLSTM vs. FNO | Mann–Whitney U | 47.0 | 0.1001 | 0.74 | No |
| GNN-TAT vs. FNO | Mann–Whitney U | 9.0 | 0.0360 | 1.82 | Yes |