Hydrological calibration in data-scarce catchments is challenged by non-stationary regimes, fragmented data, and systematic measurement errors. Conventional calibration approaches often assume continuous records and rely on standard performance metrics, which can bias calibration toward high flows and exacerbate parameter equifinality—ultimately reducing robustness under data limitations. This study provides a systematic comparison of three calibration strategies—Kling–Gupta Efficiency (KGE), a non-parametric variant (R
NP), and Flow Duration Curve (FDC)-based calibration—together with their time-consistent counterparts (SKGE, SR
NP, and SRMSE). All schemes are implemented for the lumped HBV-type TUW model across nine catchments in southern Italy and evaluated using independent metrics targeting overall hydrograph agreement, high-flow behavior, and FDC quantile matching (Q5–Q95). The results reveal that the time-consistent KGE-based strategy excels during in calibration (NSE = 0.56, RMSE = 4.65 m
3/s) but shows notable declines in validation (NSE = 0.40, RMSE = 3.91 m
3/s), indicating sensitivity to non-stationarity. The R
NP-based approach demonstrates enhanced validation robustness (NSE = 0.51, RMSE = 3.60 m
3/s) and low-flow accuracy, with NSE
lnQ = 0.30 and low-flow accuracy, leveraging its non-parametric structure. The SR
NP variant further enhances performance in validation (NSE = 0.52, RMSE = 3.42 m
3/s), along with superior low-flow performance (NSE
lnQ = 0.48). The FDC-based strategy effectively reproduces flow distributions during calibration (NSE = 0.41, minimal PBIAS = −0.03%) but exhibits limited temporal transferability (validation NSE = 0.25, RMSE = 4.50 m
3/s). Time-consistent variants reduce parameter dispersion by approximately 2–8% (relative to full-period calibration) and improve validation metrics by 5–15% across all catchments. Overall, time-consistent calibration provides a practical pathway to increase robustness under non-stationary, data-scarce Mediterranean conditions, highlighting a systematic trade-off between calibration accuracy and validation reliability.
Full article