Abstract
Sensor networks generate high-dimensional temporally dependent data across healthcare, environmental monitoring, and energy management, which demands robust machine learning for reliable forecasting. While gradient boosting methods have emerged as powerful tools for sensor-based regression, systematic evaluation under realistic deployment conditions remains limited. This work provides a comprehensive review and empirical benchmark of boosting algorithms spanning classical methods (AdaBoost and GBM), modern gradient boosting frameworks (XGBoost, LightGBM, and CatBoost), and adaptive extensions for streaming data and hybrid architectures. We conduct rigorous cross-domain evaluation on continuous glucose monitoring, urban air-quality forecasting, and building-energy prediction, assessing not only predictive accuracy but also robustness under sensor degradation, temporal generalization through proper time-series validation, feature-importance stability, and computational efficiency. Our analysis reveals fundamental trade-offs challenging conventional assumptions. Algorithmic sophistication yields diminishing returns when intrinsic predictability collapses due to exogenous forcing. Random cross-validation (CV) systematically overestimates performance through temporal leakage, with magnitudes varying substantially across domains. Calibration drift emerges as the dominant failure mode, causing catastrophic degradation across all the static models regardless of sophistication. Importantly, feature-importance stability does not guarantee predictive reliability. We synthesize the findings into actionable guidelines for algorithm selection, hyperparameter configuration, and deployment strategies while identifying critical open challenges, including uncertainty quantification, physics-informed architectures, and privacy-preserving distributed learning.