You are currently viewing a new version of our website. To view the old version click .
Sensors
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Review
  • Open Access

30 November 2025

Machine Learning for Sensor Analytics: A Comprehensive Review and Benchmark of Boosting Algorithms in Healthcare, Environmental, and Energy Applications

and
Department of Decision Sciences and Marketing, Adelphi University, 1 South Avenue, Garden City, NY 11530, USA
*
Author to whom correspondence should be addressed.
Sensors2025, 25(23), 7294;https://doi.org/10.3390/s25237294 
(registering DOI)
This article belongs to the Special Issue Feature Review Papers in Intelligent Sensors

Abstract

Sensor networks generate high-dimensional temporally dependent data across healthcare, environmental monitoring, and energy management, which demands robust machine learning for reliable forecasting. While gradient boosting methods have emerged as powerful tools for sensor-based regression, systematic evaluation under realistic deployment conditions remains limited. This work provides a comprehensive review and empirical benchmark of boosting algorithms spanning classical methods (AdaBoost and GBM), modern gradient boosting frameworks (XGBoost, LightGBM, and CatBoost), and adaptive extensions for streaming data and hybrid architectures. We conduct rigorous cross-domain evaluation on continuous glucose monitoring, urban air-quality forecasting, and building-energy prediction, assessing not only predictive accuracy but also robustness under sensor degradation, temporal generalization through proper time-series validation, feature-importance stability, and computational efficiency. Our analysis reveals fundamental trade-offs challenging conventional assumptions. Algorithmic sophistication yields diminishing returns when intrinsic predictability collapses due to exogenous forcing. Random cross-validation (CV) systematically overestimates performance through temporal leakage, with magnitudes varying substantially across domains. Calibration drift emerges as the dominant failure mode, causing catastrophic degradation across all the static models regardless of sophistication. Importantly, feature-importance stability does not guarantee predictive reliability. We synthesize the findings into actionable guidelines for algorithm selection, hyperparameter configuration, and deployment strategies while identifying critical open challenges, including uncertainty quantification, physics-informed architectures, and privacy-preserving distributed learning.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.