A Selective Overview of Quantile Regression for Large-Scale Data

Shanshan Wang; Wei Cao; Xiaoxue Hu; Hanyu Zhong; Weixi Sun

doi:10.3390/math13050837

,

and

¹

School of Economics and Management, Beihang University, Beijing 100191, China

²

MOE Key Laboratory of Complex System Analysis and Management Decision, Beihang University, Beijing 100191, China

³

Sino-French Engineering School, Beihang University, Beijing 100191, China

^*

Author to whom correspondence should be addressed.

Mathematics2025, 13(5), 837;https://doi.org/10.3390/math13050837

This article belongs to the Special Issue Computational Statistics, Data Analysis and Applications

Version Notes

Order Reprints

Abstract

Large-scale data, characterized by heterogeneity due to heteroskedastic variance or inhomogeneous covariate effects, arises in diverse fields of scientific research and technological development. Quantile regression (QR) is a valuable tool for detecting heteroskedasticity, and numerous QR statistical methods for large-scale data have been rapidly developed. This paper provides a selective review of recent advances in QR theory, methods, and implementations, particularly in the context of massive and streaming data. We focus on three key strategies for large-scale QR analysis: (1) distributed computing, (2) subsampling methods, and (3) online updating. The main contribution of this paper is a comprehensive review of existing work and advancements in these areas, addressing challenges such as managing the non-smooth QR loss function, developing distributed and online updating formulations, and conducting statistical inference. Finally, we highlight several issues that require further study.

Keywords:

large-scale data; quantile regression; distributed computing; subsampling methods; renewable estimation

A Selective Overview of Quantile Regression for Large-Scale Data

Abstract

Article Metrics

Citations

Article Access Statistics