Statistical Analysis and Data Science for Complex Data, 2nd Edition

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "D1: Probability and Statistics".

Deadline for manuscript submissions: 31 December 2026 | Viewed by 2380

Special Issue Editor


E-Mail Website
Guest Editor
Department of Statistics, National Chengchi University, Taipei 116, Taiwan
Interests: graphical models; high-dimensional data analysis; machine learning; measurement error and error classification; survival analysis
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Nowadays, thanks to the rapid development of technology, datasets can be collected easily in many fields, such as biology, manufacturing, etc. Typically, given a dataset, one may encounter situations wherein (i) the sample size is large or (ii) the dimension of variables is large, yielding so-called big data or high-dimensional data, respectively. However, rare samples or variables are informative in data analysis. On the other hand, datasets usually contain complex structures caused by the collection procedure, such as censoring, measurement errors, or missingness. With noisy data, it becomes more challenging to choose informative subdata, detect important variables, or conduct analyses. In light of these challenges, this Special Issue will provide a platform for publishing novel statistical methods and algorithms that handle these complex structures in various research fields. Topics of interest for this Special Issue include biostatistics, bioinformatics, causal inference, statistical process control, and survival analysis.

Dr. Li-pang Chen
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • algorithm
  • big data
  • high dimensionality
  • noisy data

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 7420 KB  
Article
Research on Spare Part Activation Strategy and Reliability Index Calculation of Cold Standby Voting Systems Under Weibull Distribution
by Ziwen Yang, Xiaochuan Ai, Longlong Liu and Jun Wu
Mathematics 2026, 14(9), 1533; https://doi.org/10.3390/math14091533 - 30 Apr 2026
Viewed by 228
Abstract
This study investigates the impact of standby activation strategies on system reliability. The results show that a delayed activation strategy effectively improves system reliability. Additionally, to tackle the difficulty of deriving analytical solutions for reliability metrics under the Weibull distribution, a non-homogeneous Markov [...] Read more.
This study investigates the impact of standby activation strategies on system reliability. The results show that a delayed activation strategy effectively improves system reliability. Additionally, to tackle the difficulty of deriving analytical solutions for reliability metrics under the Weibull distribution, a non-homogeneous Markov model based on the delayed activation strategy is introduced. The system’s residual life is modeled computationally using the state transition method. The numerical results suggest that the proposed method aligns closely with Monte Carlo simulations. It significantly improves computational efficiency while maintaining high accuracy, thus confirming its effectiveness. Full article
(This article belongs to the Special Issue Statistical Analysis and Data Science for Complex Data, 2nd Edition)
Show Figures

Figure 1

22 pages, 4293 KB  
Article
Modeling Comovement and Asynchrony Between Fossil and Clean Energy Markets Under Climate Risks Using Quantile-on-Quantile Connectedness
by Mengyue Liu, Rufei Zhang, Yuanyuan Jiang and Wang Gao
Mathematics 2026, 14(9), 1486; https://doi.org/10.3390/math14091486 - 28 Apr 2026
Viewed by 379
Abstract
This paper examines how climate risk shapes the connectedness between fossil and clean energy markets. To do so, we use the quantile-on-quantile connectedness framework and extend it to the frequency domain to capture return and volatility spillovers across different market conditions and horizons. [...] Read more.
This paper examines how climate risk shapes the connectedness between fossil and clean energy markets. To do so, we use the quantile-on-quantile connectedness framework and extend it to the frequency domain to capture return and volatility spillovers across different market conditions and horizons. We further employ an autoregressive distributed lag (ARDL) model to assess the effects of physical and transition climate risks on the estimated connectedness structure. The results show that fossil and clean energy markets exhibit both comovement and asynchrony, although comovement is dominant overall. After decomposing connectedness into short-term and long-term components, we find that return spillovers are mainly driven by short-term dynamics, whereas volatility spillovers are mainly driven by long-term dynamics. In addition, climate risks significantly affect the connectedness structure between the two markets, and policy-related risk and global warming risk are particularly important in strengthening asynchronous features. These findings suggest that climate risk influences not only the strength of spillovers, but also the way fossil and clean energy markets interact under different market states. Full article
(This article belongs to the Special Issue Statistical Analysis and Data Science for Complex Data, 2nd Edition)
Show Figures

Figure 1

30 pages, 3285 KB  
Article
Causal Identification of Artificial Intelligence Effects on Enterprise Labor Structure via a Partially Linear Double Machine Learning Estimator: Evidence from High-Dimensional Panel Data
by Huali Liu, Wenjie Li, Yankai Lin and Zne-Jung Lee
Mathematics 2026, 14(8), 1312; https://doi.org/10.3390/math14081312 - 14 Apr 2026
Viewed by 480
Abstract
This study develops a semiparametric causal inference framework to quantify the effect of Artificial Intelligence (AI) adoption on enterprise labor structure under high-dimensional confounding. We employ the Double Machine Learning (DML) estimator proposed, which combines Neyman orthogonality and cross-fitting to achieve reliable causal [...] Read more.
This study develops a semiparametric causal inference framework to quantify the effect of Artificial Intelligence (AI) adoption on enterprise labor structure under high-dimensional confounding. We employ the Double Machine Learning (DML) estimator proposed, which combines Neyman orthogonality and cross-fitting to achieve reliable causal identification in settings where conventional regression methods are prone to bias from high-dimensional controls and nonlinear confounding. Nuisance functions are estimated using Lasso and Random Forests, enabling flexible modeling of complex relationships between control variables and outcomes. Using an unbalanced panel of Chinese A-share listed companies spanning 2006 to 2023, we identify a significant positive average treatment effect of AI adoption on the share of high-skilled labor (estimate: 0.118; 95% CI: [0.073, 0.163]), indicating that complementarity between AI and skilled workers dominates substitution at the firm level. Heterogeneity analysis reveals that the effect is stronger in manufacturing (0.183) than in services (0.071), and more pronounced in Eastern China (0.142) than in Central and Western regions (0.079). Quantile regression further shows that the complementarity effect intensifies at higher skill quantiles. A Panel Smooth Transition Regression (PSTR) model identifies a digitalization threshold beyond which AI–skill complementarity further strengthens. Mediation analysis confirms that productivity enhancement, digital transformation, and innovation activities together account for the majority of the total effect, with productivity improvement alone contributing approximately 34%. Placebo tests and propensity score weighting validate the robustness of our findings. Full article
(This article belongs to the Special Issue Statistical Analysis and Data Science for Complex Data, 2nd Edition)
Show Figures

Figure 1

36 pages, 2186 KB  
Article
On a Beta-Gamma Discrete Distribution for Thunderstorm Count Modeling with Risk Analysis
by Tassaddaq Hussain, Enrique Villamor, Mohammad Shakil, Mohammad Ahsanullah and B. M. Golam Kibria
Mathematics 2025, 13(24), 3913; https://doi.org/10.3390/math13243913 - 7 Dec 2025
Viewed by 691
Abstract
Risk management is vital for financial institutions to evaluate and mitigate potential losses. Thunderstorm count modeling with risk analysis is used by various sectors, such as insurance and utility companies, to forecast storm recurrence, analyze risk, and estimate financial losses based on factors [...] Read more.
Risk management is vital for financial institutions to evaluate and mitigate potential losses. Thunderstorm count modeling with risk analysis is used by various sectors, such as insurance and utility companies, to forecast storm recurrence, analyze risk, and estimate financial losses based on factors like wind speed, hail size, and tornado potential. This paper introduces a novel discrete distribution, the Beta-Gamma Discrete (BGD) distribution, designed for modeling count data that inherently excludes zero values. Developed through the compounding of a discrete gamma distribution with a beta distribution, the BGD offers significant flexibility in handling overdispersion and complex data characteristics. The study derives key statistical properties of the BGD, including its probability mass function, moments, hazard rate function, moment generating function, and mean residual life. A comprehensive characterization theorem is also established. The model’s practical utility is demonstrated through an application to thunderstorm event data from the Kennedy Space Center (KSC), where the frequency of thunderstorms per event is a critical operational concern. The performance of the BGD is thoroughly assessed against established zero-truncated models—namely, the Zero-Truncated Generalized Poisson (ZTGP), Size-Biased Negative Binomial (SBNB), and Zero-Truncated Generalized Negative Binomial (ZTGNB)—using evaluation criteria such as Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Chi-square goodness-of-fit, and the Vuong test. The results consistently show that the BGD provides a superior and more accurate fit for the thunderstorm data, thus help NASA and other space agencies for establishing it as a robust and effective tool for modeling positive count data in meteorological and other applied contexts with risk analysis. Full article
(This article belongs to the Special Issue Statistical Analysis and Data Science for Complex Data, 2nd Edition)
Show Figures

Figure 1

Back to TopTop