You are currently viewing a new version of our website. To view the old version click .

Stats, Volume 4, Issue 3

September 2021 - 13 articles

Cover Story: High-dimensional classification studies have become widespread across various domains. In this paper, we propose a robust and sparse estimator for logistic regression models that simultaneously tackles the presence of outliers and/or irrelevant features. We rely on L0-constraints and mixed-integer conic programming techniques to solve the underlying double combinatorial problem in a framework that allows one to pursue optimality guarantees. Our proposal is used to investigate the main drivers of honeybee (Apis mellifera) loss in Pennsylvania through annual winter loss survey data, where it produces a more interpretable classification model and provides evidence for several outlying observations. In addition, numerical simulations show that our approach outperforms other methods across most performance measures in the considered settings. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list .
  • You may sign up for email alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.

Articles (13)

  • Article
  • Open Access
3,427 Views
15 Pages

Curve Registration of Functional Data for Approximate Bayesian Computation

  • Anthony Ebert,
  • Kerrie Mengersen,
  • Fabrizio Ruggeri and
  • Paul Wu

7 September 2021

Approximate Bayesian computation is a likelihood-free inference method which relies on comparing model realisations to observed data with informative distance measures. We obtain functional data that are not only subject to noise along their y axis b...

  • Article
  • Open Access
16 Citations
4,671 Views
17 Pages

6 September 2021

This paper presents new perspectives and methodological instruments for verifying the validity of Benford’s law for a large given dataset. To this aim, we first propose new general tests for checking the statistical conformity of a given dataset with...

  • Article
  • Open Access
1 Citations
3,218 Views
20 Pages

3 September 2021

The asymptotic distribution is presented for the linear instrumental variables model estimated with a ridge penalty and a prior where the tuning parameter is selected with a holdout sample. The structural parameters and the tuning parameter are estim...

  • Article
  • Open Access
13 Citations
4,342 Views
24 Pages

2 September 2021

Functional data analysis techniques, such as penalized splines, have become common tools used in a variety of applied research settings. Penalized spline estimators are frequently used in applied research to estimate unknown functions from noisy data...

  • Article
  • Open Access
3 Citations
4,941 Views
17 Pages

Robust Variable Selection with Optimality Guarantees for High-Dimensional Logistic Regression

  • Luca Insolia,
  • Ana Kenney,
  • Martina Calovi and
  • Francesca Chiaromonte

31 August 2021

High-dimensional classification studies have become widespread across various domains. The large dimensionality, coupled with the possible presence of data contamination, motivates the use of robust, sparse estimation methods to improve model interpr...

  • Article
  • Open Access
12 Citations
4,662 Views
19 Pages

31 August 2021

The development of a country involves directly investing in the education of its citizens. Learning analytics/educational data mining (LA/EDM) allows access to big observational structured/unstructured data captured from educational settings and reli...

  • Article
  • Open Access
2 Citations
3,600 Views
15 Pages

12 August 2021

Longitudinal data is encountered frequently in many healthcare research areas to include the critical care environment. Repeated measures from the same subject are expected to correlate with each other. Models with binary outcomes are commonly used i...

  • Article
  • Open Access
4 Citations
3,644 Views
16 Pages

Generalized Cardioid Distributions for Circular Data Analysis

  • Fernanda V. Paula,
  • Abraão D. C. Nascimento,
  • Getúlio J. A. Amaral and
  • Gauss M. Cordeiro

11 August 2021

The Cardioid (C) distribution is one of the most important models for modeling circular data. Although some of its structural properties have been derived, this distribution is not appropriate for asymmetry and multimodal phenomena in the circle, and...

  • Article
  • Open Access
4 Citations
4,219 Views
18 Pages

Smoothing in Ordinal Regression: An Application to Sensory Data

  • Ejike R. Ugba,
  • Daniel Mörlein and
  • Jan Gertheiss

21 July 2021

The so-called proportional odds assumption is popular in cumulative, ordinal regression. In practice, however, such an assumption is sometimes too restrictive. For instance, when modeling the perception of boar taint on an individual level, it turns...

  • Article
  • Open Access
6 Citations
3,141 Views
14 Pages

Parameter Choice, Stability and Validity for Robust Cluster Weighted Modeling

  • Andrea Cappozzo,
  • Luis Angel García Escudero,
  • Francesca Greselin and
  • Agustín Mayo-Iscar

6 July 2021

Statistical inference based on the cluster weighted model often requires some subjective judgment from the modeler. Many features influence the final solution, such as the number of mixture components, the shape of the clusters in the explanatory var...

of 2

Get Alerted

Add your email address to receive forthcoming issues of this journal.

XFacebookLinkedIn
Stats - ISSN 2571-905X