Statistics for High-Dimensional Data

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "D1: Probability and Statistics".

Deadline for manuscript submissions: 31 August 2025 | Viewed by 3098

Special Issue Editor


E-Mail Website
Guest Editor
Department of Information Technologies, University of Limassol, Nicosia, Cyprus
Interests: high-dimensional statistics; supervised and unsupervised dimension reduction; computational statistics; machine learning and text data analysis
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

We live in an era where we have the tools, the computing power, and therefore the ability to collect data that are both massive and high-dimensional. In addition, classic statistical methodology has been developed for small datasets where sample size n is much larger than dimension p and therefore many times they are not suitable to address high-dimensional cases where n is approximately equal to p or in cases where n is less than p. In recent years, there has been an explosion of research papers that have tried to address the methodology gap and develop suitable algorithms that address the issue of a low-sample high-dimensional setting.

In this Special Issue, we invite specialists in the area of statistics for high-dimensional data. Their research could be on estimation, hypothesis testing, regression, variable selection, dimension reduction, clustering, classification, or any other topic of modern multivariate statistics that is developed in order to address issues that arise when the data are high-dimensional. Additionally, we are looking for methodological papers, whether they have computational flavor or provide a theoretical discussion of the methodology.

Prof. Dr. Andreas Artemiou
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • high dimension
  • large p small n
  • low sample high dimensional setting
  • dimension reduction
  • variable selection

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 313 KiB  
Article
Consistent Estimators of the Population Covariance Matrix and Its Reparameterizations
by Chia-Hsuan Tsai and Ming-Tien Tsai
Mathematics 2025, 13(2), 191; https://doi.org/10.3390/math13020191 - 8 Jan 2025
Viewed by 689
Abstract
For the high-dimensional covariance estimation problem, when limnp/n=c(0,1), the orthogonally equivariant estimator of the population covariance matrix proposed by Tsai and Tsai exhibits certain optimal properties. Under some [...] Read more.
For the high-dimensional covariance estimation problem, when limnp/n=c(0,1), the orthogonally equivariant estimator of the population covariance matrix proposed by Tsai and Tsai exhibits certain optimal properties. Under some regularity conditions, the authors showed that their novel estimators of eigenvalues are consistent with the eigenvalues of the population covariance matrix. In this paper, under the multinormal setup, we show that they are consistent estimators of the population covariance matrix under a high-dimensional asymptotic setup. We also show that the novel estimator is the MLE of the population covariance matrix when c(0,1). The novel estimator is used to establish that the optimal decomposite TT2-test has been retained. A high-dimensional statistical hypothesis testing problem is used to carry out statistical inference for high-dimensional principal component analysis-related problems without the sparsity assumption. In the final section, we discuss the situation in which p>n, especially for high-dimensional low-sample size categorical data models in which p>>n. Full article
(This article belongs to the Special Issue Statistics for High-Dimensional Data)
30 pages, 3813 KiB  
Article
Matrix Factorization and Prediction for High-Dimensional Co-Occurrence Count Data via Shared Parameter Alternating Zero Inflated Gamma Model
by Taejoon Kim and Haiyan Wang
Mathematics 2024, 12(21), 3365; https://doi.org/10.3390/math12213365 - 27 Oct 2024
Cited by 1 | Viewed by 1729
Abstract
High-dimensional sparse matrix data frequently arise in various applications. A notable example is the weighted word–word co-occurrence count data, which summarizes the weighted frequency of word pairs appearing within the same context window. This type of data typically contains highly skewed non-negative values [...] Read more.
High-dimensional sparse matrix data frequently arise in various applications. A notable example is the weighted word–word co-occurrence count data, which summarizes the weighted frequency of word pairs appearing within the same context window. This type of data typically contains highly skewed non-negative values with an abundance of zeros. Another example is the co-occurrence of item–item or user–item pairs in e-commerce, which also generates high-dimensional data. The objective is to utilize these data to predict the relevance between items or users. In this paper, we assume that items or users can be represented by unknown dense vectors. The model treats the co-occurrence counts as arising from zero-inflated Gamma random variables and employs cosine similarity between the unknown vectors to summarize item–item relevance. The unknown values are estimated using the shared parameter alternating zero-inflated Gamma regression models (SA-ZIG). Both canonical link and log link models are considered. Two parameter updating schemes are proposed, along with an algorithm to estimate the unknown parameters. Convergence analysis is presented analytically. Numerical studies demonstrate that the SA-ZIG using Fisher scoring without learning rate adjustment may fail to find the maximum likelihood estimate. However, the SA-ZIG with learning rate adjustment performs satisfactorily in our simulation studies. Full article
(This article belongs to the Special Issue Statistics for High-Dimensional Data)
Show Figures

Figure 1

Back to TopTop