Next Article in Journal
Efficient Solutions of Interval Programming Problems with Inexact Parameters and Second Order Cone Constraints
Next Article in Special Issue
Evaluating the Predictive Power of Ordination Methods in Ecological Context
Previous Article in Journal
On Domain of Nörlund Matrix
Previous Article in Special Issue
A Generalized Framework for Analyzing Taxonomic, Phylogenetic, and Functional Community Structure Based on Presence–Absence Data

Identifying the Informational/Signal Dimension in Principal Component Analysis

Dipartimento di Matematica, Sapienza Università di Roma, 00185 Roma, Italy
Departamento de Ecologia, Universidade Federal do Rio Grande do Sul, 91501-970 Porto Alegre, Brazil
Author to whom correspondence should be addressed.
Mathematics 2018, 6(11), 269;
Received: 23 October 2018 / Revised: 11 November 2018 / Accepted: 14 November 2018 / Published: 20 November 2018
(This article belongs to the Special Issue New Paradigms and Trends in Quantitative Ecology)
The identification of a reduced dimensional representation of the data is among the main issues of exploratory multidimensional data analysis and several solutions had been proposed in the literature according to the method. Principal Component Analysis (PCA) is the method that has received the largest attention thus far and several identification methods—the so-called stopping rules—have been proposed, giving very different results in practice, and some comparative study has been carried out. Some inconsistencies in the previous studies led us to try to fix the distinction between signal from noise in PCA—and its limits—and propose a new testing method. This consists in the production of simulated data according to a predefined eigenvalues structure, including zero-eigenvalues. From random populations built according to several such structures, reduced-size samples were extracted and to them different levels of random normal noise were added. This controlled introduction of noise allows a clear distinction between expected signal and noise, the latter relegated to the non-zero eigenvalues in the samples corresponding to zero ones in the population. With this new method, we tested the performance of ten different stopping rules. Of every method, for every structure and every noise, both power (the ability to correctly identify the expected dimension) and type-I error (the detection of a dimension composed only by noise) have been measured, by counting the relative frequencies in which the smallest non-zero eigenvalue in the population was recognized as signal in the samples and that in which the largest zero-eigenvalue was recognized as noise, respectively. This way, the behaviour of the examined methods is clear and their comparison/evaluation is possible. The reported results show that both the generalization of the Bartlett’s test by Rencher and the Bootstrap method by Pillar result much better than all others: both are accounted for reasonable power, decreasing with noise, and very good type-I error. Thus, more than the others, these methods deserve being adopted. View Full-Text
Keywords: Principal Component Analysis; stopping rules; simulated data; rules comparison Principal Component Analysis; stopping rules; simulated data; rules comparison
Show Figures

Figure 1

MDPI and ACS Style

Camiz, S.; Pillar, V.D. Identifying the Informational/Signal Dimension in Principal Component Analysis. Mathematics 2018, 6, 269.

AMA Style

Camiz S, Pillar VD. Identifying the Informational/Signal Dimension in Principal Component Analysis. Mathematics. 2018; 6(11):269.

Chicago/Turabian Style

Camiz, Sergio; Pillar, Valério D. 2018. "Identifying the Informational/Signal Dimension in Principal Component Analysis" Mathematics 6, no. 11: 269.

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

Search more from Scilit
Back to TopTop