Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

A Comparison of Methods for Determining the Number of Factors to Retain in Exploratory Factor Analysis for Categorical Indicator Variables

Psychol. Int. 2025, 7(1), 3; https://doi.org/10.3390/psycholint7010003

by Holmes Finch

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Psychol. Int. 2025, 7(1), 3; https://doi.org/10.3390/psycholint7010003

Submission received: 26 November 2024 / Revised: 9 January 2025 / Accepted: 14 January 2025 / Published: 17 January 2025

Round 1

Reviewer 1 Report

This manuscript presents a comparative analysis of various methods for determining the number of factors in Exploratory Factor Analysis (EFA) for ordered categorical data. The study includes recently developed techniques for factor identification, and the findings are of interest to both empirical and methodological researchers in the field. While the paper makes a valuable contribution, I have several comments and suggestions that may help enhance the manuscript's clarity and rigor (see Detail comments).

1. Please provide a rationale for selecting the proposed fit measures used to assess the retention of the number of factors. Also, consider discussing other established fit measures, such as AIC and CFI, which are commonly employed for model comparison. Explaining why certain measures were chosen over others to strengthen the methodological foundation of the study.

2. Page 6: The statement “BEFA does not allow indicators to have cross-loadings” is confusing. From a BEFA perspective, all potential elements in the loading matrix, including cross-loadings, should be explored. Please clarify how BEFA was implemented in your study in terms of model specification (how loadings are estimated and any constraints assigned to the parameters?).

3. For replicability, it is essential to specify all prior settings used in the Bayesian models. Detailed information on prior distributions, hyperparameters, and any assumptions made should be included in the manuscript. In addition, for Bayesian analyses, please include information regarding convergence, burn-ins, samples, etc.

4. It is unclear how the number of factors was determined for each method. Some methods may automatically select the optimal number of factors, while others might require an iterative process involving model comparison. Please elaborate on the procedure used to select the number of factors for each method in the simulation, and detail the criteria or thresholds applied during this selection process. A table could be useful to summarize this information.

5. The two sets of figures (Figures 1-4 and Figures 5-7) present the main evaluation outcomes: the proportion of correctly identified factors and the mean number of factors. To enhance readability and consistency:

o Arrange the graphs using the same grid layout.

o Ensure that the symbols representing each method are consistent across all figures to facilitate easier comparison.

6. The results about the one-factor model are currently presented in two separate figures (Figures 8-9). Combining these into a single figure would improve readability of your findings.

7. It would be insightful to compare and discuss how varying number of categories impacts the performance of each method. Currently there is no result comparison between 2 and 4 categories. I suggest adding figures or tables comparing results for 2 versus 4 categories.

8. The study focuses on ordered categorical variables with categories limited to 2 and 4. In practice, five-category variables, including a neutral category, are more common. Additionally, threshold asymmetry could influence the results. Please provide a rationale for selecting only 2 and 4 categories for the ordered categorical variables. Addressing these points will clarify the generalizability of findings.

9. Regularized factor analysis is an important method to simultaneously select loadings and latent factors within both frequentist and Bayesian frameworks. It would be beneficial to discuss them in the discussion section.

Author Response

Please provide a rationale for selecting the proposed fit measures used to assess the retention of the number of factors. Also, consider discussing other established fit measures, such as AIC and CFI, which are commonly employed for model comparison. Explaining why certain measures were chosen over others to strengthen the methodological foundation of the study.

Response

We agree with the reviewer that a rationale for selection of the proposed methods is needed. To that end, we have included the following text on page 1 of the revision.

The methods selected for inclusion in this study were selected based upon their proven performance in prior research (e.g., parallel analysis, minimum average partial, and exploratory graph analysis) or because they were new, had shown promise in prior work but have not been studied in the context of categorical indicator variables (e.g., next eigenvalue sufficiency test, out of sample prediction error, Bayesian EFA). Prior research and the relative merits of these methods are described in more detail below

In addition, we have included references to the positive performance of specific methods in earlier research as a rationale for their inclusion in the study. These are listed below along with the associated page numbers. We thank the reviewer for brining this issue to our attention and believe that by addressing it, we have improved the manuscript. We hope the reviewer and editor agree.

3 Parallel analysis

Prior research has found that PA is effective at identifying the number of factors to retain (Auerswald & Moshagen, 2019; Fabrigar & Wegener, 2011; Preacher & MacCallum, 2003; Xia, 2021). For this reason, it was included in the current study.

3, MAP

MAP has been shown to be effective in correctly. determining the number of factors to retain (Caron, 2018; Garrido, Abad, & Ponsoda, 2011; Ruscio & Roche, 2012; Zwick & Velicer, 1986).

4, Model fit statistics

Prior research (e.g., Finch, 2020; Barendse, Oort, & Timmerman, 2015; Yang & Xia, 2015; Preacher, Zhang, Kim, & Mels, 2013) has found that this RMSEA difference approach can yield accurate results for both continuous and categorical indicator variables.

5, Exploratory graph analysis

Prior research investigating the performance of EGA in identifying the number of dimensions to retain has shown that it works well in a variety of conditions. For example, Golino and Epskamp (2017) found that EGA performed comparably to PA, and better than MAP, in terms of identifying the number of factors across a range of sample sizes, number of indicators, and number of factors. In addition, when the interfactor correlation was large (0.7) and there were 4 factors, EGA was more accurate than PA. Similar simulation results were reported by Golino, et al., (2020), who also found that EGA performed comparably to or better than PA across a variety of conditions, and was generally more accurate than other techniques included in the study, such as Kaiser’s greater than 1 eigenvalue rule, the optimal coordinate, and the acceleration factor techniques. Finch (2024) found that EGA performed comparably to PA when outliers were present in the data and that these methods were more accurate at determining the number of latent variables present in the population than MAP. Taken together, these prior results suggest that EGA is a strong alternative to traditional approaches to identifying the number of factors to retain.

Page 6: The statement “BEFA does not allow indicators to have cross-loadings” is confusing. From a BEFA perspective, all potential elements in the loading matrix, including cross-loadings, should be explored. Please clarify how BEFA was implemented in your study in terms of model specification (how loadings are estimated and any constraints assigned to the parameters?).

Response

We appreciate the reviewer’s comment and apologize that our description of BEFA was unclear. We have included the following text on page 7 of the revision to clarify this point. We hope that the reviewer finds it more helpful.

Essentially, as described by Conti, et al., the BEFA algorithm identifies the statistically optimal arrangement of loadings relating the indicators to the factors, under the constraint that for each indicator there can be only one non-zero loading. All possible arrangements of the loading space are explored and the set that best replicates the observed covariance matrix is selected as optimal.

For replicability, it is essential to specify all prior settings used in the Bayesian models. Detailed information on prior distributions, hyperparameters, and any assumptions made should be included in the manuscript. In addition, for Bayesian analyses, please include information regarding convergence, burn-ins, samples, etc.

Response

We agree with the reviewer and have included the following text on page 10 of the revision.

For both BEFA approaches, a total of 11000 links were used in the MCMC chain with the first 1000 serving as the burnin period. To address the potential of autocorrelation in the MCMC estimates, every 10^th element in the chain was sampled, yielding a posterior distribution of 1000 values for each model parameter.

It is unclear how the number of factors was determined for each method. Some methods may automatically select the optimal number of factors, while others might require an iterative process involving model comparison. Please elaborate on the procedure used to select the number of factors for each method in the simulation, and detail the criteria or thresholds applied during this selection process. A table could be useful to summarize this information.

Response

This is an excellent idea and we appreciate the reviewer suggesting it. On page 7 of the revision, we have included a new table in the manuscript (Table 1) that outlines how each method included in the study determines the number of factors to retain. We believe that including this table has greatly improved the paper and thank the reviewer for suggesting it.

The two sets of figures (Figures 1-4 and Figures 5-7) present the main evaluation outcomes: the proportion of correctly identified factors and the mean number of factors. To enhance readability and consistency:

o Arrange the graphs using the same grid layout.

Response

We apologize to the reviewer, but we did not fully understand this comment. Within each set of figures (1-4 and 5-7), the y-axes were the same. For Figures 1-4 the y-axis ranged from 0 to 1.00, representing the proportion scale and for Figures 5-7 the axis ranged from 0 to 6, reflecting the number of factors. We also used the same shading patterns for the methods across all figures. Again, we are very sorry not to understand the reviewer’s comment and will be happy to make changes given a little more direction.

o Ensure that the symbols representing each method are consistent across all figures to facilitate easier comparison.

Response

We agree with the reviewer and have ensured that the symbols and shading used for each method is consistent across the figures. Per recommendations by Reviewer 2, we have also enlarged the figures so as to make them easier to read. We hope that these efforts have improved the presentation of the results.

The results about the one-factor model are currently presented in two separate figures (Figures 8-9). Combining these into a single figure would improve readability of your findings.

Response

We have made the two graphs a single factor. Thank you for the suggestion.

It would be insightful to compare and discuss how varying number of categories impacts the performance of each method. Currently there is no result comparison between 2 and 4 categories. I suggest adding figures or tables comparing results for 2 versus 4 categories.

Response

The reviewer raises an important point regarding the relationship (or lack thereof) between the number of categories and the outcomes of interest. The ANOVA models used to assess the impact of the manipulated factors on the outcomes of interest did not find the number of indicator categories (or any interactions involving this variable) to be significantly related to either the accuracy rates or the number of factors retained by the methods. Therefore, we did not include any figures or tables related to it. However, the reviewer correctly points out that we did not make this result clear in our manuscript. Thus, we included the following statement on page 10 of the revision explicitly stating this result. We apologize for not being more clear about this issue in the original manuscript.

There were no statistically significant results for the number of indicator categories nor any interactions involving this variable. Therefore, it will not be discussed further in the manuscript.

The study focuses on ordered categorical variables with categories limited to 2 and 4. In practice, five-category variables, including a neutral category, are more common. Additionally, threshold asymmetry could influence the results. Please provide a rationale for selecting only 2 and 4 categories for the ordered categorical variables. Addressing these points will clarify the generalizability of findings.

Response

The reviewer makes an excellent point. We agree that 5 category indicators are quite common in practice. We also agree that it is important that our manuscript indicate clearly why we selected 2 and 4 categories for our study. We have included the following text on page 9 of the revision explaining why we selected the categories that we did. We appreciate the reviewer’s suggestion and believe that providing a better rationale for our study conditions has improved the manuscript. We hope that the reviewer and editor agree.

Two and 4 categories per indicator were selected for this study for two reasons. First, these are relatively common numbers seen in practice and thus the results of the study should be useful to the researchers and practitioners. Second, we can directly compare the impact of doubling the number of response categories (from 2 to 4), which may help give instrument developers insights regarding the impact of having more versus fewer categories for their items. We recognize that other number of item response categories are quite common in the literature, particularly 5. We elected to include two number of category conditions in the current study in order to keep presentation of the results at a manageable level. However, we acknowledge that other viable options exist for the number of indicator categories and therefore we encourage future research examining these.

Regularized factor analysis is an important method to simultaneously select loadings and latent factors within both frequentist and Bayesian frameworks. It would be beneficial to discuss them in the discussion section.

Response

The reviewer raises an excellent point regarding regularized EFA. And indeed, given our inclusion of GLASSO, which is itself a regularization based method, we should discuss regularized EFA more broadly in the manuscript. To that end, we have included the following new paragraph on page 18 of the revision. We appreciate the reviewer’s suggestion and believe that addressing it in the revision has strengthened the manuscript. We hope that the reviewer and editor agree.

An alternative approach for fitting EFA models that was not considered here but which does show promise is regularized EFA. There exist multiple algorithms to carry out this analysis, all of which share the goal of identifying only those loadings that are clearly different from zero. One such approach, sparse estimation via nonconcave penalized likelihood in factor analysis model (FANC) involves the use of the minimax convex penalty function (Hirose, et al., 2015). This penalty is applied in conjunction with the standard maximum likelihood estimator and places a penalty on all factor model parameters, including loadings. The consequence of this penalty is to drive factor values down in value meaning that small loadings will essentially be set to 0. Likewise, the FANC algorithm places a penalty on the number of factors to be retained. Another regularization algorithm for determining the number of factors is the principal orthogonal component threshold (POET), which rests on the assumption of conditional sparsity (Fan, et al., 2013). This assumption asserts that conditioning on a small number of common components, the observed indicator variables will have small covariances with one another. And of course, the GLASSO approach used in the current study is also a regularization based approach for identifying latent structure in the data. Comparisons of the methods included in this study is an area that should be examined in future work.

Reviewer 2 Report

This study titled ‘Comparison of Methods for Determining the Number of Factors to Retain in Exploratory Factor Analysis for Categorical Indicator Variables’ addresses a unique topic with a well-designed methodological approach. The rationale behind the study is clearly and effectively presented. Furthermore, I found the proposal to compare two relatively new techniques for determining the number of factors to retain (next eigenvalue sufficiency test and out-of-sample prediction) with other approaches to be an intriguing one.

The statistical methods and tools used were appropriate and accurate, and the information provided on the data was sufficient. The conclusion and discussion were presented in a manner that considered the existing literature. The literature section was also satisfactory. The language and academic writing style were clear and understandable.

The tables and figures in the study were employed in an appropriate and elucidating manner. However, to enhance readability, it would be beneficial for the author(s) to enlarge the figures slightly. This is the only minor revision that can be raised.

The comprehensive delineation of the study's limitations lends considerable strength to the research. In light of the aforementioned considerations, it is my assessment that the study will prove to be a valuable contribution to the field.

Author Response

Thank you so much for your kind words about the paper. We very much appreciate them. We agree that the figures should be larger and have thus enlarged each of them to improve the readability.

Round 2

Reviewer 1 Report

Thank you for providing clarifications that address all comments. The authors have done an excellent job responding to the feedback.

Figures 1–4 illustrate the proportion of replications in which the correct number of factors was identified, and Figures 5–7 present the number of factors retained. For consistency and ease of interpretation, for example, it would be beneficial if the first figure reporting the number of factors retained (Figure 5) adopted the same layout as Figure 1, displaying “factor loading value” by “interfactor correlation.” Applying this consistent layout to the remaining figures would further enhance clarity.

In addition, in the social sciences, scales are often developed with unintended cross-loadings. It would be valuable to discuss scenarios and future research directions in which cross-loadings are present in the data.

Author Response

Figures 1–4 illustrate the proportion of replications in which the correct number of factors was identified, and Figures 5–7 present the number of factors retained. For consistency and ease of interpretation, for example, it would be beneficial if the first figure reporting the number of factors retained (Figure 5) adopted the same layout as Figure 1, displaying “factor loading value” by “interfactor correlation.” Applying this consistent layout to the remaining figures would further enhance clarity.

Response

We apologize, but we did not fully understand this comment. It appears that Figures 4-7 reflect results based on the number factors retained, whereas Figures 1-3 reflect the proportion of correct results. Thus, we were not totally clear on the change that the reviewer is requesting. We apologize for this.

2. In addition, in the social sciences, scales are often developed with unintended cross-loadings. It would be valuable to discuss scenarios and future research directions in which cross-loadings are present in the data.

Response: We agree with the reviewer and have included the following text on page 18 of the revision.

The current study simulated data with pure simple structure. However, in many real world situations, indicators are likely to have non-zero loadings associated with more than one factor. Thus, future research should simulate cases in which some indicators have non-zero loadings with multiple factors.

Article Menu

A Comparison of Methods for Determining the Number of Factors to Retain in Exploratory Factor Analysis for Categorical Indicator Variables

Further Information

Guidelines

MDPI Initiatives

Follow MDPI