Analysis of the Correlation between Mass-Media Publication Activity and COVID-19 Epidemiological Situation in Early 2022

: The paper presents the results of a correlation analysis between the information trends in the electronic media of Kazakhstan and indicators of the epidemiological situation of COVID-19 according to the World Health Organization (WHO). The developed method is based on topic modeling and some other methods of processing natural language texts. The method allows for calculating the correlations between media topics, moods, the results of full-text search queries, and objective WHO data. The analysis of the results shows how the attitudes of society towards the problems of COVID-19 changed from 2021–2022. Firstly, the results reﬂect a steady trend of decreasing interest of electronic media in the topic of the pandemic, although to an unequal extent for different thematic groups. Secondly, there has been a tendency to shift the focus of attention to more pragmatic issues


Introduction
The healthcare systems of almost all countries face numerous problems caused by increased demand for medical services and high expectations of the population in the periods of pandemic, and these factors entail higher costs [1]. It also should be noted that social and medical efficiency as well as economic one are important for the healthcare system since, as it was mentioned in [2], the medical activities of a therapeutic and preventive nature may be economically unprofitable, but the medical and social effect requires their implementation. This statement is especially true in the context of a pandemic. On the other hand, the COVID-19 pandemic is an appropriate example of how rumors and incomplete knowledge affect society. People rely on mass media as a source of information and feel uncertainty when threats arise in the environment [3]. According to the authors in [4], the pandemic provoked a surge of rumors and misinformation, which hindered the rational behavior of the population and, to some extent, facilitated the acceleration of the spread of the virus. The media reports significantly affected people's emotions and psychological resilience in the course of the COVID-19 pandemic, [5]. More than 51% of news headlines in English-language media were negative in this period, and only about 30% of them were

Introduction
Item response theory (IRT) models [1-4] are a popular statistical method for analyzing dichotomous and polytomous random variables. IRT models can be classified into the area of multivariate statistics, which summarize a high-dimensional contingency table with a few latent factor variables of interest. Of particular interest is the application of IRT models in educational large-scale assessment (LSA; [5]), such as the program for international student assessment (PISA; [6]), which assesses the ability of students on test items in different cognitive domains, such as mathematics, reading, and science, across a wide range of countries all over the world.
In this article, we focus on unidimensional IRT models. These models are used for scaling cognitive test data to obtain a single unidimensional summary score [7]. Let X = (X 1 , . . . , X I ) be the vector of I polytomous random variables (i.e., items) X i ∈ {0, 1, . . . , K i } with K i ≥ 1. A unidimensional IRT model [4] is a statistical model for the multivariate probability distribution P(X = x) for x = (x 1 , . . . , x I ), where (1) The unidimensional latent variable θ follows a standard normal distribution with a density function φ, although this assumption can be weakened [8,9]. Conditional item response probabilities are defined as P(X i = x|θ) = P i (θ, x; γ i ), where γ i is a vector of the unknown item parameters of item i. Note that a local independence assumption is imposed in (1), which means that item responses X i and X j are conditionally independent for all item pairs i = j given the latent ability variable θ. This property justifies the statement that the multivariate contingency table P(X = x) is summarized by a unidimensional latent variable θ.
The item parameters γ i of the unidimensional IRT model in Equation (1) can be estimated by the (marginal) maximum likelihood (ML) using an expectation maximization (EM) algorithm [10,11]. The estimation can also involve a multi-matrix design in which only a subset of items is administered to each student [12,13]. In the likelihood formulation of (1), non-administered items are skipped in the multiplication terms in (1).
For dichotomous items, one often uses the abbreviated notation P i (θ; γ i ) = P i (θ, 1; γ i ). The function P i is also referred to as the item response function (IRF). A popular choice of P i is the two-parameter logistic (2PL; [14]) model defined by P i (θ) = Ψ(a i (θ − b i )), where Ψ denotes the logistic link function, a i is the item discrimination parameter, and b i is the item difficulty parameter. A simplified version of the 2PL model is the Rasch model [15,16], which constrains the item discriminations across items, leading to the IRF P i (θ) = Ψ(a(θ − b i )). A further alternative is the two-parameter probit (2PP; [2]) model P i (θ) = Φ(a i (θ − b i )) that employs the standard normal distribution function Φ (i.e., the probit link function).
There is increasing interest among researchers to use more flexible IRFs. In particular, the 2PL and 2PP models employ symmetric link functions. A variety of IRFs with asymmetric link functions have been proposed [17][18][19][20][21][22][23][24][25][26][27][28]. These kinds of models might be desirable if items do not follow the simple 2PL or 2PP models. In this article, we focus on item response modeling based on the generalized logistic link function [29]. This link function has been previously applied in [30] utilizing ML estimation, while [31] proposed a Markov chain Monte Carlo (MCMC) estimation approach. In this article, we thoroughly study ML estimation for the generalized logistic IRT model for dichotomous and polytomous item responses. Moreover, we also propose a regularized ML estimation approach aiming to stabilize the item parameter estimates.
The rest of the article is structured as follows. In Section 2, we introduce the IRT model based on the generalized logistic link function. Moreover, we propose the regularized estimation approach and discuss the application of this link function to polytomous items. Section 3 includes two simulation studies investigating the performance of estimating the generalized logistic IRT model for dichotomous items. Section 4 contains two empirical examples of datasets with dichotomous and polytomous items, respectively. Finally, the paper closes with a discussion in Section 5.

Item Response Modeling Based on the Generalized Logistic Link Function
The generalized logistic IRT model relies on the generalized logistic link function Ψ α 1 ,α 2 proposed by Stukel [29]. For the real-valued asymmetry parameters α 1 and α 2 , the link function Ψ α 1 ,α 2 is defined by where S α 1 ,α 2 is defined by The logistic link function is obtained with α 1 = α 2 = 0. The probit link function is approximately obtained with α 1 = α 2 = 0.12. More generally, symmetric link functions are obtained for α 1 = α 2 , while asymmetry is introduced by imposing α 1 = α 2 . The cloglog and loglog link functions [32] can also be well approximated by particular parameter values of α 1 and α 2 [31]. Figure 1 displays the generalized logistic link function Ψ α 1 ,α 2 for different values of α 1 and α 2 . It can be seen that α 1 governs the upper tail of the link function (i.e., x > 0), and α 1 values different from 0 indicate deviations from the logistic link function. For positive values (i.e., α 1 > 0), the link function Ψ α 1 ,α 2 more quickly reaches the upper asymptote of one compared to the logistic link function Ψ = Ψ 0,0 , while there is slower convergence to the upper asymptote for negative values of α 1 . Moreover, the α 2 parameter models the deviations from the logistic link function in the lower tail of the link function (i.e., for The generalized logistic link function defined in (3) can be used to define an IRF for a dichotomous item X i by where γ i = (α 1i , α 2i , a i , b i ) is the vector of item parameters for item i. In (4), it is assumed that the shape parameters α 1 and α 2 are item-specific, but it might be desirable for parsimony reasons to constrain them to be equal across items. Zhang et al. [31] proposed an MCMC estimation approach. In this approach, the factor variable θ must also be sampled, and parameter estimation can sometimes become computationally tedious. Therefore, ML estimation is always a viable alternative and computationally efficient for unidimensional IRT models, which is the reason for pursuing the ML estimation approach in this paper.
In [31], it was argued that a lower bound of −1 must be imposed for α 1 and α 2 in order to ensure proper posterior distribution. To ensure a sufficiently stable estimation from experiences in previous research [30], we also bounded the α 1 and α 2 parameters by one. To this end, we transformed the bounded asymmetry parameters α h for h = 1 and h = 2, which lie in the interval (−1, 1), into an unbounded parameter space using the Fisher transformation F [33] where α * h denote the unbounded transformed parameters of the generalized logistic link function. The inverse Fisher transformation F −1 maps unbounded parameters α * h to bounded parameters α h by means of the transformation In ML estimation of the generalized logistic IRT model for dichotomous item responses, the vector of item parameters for item i is defined as γ i = (α * i1 , α * i2 , a i , b i ). For the item response data {x pi | p = 1, . . . , N; i = 1, . . . , I} for N persons and I items, we define the log-likelihood function l based on (1) by for item responses x p = (x p1 , . . . , x pI ) of person p, and γ is the vector that collects the item parameters γ i of all items i = 1, . . . , I. The log-likelihood function can be numerically maximized to obtain the item parameter estimatesγ. In IRT software, the EM algorithm is frequently utilized [11,34].

Regularized Estimation
Estimating the shape parameters α 1 and α 2 (or α * 1 and α * 2 in the transformed parameter space) item by item might require large sample sizes and harms the precision of the estimated item parameters. On the other hand, constraining all shape parameters to be equal across items might be too restrictive, and this assumption might be violated by real-world item response data. As a compromise, the variability in shape parameters can be reduced by employing regularized ML estimation with fused ridge-type penalty functions [35].
Battauz proposed such a regularized estimation approach for the three-parameter [36] and four-parameter [37] logistic IRT models. In this paper, we propose the same approach for regularizing the α * 1 and α * 2 parameter estimates. The fused ridge penalty function P is defined by In regularized ML estimation, one maximizes the penalized log-likelihood function l pen defined by l pen (γ; λ) = l(γ) − P (γ; λ) .
Using the penalty function in (8) implies that normal priors for α * hi with a common mean ν h and a variance τ 2 are imposed for h = 1, 2 (see [37]). Importantly, by only considering the differences in pairs of item parameters, the means ν h are not explicitly estimated.
It is evident that the optimization of l pen also involves the unknown regularization parameter λ. The k-fold cross-validation approach is used for obtaining the optimal regularization parameter λ opt . The dataset is divided into k groups, and the parameters of the model are estimated on k − 1 folds leaving one fold out to evaluate the cross-validation error. This is performed by leaving one fold out in turn for each value of the regularization parameter λ. In this article, the error was evaluated using the negative log-likelihood function [37]. The smallest cross-validation error determines the choice of λ opt . In practice, k = 5 or k = 10 is frequently chosen.

Polytomous Items
The estimation approach based on the generalized logistic link function can also be applied to polytomous items with values k = 0, 1, . . . , K i [38]. We model conditional item response probabilities for which a score of at least k is obtained by The item response probabilities for a category k are defined by using the probabilities defined in (10) and P(X i = 0|θ; γ i ) = 1 − P(X i ≥ 1|θ; γ i ). Note that (10) includes item-specific intercept parameters, while the item discrimination a i and the shape parameters α i1 and α i2 are constrained to be equal for all categories k = 1, . . . , K i of item i in (10). Additionally, note that (10) and (11) can be interpreted as a generalization of the graded response model [39].

Simulation Studies
3.1. Simulation Study 1: Estimation of Common α 1 and α 2 Asymmetry Parameters 3.1.1. Method First, in Simulation Study 1, the performance of ML estimation of the generalized logistic IRT model for dichotomous items is investigated when the data-generating model (DGM) assumes the common shape parameters α 1 and α 2 across the items. In the simulation, I = 20 items were chosen. The item discrimination parameters a i and item difficulty parameters b i can be found in Table A1 in Appendix A. For the shape parameters, four different DGMs of combinations of α 1 and α 2 were studied. In the first condition (DGM1), we assumed α 1 = α 2 = 0, which corresponds to the logistic link function. In this case, applying the generalized logistic IRT model in favor of the 2PL model would not be necessary. The second condition (DGM2) corresponded to α 1 = −0.13 and α 2 = 0.21, while the third condition (DGM3) resulted by choosing α 1 = −0.30 and α 2 = 0.21. Obviously, the deviation from the logistic link function was more severe in DGM3 than in DGM2. In the fourth DGM (DGM4), we chose α 1 = 0.21 and α 2 = −0.30 to accommodate the guessing effects in IRFs.
Four different sample sizes, N, were chosen (i.e., 500, 1000, 2000, 4000) to represent the typical conditions in small-scale and large-scale studies that involve cognitive items. The latent variable θ was simulated using a standard normal distribution.
We estimated item parameters with two models. First, in Model M3 (we start with M3 for notational consistency with Simulation Study 2 and the empirical examples), we estimated the nonregularized generalized logistic IRT model with an equality constraint of α i1 and α i2 across all items i = 1, . . . , I; that is, α i1 = α 1 and α i2 = α 2 for all i = 1, . . . , I. In the second model (Model M4), we used the 2PL model, which employs the logistic IRF that can be obtained by setting α 1 = α 2 = 0 in the generalized logistic link function.
In total, 1500 replications were conducted in each simulation condition. We assessed the performance of parameter estimates for biases and the root mean square error (RMSE). To provide simple summary statistics across the parameter groups, we averaged the absolute biases and RMSE values across items for the same item parameter groups (i.e., the α 1 , α 2 , a, and b parameters). For a fair comparison between the misspecified 2PL model (Model M4) in DGM2 and DGM3 with the more complex generalized logistic IRT model, we employed the root integrated squared error (RISE; [40,41]) between an estimated IRF P i (θ; γ i ) and a true data-generating IRF P i (θ). The RISE statistic for item i is defined by The statistical software R [42] was employed for all parts of the simulation and analysis. The estimation of both IRT models was carried out using the sirt::xxirt() function in the R package sirt [43]. Table 1 displays the (average) absolute bias (Bias) and (average) RMSE of the estimated model parameters. Overall, biases in the parameter estimates were very small and practically vanished in large sample sizes, such as N = 4000. Moreover, the RMSE decreased with the increasing sample size, which is empirical evidence for the consistency property of ML estimates. The results turned out to be similar across the four different data-generating models. Notably, the RMSE values were larger for more asymmetric IRFs in DGM3 compared to DGM2. DGM4 performed similarly to DGM3 when the roles of the α 1 and α 2 were reversed.

Results
In Table 2, the average root integrated square error (RISE) between the estimated item and true item response function is displayed as a function of the sample size N for IRT models using the generalized logistic link (Model M3) and the logistic link (Model M4) functions, respectively. It turned out that there are minor efficiency losses in terms of the RISE when the logistic link function (Model M4) corresponds to the data-generating model DGM1, which did not involve asymmetric item response functions. In contrast, in the data-generating models DGM2, DGM3, and DGM4, the symmetric logistic link function is misspecified, and the RISE for estimates based on the generalized logistic link function (Model M3) was smaller across all sample sizes. From these results, it can be concluded that the additional cost to the efficiency loss when applying the more complex generalized logistic IRT model is compensated for by less biased item response function estimates. For large sample sizes, the bias in the 2PL model outweighs the smaller variability in the estimated IRF.  In Simulation Study 2, the DGM assumes the item-specific shape parameters α 1 and α 2 . As in Simulation Study 1, 20 items were employed in the simulation. The data-generating item parameters can be found in Table A1 in Appendix A.
In addition to Models M3 (i.e., joint α 1 and α 2 parameters) and M4 (i.e., the logistic link function), additional analysis models were specified. In Model M1, the generalized logistic IRT model was estimated without a regularization approach (i.e., nonregularized estimation). In Model M2, we employed regularized estimation with an optimal regularization parameter λ opt by using k-fold cross-validation utilizing the cross-validated log-likelihood value. In Model M6, we report the parameter estimates of the regularized estimation using a fixed regularization parameter λ = 1.
In this simulation, we consider the sample sizes N = 1000, 2000, and 4000. We did not simulate a sample size N = 500 because larger sample sizes are certainly required for item-specific estimation of the generalized logistic IRT model.
In In total, 1500 replications were conducted. The absolute average bias and average RMSE are reported for the groups of item parameters. Moreover, the performance of the different models is also assessed with the RISE statistic (see (12)).
Again, the statistical software R [42] was employed for all parts of the simulation. The estimation of the nonregularized and regularized IRT models was carried out using the sirt::xxirt() function in the sirt package [43]. Table 3 presents the average absolute bias and average RMSE for different analysis models as a function of sample size N. It can be seen that biases only vanish for the nonregularized (Model M1) and optimally regularized (Model M2) models. However, the variability in terms of the RMSE was much lower in Model M3, which assumes the joint shape parameters α 1 and α 2 or a regularized estimation with a relatively large regularization parameter λ = 1 (Model M6). Hence, it is up to the researcher whether the bias or RMSE matters for parameter estimates when choosing from among the different modeling alternatives.   Figure 2 displays the average RISE as a function of the regularization parameter λ. A regularization parameter λ of about 0.20 minimizes the RISE statistic. Notably, this value is much larger than the optimal regularization parameter selected by the crossvalidated log-likelihood function. In the subsequent table, Table 2, we report a slightly larger regularization parameter λ = 1.   Table 4 displays the RISE for different analysis models as a function of the sample size N. As it was also evident in Figure 2, an appropriate fixed regularization parameter can lead to smaller RISE values than an optimally selected regularization parameter based on the cross-validated log-likelihood. Nevertheless, it must be emphasized that all models that utilize the generalized logistic link function outperformed the misspecified logistic 2PL model (Model M4) for all sample sizes. This was also the case for Models M1 and M2, which resulted in highly variable item parameter estimates. We now apply the generalized logistic IRT model to the program for international student assessment (PISA; [44]) study. Ten countries were selected from the PISA 2006 study [44] in the reading domain. The ten countries were: Austria (AUT), Switzerland In this analysis, we only used those students who had a reading test in the PISA 2006 study. For each country, 27 or 28 items were valid and used in the subsequent analysis. A total of 10 items were multiple-choice, while 18 items were constructed response or short response items. Seven polytomous items were dichotomously rescored, while only the largest category was treated as correct.

Results
The used sample sizes per country in the analysis varied between N = 2374 and N = 4000 (M = 2896.8, SD = 484.0). The average number of students per item varied across countries between 1337.7 and 2261.3 (M = 1628.0, SD = 273.4). Sampling weights were not taken into account in the analysis because the two-stage stratified clustered sampling design would require a modified computation of the Akaike information criterion (AIC; [45,46]).
Five different analysis models were specified. In the first model, Model M1, the asymmetry parameters α 1 and α 2 were assumed to be item-specific and nonregularized. Model M2 estimated the item parameters by using the optimal regularization parameter λ opt via maximization of the cross-validated log-likelihood. In Model M3, the joint α 1 and α 2 parameters across items were assumed. Models M4 and M5 employed the logistic and probit link functions, respectively.
All models were separately estimated for each country because this example did not focus on country comparisons but rather on comparing different IRT modeling alternatives. All IRT models were estimated using the sirt::xxirt() function in the R package sirt [43].

Results
In Table 5, the AIC is presented for all countries for Models M1, M3, M4, and M5. For all countries except for FIN and SWE, the generalized logistic IRT model with item-specific α 1 and α 2 parameters better fit the data than the 2PL model (Model M5). However, only for Finland (FIN), the constrained generalized logistic IRT model (Model M3) was the best-fitting model among the competitive IRT models. For six countries, the 3PLRH model (Model M8) was the best-fitting model, while for three countries, the 4PL model (Model M9) was the frontrunner among the models. Interestingly, in nine of the ten countries, the generalized logistic IRT model outperformed the 3PL model. Moreover, in all countries, the 4PL model outperformed the 3PL model. Additionally, the IRT model with the logistic link function fitted the datasets for all countries better than the IRT model with the probit link function. Hence, from a sole statistical perspective, the generalized IRT model or alternative IRT models should be preferred over the operationally used 2PL model [51] because of a better model fit.
In Table 6, the summary statistics of the estimated asymmetry parameters α 1 and α 2 are presented. The joint α 1 parameter from Model M3 ranged between −0.20 and 0.01 (M = −0.08, SD = 0.07) and was mostly negative. In contrast, the joint α 2 parameter from Model M3 was positive and ranged between 0.09 and 0.36 (M = 0.21, SD = 0.09). Overall, almost no differences in the summary statistics between the nonregularized and regularized estimations were observed.  Note. Par = parameter; SD = standard deviation; M1 = α 1 and α 2 item-specific, nonregularized; M2 = α 1 and α 2 item-specific, regularized with λ opt ; M3 = joint α 1 and α 2 ; λ opt = optimal regularization parameter obtained with the cross-validated log-likelihood function. There is evidence of asymmetry in the IRF (i.e., for items R055Q02, R055Q03, and R067Q04) and guessing behavior (i.e., for items R055Q01 and R067Q01). Interestingly, the estimated IRF of the 3PL model also substantially differs from the generalized logistic IRT model. The item parameters for the generalized logistic IRT model (Model M1) of all 28 items for Germany can be found in Table A2 in Appendix B. In conclusion, the generalized logistic IRT model can more flexibly capture the functional form of the IRF.

Method
In this example, the nonregularized and the regularized generalized logistic item response model is applied to questionnaire data. The adult self-transcendence inventory (ASTI; [52,53]) is a self-report scale measuring the complex target construct of wisdom. The items can be assigned to five dimensions: non-attachment (NA; 4 items), presence in the here-and-now and growth (PG; 6 items), peace of mind (PM; 4 items), self-knowledge and integration (SI; 4 items), and self-transcendence (ST; 7 items). The items had three or four response categories.
A dataset with responses to the ASTI questionnaire has been made available in the MPsychoR package as the data object ASTI [54,55]. It contains polytomous item responses from 1215 respondents.
The polytomous generalized logistic IRT model described in Section 2.2 was applied. The same five analysis models as in the PISA 2006 reading example (see Section 4.1) were specified. In Model M1, the asymmetry parameters α 1 and α 2 were assumed to be item-specific and nonregularized. Model M2 estimated the item parameters by using the optimal regularization parameter λ opt via maximization of the cross-validated loglikelihood function. Model M3 assumed the joint α 1 and α 2 parameters across the items. Models M4 and M5 utilized the logistic and probit link functions, respectively (see also [56]). Unidimensional IRT models were separately fitted to each of the five dimensions.

Results
In Table 7, the AIC values are displayed for the four different models M1, M2, M3, and M4 are displayed. The most complex Model M1 was preferred for scales PG and ST in which the asymmetry parameters α 1 and α 2 were made item-specific. Model M3, which assumed the joint shape parameters α 1 and α 2 , resulted in the best model fit for scales PM and SI. The graded response model with the logistic link function (Model M4) was selected by AIC for the NA scale. Interestingly, the logistic link function always resulted in a better model fit compared to the probit link function.  Figure 4 displays the cross-validated log-likelihood values for the five different ASTI scales. The maximum value of the cross-validated log-likelihood function is indicated by a red triangle. It can be seen that the optimal λ value was lowest for the NA and PG scales and largest for the SI scale.  In Table 8, the summary statistics for the α 1 and α 2 parameters are presented. Overall, the means of α 1 and α 2 were relatively similar in Models M1 and M2, which utilized nonregularized and regularized estimation, respectively. Substantial differences in standard deviations for the α 2 parameter were observed for scales SI and ST. These scales had the largest optimal regularization parameter λ opt (see Figure 4), which supports the plausibility of these differences. Note that, except for Model M3 for the PG scale, all of the estimated α 1 and α 2 parameters were (on average) negative. Note. Par = parameter; SD = standard deviation; M1 = α 1 and α 2 item-specific, nonregularized; M2 = α 1 and α 2 item-specific, regularized with λ opt ; M3 = joint α 1 and α 2 ; λ opt = optimal regularization parameter obtained with the cross-validated log-likelihood function; NA = non-attachment; PG = presence in the here-and-now and growth; PM = peace of mind; SI = self-knowledge and integration; ST = self-transcendence.

Discussion
In this article, nonregularized and regularized maximum likelihood estimations of the generalized logistic IRT model for dichotomous and polytomous items were investigated. It was shown that parameter estimates were practically unbiased in large samples, and variability decreased with larger sample sizes. Moreover, this was present in the simulation, and the empirical examples that used regularized estimation were able to stabilize parameter estimates.
It should be emphasized that the variability of the estimated item parameters in the generalized logistic IRT model can be noteworthy, even in very large sample sizes such as N = 4000. However, as in the three-parameter or four-parameter logistic IRT models, this is likely the case due to the large dependency among the four different item parameters. Nevertheless, estimated item response functions can still be relatively precise, which demonstrates the finding of stable item response functions despite the unstable estimation of item parameters [57]. Using complex IRT models might be preferable when the primary goal is deriving an optimal scoring rule that maximizes the extent of the extracted information from the observed item responses [58,59].
In applications, it is likely that item response functions typically differ for constructed response items and multiple-choice items. It might be interesting and parsimonious to separately estimate α 1 and α 2 for the two item formats but make them equal for items of the same item format. By estimating this, the guessing or slipping effects can be modeled by the generalized logistic IRT model.
As pointed out by an anonymous reviewer, it would be vital also to compare the generalized logistic IRT model to other IRT models, such as the three-or four-parameter logistic models, in the simulation studies. It might well be the case that despite quite different functional forms of utilized IRT models, there would not be negligible differences in the fitted item response functions of different types of IRT models.
There is a recent discussion about whether distributional assumptions must be taken for granted in ordinal factor analysis for analyzing polytomous items [60]. Most often, ordinal factor analysis in structural equation modeling software relies on the limitedinformation estimation method that utilizes tetrachoric or polychoric correlations [61]. Using polychoric correlations implies that one assumes an underlying normally distributed variable for each item (i.e., a latent normality assumption; [62][63][64]). It is argued in [60] that the distributional assumption for the underlying latent variable must be known by the researcher and cannot be identified from data. It is important to emphasize that the issue of non-identification is coupled with the goal of using limited information methods and computing a latent correlation matrix (i.e., polychoric correlations or correlations adapted to other pre-specified marginal distributions). To put this in other words, those researchers base the ordinal factor analysis on a normal copula model. When applying the generalized logistic IRT model (i.e., the generalized logistic link function) for exploratory or confirmatory factor analysis, residual distributions different from the normal distribution can be identified. In this case, simply no substantial knowledge is required for factoranalyzing ordinal data if there is enough data available for empirical identification.
Appropriate linking methods should be applied that are relatively robust to model misspecifications (see [83]).
Funding: This research received no external funding.
Informed Consent Statement: Informed consent was obtained from all subjects involved in this study.

Data Availability Statement:
The PISA 2006 dataset is available at https://www.oecd.org/pisa/ pisaproducts/database-pisa2006.htm (accessed on 23 April 2023). The ASTI dataset is included in the R package MPsychoR and can be accessed within R by data(ASTI, package='MPsychoR').

Conflicts of Interest:
The author declares no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix A. Item Parameters Used in the Simulation Studies
Table A1 displays the item parameters that were used in the two simulation studies. The item parameters for Simulation Study 2 are exactly displayed in this table. The asymmetry parameters α i1 for the upper tail of the item response functions ranged between −0.5 and 0.3 (M = −0.13, SD = 0.19). The asymmetry parameters α i2 for the lower tail of the item response functions ranged between −0.4 and 0.7 (M = 0.21, SD = 0.32). The item discrimination parameters a i ranged between 0.5 and 2.3 (M = 1.46, SD = 0.55), while the item difficulty parameters ranged between −1.9 and 2.5 (M = −0.16, SD = 1.2).
For Simulation Study 1 (see Section 3.1.1), only the item discrimination parameters a i and item difficulty parameters b i are displayed in Table A1.