Flexible Item Response Modeling in R with the flexmet Package

Feuerstahler, Leah

doi:10.3390/psych3030031

Open AccessTutorial

Flexible Item Response Modeling in R with the flexmet Package

by

Leah Feuerstahler

Department of Psychology, Fordham University, 441 E Fordham Road, Bronx, NY 10458, USA

Psych 2021, 3(3), 447-478; https://doi.org/10.3390/psych3030031

Submission received: 15 July 2021 / Revised: 10 August 2021 / Accepted: 11 August 2021 / Published: 16 August 2021

(This article belongs to the Special Issue Computational Aspects, Statistical Algorithms and Software in Psychometrics)

Download

Browse Figures

Versions Notes

Abstract

:

The filtered monotonic polynomial (FMP) model is a semi-parametric item response model that allows flexible response function shapes but also includes traditional item response models as special cases. The flexmet package for R facilitates the routine use of the FMP model in real data analysis and simulation studies. This tutorial provides several code examples illustrating how the flexmet package may be used to simulate FMP model parameters and data (both for dichotomous and polytomously scored items), estimate FMP model parameters, transform traditional item response models to different metrics, and more. This tutorial serves as both an introduction to the unique features of the FMP model and as a practical guide to its implementation in R via the flexmet package.

Keywords:

item response theory; psychometric scaling; software tutorial

1. Background

Although many applications of item response theory are in the context of parametric models such as the Rasch, two, and three-parameter logistic models [1], there is also a recognized need for models that allow for more flexible relationships between the latent trait and an item category response [2]. Models and techniques that allow for more flexible item response functions are variously known as non-parametric [2], quasi-parametric [3], or semi-parametric [4] models, and include methods such as Mokken scale analysis [5], kernel smoothing [6], and polynomial splines [4]. Historically, flexible item response models have been used to analyze data sets with small sample sizes, to check the assumptions of parametric item response models, and as an alternative to poorly fitting parametric models [2]. In recent years, flexible item response models have increasingly been used in confirmatory contexts. For example, flexible item response models have recently been applied to computerized adaptive testing [7,8], the creation of item banks for measuring health outcomes [9], and the development of optimal scoring procedures [10].

A compelling recent addition to the family of semi-parametric item response models is the filtered monotonic polynomial (FMP) model [3,11]. The general form of the FMP model [12] used in this paper is a generalization of Muraki’s generalized partial credit model [13] (GPCM) that replaces a linear function of the latent trait

θ

with a polynomial expansion of

θ

. As clarified below, constraints are placed on the item parameters so that this polynomial function is a monotonically increasing function of

θ

. For item i with

C_{i}

ordered response categories and person j, the item response function (IRF) of the general FMP model gives the probability of a response in category c,

c = 0, \dots C_{i} - 1

, as

P (X_{i j} = c | θ_{j}, b_{i}) = \frac{exp (\sum_{v = 0}^{c} (b_{0_{v} i} + m_{i}^{★} (θ_{j}, b_{1 i}, \dots b_{2 k_{i} + 1, i})))}{\sum_{u = 0}^{C_{i} - 1} exp (\sum_{v = 0}^{u} (b_{0_{v} i} + m_{i}^{★} (θ_{j}, b_{1 i}, \dots b_{2 k_{i} + 1, i})))},

(1)

where

m_{i}^{★} (θ_{j}, b_{1 i}, \dots b_{2 k_{i} + 1, i}) = b_{1 i} θ_{j} + b_{2 i} θ_{j}^{2} + \dots + b_{2 k_{i} - 1, i} θ_{j}^{2 k_{i} - 1},

(2)

\sum_{v = 0}^{0} (b_{0_{v} i} + m_{i}^{★} (θ_{j}, b_{1 i}, \dots b_{2 k_{i} + 1, i})) \equiv 0

, and

b_{i} = {(b_{0_{1} i}, \dots, b_{0_{C_{i} - 1} i}, b_{1 i}, b_{2 i}, \dots, b_{(2 k_{i} + 1) i})}^{'}

is a vector of item parameters in a polynomial coefficient parameterization.

In Equations (1) and (2), the

k_{i}

parameter is an item-specific non-negative integer that controls the maximum degree of the polynomial function of

θ

. Specifically, the highest-order polynomial equals

2 k_{i} + 1

, and so

k_{i} = 0

, 1, and 2 imply linear, cubic, and fifth-degree polynomial functions of

θ

. Note that this formulation forces an odd value for the highest-order polynomial of

θ

, which is a necessary (but not sufficient) condition for the polynomial to be a monotonic function of

θ

.

A key feature of the FMP model is that it reduces to familiar parametric item response models when

m^{★}

is set to be a linear function of

θ

(i.e., when

k_{i} = 0

). Specifically, if

k_{i} = 0

, Equation (1) reduces to the GPCM, and if both

k_{i} = 0

and

C_{i} = 2

(i.e., scored item responses are dichotomous), then Equation (1) reduces to the two-parameter logistic (2PL) model. Another key feature of this model not shared by many flexible item response models is that its model parameters are portable [3], meaning that the FMP model can be used to construct item banks, conduct adaptive testing, and score examinees not included in the original sample. To better acquaint the reader with the relationship between the FMP model with

k = 0

and other item response models used in popular IRT software packages, several examples of finding FMP parameters from the output of the R packages ltm [14], mirt [15], and TAM [16] are included in Appendix A.

The FMP model requires that

m^{★} (θ)

be a monotonically increasing function of

θ

. To enforce monotonicity, we may use a transformation of the polynomial coefficient parameters

b

. In general, consider the polynomial function

m (θ | b) = b_{0} + b_{1} θ + b_{2} θ^{2} + b_{3} θ^{3} + \dots b_{2 k + 1} θ^{2 k + 1} .

(3)

Equation (3) is a strictly monotonically increasing function of

θ

if and only if the first derivative of

m (θ)

,

\frac{\partial m (θ)}{\partial θ} = b_{1} + 2 b_{2} θ + 3 b_{3} θ^{2} + \dots + (2 k + 1) θ^{2 k},

(4)

is positive at all values of

θ

. One way to enforce positivity is through the following reparameterization of Equation (4) [3,12,17]:

\frac{\partial m (θ)}{\partial θ} = \{\begin{matrix} exp (ω) \prod_{h = 1}^{k} (1 - 2 α_{h} θ + (α_{h}^{2} + exp (τ_{h})) θ^{2} & if k > 0 \\ exp (ω) & if k = 0 . \end{matrix}

(5)

In this parameterization, no boundary constraints are required for

ω

,

α_{h}

, or

τ_{h}

,

h = 1, \dots k

. Because

b_{0_{c} i}

parameters do not affect the monotonicity of

m^{★}

, no transformation of these parameters is necessary, but we use the symbol

ξ_{c i} = b_{0_{c} i}

to use notation consistently across the different parameterizations. Therefore,

b_{0_{c} i} = ξ_{c i}

for

c = 1, \dots, C_{i} - 1

,

b_{1} = exp (ω_{i})

, and the

b_{2}, \dots, b_{2 k_{i} + 1}

parameters are functions of

ω_{i}

,

α_{i}

, and

τ_{i}

calculated using matrix operations described elsewhere [3,11,12,18,19]. Thus, the FMP model is equivalently represented by the polynomial coefficient parameters

b_{i} = {(b_{0_{1} i}, b_{0_{2} i}, \dots, b_{0_{C_{i} - 1} i}, b_{1 i}, b_{2 i}, \dots, b_{(2 k_{i} + 1) i})}^{'}

and the Greek-letter parameters

(ξ_{1 i}, ξ_{2 i}, \dots, ξ_{(C_{i} - 1) i},

ω, α_{1 i}, τ_{1 i}, \dots, α_{(2 k_{i} + 1) i}, τ_{(2 k_{i} + 1) i})^{'}

. In both parameterizations, an item with

C_{i}

response categories and item complexity

k_{i}

is described by

2 k_{i} + C_{i} - 1

item parameters. To better acquaint the reader with the Greek-letter and polynomial coefficient parameterizations, Appendix B includes formulas to calculate the polynomial coefficients from the Greek-letter parameters up to

k = 2

.

2. Specifying the FMP Model in `flexmet`

The R package flexmet provides broad functionality for specifying, fitting, and transforming the FMP model, and many of its features (relative to version 1.1) are illustrated in the remainder of this paper. The IRF for the FMP model is specified using the polynomial coefficient parameters

b_{i}

as described in the previous section. However, the Greek-letter parameterization is also needed when fitting and generating FMP items to ensure monotonicity. The flexmet package includes the greek2b and b2greek functions to navigate between the two different parameterizations. To illustrate these functions, consider a six-item test. Three items (items 1, 2, and 3) are defined for binary item responses, and three items are defined for four-category responses (items 4, 5, and 6). Additionally, items 1 and 4 have

k_{i} = 0

, items 2 and 5 have

k_{i} = 1

, and items 3 and 6 have

k_{i} = 2

. The parameters used for this illustration (both the Greek-letter and polynomial coefficient parameterizations) are printed in Table 1.

The greek2b function outputs the

b

vector associated with the inputted xi, omega, and (optionally) alpha and tau values. For example, the

b

vector for item 1 can be found as follows.

As another example, let us find the

b

vector for item 6. In this case, the greek2b function requires the user also to specify the alpha and tau arguments, as well as multiple xi parameters corresponding to the different response categories. Note that the xi parameters should be given in order from

ξ_{1 i}, \dots, ξ_{(C_{i} - 1) i}

, and the alpha and tau vectors should also be ordered

α_{1 i}, \dots, α_{k_{i} i}

and

τ_{1 i}, \dots, τ_{k_{i} i}

.

It is also possible to represent FMP item parameters with different k values and different numbers of items in the same matrix. To do this, the NA symbol may be used to represent higher-order

ξ

values for items with

C_{i} < max (C_{i})

. In addition, specifying

α_{h} = 0

and

τ_{h} = - \infty

will set the corresponding polynomial coefficients (i.e.,

b_{2 h, i}

and

b_{2 h + 1, i}

) to be equal to 0. For example, to find the matrix of polynomial–coefficient item parameters for the example six-item test, we can bind together calls to greek2b that have xi, alpha, and tau arguments of the same length for each item.

Following from Equation (1), these item parameters can be used to find the probability of responding in each response category as a function of the latent trait

θ

; that is, the IRF. The irf_fmp function calculates item response probabilities for the FMP model given a scalar or vector of latent trait value(s) theta and a matrix or vector of item parameters bmat in the polynomial coefficient parameterization. It may also be necessary to specify the maxncat function, which gives the maximum number of response categories represented in the matrix (such that the first maxncat-1 columns are interpreted as

ξ

parameters). The default value of maxncat = 2, so this argument should be specified if bmat includes at least one polytomous item. Calls to irf_fmp will result in a three-dimensional array with

θ

values in the first dimension, items in the second dimension, and response categories in the third dimension. For example,

In the above output (and elsewhere in this tutorial), informative dimension labels have been added to ease readability. This output shows, for example, that the probabilities of responding in categories 0, 1, 2, and 3 to item 6 for a person with

θ = - 1

equal 0.4118, 0.3800, 0.1849, and 0.0233. Notice that the output also includes some NA values. This is because items 1, 2, and 3 have only 2 response categories, and therefore these subjects can only respond in categories 0 and 1 and categories 2 and 3 are NA.

For dichotomous items, it is common to only find the probability of responding in the higher response category (because the sum of response category probabilities must sum to 1 for each

θ

, probabilities for category 0 are 1 minus those for category 1). In flexmet, specific category response probabilities can be found by adding the returncat argument. This is illustrated below for item 1:

Notice that the maxncat argument is set equal to 4 in this example, even though the item is dichotomous. This is because the item parameters are taken from the first row of bmat, and bmat includes columns corresponding to four-category items. In other words, maxncat should be set to one greater than than the number of bmat columns (or one greater than the number of bmat entries, if bmat is a vector) that correspond to

ξ

parameters, even if no item with maxncat categories is included in the call to bmat. Note that the returncat argument represents the value of the category to output, where categories are labeled starting at 0. Therefore, returning category “1” for dichotomous item 1 returns the probability of a positive item response to item 1. By default, if maxncat = 2, only probabilities for category “1” are returned, and if maxncat > 2, all response category probabilities are returned.

Calls to irf_fmp also can be used to plot IRFs. Figure 1 displays the IRFs for the six example items, and tge code to produce a version of this figure is included in Appendix C. In this figure, response probabilities for category 1 (a positive response) are shown for the 3 dichotomous items, and all response category probabilities are shown for the 3 polytomous items.

3. Fitting the FMP Model in `flexmet`

The flexmet package includes functionality for specifying, fitting, and manipulating the general FMP model. Notably, flexmet is not the only R package available for fitting the FMP model. For example, the mirt package [15] includes functionality to estimate FMP item parameters using marginal maximum likelihood estimation. However, if any items have

k = 0

, the mirt package currently (version 1.34 of mirt is current at the time of writing) requires the user to estimate either 2PL or GPCM parameters rather than FMP parameters. Because mirt parameterizes the 2PL and GPCM differently than the Greek-letter FMP parameterization, this may make the mirt package less than ideal for tests with a variety of

k_{i}

values and for advanced applications of the FMP model (such as scale transformations, as described in a later section). Therefore, the flexmet package includes several ways to estimate FMP model parameters, including built-in methods for fixed-effects and random-effects estimation, as well as a wrapper that uses the mirt package to estimate parameters and return parameter estimates in a standardized format. As illustrated below, flexmet facilitates item parameter estimation for items with any combination of

k_{i}

values and numbers of response categories, with or without the use of Bayesian priors.

Early applications of the FMP model treated the latent trait as a fixed effect to estimate item parameters [3,11]. In this fixed-effects approach, initial estimates of the

θ

parameters are treated as known quantities when calculating maximum likelihood estimates of the Greek-letter parameters for an individual item. These fixed

θ

values are called

θ

surrogates [3,11] and are calculated as the first principal components scores of the full data matrix. In the following code chunk, 1000 true

θ

values are simulated from a standard normal distribution. Then, the sim_data function in flexmet is used in conjunction with the six-item bmat matrix defined earlier to randomly generate item response data. Finally,

θ

surrogates are calculated by passing the simulated data to flexmet’s get_surrogates function.

The tsur object now includes 1000

θ

surrogate values calculated from the simulated data. To estimate item parameters for a single item using fixed-effects estimation with

θ

surrogates, we can use the fmp_1 function in flexmet. As illustrated below, this function requires the user to specify the data vector, the desired k value, and a vector of

θ

surrogates tsur. Below, this is illustrated for k values of 0, 1, and 2.

The best choice of k for a given item is typically unknown, and so authors have suggested comparing the item-level Akaike Information Criteria (AIC) value to select the optimal k value [3,12]. We can perform this comparison by extracting the AIC list element from each call to fmp_1.

In this example, k = 1 leads to the lowest AIC value. Incidentally,

k = 1

is also the data-generating k value for this item (item 2), though this will not always be the case. Note that it is not always desirable to seek the “correct”

k_{i}

value, but instead it may be preferable to approximate the population curve as closely as possible without overfitting the data. One measure of the similarity of item response functions is the root integrated mean squared error (RIMSE; [6,12]), which is defined here as

RIMSE = \sqrt{\int {(\sum_{c = 0}^{C_{i} - 1} x P_{1} (X_{i} = c | θ) - \sum_{c = 0}^{C_{i} - 1} x P_{2} (X_{i} = c | θ))}^{2} g (θ) d θ}

(6)

where

P_{1}

and

P_{2}

represent the two item response functions to compare (not necessarily from the FMP model), and

g (θ)

indicates a

θ

distribution to integrate over. Smaller values of RIMSE indicate greater similarity between the two curves. In flexmet, the rimse function can calculate the RIMSE for any combination of

b

-vectors that represent the same number of response categories (though not illustrated here, rimse can also be used with non-FMP item response functions). In flexmet,

g (θ)

is standard normal by default but can be modified using the int argument, which expects a matrix with two columns. The first column should include a sequence of quadrature points, and the second column should include the densities of each quadrature point, scaled so that the densities sum to 1. The int_mat function in flexmet facilitates the creation of this matrix. The int_mat function takes a distribution function distr (such as dnorm or dunif), a named list args of the parameters of that distribution, the lower and upper bounds of the quadrature points, lb and ub, and the number of quadrature points, npts. An example of modifying these arguments of the int_mat function is included later in this paper (Section 4.2).

The first two arguments to rimse should be two vectors of

b

-parameters, in either order. In the code below, the true

b

-parameters for item 2 are listed first. Note that because columns 2 and 3 of bmat represent category intercept parameters for polytomous items (and include NA values for item 2), we omit these from the

b

-vector when calling rimse. The estimated

b

-parameters are listed second and are found in the bmat list element of each call to fmp_1. If items are polytomous, the ncat argument should also be specified to indicate the number of response categories. Because the default value of ncat = 2, it is not necessary to include this argument for dichotomous items.

Comparing the RIMSE of the estimated curves versus the data-generating curves for 3 values of k, we see that k = 0 leads to the highest error in estimation, followed by k = 2, and k = 1 most closely traces the population item response function.

It is also possible to estimate fixed-effects item parameters for multiple items in one command using the fmp function with the em = FALSE argument. This method will automatically calculate

θ

surrogates based on the provided data matrix. In the fmp function, it is possible to specify different k values for different items. Namely, if a scalar is specified for the k argument, the same k value will be used for all items. Otherwise, a vector of k values should be specified, one per item.

In the example below, the fixed-effects FMP model is fit several times. First, all items are fit with

k = 0

,

k = 1

, and

k = 2

. Based on these results, a model with differing k values is specified based on the optimal k value, as indicated by comparing item-level AIC values.

The user may notice that the model with k = 2 produces an error that the (item parameter) information matrix cannot be inverted. This is a common problem with high k values. If this error occurs, standard errors are not available for the estimated item parameters; however, the item parameter estimates themselves may be used without concern.

Based on the above results, we see that the lowest AIC value is observed for

k =

0, 0, 2, 2, 0, and 1 for the 6 items. We may then choose to fit a new fixed-effects FMP model with these varying k values, as illustrated below.

In the item parameter matrix printed above, notice how items with different k values and numbers of response categories are represented. Specifically, NA’s are used as placeholders for items 1–3 that include less than the maximum number of response categories. In contrast, if

k_{i}

is less than the maximum

k_{i}

value represented in the parameter matrix (as is the case for items 1, 2, 5, and 6), then the higher-order

b

parameters are set to 0.

In addition to fixed-effects estimation, flexmet also provides functionality for random-effects estimation using marginal maximum likelihood estimation via the expectation-maximization (EM) algorithm [20]. The flexmet package includes both an inbuilt algorithm for estimating item parameters (using the fmp function with option em = TRUE) and a wrapper to the mirt [15] package (using the fmp function with option em = "mirt"). The mirt algorithm is currently faster and more reliable than the inbuilt algorithm, and so this option is used for illustration. Note that the mirt package must be installed in order to use this option. Using the same pattern of k values found with the fixed-effects model, we find the following results:

At a glance, comparing the bmat outputs of fe_mixed and re_mixed indicates that while some parameter estimates are similar across the two estimation methods, others are quite different. Notably, for highly parameterized models such as the FMP model, very different sets of parameters can trace similar curves. For this reason, it is useful to plot curves or to apply overall summary measures such as the RIMSE when comparing estimated curves to the true curve or to each other. These RIMSE values are calculated for fe_mixed and re_mixed after illustrating one final estimation method.

Another useful estimation option available for both the fmp_1 and fmp functions is the use of Bayesian priors. Particularly, for higher values of k, the FMP model may become computationally unstable, which can be somewhat alleviated through the use of priors [12]. Because the Greek-letter parameterization of the FMP is used for parameter estimation, priors are placed on the Greek-letter parameters rather than on the more readily interpretable

b

parameters. When choosing priors, it may be helpful to note that

(ξ_{1 i}, \dots ξ_{(C_{i} - 1) i}) = (b_{0_{1} i}, \dots b_{0_{C_{i} - 1} i})

and that

ω = ln b_{1}

. In addition, higher-order

b

parameters will equal zero if the corresponding

α = 0

and

τ = - \infty

. Prior predictive simulation may help the user to select appropriate priors. For example, the following code produces a prior predictive simulation of 20 item response curves generated from the following distributions:

ξ \sim N (0, 1)

,

ω \sim N (

−

0.5, 1)

,

α \sim N (0, 0.5)

, and

τ \sim N (- 3, 0.5)

. The plot produced by this code is shown in Figure 2. This choice of priors appears to allow for a variety of shapes and locations of the FMP curves, and so we continue to illustrate Bayesian estimation with these priors.

Priors can be added to models fit using flexmet by specifying the prior argument to the fmp or fmp_1 function. As of version 1.1 of flexmet, only normally distributed priors are available, and the same priors are applied to all model parameters of a given class (i.e., for all items in the data set and for all instances of that parameter type within an item). To specify priors, a list with named elements xi, omega, alpha, and tau should be passed to the prior argument. For example, to specify a standard normal prior for all

ξ

s, we may write prior = list(xi = c("norm", 0, 1)). The code below illustrates Bayesian estimation with the em = "mirt" option.

Finally, we may compare the accuracy of each of the three illustrated estimation methods using the rimse function. The code below does this for each of the six simulated items. For each item, random-effects estimation tends to lead to somewhat more accurate curves than fixed-effects estimation, and the use of priors improves parameter estimation accuracy in some, but not all, cases.

Several other estimation options are available with the fmp and fmp_1 functions that are not illustrated here. A vector of starting values may be specified with the start_vals argument, where parameters should be listed in the same order as the estimated parameters given in the parmat list element of the fitted model object. For fixed-effects estimation, the method argument indicates the optimization algorithm (passed to the optim function in R’s stats package). For random-effects estimation, the max_em option indicates the maximum number of EM cycles (default 500), and the n_quad function indicates the number of quadrature points (default 49). Additional named arguments may be passed to the optim function (if em = TRUE) or the mirt function (if em = "mirt").

Finally, flexmet includes the th_est_ml and th_est_eap functions for maximum likelihood (ML) and expected a posteriori (EAP, [21]) person parameter estimation. Both of these functions require a data matrix dat and a matrix of

b

-parameters bmat. If at least one item is polytomous, the maxncat argument should also be specified. By default, the th_est_eap function uses a standard normal prior with 33 quadrature points, and this may be modified using the int argument in conjunction with int_mat. A call to either trait estimation function will result in a matrix with two columns: one for the

θ

parameter estimates and one for the standard errors (for ML) or posterior standard deviations (for EAP). The code below illustrates this procedure for the example data and the re_priors estimated item parameters.

4. Transforming the FMP Model with `flexmet`

Aside from the increased flexibility in IRF shapes afforded by the FMP model, another potential use of the FMP model is to transform item response models linearly or nonlinearly [18,19]. Specifically, suppose that two scalings of the latent trait,

θ

and

θ^{★}

, are related by a monotonic polynomial function of degree

2 k_{θ} + 1

such that

θ = t_{0} + t_{1} θ^{★} + t_{2} θ^{★ 2} + \dots t_{2 k_{θ} + 1} θ^{★ 2 k_{θ} + 1},

(7)

where

k_{θ}

is a non-negative integer. Then, an FMP model defined on the scale of

θ

may be transformed to the scale of

θ^{★}

using matrix operations described elsewhere [18,19]. For a given

k_{θ}

and an item with a given

k_{i}

(i.e., item complexity defined on the scale of

θ

), the item complexity on the scale of

θ^{★}

,

k_{i}^{★}

, equals

k_{i}^{★} = 2 k_{i} k_{θ} + k_{i} + k_{θ} .

(8)

Therefore, item response functions on the scale of

θ^{★}

will necessarily have larger item complexities than response functions on the scale of

θ

.

Two applications of nonlinearly transforming the FMP model will be illustrated below. In the first application, item parameters are estimated separately for two groups with different latent trait distributions. Because the latent trait is often assumed to follow a standard normal distribution for model identification, the fitted item response models will be on different, nonlinearly related, scales. Linear and nonlinear linking transformations will be estimated and compared. In the second application, an item response model will be transformed such that the latent trait scale is on a more interpretable percentage-correct score metric.

4.1. Item Parameter Linking with the FMP Model

If item parameters for the same items are estimated from two different samples, item parameter linking may be used to put the parameters on the same scale. The need for item parameter linking arises from differences in model identification constraints [22]. For the most common parametric item response models, such as the 2PL and GPCM, the chosen functional form of the item response model identifies the scale of the latent trait up to a linear transformation. In addition, models may be identified by assuming that the latent trait is standardized, or that the latent trait follows a standard normal distribution (as is the case for the fixed-effects FMP estimation illustrated above). Therefore, if there are any differences in the location, variance, or shape of the latent trait distribution between groups, linking may be necessary to put the two sets of item parameters on more similar scales. To illustrate how item parameters on different scales may be linked with the FMP model, an illustrative simulation is presented next.

To simulate data, a population set of FMP item parameters was first generated for 20 items each with two response categories and

k = 0

using the sim_bmat function in flexmet. With the sim_bmat function, the user can generate items with varying k values and numbers of response categories by specifying vectors for the k and ncat functions. The population item parameters were then used to generate data for two groups of 5000 examinees. The population latent trait values for the first group were first drawn from a standard normal distribution, then transformed by taking −2 plus e to the power of the standard normal scores divided by 2. The population latent trait values for the second group were drawn from a standard normal distribution. From this design, because the second group’s

θ

values were transformed to the scale of the first group’s scores (which were identified by a combination of the shape of the fitted model and the assumption of a standard normal distribution), the population relationship between the two scales equals

θ = - 2 + exp (θ^{★} / 2) .

(9)

After generating population

θ

values, data were then generated separately for each group. To select the k values that best represent these data, each item for each group was first fit to a series of FMP models using the fmp_1 function based on theta surrogates. Specifically, each item was first fit with

k = 0

and

k = 1

. If the AIC for

k = 0

was smaller than for

k = 1

, then the final model was fit with

k = 0

. If not, models with sequentially higher k values were fit until the model with the smaller k value had a smaller AIC than the model with the higher k value. The code to implement this procedure is included in Appendix D and resulted in a mixture of

k = 0

and

k = 1

for both groups. Once the empirically selected k values for each item and group were chosen, item parameters were estimated again with fmp and the em = "mirt" option. For group 1, the argument technical = list(NCYCLES = 1500) was passed to mirt because more than 500 cycles of the EM algorithm (the default in mirt) were required for this model to converge.

The flexmet package includes two functions for linking item parameters: hb_link for linking based on the Haebara method (HB) [23] and sl_link for linking based on the Stocking–Lord method (SL) [24], and both functions take the same arguments. The bmat1 and bmat2 arguments give the matrices of item parameters in the polynomial coefficient parameterization. These functions currently assume that both matrices include the same set of common items in the same order. The bmat2 parameters are on the

θ^{★}

metric and are transformed to the

θ

metric of the bmat1 parameters. The polynomial degree of the estimated transformation is specified by the k_theta argument (monotonicity of the estimated linking transformation is ensured using the same strategy as is used for item parameters).

From the objects returned by calls to hb_link and sl_link, the value list element gives the function value from minimizing the HB or SL criterion (see [23,24] for details). Comparing these values resulting from the HB or SL method with different

k_{θ}

can help inform which

k_{θ}

yields the most appropriate latent trait transformations, where smaller values indicate smaller differences between the estimated item/test response functions on the two scales. Although criteria values from calls to hb_link should not be compared to values from calls to sl_link, comparing values with different

k_{θ}

suggests that

k_{θ} = 1

leads to the closest match between the two sets of item parameters under both the SL and HB methods.

Calls to hb_link and sl_link also output the estimated vector of polynomial coefficients in the tvec list element. As shown below, although these values are similar for the HB and SL methods, there are some differences, particularly for higher-order polynomial coefficients.

The tvec output is useful for understanding the estimated relationship between the

θ

and

θ^{★}

scales. Recall from Equation (7) that

θ

is a polynomial function of

θ^{★}

, and Equation (9) gives the population relationship between

θ

and

θ^{★}

. In Figure 3, the SL linking transformations are compared to the true linking transformation. The code to reproduce a version of this figure is included in Appendix E and makes use of flexmet’s inv_poly function to calculate the

θ

value that corresponds to any particular value of

θ^{★}

. Although in non-simulation applications, the true latent trait transformation will not be known, plotting may still be a useful tool to inspect the similarity of different transformations.

To illustrate the meaning of the linking transformation, consider a

θ

value of 0.5. In the code below, this

θ

value is transformed to a

θ^{★}

value using the inv_poly function, which takes a scalar value (here, each

θ

) and returns the

θ^{★}

value that combines with the polynomial coefficient vector coefs (see Equation (7)) to produce

θ

. As shown below for the SL transformation with

k_{θ} = 1

, the item response probabilities for theta combined with the originally estimated bmat are identical to the transformed thetastar values combined with the transformed item parameters (which are found in the bmat list element of the linking output). This equality is demonstrated for the first three items of the simulated test.

4.2. Transforming the FMP Model to a User-Defined Scale

Another potential application of FMP model transformations is to specify an item response model on a latent trait metric other than

θ

. In the following illustration, flexmet functions are used to transform a set of 3 parameter logistic (3PL) model item parameters from the typical

θ

metric to a percentage-correct (PC) metric where the lowest score is 0, the highest score is 100, and scores roughly correspond to the percentage of items answered correctly. Although this type of model transformation is illustrated using a PC metric, any monotonic transformation is possible, so long as the user can identify a suitable polynomial approximation to the desired transformation. Transforming the entire item response model to a more interpretable metric allows for trait estimates, item information, standard errors, and other quantities to be calculated directly on the reported score metric.

The item parameters used for the following illustration are taken from the tcals data set found in version 3.16 of the catR package [25] (and also included in Appendix F). Note that these parameters correspond to the following representation of the 3PL model:

P (X_{i j} = 1 | θ_{j}, a_{i}^{★}, b_{i}^{★}, c_{i}^{★}) = c_{i}^{★} + \frac{1 - c_{i}^{★}}{1 + exp (- a_{i}^{★} (θ_{j} - b_{i}^{★}))},

(10)

where

a_{i}^{★}

indicates the item discrimination parameter,

b_{i}^{★}

indicates the item difficulty parameter, and

c_{i}^{★}

indicates the guessing, or lower asymptote, parameter. The star notation is used for these item parameters only to avoid confusion with previously defined notation.

Note that the 3PL presented in Equation (10) is not a special case of the FMP model as presented in Equation (1). However, for dichotomous item responses, an FMP model may be specified that includes upper (

d_{i}^{★}

) and/or lower (

c_{i}^{★}

) asymptotes on the item response functions [26] as follows:

P (X_{i j} = 1 | θ_{j} b_{i}, c_{i}^{★}, d_{i}^{★}) = c_{i}^{★} + \frac{d_{i}^{★} - c_{i}^{★}}{1 + exp (- b_{0} - m^{★} (θ_{j}, b_{1 i}, \dots, b_{2 k_{i} + 1, i}))} .

(11)

If not otherwise specified, we may assume that all

d_{i}^{★} = 1

and all

c_{i}^{★} = 0

. In many flexmet functions, upper and lower asymptote parameters may be specified using the cvec and dvec arguments (note that flexmet currently does not allow these asymptote parameters to be estimated with the fmp or fmp_1 functions; however, most other flexmet functions accommodate asymptote parameters). These asymptote parameters are unaffected by transformations of the latent trait metric.

In the code below, the tcals parameters are read in first. Then, an object fmp_pars is defined that transforms the 3PL

a_{i}^{★}

and

b_{i}^{★}

parameters to the polynomial coefficients required by the FMP model. Then, a sequence of

θ

values is generated between −4 and 4, and item response probabilities are calculated for each item and

θ

value. The expected percentage-correct (PC) scores are then calculated by summing the item response probabilities for each

θ

value and multiplying them by

100 / 85

(where 85 is the number of test items). We then regress the

θ

values on the PC scores using monotonic polynomial regression [27] as implemented in version 0.3–10 of the Monopoly package [28] for R. Several regressions are fit using k values (which are used as

k_{θ}

values) from 0 to 5, where, as before,

2 k + 1

indicates the maximum polynomial degree. Finally, the average regression residual is printed for each k value.

The average regression residuals indicate that higher k values lead to better approximations of the desired curve. However, choosing a higher k value will lead to many higher-order coefficients for the transformed item parameters, so we may choose to use the smallest k value that yields an acceptable level of accuracy. For instance, if we decide that an average regression residual of less than 0.1 is acceptable, then we may choose to move forward with the

k = 1

transformation. The estimated regression coefficients then become the polynomial coefficients with which to transform fmp_pars (i.e., the t coefficients for Equation (7) with

k_{θ} = 1

). We can implement this transformation by applying the transform_b function in flexmet to each row of fmp_pars, specifying the

θ

transformation coefficients in the tvec argument, as illustrated below.

Note again that the

c_{i}^{★}

parameters remain unchanged during FMP parameter transformations, and so the argument cvec = tcals$c should be passed to any flexmet function that makes use of these parameters. For example, we may compare two methods of calculating trait estimates on the transformed

θ^{★}

/PC metric: first, estimating person parameters on the

θ

metric and then applying the polynomial transformation; and estimating person parameters directly on the

θ^{★}

/PC metric. In this example, we use a standard normal prior to estimate EAPs on the

θ

metric, and a

N (80, 10)

prior to estimate EAPs directly on the

θ^{★}

/PC metric. This prior is specified by feeding the output of a call to the int_mat function to the int argument. In the example code shown below, both priors are specified with 89 quadrature points, the prior on

θ

is bounded between 0 and 4, and the prior on

θ^{★}

is bounded between 0 and 100. Finally, the EAPs that were estimated on the

θ

metric are transformed to the

θ^{★}

/PC metric using the inv_poly function.

Comparing the distribution of trait estimates calculated with both approaches, we see that the scores are relatively similar (the small difference in estimated distributions may be due to the use of different prior distributions). However, a major advantage of estimating parameters directly on the

θ^{★}

/PC metric is that the estimated standard errors are on the appropriate metric. That is, nonlinear monotonic transformations of the latent trait metric can lead to unexpected distortions of the information matrices [19,29]. An illustration of the effects of this transformation is illustrated in Figure 4, and the code to reproduce a version of this figure is included in Appendix F. This code makes use of the iif_fmp function in flexmet, which has similar arguments to the irf_fmp function and yields item information for any FMP items and

θ

values. In the upper panels of Figure 4, three example IRFs (taken from the 85 tcals items) are displayed on the

θ

metric and after being transformed to the

θ^{★}

/PC metric. Notice that although the shape of each IRF changes, the relative relationships among the three curves do not change. In the bottom panels, the test information function (i.e., the sum of item-level information functions) is illustrated for both the

θ

and

θ^{★}

metrics. Whereas test information on the

θ

metric is unimodal and centered just below

θ = 0

, the

θ^{★}

metric is bimodal with peaks around

θ^{★} = 35

and

θ^{★} = 90

. This result implies that the items that the trait regions measured most accurately on the

θ

metric are not necessarily those measured most accurately on the

θ^{★}

metric. This result also highlights the importance of calculating information directly on the metric on which scores are reported.

5. Summary

This paper provides an introduction to the specification and use of the FMP model through the flexmet package for R. Version 1.1 of flexmet includes functionality to estimate FMP item parameters for both dichotomous and polytomous items, as well as methods to transform FMP item parameters, generate FMP item parameters and data, and several other methods that are particularly useful when working with the FMP model. The provided examples also illustrate some scenarios in which it may be desirable to use the FMP model, including fitting IRFs that can take on more flexible shapes than allowed by traditional models and transforming item response models to user-defined metrics (e.g., a sum score or proportion-correct metric). Moreover, it is hoped that broader use of the flexmet package will lead to further development of the flexmet package, including planned features such as more choices of prior distributions for Bayesian estimation and the estimation of upper and lower asymptote parameters. Although not every feature of the current flexmet package is illustrated in the examples presented above, we hope that this tutorial provides the reader with an accessible introduction to the FMP model and the tools to use the FMP model with the flexmet package.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All information required to fully reproduce the results in this paper is included in text.

Acknowledgments

Many thanks to Xing Chen for his assistance testing code and suggesting improvements and to three anonymous reviewers for their suggestions in improving the readability and accessibility of this tutorial.

Conflicts of Interest

The author declares no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

2PL	Two-parameter logistic
3PL	Three-parameter logistic
AIC	Akaike information criterion
EAP	Expected a posteriori
EM	Expectation-maximization
FMP	Filtered monotonic polynomial
GPCM	Generalized partial credit model
HB	Haebara
IRF	Item response function
ML	Maximum likelihood
PC	Percentage-correct
RIMSE	Root integrated mean squared error
SL	Stocking–Lord

Appendix A. Relationship of the FMP Model (k = 0) to Equivalent Models in Other Software Packages

The purpose of this appendix is to illustrate how to transform 2PL and GPCM item parameters from various software packages to the item parameters used by the FMP model with

k = 0

. These transformations are useful for fully understanding the FMP model and some of its applications (e.g., see Section 4.2). Although these examples are in the context of specific R packages, they are broadly representative of the most common parameterizations of the 2PM and GPCM, and well as variations of these models such as the 3PM (see also Section 4.2) and the partial credit model.

In the following examples, 2PL or GPCM data are generated using the flexmet package (item parameter and data generation are described in Section 3). The data are then fit to the 2PL or GPCM using the ltm version 1.1-1 [14], mirt version 1.34 [15], and TAM version 3.7-16 [16] packages, the estimated item parameters are transformed to FMP parameters. The equivalence of the different parameterizations is then verified by comparing category response probabilities using flexmet’s irf_fmp function (introduced in Section 2) and each package’s inbuilt function for calculating response probabilities. Throughout, packages are called explicitly using R’s :: syntax to avoid potential package conflicts and to clarify which commands belong to which packages. Note also that this appendix uses flexmet functions and conventions that are more fully described within the main tutorial.

Appendix A.1. Two-Parameter Model

In Equation (1), the portion of the numerator within the

exp ()

function when

k = 0

and

C_{i} = 2

(i.e., the conditions which make the FMP model equivalent to the 2PL) equals

b_{0 i} + b_{1 i} θ .

(A1)

The following code demonstrates how to find the FMP

b_{0 i}

and

b_{1 i}

parameters from other parameterizations of the 2PL. To begin, we first generate 2PL data.

The ltm package with option IRT.param = FALSE and the mirt package both parameterize the 2PL in the same way as the FMP model. This is illustrated first for ltm.

The first column of ltm_fit1_FMP is labeled (Intercept) and is equivalent to

b_{0 i}

. The second column of ltm_fit1_FMP is labeled z1 and is equivalent to

b_{1 i}

. This is verified by comparing predicted response probabilities.

When fitting the 2PL in mirt, the same parameterization is used as for the FMP model, but mirt will print the

b_{i 1}

parameters (labeled a in mirt) before the

b_{0 i}

parameters (labeled d inmirt), so the columns must be reordered.

A common way for the 2PL to be parameterized is in terms of the difficulty parameter

d i f f_{i}

and a discrimination parameter

d i s c_{i}

. In this parameterization, the expression inside the

exp ()

(see Equation (A1)) is

d i s c_{i} (θ - d i f f_{i}) .

(A2)

Therefore, the FMP

b_{01} = - d i f f_{i} d i s c_{i}

and the FMP

b_{1 i} = d i s c_{i}

. This difficulty-discrimination parameterization is used by the ltm package with option IRT.param = TRUE, as illustrated next.

Here, we see that ltm stores the difficulty parameters in the first column and the discrimination parameters in the second column. Next, we transform these parameters to

b_{0 i}

and

b_{1 i}

and verify that these parameters yield the same FMP response probabilities as ltm’s inbuilt function.

Finally, the TAM package uses a slightly different expression for the 2PL than the previously explored packages. Namely, the expression within the numerator

exp ()

(see Equation (A1)) is

B_{i} θ - ξ_{i} .

(A3)

Therefore, the TAM 2PL model output is related to the FMP model in that

b_{0 i} = - ξ_{i}

and

b_{1 i} = B_{i}

. Again, we can verify that this transformation results in the same predicted response probabilities (up to rounding error). Note that for the TAM package, the function that calculates response probabilities is located in the CDM package and appears to only be available for a certain grid of

θ

values. The 9th, 11th, and 13th of these

θ

values are used below for illustration.

Appendix A.2. Generalized Partial Credit Model

In Equation (1), the portion of the numerator within the

exp ()

function when

k = 0

and

C_{i} > 2

(i.e., the conditions that make the FMP model equivalent to the GPCM) equals

\sum_{v = 0}^{c} (b_{0_{v} i} + b_{1 i} θ) .

(A4)

The following code demonstrates how to find the FMP

b_{0_{c} i}

and

b_{1 i}

from other parameterizations of the GPCM. To begin, we first generate GPCM data with five response categories.

The ltm package with the IRT.param = FALSE option produces GPCM item parameter estimates that are already in the same format that the flexmet package uses for the FMP model.

Here, columns Catgr.1 through Catgr.4 include parameters

b_{0_{1} i}

through

b_{0_{4} i}

, and the Dscrmn column includes the

b_{1 i}

parameters. We can verify that these are equivalent by comparing the response probabilities calculated from each package. Here (and through the rest of this section), any minor differences in predicted probabilities can be attributed to rounding.

The ltm package with the IRT.param = TRUE argument uses a slightly different parameterization of the GPCM. Namely, the numerator expression within the

exp ()

(see Equation (A4)) is

\sum_{v = 0}^{c} D s c r m_{i} (θ - C a t g r_{c i})

so that

b_{0_{c} i} = - C a t g r_{c i} D s c r m_{i}

and

b_{1 i} = D s c r m_{i}

.

We can then transform the outputted ltm parameters to FMP parameters and verify that each leads to the same predicted probabilities.

Next, the mirt package with option itemtype = "gpcmIRT" is very similar to the ltm package with option IRT.param = TRUE. The expression within the numerator

exp ()

(see Equation (A4)) is

\sum_{v = 0}^{c} a_{i} (θ - b_{c i})

so that the FMP

b_{0_{c} i} = - b_{c i} a_{i}

and the FMP

b_{1 i} = a_{i}

.

Here, the

a_{i}

parameters are in the first column, and the

b_{1 i}

–

b_{4 i}

parameters are in columns 2 through 5. The following code implements the appropriate transformation and verifies its equivalence to the FMP model as implemented in flexmet.

The mirt package also includes the itemtype = "gpcm" option for fitting the GPCM. Here, the expression in the numerator

exp ()

(see Equation (A4)) is

c a_{i} θ + d_{i}

(note the lack of the summation operator), where c is the response category (e.g., 1, 2, 3, or 4). Because this expression is equivalent to the cumulative sum in the FMP model expression, some of the FMP-equivalent parameters must be found through subtraction. Namely, the FMP

b_{1 i} = a_{i}

,

b_{0_{1} i} = d_{1}

, and

b_{0_{c} i} = d_{c} - d_{c - 1}

for

c > 1

. Below, we fit the model and inspect the outputted model parameters.

Here, we only want to use the columns of estimated parameters: a1, d1, d2, d3, and d4. In the code below, the appropriate FMP parameters is found by manually subtracting the appropriate parameter estimates.

Finally, we illustrate the equivalence of the FMP model and the the GPCM as estimated in TAM. The parameterization used here is similar to that used by mirt with the itemtype = "gpcm" option. Here, the expression inside the

exp ()

of Equation (A4) is

B_{c i} θ + A_{c i}

.

Note in the above that the columns beginning with B.Cat are integer multiples of each other; that is, the values in B.Cat2.Dim1 are double those in B.Cat1.Dim1, those in B.Cat3.Dim1 are triple those in B.Cat1.Dim1, and so forth. Therefore, we only need the B.Cat1.Dim1 column, which is equivalent to the FMP

b_{1 i}

. The columns beginning with AXsi are equivalent to the negative

d_{i}

parameters in the previous mirt example (i.e., mirt with option itemtype = "gpcm"). As such, subtraction could be used to find the FMP parameters in the same manner as the previous mirt example. However, it is simpler to use the estimated xsi parameters in TAM, printed below.

Notice that for item 1, the xsi for category 1 is the same as AXsi_.Cat1, the xsi for category 2 equals AXsi_.Cat2 - AXsi_.Cat1, and so forth. Therefore, the negative xsi parameters are equal to the FMP

b_{0_{c} i}

parameters. The following code builds a matrix of FMP parameters from the TAM output and verifies that both methods produce the same category response probabilities.

Appendix B. Relationship between Greek-Letter and Polynomial Coefficient Parameterizations of the FMP Model

This appendix presents explicit formulas to transform Greek-letter FMP parameters to FMP polynomial coefficients for items with

k_{i} \in {0, 1, 2}

. Matrix equations to implement these equations more efficiently, as well as extensions to larger k values, are available in other sources [3,11,12,18,19]. These equations are provided for didactic purposes only; the practical implementation of these transformations can always be performed using the greek2b and b2greek functions in flexmet.

Appendix B.1

k = 0

\begin{matrix} b_{0_{c} i} & = ξ_{c i} \end{matrix}

(A5)

\begin{matrix} b_{1 i} & = exp (ω_{i}) \end{matrix}

(A6)

Appendix B.2

k = 1

\begin{matrix} b_{0_{c} i} & = ξ_{c i} \end{matrix}

(A7)

\begin{matrix} b_{1 i} & = exp (ω_{i}) \end{matrix}

(A8)

\begin{matrix} b_{2 i} & = - α_{1 i} exp (ω_{i}) \end{matrix}

(A9)

\begin{matrix} b_{3 i} & = (α_{1 i}^{2} + exp (τ_{1 i})) \frac{exp (ω_{i})}{3} \end{matrix}

(A10)

Appendix B.3

k = 2

\begin{matrix} b_{0_{c} i} & = ξ_{c i} \end{matrix}

(A11)

\begin{matrix} b_{1 i} & = exp (ω_{i}) \end{matrix}

(A12)

\begin{matrix} b_{2 i} & = - (α_{1 i} + α_{2 i}) exp (ω_{i}) \end{matrix}

(A13)

\begin{matrix} b_{3 i} & = (α_{1 i}^{2} + α_{2 i}^{2} + exp (τ_{1 i}) + exp (τ_{2 i}) + 4 α_{1 i} α_{2 i}) \frac{exp (ω_{i})}{3} \end{matrix}

(A14)

\begin{matrix} b_{4 i} & = - (α_{1 i} (α_{2 i}^{2} + exp (τ_{2 i})) + α_{2 i} (α_{1 i}^{2} + exp (τ_{1 i}))) \frac{exp (ω_{i})}{2} \end{matrix}

(A15)

\begin{matrix} b_{5 i} & = (α_{1 i}^{2} + exp (τ_{1 i})) (α_{2 i}^{2} + exp (τ_{2 i})) \frac{exp (ω_{i})}{5} \end{matrix}

(A16)

Appendix C

Code to Reproduce Figure 1.

Appendix D

Code to Find AIC-Based

k

Values for Simulated Data.

Appendix E

Code to Reproduce Figure 3.

Appendix F

Code to Reproduce Figure 4.

References

Birnbaum, A. Some latent trait models and their use in inferring an examinee’s ability. In Statistical Theories of Mental Test Scores; Lord, F., Novick, M., Eds.; Addison-Wesley: Reading, MA, USA, 1968; pp. 397–479. [Google Scholar]
Molenaar, I.W. Thirty years of nonparametric item response theory. Appl. Psychol. Meas. 2001, 25, 295–299. [Google Scholar] [CrossRef] [Green Version]
Liang, L.; Browne, M.W. A quasi-parametric method for fitting flexible item response functions. J. Educ. Behav. Stat. 2015, 40, 5–34. [Google Scholar] [CrossRef]
Ramsay, J.; Winsberg, S. Maximum marginal likelihood estimation for semiparametric item analysis. Psychometrika 1991, 56, 365–379. [Google Scholar] [CrossRef]
Mokken, R.J. A Theory and Procedure of Scale Analysis; De Gruyter Mouton: Berlin, Germany; New York, NY, USA, 2011. [Google Scholar]
Ramsay, J.O. Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika 1991, 56, 611–630. [Google Scholar] [CrossRef]
Falk, C.F.; Feuerstahler, L.M. On the Performance of Semi-and Nonparametric Item Response Functions in Computer Adaptive Tests. Educ. Psychol. Meas. 2021. [Google Scholar] [CrossRef]
Xu, X.; Douglas, J. Computerized adaptive testing under nonparametric IRT models. Psychometrika 2006, 71, 121–137. [Google Scholar] [CrossRef]
Falk, C.F.; Fischer, F. More flexible response functions for the PROMIS physical functioning item bank by application of a monotonic polynomial approach. Qual. Life Res. 2021, 1–11. [Google Scholar] [CrossRef]
Wiberg, M.; Ramsay, J.O.; Li, J. Optimal scores: An alternative to parametric item response theory and sum scores. Psychometrika 2019, 84, 310–322. [Google Scholar] [CrossRef] [PubMed]
Liang, L. A Semi-Parametric Approach to Estimating Item Response Functions. Ph.D. Thesis, The Ohio State University, Columbus, OH, USA, 2007. [Google Scholar]
Falk, C.F.; Cai, L. Maximum marginal likelihood estimation of a monotonic polynomial generalized partial credit model with applications to multiple group analysis. Psychometrika 2016, 81, 434–460. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Muraki, E. A Generalized Partial Credit Model: Application of an EM Algorithm. Appl. Psychol. Meas. 1992, 16, 159–176. [Google Scholar] [CrossRef]
Rizopoulos, D. ltm: An R package for Latent Variable Modelling and Item Response Theory Analyses. J. Stat. Softw. 2006, 17, 1–25. [Google Scholar] [CrossRef] [Green Version]
Chalmers, R.P. mirt: A multidimensional item response theory package for the R environment. J. Stat. Softw. 2012, 48, 1–29. [Google Scholar] [CrossRef] [Green Version]
Robitzsch, A.; Kiefer, T.; Wu, M. TAM: Test Analysis Modules; R Package Version 3.7-16. 2021. Available online: https://search.r-project.org/CRAN/refmans/TAM/html/TAM-package.html (accessed on 16 August 2021).
Elphinstone, C. A target distribution model for nonparametric density estimation. Commun. Stat.-Theory Methods 1983, 12, 161–198. [Google Scholar] [CrossRef]
Feuerstahler, L. Exploring Alternate Latent Trait Metrics with the Filtered Monotonic Polynomial IRT Model. Ph.D. Thesis, University of Minnesosta, Minneapolis, MN, USA, 2016. [Google Scholar]
Feuerstahler, L.M. Metric transformations and the filtered monotonic polynomial item response model. Psychometrika 2019, 84, 105–123. [Google Scholar] [CrossRef]
Bock, R.D.; Aitkin, M. Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika 1981, 46, 443–459. [Google Scholar] [CrossRef]
Bock, R.D.; Mislevy, R.J. Adaptive EAP estimation of ability in a microcomputer environment. Appl. Psychol. Meas. 1982, 6, 431–444. [Google Scholar] [CrossRef]
Van der Linden, W.J.; Barrett, M.D. Linking item response model parameters. Psychometrika 2016, 81, 650–673. [Google Scholar] [CrossRef] [Green Version]
Haebara, T. Equating logistic ability scales by a weighted least squares method. Jpn. Psychol. Res. 1980, 22, 144–149. [Google Scholar] [CrossRef] [Green Version]
Stocking, M.L.; Lord, F.M. Developing a common metric in item response theory. Appl. Psychol. Meas. 1983, 7, 201–210. [Google Scholar] [CrossRef] [Green Version]
Magis, D.; Barrada, J.R. Computerized adaptive testing with R: Recent updates of the package catR. J. Stat. Softw. 2017, 76, 1–19. [Google Scholar] [CrossRef]
Falk, C.F.; Cai, L. Semiparametric item response functions in the context of guessing. J. Educ. Meas. 2016, 53, 229–247. [Google Scholar] [CrossRef]
Murray, K.; Müller, S.; Turlach, B.A. Revisiting fitting monotone polynomials to data. Comput. Stat. 2013, 28, 1989–2005. [Google Scholar] [CrossRef]
Murray, K.; Müller, S.; Turlach, B. Fast and flexible methods for monotone polynomial fitting. J. Stat. Comput. Simul. 2016, 86, 2946–2966. [Google Scholar] [CrossRef]
Lord, F.M. Applications of Item Response Theory to Practical Testing Problems; Erlbaum: New York, NY, USA, 1980. [Google Scholar]

Figure 1. Item response functions for six example FMP items. Items 1, 2, and 3 are dichotomous, and items 4, 5, and 6 have four response categories. Items 1 and 4 have

k_{i} = 0

, items 2 and 5 have

k_{i} = 1

, and items 3 and 6 have

k_{i} = 2

.

Figure 1. Item response functions for six example FMP items. Items 1, 2, and 3 are dichotomous, and items 4, 5, and 6 have four response categories. Items 1 and 4 have

k_{i} = 0

, items 2 and 5 have

k_{i} = 1

, and items 3 and 6 have

k_{i} = 2

.

Figure 2. Prior predictive simulation of FMP item response curves with

k = 1

and

k = 2

and

ξ \sim N (0, 1)

,

ω \sim N (- 0.5, 1)

,

α \sim N (0, 0.5)

, and

τ \sim N (- 3, 0.5)

.

Figure 2. Prior predictive simulation of FMP item response curves with

k = 1

and

k = 2

and

ξ \sim N (0, 1)

,

ω \sim N (- 0.5, 1)

,

α \sim N (0, 0.5)

, and

τ \sim N (- 3, 0.5)

.

Figure 3. True linking transformation and estimated Stocking–Lord linking transformations with various

k_{θ}

values.

Figure 3. True linking transformation and estimated Stocking–Lord linking transformations with various

k_{θ}

values.

Figure 4. Item response functions (upper row) and test information functions (bottom row) on the original

θ

metric and on the transformed percentage-correct metric

θ^{★}

.

Figure 4. Item response functions (upper row) and test information functions (bottom row) on the original

θ

metric and on the transformed percentage-correct metric

θ^{★}

.

Table 1. Example FMP item parameters.

Greek-Letter Parameterization
Parameter	Item 1	Item 2	Item 3	Item 4	Item 5	Item 6
$ξ_{1}$	0.49	−0.87	−0.54	0.58	0.97	0.79
$ξ_{2}$				0.49	0.38	0.15
$ξ_{3}$				−0.05	−0.15	−1.20
$ω$	0.25	−0.57	−0.24	−0.41	0.58	0.57
$α_{1}$		−0.63	1.13		0.45	−0.22
$α_{2}$			−0.40			−0.59
$τ_{1}$		−0.12	−1.76		−1.42	−1.57
$τ_{2}$			−1.14			−2.45
Polynomial Coefficient Parameterization
Parameter	Item 1	Item 2	Item 3	Item 4	Item 5	Item 6
$b_{0_{1}}$	0.49	−0.87	−0.54	0.58	0.97	0.79
$b_{0_{2}}$				0.49	0.38	0.15
$b_{0_{3}}$				−0.05	−0.15	−1.20
$b_{1}$	1.28	0.57	0.79	0.66	1.79	1.77
$b_{2}$		0.36	−0.57		−0.80	1.43
$b_{3}$		0.24	0.03		0.26	0.71
$b_{4}$			0.01			0.22
$b_{5}$			0.11			0.04

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Feuerstahler, L. Flexible Item Response Modeling in R with the flexmet Package. Psych 2021, 3, 447-478. https://doi.org/10.3390/psych3030031

AMA Style

Feuerstahler L. Flexible Item Response Modeling in R with the flexmet Package. Psych. 2021; 3(3):447-478. https://doi.org/10.3390/psych3030031

Chicago/Turabian Style

Feuerstahler, Leah. 2021. "Flexible Item Response Modeling in R with the flexmet Package" Psych 3, no. 3: 447-478. https://doi.org/10.3390/psych3030031

APA Style

Feuerstahler, L. (2021). Flexible Item Response Modeling in R with the flexmet Package. Psych, 3(3), 447-478. https://doi.org/10.3390/psych3030031

Article Menu

Flexible Item Response Modeling in R with the flexmet Package

Abstract

1. Background

2. Specifying the FMP Model in `flexmet`

3. Fitting the FMP Model in `flexmet`

4. Transforming the FMP Model with `flexmet`

4.1. Item Parameter Linking with the FMP Model

4.2. Transforming the FMP Model to a User-Defined Scale

5. Summary

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Relationship of the FMP Model (k = 0) to Equivalent Models in Other Software Packages

Appendix A.1. Two-Parameter Model

Appendix A.2. Generalized Partial Credit Model

Appendix B. Relationship between Greek-Letter and Polynomial Coefficient Parameterizations of the FMP Model

Appendix B.1

Appendix B.2

Appendix B.3

Appendix C

Appendix D

Appendix E

Appendix F

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Flexible Item Response Modeling in R with the flexmet Package

Abstract

1. Background

2. Specifying the FMP Model in flexmet

3. Fitting the FMP Model in flexmet

4. Transforming the FMP Model with flexmet

4.1. Item Parameter Linking with the FMP Model

4.2. Transforming the FMP Model to a User-Defined Scale

5. Summary

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Relationship of the FMP Model (k = 0) to Equivalent Models in Other Software Packages

Appendix A.1. Two-Parameter Model

Appendix A.2. Generalized Partial Credit Model

Appendix B. Relationship between Greek-Letter and Polynomial Coefficient Parameterizations of the FMP Model

Appendix B.1

Appendix B.2

Appendix B.3

Appendix C

Appendix D

Appendix E

Appendix F

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2. Specifying the FMP Model in `flexmet`

3. Fitting the FMP Model in `flexmet`

4. Transforming the FMP Model with `flexmet`