Generalizability Theory in the Evaluation of Psychological Profile in Track and Field

Generalizability theory (GT) has been used throughout the scientific literature to ensure validity, reliability, and generalizability in different sport contexts. However, there is a small number of studies examining the measurement of psychological profiles in sport from this perspective. Therefore, this study’s main goal is the sources of variability and the optimal measurement design estimation for a good assessment of the psychological profile in track and field. The sample consisted of 470 participants (age: Average= 32.1; Standar Desviation = 13.5). The analysis of variance and generalizability component analysis has been performed in order to test the reliability and generalizability of the sample. The profile included the following variables: flow, motivation (from Self-Determination Theory and Achievement Goals), self-confidence, and psychological skills. Results confirm that the sample has a high degree of reliability and generalizability in all the tested models. So, a detailed study on the validity, reliability, and generalizability of samples and measures should be an inherent element in the practice of psychological counseling in sports.


Introduction
The progress of sport psychology has been associated with the development and publication of several instruments to measure mental characteristics [1].In this sense, it is important to know the psychological profile in sport, which includes a set of mental characteristics and abilities that define the athlete [2] because a holistic understanding of the psychological processes underlying this profile makes it necessary to include in it the different constructs that may be related to sport performance [3].Many variables have been included separately in research on the sport psychological profile (e.g., motivation, self-confidence), although the most productive studies are those that group constructs for measuring the psychological profile, highlighting the research conducted in Spanish using the Inventario Psicológico de Ejecución Deportiva (IPED; [4,5]), which allows us to distinguish between the strengths and weaknesses of an athlete's profile.This instrument is divided into the following factors: self-confidence (e.g., I see myself more as a loser than a winner during competitions), negative coping control (e.g., I get angry and frustrated during competition), attentional control (e.g., I become distracted and lose my concentration during competition), visu-imaginative control (e.g., Before the competition, I imagine myself as a winner), and visu-imaginative control (e.g., Before the competition, I imagine myself as a winner and before the competition, I imagine myself executing my actions and performing perfectly), motivational level (e.g., I am highly motivated to do my best in the competition), positive coping control (e.g., I can maintain positive emotions during the competition), and attitudinal control (e.g., During the competition, I think positively).
Flow [6], self-determined motivation [7], achievement goals [8], and self-confidence [9], in addition to psychological skills, have been considered important variables that correspond to the psychological profile shown in this research.However, no studies have been found analyzing these variables in the field of athletics as a whole.
From this perspective, it is clear that any measurement tool should be analyzed to check its validity.Several ways of testing the validity of a measure have been described in the scientific literature.Nevertheless, the concept of validity has begun to be described as the degree of appropriateness of the inferences and interpretations that can be drawn from scores on measurement instruments [10].In this way, evidence related to the purpose and use of a given instrument, including content, predictive, and construct evidence, is gathered and considered as evidence of the acceptable or unacceptable degree of validity of a test, depending on its use with a particular population [11].As a result of research and reflection on the subject, generalizability theory (GT; [12,13]) emerged.GT is defined as a way of reducing the influence of sources of error, optimizing measurement designs and considering the reliability and generalizability of estimated measures.GT can be used to examine the sources of variation affecting a particular measurement.In addition, the use of GT makes it possible to estimate the fit of the measure to the mean of all possible observations [14].In this way, all variance components contributing to the error of an estimate are identified and measured, and strategies are applied to reduce the influence of these sources of error on the measure [15,16].The use of GT in sport psychology is widely justified, for example, for the aprioristic approach versus a posteriori application of one instrument [14].In this sense, any of the presented works can be considered from an a priori point of view as an exploration of an insufficiently known or studied research domain and as a way to prepare for larger-scale research.It is also possible that sample sizes can be better adjusted to the social reality of research in sport and physical activity psychology.
In summary, GT unifies the concepts of reliability, validity, and precision of a given measure through four steps: (1) definition of the study facets; (2) analysis of variance of the scores obtained on the study facets; (3) calculation of the error components; and (4) optimization of the generalizability coefficients.
Despite the growth of GT in the scientific literature [11][12][13][14][15][16], no study has been found that tests the degree of accuracy in the generalization of the results of the questionnaires used, with the exception of IPED [4,5].In view of the above, the main objective of this research is to estimate the sources of variability and to estimate the optimal measurement designs for an adequate evaluation of the psychological profile in athletics based on the measures of flow, motivation, self-confidence, and psychological skills.

Participants
The sample consisted of 470 participants (266 men and 204 women).Of all, 241 answered the questionnaire online and 229 answered the printed version.All of them practiced athletics in different specialties.Specifically, 148 were runners (not federated) and 322 were federated athletes, of whom 95 were sprinters and/or hurdlers, 143 were middledistance and/or long-distance runners, 81 were event specialists, and 9 were decathletes or heptathletes.Their age ranged from 14 to 70 years (AVG: 32.1; SD: 13.5).The ages of the participants were grouped into three categories: U18 (N = 73) for those under the age of 18, Senior (N = 200) for those aged between 18 and 35, and Master (N = 184) for those over 35.

Procedure
Two questionnaire response formats were used for data collection: online and by hand.In both cases, anonymity and confidentiality of the information provided were ensured.First, an analysis of variance components was carried out, followed by an analysis of generalizability.

Data Analysis
Data analysis has been carried out using two main techniques: analysis of variance components and generalizability analysis.The following programs have been used: SAGT [21] and SAS System v.9.1 [22,23].
GT attempts to identify and measure the variance components contributing error to an estimate and, knowing them, implement strategies to reduce their influence on the measurement [24].Reliability is estimated by being certain that the scale is measuring what it measures in a reproducible manner.The generalizability coefficient provides information on the stability and consistency of individual differences between people, as well as other possible sources of variation, by considering the fit of the models to the General Linear Model through the comparison of the residual variance of the least squares and maximum likelihood procedures.Different procedures can be distinguished to perform the analysis of variance components, although in this case, only two will be used, namely, least squares through the VARCOMP procedure and maximum likelihood through the GLM (Generalized Linear Model) procedure.

Analysis of Variance Components
A variance component analysis of variance was performed using VARCOMP (method = type1) and GLM procedures for a six-facet model Due to the saturation produced by working with such a large number of facets, the model [y = p o a c f g] without interactions was used first (see Table 1).A similar error of variance was obtained with both procedures (GLM = 83,850.60/VARCOMP= 83,851), and the model and the facets [p], [c], [f] were significant (<0.001 <0.0001) and explained 95.30% of the variance.The rest of the facets collapsed because of the contribution of the [p] facet to the model.The model and all facets were significant.The model explained 91.88% of the variance.The error variance with both procedures was practically equal (GLM = 144,694.80/VAR-COMP= 144,695), as can be seen in Table 2. From this analysis, the four facets that contributed the most variance to the model were considered, and two new analyses were performed with all interactions with the model [y = a|c|f|g] and [y = a|c|f|o].In the model [y = a|c|f|g], all facets with their interactions were significant, and 91.88% of the variance was explained, as can be seen in Table 3.The error variance with both procedures was very similar (GLM = 12,159/VARCOMP = 12,064).On the other hand, for the model [y = a c f o], all facets with their interactions were significant, and 91.88% of the variance was explained.The error variance with both procedures was also very similar (GLM = 12,064/VARCOMP = 12,159).These results can be seen in Table 4.
With these estimated results on the equality in the error variance of both a least squares and a maximum likelihood procedure, it can be assumed that the sample is linear, normal, and homoscedastic [25,26].

Generalizability Analysis Results
From the models estimated in the analysis of variance components, the generalizability analysis was performed using the SAGT statistical program [21].Eleven models with different designs were obtained:  5).On the other hand, the lowest percentage of variance was associated with the interaction [o][f] (Questionnaire format and Factor, respectively) with 0.000%, followed by the interaction [a][f] (Athlete Type and Factor] with 0.013%.6), where (o): Response Format; (a): Athlete Type; (c): Questionnaire; and (f): Factor, generalizability coefficients greater than 0.98 were obtained (relative G = 0.998 and absolute G = 0.989).These results confirm that the factors estimated from the Response Format, the Type of Athlete, and the Questionnaire used present a high reliability and a high capacity for the generalization of the numerical structure with the sample studied.Table 6.Generalizability analysis of the different analyzed models.

In the [o][a]
[f]/[c] model, generalizability coefficients above 0.87 (relative G = 0.966 and absolute G = 0.872) are obtained, which confirms that the numerical structure of the sample studied has adequate reliability and generalizability.
In the [o][f][c]/[a] model, generalizability coefficients above 0.71 (relative G = 0.952 and absolute G = 0.713) are obtained, confirming that the number of athletes in the sample estimated from the remaining facets is reliable and generalizable.
In the model , where (g): Gender; (a): Type of Athlete; (f): Factor; and (c): Questionnaire, generalizability coefficients higher than 0.87 were obtained (relative G = 0.967 and absolute G = 0.872).These results show that the questionnaires estimated on the basis of Gender, Type of Athlete, and Factor present a high reliability and an adequate generalization capacity of the numerical structure of the sample studied.
In the [g][a][f]/[c] model, the highest percentage of variance was associated with facet [a] (Type of Athlete) with 39.020% followed by facet [c] (type of Questionnaire) with 32.275% (Table 7).The lowest percentages of variance were found in the interaction [g] [f] (Gender and Factor) with 0.000% of variance associated and the interaction between all facets of the model, which obtained 0.007%.In the model [a][c][f]/[g] (see Table 6), where (a): Athlete Type; (c): Questionnaire; (f): Factor; and (g): Gender, optimal generalizability coefficients were obtained (relative G = 1.000 and absolute G = 1.000), confirming the reliability and optimal generalizability of the numerical structure of the studied sample.In the [g][c][f]/[a] model, generalizability coefficients higher than 0.71 (relative G = 0.950 and absolute G = 0.712) are obtained.These results confirm the reliability of the numerical structure of the sample studied and an adequate generalization capacity.In the model  8).However, the lowest associated variance percentages are found in facet [f] (Factor) and interaction [p][f] (Participant and Factor) with 0.000% and 0.361%, respectively.

Discussion
The general objectives of this research were to estimate the sources of variability and to calculate the optimal measurement designs for an adequate assessment of the psychological profile in athletics (consisting of flow, motivation, self-confidence, and psychological skills).
In order to study the facets that explain a greater percentage of the variance, variance component analysis and generalizability analysis were carried out.The aim was to ensure the generalizability of the data provided and the fit of the model, as well as to determine the variability of the facet involved [4].With regard to the analysis of variance components, the results confirm that the main source of variance for the estimation of the psychological profile in athletics is the facet of Participants when the model is tested without interactions.This means that the variability that occurs between participants is large enough to determine their psychological profile.When this facet is disregarded, all the other facets and their interactions are significant.This means that the psychological profile in athletics is explained by the variability of the measured factors.For that reason, different explanatory models on the variance of the profile were tested.All of them explain 91.88% of the variance and show a very similar error variance, which allows us to assume that the sample is linear, normal, and homoscedastic [25,26].These results also indicate that Gender, Questionnaire used, Type of Athlete, Response mode, and Factor are determinants in the study of the psychological profile and produce differences among participants.
With regard to the generalizability analysis, each of the proposed measurement designs was analyzed independently and reliability and generalizability indices were estimated with the aim of estimating the most parsimonious explanatory models that maximize the explained variance and minimize the error variance.All of them were adequate in terms of reliability and validity indices.Among all the models tested, the model where [a] corresponds to the Type of Athlete, [c] to the Questionnaire used, [f] to the Factor, and [g] to the Gender of the participant, obtained optimal generalizability coefficients.
The use of generalizability theory in the measurement of the psychological profile of athletes, perhaps because of its novelty, is not widespread, but it has obtained positive results, e.g., [27].Among other benefits, the use of generalizability theory in the study of psychological profiles allows the validity and reliability of the numerical structure used to be checked in a very precise way [16,28,29].In addition, it makes it possible to estimate the sample, i.e., the minimum number of participants necessary to generalize the results [14].In this sense, it can be affirmed that the approach of the present study has adequate reliability and generalizability indicators in the different models tested.The general analysis of the generalizability coefficients reveals that the generalization accuracy reliability of the results is optimal.Other studies have proved that GT is especially effective in the planification of investigations and interventions in sport [15,16].These data contribute to considering that there are significant differences between the different athletic specialties, as well as between different genders or age groups.In this sense, it can be affirmed that the approach of this study has adequate reliability and generalizability indicators in the different models tested.The general analysis of the generalizability coefficients reveals that the reliability of the generalization accuracy of the results is optimal.These data contribute to considering that there are significant differences between the different athletic specialties, as well as between different genders or age groups.
It is effective to check that a research design is adequate in terms of the number of participants and/or observations and has the same precision of generalizability in subsequent analyses.But it has also helped us to be able to design a broader research design by introducing modifications to the current design through the use of nesting in the facets.This nesting can result in a higher generalization accuracy of an investigation.In fact, we will always have a smaller number of error components that will allow for such precision if the residuals are small.Indeed, in the spirit of Cronbach, a G-analysis is normally an a priori study, which serves to prepare a larger-scale research design [16].The prior work of estimating the sources of variance should make it possible to fine-tune the measurement devices adapted to the decisions considered in the main investigation.
With this optimization plan in mind, future lines of research can be proposed to minimize the limitations of this study.This research has certain limitations that should be examined in future work.Among them, we can highlight the possibility of having carried out a previous analysis of the data regarding the role played by the questionnaire presentation format (online or on paper) and considering the interviewer's bias in order to be able to carry out the analysis of the profile knowing the influence of this variable previously.This study could also have been approached by having collected other variables that would have been interesting for the analysis, such as sports level or experience in competitions.
However, these previous explorations, due to their rigor and precision, can serve as a basis for making decisions on a larger scale, which will help to confirm and extend the results obtained.A detailed study of the generalizability of the measure in the psychological profile within sport provides rigor to the studies that aim to explore it.
Then, another analysis without interactions was performed with the model [y = o a c f g] to know the contribution of each facet, disregarding facet [p], where (o): Response Format; (a): Athlete Type; (c): Questionnaire; (f): Factor; and (g): Gender.
[a][c]/[f]; and, finally, [c][f]/[p].The results obtained for all of them are described below.When a generalizability analysis was performed with the different cross-facet designs on the different models in the model [f][a][c]/[o], where (f) Factor; (a) Athlete Type; (c) Questionnaire; and (o) Response Format, generalizability coefficients greater than 0.99 were obtained (relative G = 0.998 and absolute G = 0.998), confirming the generalizability of the numerical structure of the sample.In the [f][a][c]/[o] model, the highest percentage of variance was associated with facet [a] (Athlete Type) with 39.020% followed by facet [c] (Questionnaire type) with 32.275% (Table Note.F.V = sources of variation; S.C = sum of squares; G.L = degrees of freedom; C.M = mean square; Aleato.= random; Correg = corrected; S.E.= standard error.
[g][a][c]/[f], generalizability coefficients above 0.98 (relative G = 0.998 and absolute G = 0.989) are obtained, showing the high reliability and generalizability of the numerical structure of the studied sample.And finally, the model [c][f]/[p], where (c): Questionnaire; (f): Factor; and (p): Participant, generalizability coefficients above 0.99 were obtained (relative G = 1.000 and absolute G = 0.995).These results show the generalizability of the numerical structure of the sample studied and a high reliability.In the [c][f]/[p] model, the highest percentage of variance is associated with facet [p] (Participant) with 65.864% followed by facet [c] (Questionnaire type) with 29.000% (Table

Table 1 .
Analysis of variance of the model [y = p o a c f g].
DF: degrees of freedom; Pr > F: probability greater than the calculated F; SS: Sum of Squares.

Table 2 .
Analysis of variance of the model [y = o a c f g].
DF: degrees of freedom; Pr > F: probability greater than the calculated F; SS: Sum of Squares.

Table 3 .
Analysis of variance of the model [y = a c f g].
DF: degrees of freedom; Pr > F: probability greater than the calculated F; SS: Sum of Squares.

Table 4 .
Analysis of variance of the model [y = a c f o].
DF: degrees of freedom; Pr > F: probability greater than the calculated F; SS: Sum of Squares.

Table 8 .
Analysis of variance of the model [c][f]/[p].