Test–Retest Reliability of a Questionnaire on Motives for Physical Activity among Adolescents

The aim of this study was to investigate the test–retest reliability of the motives for undertaking physical activity (PA) items from the Health Behavior in School-Aged Children (HBSC) study questionnaire among Slovak and Czech adolescents and to determine whether this reliability differs by gender, age group and country. We obtained data from 580 students aged 11 and 15 years old (51.2% boys) who participated in a test and retest study with a four-week interval in 2013 via the Health Behavior in School-Aged Children cross-sectional study in the Czech Republic and Slovakia. We estimated the test–retest reliability of all 13 dichotomized motives by using Intraclass Correlation Coefficients (ICC) and Cohen’s Kappa statistics, for continuous and dichotomized motives, respectively. Test–retest reliability showed moderate agreement for nine motives (ICC from 0.41 to 0.60) and fair agreement for four motives (ICC from 0.33 to 0.40). Kappa statistics were similarly moderate to large (0.33 to 0.61), except for three motives with small or trivial correlations. The motives “To improve my health” and “To enjoy the feeling of using my body” had consistently low Kappas and correlations. Overall, the results of this study suggest that most questions on motives for PA on the HBSC questionnaire have acceptable test–retest characteristics for use among adolescents.


Introduction
Physical activity (PA) is associated with adolescent healthiness and a low amount of PA is conducive of poor health outcomes and obesity during adolescence [1][2][3][4][5]. As suggested by several studies, one of the potential pathways leading to an increase in levels of PA is through motives for PA [6][7][8][9]. Accordingly, evidence-based development of national-specific strategies for public health and December 2013; these regions represent the full range of adolescents in these two countries. Schools were chosen randomly from a list of schools in the Olomouc and Pardubice regions, Czech Republic, and the Kosice region, Slovakia. Inclusion criteria were the affiliation to a primary school in the Olomouc region (Czech Republic), Pardubice region (Czech Republic) or Kosice region (Slovakia), grade (5th and 9th grade) and the cognitive skills to complete the questionnaire. Exclusion criteria were primary schools educating adolescents with a special need, disagreement of parents/adolescents for study participation and inability to complete the questionnaire based on cognitive skills of the adolescents. All contacted schools agreed to participate. Questionnaires were administered in the 5th and 9th grades by trained research assistants in the absence of a teacher during regular class time. In the first part of the data collection (Test) we obtained data from 419 adolescents in the Czech Republic (response rate: 83.2%) and 259 adolescents in Slovakia (response rate: 74.1%). Non-response was primarily due to illness and parental disapproval of the participation of their children.
The second part of the data collection (Retest) was conducted four weeks after the first part. In general, it is recommended to take a period of 1-4 weeks with the goal to examine test-retest reliability of items [25]. The period between test and retest has to be sufficiently long to avoid the retention of previously given answers to questions and at the same time sufficiently short to eliminate changes in lifestyle of respondents [23]. In our study, we decided to choose a four-week period. This period is widely used also in other test-retest studies [23,26,27]. In the retest, we obtained data from 353 adolescents in Czech Republic (66 dropped out, 15.7%) and 227 adolescents in Slovakia (32 dropped out, 12.3%) who also participated in the first part of the data collection (Test). The final sample consisted of 353 Czech (51.9% boys) and 227 Slovak (52.9% boys) primary school pupils, grades five and nine.
Adolescents who participated only in the test study (the first part of the data collection) and then dropped out did not differ with statistical significance from the other respondents concerning employment status, parental education, gender and grade.
All subjects gave their informed consent for inclusion before they participated in the study. The schools in the Czech Republic had general permission granted at the beginning of the school year by all parents. Parents in Slovakia were informed about the study via the school administration and could opt out if they disagreed with it. Participation in the study was fully voluntary and anonymous, with no explicit incentives provided for participation in either country.
The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved in Czech Republic on 15 May 2012 by the Ethics Committee of the Faculty of Physical Culture, Palacky University in Olomouc under the project GACR-excellence. The protocol in Slovakia was approved on 18 June 2012 by the Ethics Committee of the Medical Faculty at the P.J. Safarik University in Kosice (No: 9/2012) under the project APVV 0032-11.

Variables and Data Measurement
Demographic data (age, gender) were collected using the single questions used and validated in the Health Behavior in School-Aged Children (HBSC) surveys [2,28].
The motives for PA were assessed using 13 items from the HBSC study examining why young people undertake leisure time PA. The question was first used as part of an optional PA package in the 1985/86 HBSC survey and the results about 11 sub-items on motivation for sports activity only from adolescents in Finland, Norway and Sweden; they were published by Wold and Kannas [29]. From the 2005/06 HBSC onward, this measure was adapted to assess motivations for all PA and the scale was broadened to include all PA and two items were added, specifically the items: "to control my weight" and "it is exciting" [8]. Since this HBSC data collection, this 13-item scale on measuring motives for PA is used. The question reads as follows: "Here is a list of reasons that some young people give for taking part in PA in their free time. For each motive please tick how important it is for you, with as answers (1) very important; (2) fairly important; (3) not important". Respondents reply for 13 motives for PA (as can be seen in Figure 1). Further, we dichotomized all of the items by combining (1) very important and (2) fairly important vs. (3) not important.

Bias
We provided a maximum effort to prevent potential sources of bias in our study. The risk of selection bias was small as shown by the analyses of loss to follow-up. As our study regards the reliability of a self-report questionnaire, we cannot exclude effects of information bias including an effect of social desirability. However, this will then affect the full questionnaire as we studied.

Statistical Analyses
In the first step we computed frequencies of the background characteristics. Next, we assessed the proportion of respondents who answered a question identically or shifted their response by one or two categories in the test and retest. Third, we used Intraclass Correlation Coefficients (ICC) to estimate the test-retest reliability of all selected items for the whole sample and stratified by gender, age group and country. In the final step, we computed Cohen's Kappa coefficients with dichotomized variables for the whole sample and stratified by gender, age group and country.
According to Landis and Koch's subjective guidelines [30], the strength of test-retest agreement for an ICC greater than 0.81 is considered to be almost perfect agreement; 0.61 to 0.80 is considered to be substantial agreement; 0.41 to 0.60 is considered to be moderate agreement; 0.21 to 0.40 is considered to be fair agreement; and an ICC below 0.20 is considered to be poor. Regarding Cohen's Kappa statistics, correlation coefficients greater than 0.5 are considered to be large, 0.3-0.5 moderate, 0.1-0.3 small and less than 0.1 are considered to be trivial [31]. All data were analyzed using IBM SPSS 20 for Windows (IBM Corp. Released 2011, Armonk, NY, USA). We used power analyses to justify group sizes to ensure they had enough power to detect among-group differences.

Participants and Descriptive Data
The background characteristics (prevalence rates) of the study sample in the test and retest data collection can be seen in Table 1.

Main Results
The proportion of respondents who answered a question identically varied from 62% to 73% in the Czech Republic and from 56% to 71% in Slovakia ( Figure 1).

Main Results
The proportion of respondents who answered a question identically varied from 62% to 73% in the Czech Republic and from 56% to 71% in Slovakia (Figure 1).  Table 2 shows the ICCs for the HBSC items regarding motives for PA by gender, age group and country. Across subgroups and motives, the ICC varied from 0.29 to 0.65, which indicates fair to moderate agreement. Test-retest reliability showed moderate agreement for nine motives (ICC from 0.41 to 0.60) and fair agreement for four motives (ICCs from 0.33 to 0.40 for "to have fun", "to improve my health", "to see my friends" and "to enjoy the feeling of using my body") in the whole sample. Motives for PA tended to have greater agreement in girls than in boys. Likewise, most motives for PA tended to have greater agreement in the 15-year-old adolescents than in the 11-year-old adolescents. Agreement tended to be better for adolescents in Slovakia than for those in the Czech Republic for most of the items.   Table 2 shows the ICCs for the HBSC items regarding motives for PA by gender, age group and country. Across subgroups and motives, the ICC varied from 0.29 to 0.65, which indicates fair to moderate agreement. Test-retest reliability showed moderate agreement for nine motives (ICC from 0.41 to 0.60) and fair agreement for four motives (ICCs from 0.33 to 0.40 for "to have fun", "to improve my health", "to see my friends" and "to enjoy the feeling of using my body") in the whole sample. Motives for PA tended to have greater agreement in girls than in boys. Likewise, most motives for PA tended to have greater agreement in the 15-year-old adolescents than in the 11-year-old adolescents. Agreement tended to be better for adolescents in Slovakia than for those in the Czech Republic for most of the items.
We dichotomized all 13 motives for PA according to WHO recommendations and we created binary variables of them for further analyses. Table 3 shows Cohen's Kappa for the HBSC items regarding motives for PA by gender, age group and country. We found strong or moderate correlations between test and retest for 10 out of 13 motives for PA in the whole sample. Moreover, we also observed strong or moderate correlations between test and retest for most of the motives per stratum of gender, age group and country. Weak correlations were observed regarding two motives ("to make new friends" and "to enjoy the feeling of using my body") and a trivial correlation in the motive "to improve my health" in the whole sample and also per gender, age group and country. Using a binary format resulted in similar findings to using a continuous format of motive variable. The only exception was the motive "To improve my health" which showed different results and trivial agreement after dichotomization. The test-retest reliability of motives for PA tended to be better in boys than in girls. Six of the motives for PA had a somewhat better reliability in 15-year-old adolescents than in 11-year-old adolescents (a-c, g, i, l). Likewise reliability tended to be better in Slovakia than in the Czech Republic for most motives for PA. We performed power analysis and this showed that the sample sizes were sufficient to detect among-group differences with sufficient power. A sample size 150 is required to achieve Intraclass Correlation Coefficients of 0.2 in case of test and retest with power of at least 80%. A sample size 185 is required to achieve Cohen's Kappa coefficients of 0.2 while test and retest differ by 10% in marginal frequencies with power of at least 80%.

Discussion
The aim of the study was to investigate the test-retest reliability of the motives for PA items of the HBSC questionnaire in Czech and Slovak adolescents and to determine whether this reliability differs by gender, age group (11-and 15-year-olds) and country. The motives for PA items showed moderate agreement for most motives in the whole sample and also stratified by gender, age group and country. After dichotomization, we observed a moderate correlation between the test and retest in almost all examined items, exceptions being small correlations for the motives "to make new friends" and "to enjoy the feeling of using my body" and a trivial correlation for the motive "to improve my health".
The test-retest reliability was moderate for nine motives and fair for four motives in the whole sample, and showed a better test-retest reliability in girls than in boys, and in 15-year-old adolescents than in 11-year-old adolescents. According to our knowledge, no previous study assessed the test-retest reliability of all 13 sub-items on adolescents' motives for PA as used in the HBSC study. We therefore can only compare our findings with those on adjacent concepts. Ojala et al. [20] reported in a study on motives for exercise that test-retest reliability was acceptable for adolescents, using a similar instrument as in the HBSC study. Wold et al. [9] assessed changes in motives for PA among adolescents from 1986 to 2006. They found that adolescents in 2006 tended to report higher importance of motives for PA than adolescents of the same age 20 years earlier in Finland, Norway and Wales. Among similar measured constructs such as, e.g., motives for food choice [32], and motives of smoking [33,34], test-retest reliability was found to be acceptable.
With further similar patterns as for the ICC, we observed strong or moderate correlations between test and retest for ten out of thirteen dichotomized motives, both in the whole sample, and per stratum of gender, age group and country. Based on our results, the HBSC questionnaire on motives for PA is an acceptable instrument to measure motives for PA among adolescents, with caution that three dichotomized motives had small or trivial correlations. It is important to identify carefully motivation for PA in adolescence and determine which instruments for measuring motives for PA are beneficial and effective for them.
Nevertheless, it is important to understand the possible determinants associated with motives for PA and PA in the period of adolescence which might be crucial for the next prevention and promotion among adolescents. Previous studies found that a motivation is a typical personal characteristic and it may be crucial for explaining that some people are sufficiently physically active in their free time [35]. Moreover, the amount of PA and the motives for PA differ highly by gender and age [8,10,35,36] and motives for PA thus vary by several factors. Our findings are in line with previous studies based on the self-determination theory [12]. In addition, previous studies showed that other extrinsic factors also had an important role, e.g., neighbourhood and environment [37], classmate and teacher support during physical education lessons [38,39] and support by family and peers [40,41].
The main strengths of this study are its large sample size, representative dataset of adolescents from two countries and collection of data according to a standardized protocol. The test-retest reliability-answering twice the same questions within a certain timeframe-can be influenced by many factors, e.g., the interpretation or understanding of a question, such as the familiarity with the content, the complexity and ambiguity of an item, the role of someone´s memory, and the number of response options [42,43]. Another strength and at the same time a possible limitation of our study is the period between test and retest administration (4 weeks) from two Central European countries which was sufficiently long to avoid the retention of previously given answers on questions and at the same time sufficiently short to eliminate changes in lifestyle of respondents. At the same time, it might be seen as a limitation of our study. Taking into account that some participants may have shifted their pattern of motives for PA during this period, it would be helpful if the methodology of similar test-retest studies became the same.
A limitation of our study is that test-retest reliability was only analyzed using ICC and Cohen's Kappa per item, but not for an overall scale of motives for PA. Use of a fully consistent scale for motives for PA might add knowledge on the reliability with which this can be measured. Further, a proportional bias was not addressed in our study. In addition, this study was focused on the test-retest reliability of motives for the PA questionnaire within the HBSC study but did not address their validity; this would be an important issue for future research.

Conclusions
Motives for PA showed mostly moderate agreement and likewise mostly strong or moderate correlation after dichotomization in both genders in 11-and 15-year-old adolescents. We conclude that the HBSC questionnaire on motives for PA is with some caution an acceptable instrument to measure motives for PA among adolescents. Moreover, we recommend use continuous-level variables (ICC) responses for this HBSC questionnaire on motives for PA instead of dichotomized responses (Cohen´s Kappa coefficients). Motives for PA thus not only align with actual levels of PA but can also be measured pretty reliably. The study offers unique and interesting insights into how adolescents perceive motives for PA in the Czech and Slovak Republics. This study focused on the test-retest reliability of the selected items of motives for PA, but not for their validity. Moreover, future research and practice should focus on developing instruments that have better precision with the goal of reducing or overcoming the methodological bias of the studied data.

Conflicts of Interest:
The authors declare no conflict of interest.