Next Article in Journal
Performance Evaluation of Instrument-Based SARS-CoV-2 Rapid Antigen Fluorescent Immunoassays for Point-of-Care Detection
Next Article in Special Issue
Heart Rate Recovery After Six-Minute Walk Test, Pulmonary Function, Dyspnea, and Functional Status After COVID-19
Previous Article in Journal
Who Still Pays the Price of SARS-CoV-2 in the Vaccination Era? Evidence from Primary Healthcare in Greece
Previous Article in Special Issue
Functional Dependence in Brazilian Adults One Year After COVID-19 Infection: Prevalence and Risk Factors in a Cross-Sectional Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Intra-Rater Reliability of 30 s Sit-To-Stand and Timed-Up-and-Go Tests in Older Adults with Post-COVID-19 Syndrome: A Pilot Study

by
Marina Kloni
1,2,
Alexandros Heraclides
2,
Theognosia Panteli
1,2,
Alexios Klonis
3,
Panagiotis Rentzias
2 and
Christos Karagiannis
2,*
1
Physiotherapy Department, Rehabilitation Centre EDEN, 7562 Larnaca, Cyprus
2
Department of Health Sciences, European University of Cyprus, 6 Diogenous Str, Engomi, 2404 Nicosia, Cyprus
3
Department of Anaesthesia, The Newcastle Upon Tyne Hospitals NHS Foundation Trust, Newcastle NE7 7DN, UK
*
Author to whom correspondence should be addressed.
COVID 2026, 6(5), 77; https://doi.org/10.3390/covid6050077
Submission received: 21 March 2026 / Revised: 21 April 2026 / Accepted: 24 April 2026 / Published: 28 April 2026
(This article belongs to the Special Issue Post-COVID-19 Muscle Health and Exercise Rehabilitation)

Abstract

Background: Post-COVID-19 syndrome (PCS) is associated with impairments in mobility, balance, and physical function, which may reduce quality of life. The 30 s Sit-to-Stand (30STS) and Timed Up and Go (TUG) tests are widely used clinical measures; however, their intra-rater reliability in older adults with PCS has not been established. Reliable outcome measures are essential for clinical assessment and rehabilitation planning. Methods: In this single-center pilot study, nineteen older adults with PCS were recruited as a convenience sample. Participants completed three trials of the 30STS and TUG tests on day one, with the protocol repeated after three days. The 30STS evaluates lower-limb strength and functional performance, while the TUG assesses balance, gait, and fall risk. Intra-class correlation coefficient (ICC), standard error of measurement (SEM), and minimum detectable change (MDC) were calculated. Results: The TUG showed an ICC of 0.995 (95% CI: 0.991–0.998), SEM of 0.48 s, and MDC of 1.33 s. The 30STS showed an ICC of 0.986 (95% CI: 0.973–0.994), SEM of 0.26 repetitions, and MDC of 0.72 repetitions. Conclusions: The TUG and 30STS demonstrate excellent intra-rater reliability and appear to be feasible clinical tools for assessing functional performance in older adults with PCS. However, findings should be interpreted cautiously due to the small, single-center pilot design and single evaluator. Further research is needed to confirm generalizability across broader PCS populations and clinical settings.

1. Introduction

Post-COVID-19 syndrome (PCS), also referred to as “Long COVID” or “post-acute COVID-19 syndrome,” is defined by the World Health Organization as the persistence of symptoms for more than three months following acute SARS-CoV-2 infection, in the absence of an alternative diagnosis [1,2]. In this study, we adopted the term Post-COVID-19 syndrome (PCS) to maintain terminological consistency with this definition.
Prevalence estimates of PCS vary widely, ranging from 6% to over 50%, reflecting differences in study design, disease severity, geographic region and follow-up duration [2,3,4]. Individuals with PCS often experience multisystem impairments that can reduce functional performance. Common symptoms include fatigue, muscle weakness, exercise intolerance, impaired balance, and reduced mobility [5]. Prolonged hospitalization and physical inactivity may further exacerbate these impairments, increasing the risk of falls and difficulties with activities of daily living [1,5].
Reliable, practical, and easily administered outcome measures are essential for assessing muscle strength, balance, and overall functional capacity in this population. Such measures support the identification of functional limitations, monitoring of recovery, and guidance for rehabilitation interventions. Among recommended assessments of physical function [6], the 30 s Sit-to-Stand (30STS) and Timed Up and Go (TUG) tests were selected for this study due to their simplicity, minimal equipment requirements, and clinical relevance. The 30STS evaluates lower-limb strength and overall physical performance, while the TUG assesses balance, gait, and fall risk [7,8]. Both tests have been validated in various clinical populations and have also been applied in post-COVID-19 cohorts, where reduced performance and persistent functional limitations have been reported [9,10,11,12,13,14,15].
Despite their widespread use, the measurement properties of these tests have not been fully established in adults with PCS and evidence on intra-rater reliability in this population remains limited. Establishing intra-rater reliability is a fundamental step in supporting the validity and clinical applicability of outcome measures. Therefore, the aim of this study was to evaluate the intra-rater reliability of the 30STS and TUG tests in older adults with PCS, providing preliminary evidence to support their use in clinical and research settings.

2. Materials and Methods

2.1. Study Design

This study followed the Guidelines for Reporting Reliability and Agreement Studies (GRRAS) [16]. A prospective, observational design was used to evaluate the intra-rater reliability of the TUG and 30STS tests in older adults with PCS [17,18]. Measurements were performed by a single rater on two separate occasions, with a three-day interval between sessions [17,18,19].
These procedures require consistent professional judgment, which justifies the focus on intra-rater reliability. Establishing intra-rater reliability is a necessary first step before evaluating other measurement properties, such as inter-rater reliability, validity, and responsiveness.
The rater was a physiotherapist with 10 years of clinical rehabilitation experience and was responsible for standardized test administration, including instructions, timing and supervision of correct movement execution according to established protocols. As no interventions were applied, the study was observational. Participants were blinded to their results to reduce potential expectation bias [20].

2.2. Participants

A non-probabilistic convenience sampling approach was used. Participants were adults with PCS referred for rehabilitation at the Eden Rehabilitation Centre.
Inclusion criteria: (1) diagnosis of PCS; (2) clinically stable condition; (3) independent ambulation with or without a walking aid; and (4) written informed consent. Exclusion criteria were: (1) age under 18 years; (2) cognitive or psychiatric conditions limiting comprehension or consent; (3) refusal to participate; and (4) unstable clinical condition (e.g., unstable cardiac disease) [21,22,23].
All participants received verbal and written information regarding study procedures, confidentiality, and withdrawal rights. Written informed consent was obtained in accordance with the Declaration of Helsinki. The study protocol was approved by the Cyprus National Bioethics Committee (ΕΕΒΚ/ΕΠ/2022/67; 12 December 2022) and was registered with Clinical Trials, https://clinicaltrials.gov/ (last accessed on 10 August 2025), number: NCT05886842.

2.3. Procedure

Data were collected between August and December 2023 in a quiet, temperature-controlled physiotherapy gym at the Eden Rehabilitation Centre. Testing was conducted between 8:00 and 10:00 a.m. to minimize diurnal variability.
Physiological measures (blood pressure, heart rate and oxygen saturation) were assessed before and after testing using a portable monitor and pulse oximeter. If values fell outside safe ranges [24], testing was paused or rescheduled, and medical staff were notified. Participants were monitored for symptoms such as dyspnea, chest pain, dizziness, palpitations or severe cough, with testing terminated if necessary [24].
Equipment included a standard-height chair (45 cm), a stopwatch, and a marked 3 m straight, level corridor identified with adhesive tape.
Participants followed standardized protocols for the 30STS [4] and TUG [5] tests. A fixed sequence was selected to ensure procedural consistency across participants in this pilot reliability study, with the TUG test administered first. Participants stood from a chair, walked 3 m at a comfortable pace, turned, returned and sat down. Timing began on the command “go” and ended when the participant returned to the seated position. Walking aids were permitted.
After a 15 min rest period, participants completed the 30STS test, performing as many sit-to-stand repetitions as possible within 30 s, with arms crossed over the chest.
Each participant performed three trials per test, with 5 min rest intervals between trials and a 15 min rest period between tests to minimize fatigue effects. No verbal encouragement was provided. The retest session was conducted three days later under identical conditions. The three-day interval represented the maximum period participants could remain without physiotherapy, while minimizing potential clinical changes and reducing immediate learning or fatigue effects.
For the 30STS test, the outcome was the total number of repetitions and for the TUG test, the time for test completion was recorded to two decimal places. Fatigue-related variability was considered in the study design, as PCS symptoms may influence performance even within short testing sessions.

2.4. Statistical Analysis

Statistical analyses were performed using R software (version 4.5.1; https://posit.co/download/rstudio-desktop/, accessed on 10 August 2025; R Foundation for Statistical Computing, Vienna, Austria).
Relative reliability was assessed using the intraclass correlation coefficient (ICC) with 95% confidence intervals (95% CI) [25]. For each test, the mean of the three trials from day one and the mean of the three trials from day three were used. ICCs were calculated using a two-way mixed-effects model with absolute agreement based on mean ratings [26]. ICC values were interpreted as <0.5 poor, 0.5–0.75 moderate, 0.75–0.9 good, and >0.9 excellent reliability [16].
Absolute reliability was assessed using the standard error of measurement (SEM) and the minimum detectable change (MDC) [25,27,28]. The SEM reflects measurement variability, while the MDC represents the smallest change that reflects a true change rather than random error [27,28]. They were calculated as follows:
SEM = SD × √(1 − ICC)
MDC = 1.96 × SEM × √(2).
Agreement between test and retest sessions was evaluated using Bland-Altman plots with 95% limits of agreement (mean difference ± 1.96 SD of differences) [29,30,31,32].
Data distribution was assessed using the Shapiro-Wilk test. Descriptive statistics included means and standard deviations for continuous variables. Number and percentages were used for categorical variables. The significance level was set at p < 0.05.
A priori sample size calculation determined that at least 19 participants were required to detect a difference between ICC values of 0.70 and 0.90 with 80% power and α = 0.05 [31,32,33,34,35,36]. A reliability coefficient above 0.90 was considered acceptable for clinical measurement [37].

3. Results

Nineteen adults with PCS, met the inclusion criteria, provided informed consent, and completed all testing sessions, resulting in a total of 228 measurements. No adverse events occurred during or after the administration of either test.
Participant characteristics are summarized in Table 1. The sample consisted predominantly of older, retired individuals. All participants had at least one comorbid condition, with hypertension (63%) being the most prevalent. Each participant reported a single episode of COVID-19 infection, and none reported post-exertional malaise (PEM). The most commonly reported symptoms were muscle weakness (100%), reduced endurance (100%) and balance impairment (89%). The Shapiro-Wilk test indicated that all variables were normally distributed (p > 0.05).
Descriptive statistics for each trial of the TUG and 30STS tests are presented in Table 2. For both tests, performance remained relatively stable across the three trials within each session. TUG times showed only minor fluctuations between trials, and a similar pattern was evident for 30STS repetitions. Within-subject variability was observed across trials.

3.1. Timed Up and Go

The TUG test demonstrated excellent intra-rater reliability, with an ICC of 0.995 (95% CI: 0.991–0.998, p = 1.41 × 10−56). Absolute reliability was high, with an SEM of 0.48 s and MDC of 1.33 s.
The Bland-Altman analysis (Figure 1) showed a mean difference of −0.108 s, with 95% limits of agreement ranging from −1.185 to 0.968 s.

3.2. 30 s Sit-to-Stand

The 30STS test demonstrated excellent intra-rater reliability, with an ICC of 0.986 (95% CI: 0.973–0.994, p = 5.74 × 10−47). Absolute reliability was high, with an SEM of 0.258 repetitions and an MDC of 0.717 repetitions.
The Bland-Altman analysis (Figure 2) showed a mean difference of −0.140 repetitions, with 95% limits of agreement ranging from −0.905 to 0.625 repetitions.

4. Discussion

This pilot study evaluated the intra-rater reliability of the TUG and 30STS tests in older adults with PCS. Both tests demonstrated excellent intra-rater reliability, with ICC values exceeding 0.90, indicating a high level of consistency when administered by the same trained evaluator under standardized conditions. These findings suggest that both tests may provide stable and repeatable measures of functional mobility, balance, and lower-limb performance in this population.
The high reliability observed is likely attributable to the standardized testing protocol, consistent instructions and use of a single experienced evaluator, all of which reduced measurement variability. While this enhances internal consistency, it may also reflect controlled measurement conditions that are not fully representative of routine clinical practice, where differences in clinician experience, environment, and patient management may introduce additional variability.
From a functional perspective, both the TUG and 30STS tests capture key domains commonly affected in PCS, including lower-limb strength, balance, gait efficiency and exercise intolerance. These impairments are frequently influenced by persistent fatigue and post-viral deconditioning, underscoring the clinical relevance of these assessments. However, the present findings support measurement reliability rather than diagnostic or prognostic validity.
An important methodological consideration in repeated functional testing is the potential for learning effects across successive trials. In this study, trial-by-trial analyses of both the TUG and 30STS tests did not demonstrate a systematic pattern of improvement across trials, suggesting that any familiarization effects were minimal in this dataset. No inferential statistical tests were performed for trial-by-trial comparisons, as the primary aim of this study was intra-rater reliability assessment.
Some within-subject variability was observed, which may reflect normal performance fluctuations in individuals with PCS, consistent with the heterogeneity of symptoms reported in this population. From a clinical perspective, these findings may suggest that a formal practice trial may not always be necessary under standardized conditions, although this requires confirmation in larger and more diverse samples.
Comparison with previous research indicates that the observed reliability is consistent with studies in older adult and clinical populations. Both the TUG and 30STS tests have previously demonstrated high reliability across diverse settings [38,39,40,41,42,43,44,45,46,47]. However, direct comparisons should be interpreted cautiously due to differences in sample characteristics, protocols, and rater experience.
Agreement analysis using Bland–Altman plots showed minimal systematic bias between test sessions for both measures, with narrow limits of agreement, supporting the ICC findings and indicating low measurement error under standardized conditions. Precision indices (SEM and MDC) were comparable to or lower than those reported in previous studies [38,39,40,41,42,43,44,45,46,47], suggesting acceptable measurement precision in this sample, although variability across methodologies limits direct comparability.
It is important to distinguish reliability from measurement accuracy. High intra-rater reliability reflects consistency of repeated measurements rather than validity or absence of systematic error. This distinction is particularly relevant for performance-based tests such as the 30STS test, where some degree of subjective judgment may be required to determine correct execution. Therefore, the present results should be interpreted as evidence of measurement consistency under controlled conditions rather than generalizable measurement robustness.
Overall, while the findings support the use of both tests in controlled rehabilitation settings, caution is warranted in extrapolating results to broader clinical environments.

4.1. Limitations

Several limitations should be considered when interpreting these findings. The study included a small, convenience sample of older and predominantly retired adults with PCS recruited from a single rehabilitation center. This may limit external validity and generalizability. The small sample size also reduces the precision and stability of ICC estimates, while the relatively homogeneous population restricts applicability to more diverse PCS populations.
In addition, only a single experienced evaluator was involved, allowing assessment of intra-rater reliability only. Although this ensured consistency in test administration, it does not capture variability between clinicians and the resulting estimates may therefore reflect an idealized measurement scenario with reduced measurement variability rather than real world clinical conditions. The single-centre design may also introduce contextual bias related to local procedures, patient selection, and environmental factors.
Furthermore, the three-day interval between testing sessions may have introduced minor learning or recall effects [48,49], potentially influencing performance consistency. Finally, the fixed order of test administration, with the TUG test consistently performed before the 30STS test, may have introduced order effects. In individuals with PCS, fatigue and exercise intolerance are common, and residual fatigue may have influenced performance on the second test despite the standardized 15 min rest period.
Although the sample size was determined a priori based on ICC differences, the study was considered a pilot due to its single-centre design, small absolute sample size, and focus on intra-rater reliability in a specific clinical population.
Taken together, these limitations suggest that the findings should be considered preliminary and hypothesis-generating rather than definitive evidence of measurement reliability.

4.2. Clinical Implications

The findings suggest that the TUG and 30STS tests may be feasible and potentially reliable functional assessments for older adults with PCS when administered under standardized conditions. Their simplicity, low cost, and minimal equipment requirements make them practical alternatives to more complex functional assessments, such as the Six-Minute Walk test or the Short Physical Performance Battery, in rehabilitation settings.
However, their clinical application should be considered within the context of the study design. Reliability was established under controlled conditions with a single evaluator and a relatively homogeneous sample, which may not reflect variability in routine clinical practice. Therefore, broader implementation should be supported by further evidence from larger, multi-center studies.
Future research should include multiple raters with varying levels of clinical experience to assess inter-rater reliability, alongside more diverse populations to improve generalizability. The use of video-recorded sessions may facilitate independent scoring, while randomized test order could help reduce potential order effects. In addition, longitudinal studies are needed to determine responsiveness to clinical change over time in individuals with PCS.

5. Conclusions

The TUG and 30STS tests demonstrated excellent intra-rater reliability in this pilot study of older adults with PCS. These findings suggest that both measures provide consistent and repeatable assessments of functional performance when administered under standardized conditions.
However, given the pilot design, small sample size, single-center setting, and single evaluator, these results should be interpreted as preliminary. Further research involving larger and more diverse populations, multiple raters, and multi-center designs is required to confirm these findings and establish broader clinical generalizability.

Author Contributions

Conceptualization, M.K., A.H., T.P., A.K., P.R. and C.K.; methodology, M.K., A.H., T.P., A.K., P.R. and C.K.; software, M.K.; validation, M.K., A.H., P.R. and C.K.; formal analysis, M.K.,A.H., A.K. and C.K.; investigation, M.K., T.P., A.K., P.R. and C.K.; resources, M.K., T.P., A.K. and C.K.; data curation, M.K., T.P., A.K. and C.K.; writing—original draft preparation, M.K. and T.P.; writing—review and editing, M.K.,A.H., T.P., A.K., P.R. and C.K.; visualization, M.K. and C.K.; supervision, A.H.,A.K.,P.R. and C.K.; project administration, M.K., A.H., P.R. and C.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Cyprus National Bioethics Committee (protocol code ΕΕΒΚ/ΕΠ/2022/67; 12 December 2022).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
PCSPost-COVID-19 syndrome
30STS30 s sit-to-stand
TUGTimed-up-and-go
GRRASGuidelines for reporting reliability and agreement studies
ICCInter-class correlation coefficient
CIConfidence intervals
SEMStandard error of measurement
MDCMinimum detectable change
PEMPost-exertional malaise

References

  1. Yong, S.J. Long COVID or post-COVID-19 syndrome: Putative pathophysiology, risk factors, and treatments. Infect. Dis. 2021, 10, 737–754. [Google Scholar] [CrossRef] [PubMed]
  2. WHO. Clinical Management of COVID-19. Living Guideline, June 2025. 2025. Available online: https://iris.who.int/server/api/core/bitstreams/d1021eff-f570-4c22-b630-a44bf4267a6c/content (accessed on 27 February 2026).
  3. Sk Abd Razak, R.; Ismail, A.; Abdul Aziz, A.F.; Suddin, L.S.; Azzeri, A.; Sha’ari, N.I. Post-COVID syndrome prevalence: A systematic review and meta-analysis. BMC Public Health 2024, 24, 1785–1803. [Google Scholar] [CrossRef] [PubMed]
  4. Natarajan, A.; Shetty, A.; Delanerolle, G.; Zeng, Y.; Zhang, Y.; Raymont, V.; Rathod, S.; Halabi, S.; Elliot, K.; Shi, J.Q.; et al. A systematic review and meta-analysis of long COVID symptoms. Syst. Rev. 2023, 12, 88–107. [Google Scholar] [CrossRef]
  5. Nalbandian, A.; Sehgal, K.; Gupta, A.; Madhavan, M.V.; McGroder, C.; Stevens, J.S.; Cook, J.R.; Nordvig, A.S.; Shalev, D.; Sehrawat, T.S.; et al. Post-acute COVID-19 syndrome. Nat. Med. 2021, 27, 601–615. [Google Scholar] [CrossRef]
  6. Malik, P.; Patel, K.; Pinto, C.; Jaiswal, R.; Tirupathi, R.; Pillai, S.; Patel, U. Post-acute COVID-19 syndrome (PCS) and rehabilitation: A systematic review. J. Med. Virol. 2022, 94, 154–168. [Google Scholar] [CrossRef]
  7. Jones, C.J.; Rikli, R.E.; Beam, W.C. A 30-s chair-stand test as a measure of lower body strength in community-residing older adults. Res. Q. Exerc. Sport 1999, 70, 113–119. [Google Scholar] [CrossRef] [PubMed]
  8. Kear, B.M.; Guck, T.P.; McGaha, A.L. Timed Up and Go (TUG) Test: Normative Reference Values for ages 20-59 years and relationships with physical and mental health risk factors. J. Prim. Care. Community Health 2017, 8, 9–13. [Google Scholar] [CrossRef]
  9. Bellan, M.; Soddu, D.; Balbo, P.E.; Baricich, A.; Zeppegno, P.; Avanzi, G.C.; Baldon, G.; Bartolomei, G.; Battaglia, M.; Battistini, S.; et al. Respiratory and Psychophysical Sequelae Among Patients With COVID-19 Four Months After Hospital Discharge. JAMA Netw. Open 2021, 4, e2036142. [Google Scholar] [CrossRef] [PubMed]
  10. Gloeckl, R.; Leitl, D.; Jarosch, I.; Schneeberger, T.; Nell, C.; Stenzel, N.; Vogelmeier, C.F.; Kenn, K.; Koczulla, A.R. Benefits of pulmonary rehabilitation in COVID-19: A prospective observational cohort study. ERJ Open Res. 2021, 7, 00108–00118. [Google Scholar] [CrossRef]
  11. Rahmati, M.; Udeh, R.; Kang, J.; Dolja-Gore, X.; McEvoy, M.; Kazemi, A.; Soysal, P.; Smith, L.; Kenna, T.; Fond, G.; et al. Long-Term Sequelae of COVID-19: A Systematic Review and Meta-Analysis of Symptoms 3 Years Post-SARS-CoV-2 Infection. J. Med. Virol. 2025, 6, e70429. [Google Scholar] [CrossRef]
  12. Kulik, G.L.; Zheng, T.; Jolley, S.E.; Ashktorab, H.; Brim, H.; Feuerriegel, E.M.; Hafner, J.W.; Hess, R.; Horne, B.D.; Hornig, M.; et al. Physical Function Differences by COVID-19 Status: A Cross-sectional Analysis From the RECOVER Adult Cohort. Phys. Ther. 2025, 105, pzaf063. [Google Scholar] [CrossRef] [PubMed]
  13. Baricich, A.; Borg, M.B.; Cuneo, D.; Cadario, E.; Azzolina, D.; Balbo, P.E.; Bellan, M.; Zeppegno, P.; Pirisi, M.; Cisari, C.; et al. Midterm functional sequelae and implications in rehabilitation after COVID-19: A cross-sectional study. Eur. J. Phys. Rehabil. Med. 2021, 57, 199–207. [Google Scholar] [CrossRef]
  14. Gill, S.; Hely, R.; Page, R.S.; Hely, A.; Harrison, B.; Landers, S. Thirty second chair stand test: Test-retest reliability, agreement and minimum detectable change in people with early-stage knee osteoarthritis. Physiother. Res. Int. 2022, 27, e1957. [Google Scholar] [CrossRef]
  15. Reis, M.; Teixeira, M.; Carvão, C.; Martins, A.C. Validity and Reliability of the Self-Administered Timed Up and Go Test in Assessing Fall Risk in Community-Dwelling Older Adults. Geriatrics 2025, 10, 62. [Google Scholar] [CrossRef]
  16. Kottner, J.; Audige, L.; Brorson, S.; Donner, A.; Gajewski, B.J.; Hróbjartsson, A.; Roberts, C.; Shoukri, M.; Streiner, D.L. Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. Int. J. Nurs. Stud. 2011, 48, 661–671. [Google Scholar] [CrossRef] [PubMed]
  17. Karanicolas, P.J.; Bhandari, M.; Kreder, H.; Moroni, A.; Richardson, M.; Walter, S.D.; Norman, G.R.; Guyatt, G.H.; Collaboration for Outcome Assessment on Surgical Trials (COAST) Musculoskeletal Group. Evaluating agreement: Conducting a reliability study. J. Bone Jt. Surg. Am. 2009, 91, 99–106. [Google Scholar] [CrossRef]
  18. Sim, J.; Wright, C. Research in Health Care: Concepts, Designs and Methods; Stanley Thornes Ltd.: Cheltenham, UK, 2002; pp. 123–139. [Google Scholar]
  19. Streiner, D.L.; Norman, G.R.; Cairney, J. Health Measurement Scales a Practical Guide to Their Development and Use; Oxford University Press: Oxford, UK, 2014. [CrossRef]
  20. Cardarelli, R.; Seater, M.M. Evidence-based medicine, part 4. An introduction to critical appraisal of articles on harm. J. Am. Osteopath. Assoc. 2007, 107, 310–314. [Google Scholar] [PubMed]
  21. Spruit, M.A.; Singh, S.J.; Garvey, C.; ZuWallack, R.; Nici, L.; Rochester, C.; Hill, K.; Holland, A.E.; Lareau, S.C.; Man, W.D.; et al. An official American Thoracic Society/European Respiratory Society statement: Key concepts and advances in pulmonary rehabilitation. Am. J. Respir. Crit. Care Med. 2013, 188, 13–64. [Google Scholar] [CrossRef]
  22. Jimeno-Almazán, A.; Franco-López, F.; Buendía-Romero, Á.; Martínez-Cava, A.; Sánchez-Agar, J.A.; Sánchez-Alcaraz Martínez, B.J.; Courel-Ibáñez, J.; Pallarés, J.G. Rehabilitation for post-COVID-19 condition through a supervised exercise intervention: A randomized controlled trial. Scand. J. Med. Sci. Sports 2022, 32, 1791–1801. [Google Scholar] [CrossRef]
  23. van Haastregt, J.C.M.; Everink, I.H.J.; Schols, J.M.G.A.; Grund, S.; Gordon, A.L.; Poot, E.P.; Martin, F.C.; O’Neill, D.; Petrovic, M.; Bachmann, S.; et al. Management of post-acute COVID-19 patients in geriatric rehabilitation: EuGMS guidance. Eur. Geriatr. Med. 2022, 13, 291–304. [Google Scholar] [CrossRef]
  24. Sakai, T.; Hoshino, C.; Hirao, M.; Nakano, M.; Takashina, Y.; Okawa, A. Rehabilitation of Patients with Post-COVID-19 Syndrome: A Narrative Review. Prog. Rehabil. Med. 2023, 8, 20230017. [Google Scholar] [CrossRef] [PubMed]
  25. Mokkink, L.B.; Eekhout, I.; Boers, M.; van der Vleuten, C.P.M.; de Vet, H.C.W. Studies on reliability and measurement error of measurements in medicine- from design to statistics explained for medical researchers. Patient Relat. Outcome Meas. 2023, 14, 193–212. [Google Scholar] [CrossRef]
  26. Koo, T.K.; Li, M.Y. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J. Chiropr. Med. 2016, 15, 155–163. [Google Scholar] [CrossRef]
  27. Parker, R.A.; Scott, C.; Inácio, V.; Stevens, N.T. Using multiple agreement methods for continuous repeated measures data: A tutorial for practitioners. BMC Med. Res. Methodol. 2020, 20, 154. [Google Scholar] [CrossRef] [PubMed]
  28. Haley, S.M.; Fragala-Pinkham, M.A. Interpreting change scores of tests and measures used in physical therapy. Phys. Ther. 2006, 86, 735–743. [Google Scholar] [CrossRef] [PubMed]
  29. Bland, J.M.; Altman, D. Statistical Methods for Assessing Agreement Between Two Methods of Clinical Measurement. Lancet 1986, 327, 307–310. [Google Scholar] [CrossRef]
  30. Cesana, B.M.; Antonelli, P. Bland and Altman agreement method: To plot differences against means or differences against standard? An endless tale? Clin. Chem. Lab. Med. 2023, 62, 262–269. [Google Scholar] [CrossRef]
  31. Gerke, O. Reporting standards for a Bland-Altman agreement analysis: A review of methodological reviews. Diagnosistics 2020, 10, 334. [Google Scholar] [CrossRef]
  32. Borg, D.N.; Bach, A.J.E.; O’Brien, J.L.; Sainani, K.L. Calculating sample size for reliability studies. PM&R 2022, 14, 1018–1025. [Google Scholar] [CrossRef]
  33. Donner, A.; Eliasziw, M. Sample size requirements for reliability studies. Stat. Med. 1987, 6, 441–448. [Google Scholar] [CrossRef]
  34. Zou, G.Y. Sample size formulas for estimating intraclass correlation coefficients with precision and assurance. Stat. Med. 2012, 31, 3972–3981. [Google Scholar] [CrossRef]
  35. Walter, S.D.; Eliasziw, M.; Donner, A. Sample size and optimal designs for reliability studies. Stat. Med. 1998, 17, 101–110. [Google Scholar] [CrossRef]
  36. Lu, M.J.; Zhong, W.H.; Liu, Y.X.; Miao, H.Z.; Li, Y.C.; Ji, M.H. Sample Size for Assessing Agreement between Two Methods of Measurement by Bland-Altman Method. Int. J. Biostat. 2016, 12, 20150039. [Google Scholar] [CrossRef] [PubMed]
  37. Portney, L.; Watkins, M. Foundations of Clinical Research: Applications to Practice, 2nd ed.; Pearson/Prentice Hall: Upper Saddle River, NJ, USA, 2000. [Google Scholar]
  38. Marques, A.; Cruz, J.; Quina, S.; Regêncio, M.; Jácome, C. Reliability, Agreement and Minimal Detectable Change of the Timed Up & Go and the 10-Meter Walk Tests in Older Patients with COPD. COPD J. Chronic Obstr. Pulm. Dis. 2016, 13, 279–287. [Google Scholar] [CrossRef]
  39. Alghadir, A.H.; Al-Eisa, E.S.; Anwer, S.; Sarkar, B. Reliability, validity, and responsiveness of three scales for measuring balance in patients with chronic stroke. BMC Neurol. 2018, 18, 141. [Google Scholar] [CrossRef] [PubMed]
  40. Hadjiioannou, I.; Wong, K.; Lindup, H.; Mayes, J.; Castle, E.; Greenwood, S. Test–Retest Reliability for Physical Function Measures in Patients with Chronic Kidney Disease. J. Ren. Care 2020, 46, 25–34. [Google Scholar] [CrossRef]
  41. Aktar, B.; Balci, B.; Oztura, I.; Baklan, B. The test-retest reliability and minimal detectable change of the six-minute walk test, timed up and go test, and 30-second chair stand test in people with epilepsy. Physiother. Theory Pract. 2024, 40, 2298–2407. [Google Scholar] [CrossRef]
  42. Ozcan Kahraman, B.; Ozsoy, I.; Akdeniz, B.; Ozpelit, E.; Sevinc, C.; Acar, S.; Savci, S. Test-retest reliability and validity of the timed up and go test and 30-second sit to stand test in patients with pulmonary hypertension. Int. J. Cardiol. 2020, 304, 159–163. [Google Scholar] [CrossRef] [PubMed]
  43. Hansen, H.; Beyer, N.; Frølich, A.; Godtfredsen, N.; Bieler, T.T. Intra- and inter-rater reproducibility of the 6-minute walk test and the 30-second sit-to-stand test in patients with severe and very severe COPD. Int. J. Chronic Obstruct. Pulmon. Dis. 2018, 13, 3447–3457. [Google Scholar] [CrossRef]
  44. Figueiredo, P.H.S.; Veloso, L.R.S.; Lima, M.M.O.; Vieira, C.F.D.; Alves, F.L.; Lacerda, A.C.R.; Lima, V.P.; Rodrigues, V.G.B.; Maciel, E.H.B.; Costa, H.S. The reliability and validity of the 30-seconds sit-to-stand test and its capacity for assessment of the functional status of hemodialysis patients. J. Bodyw. Mov. Ther. 2021, 27, 157–164. [Google Scholar] [CrossRef]
  45. Wang, Z.; Yan, J.; Meng, S.; Li, J.; Yu, Y.; Zhang, T.; Tsang, R.C.C.; El-Ansary, D.; Han, J.; Jones, A.Y.M. Reliability and validity of sit-to-stand test protocols in patients with coronary artery disease. Front. Cardiovasc. Med. 2022, 9, 841453. [Google Scholar] [CrossRef] [PubMed]
  46. Özkeskin, M.; Özden, F.; Ar, E.; Yüceyar, N. The reliability and validity of the 30-second chair stand test and modified four square step test in persons with multiple sclerosis. Physiother. Theory Pract. 2023, 39, 2189–2195. [Google Scholar] [CrossRef]
  47. Unver, B.; Kalkan, S.; Yuksel, E.; Kahraman, T.; Karatosun, V. Reliability of the 50-foot walk test and 30-sec chair stand test in total knee arthroplasty. Acta Ortop. Bras. 2015, 23, 184–187. [Google Scholar] [CrossRef]
  48. Marx, R.G.; Menezes, A.; Horovitz, L.; Jones, E.J.; Warren, R.F. A comparison of two time intervals for test-retest reliability of health status instruments. J. Clin. Epidemiol. 2003, 56, 730–735. [Google Scholar] [CrossRef] [PubMed]
  49. Ofem, U.J.; Owan, V.J.; Ibout, C.; Ovat, S.V. Paradigm shift in reliability estimates: Application of analysis of variance repeated measures (ANOVAM) in validation studies. Pedagog. Res. 2025, 10, em0239. [Google Scholar] [CrossRef]
Figure 1. Bland-Altman plot for TUG test. The solid line represents the mean difference and the dashed lines indicate the 95% limits of agreement.
Figure 1. Bland-Altman plot for TUG test. The solid line represents the mean difference and the dashed lines indicate the 95% limits of agreement.
Covid 06 00077 g001
Figure 2. Bland-Altman plot for 30STS test. The solid line represents the mean difference and the dashed lines indicate the 95% limits of agreement.
Figure 2. Bland-Altman plot for 30STS test. The solid line represents the mean difference and the dashed lines indicate the 95% limits of agreement.
Covid 06 00077 g002
Table 1. Participant characteristics.
Table 1. Participant characteristics.
Gender a
Male11 (58%)
Female8 (42%)
Age, yr b70.42 ± 12.88
COVID-19 severity stage a
Mild6 (32%)
Moderate6 (32%)
Severe2 (10%)
Critical5 (26%)
Occupation a
Retired13 (68%)
Unemployed1 (5%)
Housewife2 (10%)
Others3 (15%)
Most common comorbidities a
Hypertension12 (63%)
Type 2 Diabetes Mellitus9 (47%)
Atrial Fibrillation7 (37%)
High Cholesterol6 (32%)
COPD5 (26%)
Number of COVID-19 infections a
1 infection19 (100%)
Post-exertional malaise (PEM) a
Yes0 (0%)
No19 (100%)
Symptoms a
Fatigue15 (79%)
Dyspnea12 (63%)
Muscle Weakness19 (100%)
Balance Impairment17 (89%)
Reduced Endurance19 (100%)
a Data are presented as n (%), b Data are presented as mean ± SD.
Table 2. Descriptive statistics for each trial of the TUG and 30STS tests across sessions.
Table 2. Descriptive statistics for each trial of the TUG and 30STS tests across sessions.
TUG (s)
SessionTrial 1Trial 2Trial 3Overall
Day 1 14.53 ± 4.1314.62 ± 4.2114.26 ± 4.1314.47 ± 4.09
Day 215.05 ± 4.1514.45 ± 4.1214.24 ± 4.1514.58 ± 4.08
30STS (repetitions)
SessionTrial 1Trial 2Trial 3Overall
Day 18.16 ± 2.068.00 ± 2.138.32 ± 2.408.16 ± 2.17
Day 28.26 ± 2.498.42 ± 2.278.21 ± 1.998.30 ± 2.22
Data are presented as mean ± standard deviation.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kloni, M.; Heraclides, A.; Panteli, T.; Klonis, A.; Rentzias, P.; Karagiannis, C. Intra-Rater Reliability of 30 s Sit-To-Stand and Timed-Up-and-Go Tests in Older Adults with Post-COVID-19 Syndrome: A Pilot Study. COVID 2026, 6, 77. https://doi.org/10.3390/covid6050077

AMA Style

Kloni M, Heraclides A, Panteli T, Klonis A, Rentzias P, Karagiannis C. Intra-Rater Reliability of 30 s Sit-To-Stand and Timed-Up-and-Go Tests in Older Adults with Post-COVID-19 Syndrome: A Pilot Study. COVID. 2026; 6(5):77. https://doi.org/10.3390/covid6050077

Chicago/Turabian Style

Kloni, Marina, Alexandros Heraclides, Theognosia Panteli, Alexios Klonis, Panagiotis Rentzias, and Christos Karagiannis. 2026. "Intra-Rater Reliability of 30 s Sit-To-Stand and Timed-Up-and-Go Tests in Older Adults with Post-COVID-19 Syndrome: A Pilot Study" COVID 6, no. 5: 77. https://doi.org/10.3390/covid6050077

APA Style

Kloni, M., Heraclides, A., Panteli, T., Klonis, A., Rentzias, P., & Karagiannis, C. (2026). Intra-Rater Reliability of 30 s Sit-To-Stand and Timed-Up-and-Go Tests in Older Adults with Post-COVID-19 Syndrome: A Pilot Study. COVID, 6(5), 77. https://doi.org/10.3390/covid6050077

Article Metrics

Back to TopTop