Functional Tests Predicting Return to Work of Workers with Non-Specific Low Back Pain: Are There Any Validated and Usable Functional Tests for Occupational Health Services in Everyday Practice? A Systematic Review

The literature predominantly advocates subjective perception of disability and pain as an outcome measure for the functional evaluation of patients with low back pain (LBP). Physical outcome measurements are almost completely ignored. In this systematic review, we focused on physical functional measurements that can contribute to the prediction of patients’ return to work (RTW) readiness after sick leave or rehabilitation. Searches were conducted in July 2022 without any time limit in the Cochrane Library, PEDro, PubMed and Scopus databases for functional and clinical tests reliable and applicable in clinical practice without demanding equipment. Two independent researchers extracted the data from the included articles in a standardised data collection form, and a third researcher validated the data extraction. No date restriction was applied. We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines in conducting the review. We found seven original articles, including six with an impact on predicting RTW. We found four fair and three poor original studies fulfilling our criteria. We found the Back Performance Scale (BPS) and back endurance test to be the most promising tests for occupational health service and the clinical practitioner. Radiation of back pain, with or without neurological deficiencies, had some predictive value in terms of RTW, too. The working conditions vary a lot, which causes inconsistency in the studies and in their interpretation. Functional tests could complete the widely used working ability evaluations methods such as the Work Ability Index (WAI) and are worth considering for future research. Overall, more research is needed in this field. The question of when LBP patients can resume everyday activities and work is not possible to determine with functional tests alone. Psychosocial aspects and work demands must be considered. PROSPERO: CRD42022353955. The study was funded by the University of Helsinki.


Introduction
An estimated 7.5% of the global population are affected by LBP either acutely or chronically, making LBP a major burden both individually as well as socioeconomically [1][2][3]. Up to 90% of all LBP cases are considered "non-specific" [4], i.e., the cause of pain is not known [5], and pain can come from any part of the spine supplied by pain nerves. While most patients heal by themselves or with conservative treatment, 5-20% develop a chronic condition with an increased prevalence among the older population [6]. Chronic low back pain (CBLP) is defined as pain lasting longer than 3 months [7]. The functional evaluation of patients with unspecific subacute or chronic low back pain (LBP) is a continuous challenge, particularly in occupational health service.
The prevailing literature advocates multifaceted follow-up evaluations for patients with LBP. According to a comprehensive review by Chapman et al. [8], the most important domains are pain, function and quality of life. For pain, they recommended the Visual Analogue Scale (VAS) and the Numeric Pain Rating Scale (NPRS) because of their ease of administration and responsiveness. For function, they recommended the Oswestry Disability Index (ODI) [9] and the Roland Morris Disability Questionnaire (RMDQ) [10]. Notably, functional tests were not mentioned. The same observation was made in the review by Froud et al. [11], namely, that functional measurements are neglected as outcome measures in studies of patients with LBP. This is contradictory to the principles of the International Classification of Functioning (ICF) [12], which combines bodily functions and the social consequences of diseases and advocates a holistic view when evaluating the outcome of diseases. In addition to the well-grounded conceptual theory of the ICF, functional tests for patients' back pain are used and required in occupational health service. This presents a dilemma: no functional test for patients with LBP has so far succeeded in gaining unanimous acceptance as the method of choice to predict return to work (RTW). A good test should be valid, reliable and responsive, and predict RTW or illustrate the patient's working capacity.
The demand for practical functional tests for patients with LBP is obvious. Modern occupational health service aims to contribute to the total management of working ability within the employing company. Modified work and alternative work or duties [13,14] are increasingly adopted as a tool to combat total working incapacity. Physical tests, i.e., measurements, are sounder bases in the decision making when different work measures are planned. The same argument applies for social insurance institutions which prefer functional tests or other clinical evidence to mere subjective perception. This calls for the development of tools to predict the RTW of patients with LBP.
Work demands vary hugely, so it is almost impossible to find a functional test that could give a comprehensive view of any patient's working capacity to RTW after an episode of back pain. However, we aimed to search the literature to find the best available and suitable functional tests for patients with LBP (i.e., subacute or chronic unspecific low back pain) to predict their RTW, knowing that any functional test should be accompanied by psychometric and social evaluations. We focused on clinical and functional tests that are applicable in everyday practice without any special or expensive technology and that can be conducted within 20 min visits. The physical test must be reliable enough to be usable and it also has to predict to some extent the RTW. Accordingly, we omitted demanding functional tests planned for special situations, such as tests for fire-fighters that use isokinetic measurements [15], and tests whose reliability is not tested. Definitions of terms are shown in Table 1.

Definition of Concept/Reference
Working capacity Work capacity is the ability to perform real physical work [16].

Working ability
Work ability is the result of the interaction of the worker with his or her work and indicates how good a worker is at present and in the near future, and how able he or she is to do his or her work with respect to work demands, health and mental resources [16].
Work Ability Index (WAI) The Work Ability Index (WAI) contains questions concerning your work, your work ability and your health. Your answers help to indicate whether measures for improving your health have to be taken and if your work ability must be improved [17].

Low back pain (LBP)
It is defined by the location of pain, typically between the lower rib margins and the buttock creases. It is commonly accompanied by pain in one or both legs and some people with low back pain have associated neurological symptoms in the lower limbs [18].

Concept Definition of Concept/Reference
Chronic low back pain Chronic low back pain (CLBP) is defined as pain lasting longer than 3 months [7].
Unspecific low back pain Up to 90% of all LBP instances are considered "un-specific" [4], i.e., the cause of pain is not known [5].
Return to work (RTW) Return to work (RTW) is a key pillar in a set of workplace processes designed to facilitate the workplace reintegration of persons concerned who experience a reduction in work capacity as a result of either occupational or non-occupational diseases or injuries. The return to work of workers who are on sick leave is part of a continuum of processes aimed at protecting and promoting the health, well-being and work ability of the workforce. Return to work is one important component of a tertiary prevention approach [19].

Physical function/performance
Is defined as one's ability to carry out activities that require physical actions, ranging from self-care (activities of daily living) to more complex activities that require a combination of skills, often with a social component or within a social context [20].

Functional tests
Functional performance testing means using a set of tests to determine performance abilities or functional limitations. A functional limitation is the inability to perform a particular activity at a normal level [21].

Materials and Methods
This review was registered at the International Prospective Register of Systematic Reviews: PROSPERO (CRD42022353955). We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines in conducting the review and reporting our findings [22].

Data Sources and Search Strategy
A comprehensive literature search was conducted in July 2022 in the Cochrane Library, PEDro, PubMed and Scopus databases, combining the following keywords: ((low back pain AND clinical test AND return to work) OR (low back pain AND clinical test AND disability pension) OR (low back pain AND clinical test AND sick leave) OR (work related low back-pain AND clinical test) OR (work related low back-pain AND physical test)). No language restrictions were applied, nor was any date limitation. Searches were also conducted for previous systematic reviews and cross-references.

Inclusion and Exclusion Criteria
The inclusion criteria were as follows: (1) studies: original articles; (2) participants: working patients with chronic LBP (lasting more than 3 months); (3) interventions: objective tests to measure patients' physical performance; and (4) outcome measures: the primary outcome measure was work ability, and the outcome indicator was RTW after sick leave. The exclusion criteria were as follows: (1) studies in which the subject had an acute cause of LBP, including fracture, osteoporosis and malignancy; and (2) letters, conferences and commentaries.

Data Extraction
For all the included articles, the following data were extracted: (a) study characteristics and study design (author, year and sample size); (b) participants (sex, age, number of participants and inclusion and exclusion criteria); (c) intervention and comparison groups; (d) follow-up time; (e) functional tests; (f) outcome measures; (g) results; and (h) conclusions.
Due to the limited number of included studies, a quantitative analysis was deemed inapplicable to this study.

Methodological Quality Evaluation and Risk of Bias (RoB)
Two independent researchers (HH and LR) extracted the data from the included articles in a standardised data collection form, and a third researcher (TV) validated the data extraction. Any disagreements between the researchers were resolved by a third researcher (TV) and consensus was attained. Quality assessments were evaluated with the Study Quality Assessment Tool criteria developed by the National Heart, Lung and Blood Institute [23]. The quality of the studies was assessed as good, fair or poor.
We assessed risk of bias (RoB) of the studies according to the principles presented by Furlan et al. [24]. The bias was assessed in a structured and fixed set of domains, focusing on design of the study, conduct, generalisability of the results and reporting. Two review authors (HH and LR) independently assessed RoB. We used a consensus method to resolve disagreements and consulted the third review author if disagreements persisted. We scored the criteria as "high risk" or "low risk". "High risk" for a study was achieved if at least one domain was of high RoB or there were some concerns for multiple domains that substantially lowered the confidence in the result. In cases of "high risk" evaluation of RoB, the overall quality evaluation was lowered from good to fair and from fair to poor.

Literature Search Results
Our search yielded 1534 records according to the predefined search strategy, of which 410 records were duplicates. A total of 1067 studies were excluded after screening the abstract. The full text of 48 articles was retrieved for detailed evaluation. Finally, we identified 17 original articles to consider in our systematic review. After a detailed review, two of the authors (HH and LR) included seven articles in the final quality study assessment. Eight articles were excluded because there was no information about the association between functional tests and RTW [25][26][27][28][29][30][31][32], and two review articles [8,33]. The final literature review yielded seven articles fulfilling the criteria. One article was found through manual research, but the others were found through screening the databases. The flow chart is presented in Figure 1. Baseline details of the included articles are presented in Table 2.
The outcome measures, results and conclusions of the finally accepted articles are presented in Table 3. One article, by Christiansen et al. [34], fulfilled the research criteria but showed a negative result for a clinical phenomenon called centralisation as an indicator of RTW. The pain centralisation phenomenon is a term used in a form of physical therapy known as the McKenzie Method. Centralisation describes a phenomenon whereby pain in a leg or buttock suddenly shifts to a spot closer to the spine if the spine is moved or manipulated [35]. Christiansen et al. [34] studied this phenomenon in a sample of 351 patients on sick leave because of LBP with or without sciatica. The patients were classified into three groups according to their pain response: centralisation, peripheralisation or no response. At the one-year follow-up, 65% of the patients had returned to work. All the pain response groups showed significant and clinically important improvements in both pain and disability. No significant differences were found between the pain response groups in any outcome measure.     The BPS, including 5 physical performance tests of daily activities, appears to be a useful instrument for reflecting key aspects of performance in patients with long-lasting back problems. Internal consistency of the BPS was high, and discriminative ability of the instrument and responsiveness to change were demonstrated. The BPS was shown to be more responsive to change than each of the 5 tests separately. As performance of the 5 tests primarily requires mobility of the trunk in the sagittal plane, the authors believe future research should examine whether tests using side bending and twisting also should be included or could replace other tests to have an even better measure of mobility-related activities in people with back pain. The BPS is a practical measure of performance, being easy and quick to perform, with no need for costly equipment. Reliability of the BPS sum score needs to be established.   Bendix AF et al., 1998 [39], Denmark * Ability to work * Disability pension obtained or application pending * Completion versus withdrawal from treatment * Change in back and leg pain * Change in level of activities of daily living * Subjective overall assessment of back problems * Young age correlates positively to return to work (RTW) * Women were almost twice as likely to return to work than men (gender) * There was a positive correlation with high severity of leg pain and low back muscle endurance vs. pension obtained or application pending in the functional restoration (FR) group * Different factors can be identified as predictive of outcome in a functional restoration program, but most of these factors were also shown to predict success for shorter control outpatient programs or of no treatment. * Of the physical findings, pretreatment back muscle endurance was the most important factor and it was positively correlated with less dependency on disability pension and reduction of back pain level during one year  [36], the BPS sum scores discriminated between patients with different RTW statuses, and the BPS sum score was more predictive than the separate tests. In addition, the BPS was used in a randomised rehabilitation study by Magnussen et al. [38], but the number of participants completing the programme was low; only 29 of the participants (64%) in the intervention group completed the intervention. Twice as many in the intervention group (n = 10, 22%) had entered a RTW process with the controls (n = 5, 11%) [38]. Better physical performance was one of the factors related to RTW along with positive expectancy and less pain. The small number of the subjects hampers the conclusions of that study.
Back muscle endurance was connected to working capacity in a study by Bendix et al. [39]. In their study of a functional restoration programme, poor back muscle endurance was connected to receipt of a disability pension. RTW after the one-year follow-up was related to good pre-treatment back muscle endurance, and the relief of back pain in the functional restoration programme was related to pre-treatment back muscle endurance.
Kool et al. [40] conducted a prospective cohort study of 99 patients with chronic LBP. Upon entry to the study, physical workload, time off work, unemployment and nationality were recorded. The study investigated four tests with an anticipated prognostic value for non-RTW: the Numeric Pain Rating Scale (NRS, 9-10 out of a maximum of 10), the Step Test, the Pseudo Strength Test (precipitous cessation, described in the original article) and Behavioural Signs, originally described by Waddel et al. [42]. The best prediction of non-RTW was obtained when at least two out of the four tests were positive (positive predictive value of 0.97 and sensitivity of 0.45).
Loisel et al. [41] studied the Quebec Task Force Classification (QTFC) [43], which classifies patients with LBP based on simple clinical criteria, including signs and symptoms (pain and neurologic examination data), imaging test results and response to treatment. It was designed for several purposes: making clinical decisions, determining prognoses and evaluating quality of care [43]. Subjects classified as having distal radiating pain at baseline (QTFC categories 3 and 4) were likelier to have lower functional status, higher pain level and no return to regular work at the one-year follow-up. The medical history and physical examination allowed the physician to classify the subjects according to the first four categories of the QTFC: QTFC 1 (pain without radiation), QTFC 2 (pain with proximal radiation, i.e., above the knee), QTFC 3 (pain with distal radiation, i.e., below the knee) and QTFC 4 (pain with distal radiation and neurologic signs). The patients with distal radiating pain were also likelier to accumulate more days of full compensation and to have higher treatment costs after a mean follow-up period of 6.5 years.

Discussion
The literature research yielded seven original articles which were all at least 10 years old. This means that work ability, functional capacity and clinical signs have no longer been the focus of research. In one systematic review [8], it was even recommended to avoid RTW as an outcome measure in studies about chronic LBP. RTW has been considered too complicated as an outcome measure, which may also reflect general attitudes in this research field. A more recent review by Froud et al. [11] found that the number of published LBP trials has increased by a factor of 4.5 per year from 5.4 (1980-1999) to 22.4 (2000-2012). The most common outcome measures were the VAS and the Roland-Morris Disability Questionnaire for functional disability. The authors did not discuss the role of functional measurements at all but focused on the forms and questionnaires. This tallies with our observations. This poses the question of whether functional tests are outdated. However, functional tests should not be totally ignored. They can have an important impact on medico-legal issues. Methodological problems are obvious when considering the problems relating to the functional tests of patients with LBP. However, if these are totally replaced by patients perceptions of pain and disability, the results may be misleading for medical personnel and patients. The concrete measure illustrates important aspects of illness.
When selecting the clinical tests, we omitted the methods requiring special techniques, such as isokinetic measurements [15] or tests developed for an exceptional group such as fire-fighters [44]. The Isernhagen work system was also ignored. Although it proved reliable in functional capacity evaluation in patients with chronic LBP [45,46], it is time-consuming to carry out and requires special arrangements.
We wanted to find tests that would be readily usable and easily available for the physician, physiotherapist or occupational therapist, so that the tests could be recommended for clinical practice. A good test must be reliable and responsive, and it must predict RTW. We understand that functional tests or clinical findings cannot solely be used in the evaluation process of working capacity. Psychosocial factors and work measures play a key role in the evaluation process, but this study focused on physical factors.
Hanke et al. [47] conducted a narrative review of function-based tests to determine the return-to-activity state with non-specific LBP. They identified 33 different tests for which positive statements regarding reliability, validity and relevance for the assessment of return-to-activity status in non-specific back pain could be made. The ability to walk, behaviour when lifting and carrying objects, motor control, muscle strength and mobility play a particular role, according to their study. In our study, we found only seven articles that considered physical performance measurements in RTW evaluation. Hanke et al. [47] included cross-sectional studies, which increased the number of tests. In our study, we required predictive evidence of the test for RTW as a criterion.
In our study, we found four fair and three poor original studies fulfilling our criteria. Three studies classified as fair overlap [36][37][38], because Strand et al. [36] presented the Pick-up test first in 2001 and a year later the BPS, including the Pick-up test as part of the whole BPS [37]. Magnussen et al. [38] used the BPS in their rehabilitation study. However, they stated that evaluation of physical functioning based on physical performance tests was not the focus of their study, and the number of participants completing the programme was small [38]. This is why the evaluation of the BPS is based on Strand et al.'s 2002 research [37]. The advantage of the BPS is that it does not require any special arrangements or techniques, and the reference values of the BPS for healthy persons have been published [48]. Overall, the practical approach of the BPS for functional capacity evaluation seems possible, but further research in different patient samples is needed to clarify the role of BPS in different subjects (different age groups, severity of symptoms, cohort studies and controlled trials of CLBP rehabilitation).
A simple test of back muscle endurance was connected to work capacity in a study by Bendix et al. [39]. Poor back muscle endurance was connected to receipt of a disability pension in their study of a functional restoration programme and the relief of back pain in the functional restoration programme was related to good pre-treatment back muscle endurance. The extent to which this test provokes further back pain has not been reported.
The back endurance test as described by Biering-Sørensen (1984) [49] is "measuring how many seconds the subject is able to keep the unsupported upper body (from the upper border of the iliac crest) horizontal, while placed prone with the buttocks and legs fixed to the couch by three wide canvas straps and the arms folded across the chest". This is the original test procedure reported by Sorensen and the reference values reported by Alaranta et al. (1994) [50] are based on this. One author's experience (HH) has been that the test sometimes exacerbates pain. Therefore, careful guidance is needed if this test is used, and one should avoid encouraging the patient too much to avoid muscle strain. The test has proved reliable, is easy to perform and it has published reference values [50].
Kool et al. [40] investigated four tests with an anticipated prognostic value for non-RTW: the Numeric Pain Rating Scale (NRS, 9-10 of a maximum of 10), the Step Test, the Pseudo Strength Test and Behavioural Signs, originally described by Waddel et al. [42]. The predictive value of this test combination is particularly good for non-RTW. Only patients who were willing to return to full-time work were recruited into the study. This is not always the clinical situation when work capacity evaluations are made, which may partly explain the good predictive value of these four tests. However, the practical implications of these observations remain low for the clinician. These tests predict non-RTW, which has a negative connotation. Based on these tests, one cannot deny rehabilitation for a person whose work capacity is jeopardised. In addition, the phenomenon of non-organic signs is obscure. According to Main and Waddell [51], positive behavioural signs should be understood as a response to examination affected by fear in the context of recovery from injury and the development of chronic incapacity. Replication and further development of this test combination is warranted before it can be recommended.
Loisel et al. [41] studied the Quebec Task Force Classification (QTFC) [43], which classifies patients with LBP based on simple clinical criteria, including signs and symptoms (pain and neurologic examination data), imaging test results and response to treatment [43]. Subjects classified as having distal radiating pain at baseline were likelier to have a lower functional status, higher pain level and not to have returned to regular work at the one-year follow-up. For our review, it is interesting that pain radiation is an important indicator and that part of the QTFC can be performed without any demanding imaging technique. This piece of information can be easily gathered from patients as a part of patient history and a part of clinical examination of the patient.
Based on our review, we found the BPS and the back endurance test to be the most promising tests for occupational health service and the clinical practitioner. They are sufficiently simple and reliable, they have reference values and they do not require special equipment, thus keeping the cost of testing low. Most importantly, these tests have relevance in terms of RTW. It would be useful both to replicate the earlier studies and to study them in different working cultures. In the future, one would like to see studies with motor and movement control of LBP patients as an indicator of RTW. A new point of view for future research is provided by smartphone applications in the registration of body movement and activities [52]. To date, studies have concentrated on pain perception as an outcome measure [53,54]. In addition, functional tests could complete the widely used working ability evaluations methods such as the Work Ability Index (WAI) [55]. This calls for new research and development projects.
According to Nguyen et al. [56], as many as 90 percent of persons with occupational nonspecific low back pain are able to return to work in a relatively short period of time as long as no serious conditions relating to LBP, so-called "red flags", exist. The functional tests can provide further insight into the clinical examination and contribute to the treatment and rehabilitation plan for those 10% of back pain patients who have challenges in RTW.
A limitation of the present study is that we may have missed relevant articles in our search. In addition, we focused only on tests that are applicable by clinical practitioners, and more demanding evaluation protocols have been ignored. We purposely focused on clinical examination and easily available tests, because that is what much clinical decisionmaking and many medico-legal issues are based on. Special situations such as physically demanding work were outside the scope of this review. Another limiting factor is that studies were heterogeneous in terms of the work situation and the readiness of participants to RTW varied. Clearly, new prospective cohort studies and RCTs are needed in this field, paying special attention to possible sources of bias in the study design.

Conclusions
We found the BPS and back endurance test to be the most promising tests for patients with LBP, contributing to the evaluation of RTW, but further confirmatory studies are recommended. Radiation of back pain, with or without neurological deficiencies, had some predictive value in terms of RTW. However, evaluating when patients with LBP can resume everyday activities and work without risking recurrence or chronicity is not possible with functional tests alone, and psychosocial aspects and work demands must be considered.

Funding:
The authors declare that this study received funding from Instituto nazionale per l'assicurazione contro gli infortuni sul lavoro and University of Helsinki. The funder had the following involvement with the study: INAIL, BRIC INAIL ACTIVE 2018: ID03-2018 and H3715 wbs 4721249. The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication.
Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.