Recommendations for Diagnosis and Treatment of Lumbosacral Radicular Pain: A Systematic Review of Clinical Practice Guidelines

The management of patients with lumbosacral radicular pain (LRP) is of primary importance to healthcare professionals. This study aimed to: identify international clinical practice guidelines on LRP, assess their methodological quality, and summarize their diagnostic and therapeutic recommendations. A systematic search was performed (August 2019) in MEDLINE, PEDro, National Guideline Clearinghouse, National Institute for Health and Clinical Excellence (NICE), New Zealand Guidelines Group (NZGG), International Guideline Library, Guideline central, and Google Scholar. Guidelines presenting recommendations on diagnosis and/or treatment of adult patients with LRP were included. Two independent reviewers selected eligible guidelines, evaluated quality with Appraisal of Guidelines Research & Evaluation (AGREE) II, and extracted recommendations. Recommendations were classified into ‘should do’, ‘could do’, ‘do not do’, or ‘uncertain’; their consistency was labelled as ‘consistent’, ‘common’, or ‘inconsistent’. Twenty-three guidelines of varying quality (AGREE II overall assessment ranging from 17% to 92%) were included. Consistent recommendations regarding diagnosis are (‘should do’): Straight leg raise (SLR) test, crossed SLR test, mapping pain distribution, gait assessment, congruence of signs and symptoms. Routine use of imaging is consistently not recommended. The following therapeutic options are consistently recommended (‘should do’): educational care, physical activity, discectomy under specific circumstances (e.g., failure of conservative treatment). Referral to a specialist is recommended when conservative therapy fails or when steppage gait is present. These recommendations provide a clear overview of the management options in patients with LRP.


Introduction
Low back pain (LBP) is globally not only a major medical problem but also a major economic problem [1]. Despite intensified research efforts on LBP management, the population burden and disability related to this disorder is increasing [2][3][4]. According to The Global Burden of Disease 2017 study, Years Lived with Disability due to LBP have globally increased by 54% between 1990 and 2015 [5]. LBP affects many people, especially female individuals and those aged 40-80 years, with a mean point prevalence of 11.9%, and 1-month prevalence of 23.2% [6]. Among patients with LBP seeking care in primary care, approximately 36% also report low back-related leg pain below the knee [7].
Low back-related leg pain is either radicular or referred (non-specific) pain. The former is described as radiating pain where a spinal nerve root is involved causing leg pain along the spinal nerve accompanied by numbness and tingling, muscle weakness and loss of reflexes. The latter is described as pain spreading down the legs arising from structures such as disc, joints or ligaments [8]. In the literature, multiple terms are used for lumbosacral radicular pain (LRP), with lumbar disc herniation being the most common diagnosis for this condition [9]. A large majority of patients with LRP tend to have a favourable prognosis in terms of pain and disability, but the time to recovery is usually longer than that of patients with LBP without concomitant radicular pain [7,[10][11][12]. Therefore, an accurate assessment of these patients is needed to provide adequate management and treatment at an early stage of presentation.
Clinical practice guidelines are developed for implementing strong evidence into clinical care, to improve quality of care and reduce variation in decision making of healthcare practitioners. Over the last decades, an increasing number of guidelines have been developed in different countries for patients with LBP [13]. In most of these guidelines, diagnostic triage is recommended (i.e., classification into specific or non-specific LBP) [13], and some guidelines include diagnostic and therapeutic recommendations for different types of LBP. Additionally, an increasing number of guidelines containing specific recommendations for LRP have been issued over the past years.
Since 2001, overviews of clinical guidelines for the management of patients with LBP have been conducted and updated [13][14][15][16][17]. However, these overviews have focused on clinical recommendations for patients with acute or chronic non-specific LBP. No systematic review has been conducted on clinical practice guidelines for patients with LRP. Moreover, while the methodological quality of guidelines on non-specific LBP has been reviewed up to 2009 for acute LBP [14] and up to 2018 for chronic LBP [18], a quality assessment of guidelines focusing only on LRP has never been performed. Since 2009, the Appraisal of Guidelines Research & Evaluation (AGREE) II instrument can be used to assess the methodological quality of clinical practice guidelines [19].
The aim of this study was to retrieve all existing guidelines formulating recommendations on the clinical management of patients with LRP, to assess their methodological quality using the AGREE II tool, and to summarize their diagnostic and therapeutic recommendations.

Review Registration and Reporting
The protocol for the review was registered on the International Prospective Register of Systematic Reviews (PROSPERO) with ID number CRD42020138738. This review was reported in line with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [20].

Literature Search
Literature searches were performed in the following electronic databases from their inception up to August 2019: MEDLINE via OVID, PEDro, National Guideline Clearinghouse, National Institute for Health and Clinical Excellence (NICE), New Zealand Guidelines Group (NZGG), International Guideline Library, and Guideline central. No language restrictions were applied. We also conducted web searches on Google Scholar using recommendations described elsewhere [21], but only the first 10 pages were screened because they retrieve the most relevant search results [21]. Furthermore, backward citation tracking on the reference lists of previous relevant reviews on the topic was performed [13,18]. Appendix A provides the search strategy used in each database.

Guidelines Selection
Clinical practice guidelines providing recommendations regarding diagnosis and treatment of LRP were included. Only guidelines formulating recommendations based on evidence were included. A guideline was included regardless of the type of professional association (e.g., physiotherapy, chiropractic, multidisciplinary), geographical location, and date of publication. The following terms were considered as LRP synonyms: sciatica, radiculopathy, nerve root compromise, nerve root compression, lumbar radicular syndrome, disc herniation, radiculitis, nerve root pain, and nerve root entrapment. Documents briefly mentioning LRP for diagnostic triage without providing further details on its management were excluded. When multiple versions of a guideline issued by a similar professional association were available, only the most recent version was selected. Clinical practice guidelines available in English, German, Portuguese, Spanish, Italian or Dutch were included because the author team could understand these languages. If guidelines in other languages were found, translators for extracting the required information from the guidelines were sought. If this method failed, documents for which no translators could be identified were excluded. Title/abstracts and full-texts were screened by two independent reviewers (AKK and CBO) and disagreements were discussed in online consensus meetings. If disagreements could not be solved, a third reviewer (AC) arbitrated.

Quality Assessment
The methodological quality of clinical practice guidelines was assessed using the AGREE II tool [22]. AGREE II is an update of the previous AGREE instrument to improve its measurement properties (i.e., reliability and validity), to refine its items and to improve the supporting documentation (i.e., original training manual and user's guide) [19]. AGREE II consists of 23 items categorized in six domains: scope and purpose; stakeholder involvement; rigor of development; clarity of presentation; applicability; editorial independence. Each item is scored on a 7-point Likert scale from one (i.e., strongly disagree) to seven (i.e., strongly agree). AGREE II also includes two global items. The first global item is scored similarly to the 23 items and evaluates a guideline overall quality. The second item evaluates the recommendation for use which is rated using a three-point scale (i.e., yes, yes with modification, or no). A previous study [23] using four appraisers found good to excellent inter-rater reliability (intraclass correlation coefficient [ICC 2,1 ] ranging from 0.66 to 0.93) for the 23 items and the first global item. The AGREE II manual recommends the appraisal of the guidelines by at least two reviewers [19]. Thus, two independent authors (AKK and CBO) performed the online AGREE II training [24] and followed the AGREE II manual to assess each included guideline [19]. To calculate the score for each domain all the scores of the individual items in a domain from both appraisers were summed up and scaled as a percentage of the maximum possible score. The overall assessment score is the mean score for the 6 different domains. To investigate the reliability among the assessors, we calculated inter-rater reliability from the two appraisers of the scores obtained for each domain and first global item using the ICC 2,1 and 95% confidence interval. Inter-rater agreement was classified as; 0.75-1.00, excellent; 0.60-0.74, good; 0.40-0.59, fair and <0.40 as poor [25]. Reliability analyses were performed using IBM SPSS Statistics 25 with a two-way random effects model for the domain scores and the first global item scores.
There is not consensus on categorizing guidelines as high, average or low quality guidelines depending on AGREE II scores. However, several methods are provided in the AGREE II manual. A previously described threshold of ≥60% for an acceptably high score on a domain [26] was adopted in this review; for the overall quality of the included guidelines: high-quality guideline when ≥5 domains were scored ≥60%, average quality guideline when 3 or 4 domains were scored ≥ 60% and low-quality guideline when ≤2 domains scored ≥ 60%.

Data Extraction and Synthesis
Two authors (AKK and CBO) independently extracted data from the included guidelines. In cases of no consensus, a third reviewer (AC) was consulted. The following information was extracted using a standardized form: recommendations regarding diagnosis (e.g., history, physical examination) and treatment (e.g., patient education, pharmacological intervention). Regarding surgical treatment options only recommendations for discectomy were extracted as this procedure is the most common applied for disc herniation. As radicular pain is also covered in most generic LBP guidelines, recommendations were extracted when it was clearly specified that they concerned radicular pain, or if recommendations for LBP in general also applied to radicular pain. If available, the level of evidence considered to formulate each recommendation was also extracted.
In order to assess the type and direction of the recommendations, the extracted recommendations were firstly classified into one of the categories: 'Should do', 'Could do', 'Do not do' and 'Uncertain'. This classification was dependent on the terminology used for a recommendation in the guideline (see Appendix B, Table A1). Secondly, to determine the consistency of a recommendation the following categories were identified using a modified version of the approach previously adopted by Lin et al. [27]: a.
Consistent recommendations: from the guidelines including recommendation for a specific approach, the majority (≥80%) indicate as 'should do', 'could do', 'do not do', or 'uncertain', but without conflicting recommendations across guidelines. Conflicting recommendations are present when at least one 'should do' or 'could do', and at least one 'do not do' is applied for the same recommendation in different guidelines. b.
Common recommendations: from the guidelines including recommendation for a specific approach, most (between 50% and 80%) indicate as 'should do', 'could do', 'do not do', or 'uncertain', but with no conflicting recommendations across guidelines. c.
Inconsistent recommendations: a recommendation for one approach indicates 'should do' or 'could do', and another recommendation for the same approach indicates 'do not do' or 'uncertain', both recommendations issued by different guidelines; the same applies if a recommendation for an approach is 'uncertain', and another recommendation for the same approach is 'do not do'.
To determine the consistency across recommendations, only the options included at least two different guidelines were considered.

Guidelines Selection
Our literature searches identified 3032 records. After screening of title/abstract and full texts, 23 eligible guidelines were included ( Figure 1). The characteristics of these guidelines are presented in Table 1. The 23 included guidelines  were developed in 10 different countries from 3 continents (i.e., North America, Europe and Asia): United States (n = 12, 52%), Canada (n = 2), Belgium (n = 1), Denmark (n = 1), Korea (n = 1), Philippines (n = 1), Italy (n = 1), the Netherlands (n = 1) and Norway (n = 1). One guideline was a joint European guideline. One guideline was written in Dutch [45], one updated guideline in Norwegian [44] and the other 21 in English. The professional entities involved in developing the guidelines vary in different countries (Table 1). Most guidelines (n = 14, 61%) are from a specific (medical) professional association (e.g., general practitioners, pain physicians, radiologists, chiropractors, physiotherapists).
The domains with the highest scores were "Clarity of presentation" and "Scope and purpose" with mean scores of 85% and 77%, respectively ( Table 2). The domains that were less well addressed by guideline developers were "Applicability" and "Stakeholder involvement" with mean scores of 36% and 55%, respectively. "Editorial independence" is the domain with the highest variability among the guidelines ranging from 0% to 100% (Table 2). Table 3 indicates inter-rater agreement for AGREE II domains and overall scores. The ICC 2,1 was 'excellent' for the domains "Scope and purpose", "Stakeholder involvement", "Rigor of development", "Applicability", "Editorial independence" and "Overall rating", and 'fair' for the domain "Clarity of presentation". Due to the small sample of assessed guidelines (n = 23), ICC 95% confidence intervals were broad, especially for the domain with the lowest inter-rater agreement ( Table 3).  Table 4 and supplementary Table S4 describe the recommendations for physical examination and other diagnostic procedures in each clinical practice guideline.

Discussion
This systematic review provides a summary of the diagnostic and therapeutic recommendations from 23 international clinical practice guidelines for LRP. The consistent and common recommendations for 'should do' for physical examination are performing the SLR test, the crossed SLR test, mapping pain distribution, steppage gait assessment, and evaluating congruence of signs and symptoms. Regarding imaging, guidelines recommend CT scan or MRI under specific circumstances (e.g., physical examination findings are consistent with disc herniation, after 4-6 weeks of pain, surgery or epidural injections are considered, severe or progressive neurologic signs and symptoms present), and do not recommend the routine use of any form of imaging. The consistent and common therapeutic recommendations for 'should do' are: providing educational care and physical activity, referral to a specialist when conservative therapy fails or when steppage gait is present.
This systematic review provides a methodological quality assessment of the 23 selected guidelines using the AGREE II tool. The overall quality of the guidelines ranged from low to high. High quality guidelines (5 out of 9) are from (national) health care institutes [38,40,43,46,48], two from a pain society [32,34], one from a department of veterans affair [50], one from interdisciplinary back pain network [44] and one from the spine society [42]. In this study, "Clarity of presentation" and "Scope and purpose" were the AGREE domains with the highest scores, and "Applicability" and "Stakeholder involvement" with the lowest scores. These finding are in accordance with another critical appraisal of the quality of LBP guidelines [51]. This previous study assessed 5 guidelines [33,38,46,48,49] which were also included in this review (although we included the latest version of the ICSI [38] guideline from 2018). Compared to the previous review, four out of five of these guidelines were classified as the same quality (i.e., high). One guideline [49] was classified as low quality in this review but of average quality in the other. This discrepancy could be due to a difference in the amount of AGREE II appraisers which could lead to a higher ICC ratio (two appraisers in this study vs. 4 appraisers in Doniselli et al. [51]. Four guidelines [33,46,48,49] included in this review were also included in another review of LBP guidelines [18], where the domain scores were generally similar for the most of the domains. If we apply the same threshold for the domain scores, these 4 guidelines would be categorized with the same quality as in our study. Based on the AGREE II scores of this and earlier studies, it is important that guideline developers take potential barriers of implication of recommendations into account and provide criteria for monitoring and auditing (i.e., AGREE II applicability). Additionally, it is important to include individuals from all relevant professions and take the view of the target population into account (i.e., AGREE II stakeholder involvement).
Although many organizations tend to develop a new guideline, studies have been undertaken on adopting and adapting existed good quality guidelines for saving time and other resources. Schünemann H. et al. [52] has developed the "GRADE-ADOLOPMENT" approach for adopting, adapting and de novo development of recommendations for guideline productions. This approach allows guideline developers to quickly and efficiently create recommendations appropriate for their context where the evidence is taken into account. The Belgian guideline KCE is a good example where the UK comprehensive guideline (NICE) has been adopted and adapted for the Belgian population. Harstall C. et al. [53] have described a multidisciplinary adaption process for creating a single overarching evidence-based clinical practice guideline for patients with LBP. The adolopment approach could facilitate other national and international guideline developers to save time and other important resources and we suggest this approach for LBP when resources are limited and good quality guidelines already exists.
The guidelines commonly recommend physical examination such as performing a SLR test, crossed SLR test, muscle and sensory testing, and reflex test for lumbar radicular pain (Table 4). However, the sensitivity and specificity of these tests have been questioned. A systematic review [54] showed poor diagnostic performance of most physical tests to identify disc herniation when used in isolation, while better performance could be obtained when tests are combined. Two stretch tests that have shown high diagnostic accuracy in patients with LRP are the SLR test and the slump test [55] where the later test was found to be more sensitive. However, according to our finding the recommendation regarding the slump test is inconsistent and six guidelines suggested the SLR test (Table 4). Professionals involved in developing or updating guidelines should consider also the slump test for recommendation. The guidelines commonly/consistently recommend physical activity and exercises/physical therapy. These recommendations were also taken up by a recent narrative review [56] where it is concluded that both physical activity and structured exercises might be beneficial elements in conservative management of patients with lumbar radicular pain.
Besides the consistent and common recommendations for diagnostic and therapeutic options for LRP, we have also identified inconsistent recommendations among the guidelines. Regarding non-invasive treatments, there are inconsistent recommendations on advising bed rest, acupuncture, traction, manipulations/mobilisations/soft tissue techniques, massage, therapeutic ultrasound and heat/cold therapy. Three guidelines ( [28,39,47], two low quality and one average quality) do not recommend bed rest with the exception of 2-4 days in severe cases, while two other guidelines ( [44,45], one high and one low quality) recommend bed rest for a few days to relieve pain. However, other high quality guidelines [32,34,38,40,42,43,46,48,50] do not include any recommendations on bed rest. Therefore, the question could be raised how strong bed rest recommendation is as a noninvasive therapeutic option. For pharmacological interventions, there are inconsistent recommendations for paracetamol, NSAIDs, opioids, anticonvulsants, muscle relaxants, antidepressants, corticosteroids and antibiotics (Table 5). These inconsistencies in guideline recommendations are not surprising considering that there is still uncertain evidence regarding their effectiveness for patients with LRP [57]. Inconsistent recommendations for the use of paracetamol is not surprising. In fact, the publication of the placebo controlled trial of paracetamol (PACE) study from 2014, probably explain these differences [58], as no effect was detected favoring paracetamol on pain and speed recovery in patients with acute LBP with or without leg pain. It can be noticed that guidelines published earlier than 2014 recommend paracetamol, whereas recently published guidelines are being careful with making this recommendation. Only the NHG guideline from 2015 [45] still suggests paracetamol as the first analgesic choice. This could lead to a conclusion that most guidelines might have followed the PACE study results considering that a randomized controlled study in patients with LRP is missing. Nevertheless, such study should be conducted as it would provide clearer indications on the efficacy of this drug for future LRP guidelines. A Cochrane review from 2016 [59] showed no significant efficacy of NSAIDs for pain reduction in treating patients with LRP. However, two guidelines [38,40] issued after 2016 suggest NSAIDs to be considered. More research is needed to evaluate the effects of NSAIDs on this condition to reach more firm conclusions. In the category invasive treatments, the recommendation for epidural injections is inconsistent. A recently published Cochrane review [60] concluded that there are small effects of epidural injections in the treatment of lumbar radicular pain, which are mainly evident at short-term follow up.
To the best of our knowledge, this is the first systematic review of clinical practice guidelines focusing on diagnosis and treatment of LRP. A strength of this study is using the AGREE II tool to methodologically assess the quality of the included guidelines and consistent findings in AGREE with other two recent reviews [18,51]. We have transparently reported all recommendations in every single guideline in the Tables (Supplementary Tables S4 and S5). A limitation of this review is that it was not possible to assess the eligibility of three guidelines due to the review team not being able to read and understand the language of these guidelines (one Hebrew and two Croatian). Another limitation is that AGREE II appraisal of the NHG and the Norwegian Back Pain Network (NBPN) guidelines was done by only one appraiser because one of the two reviewers could not understand the Dutch or Norwegian language. In assessing the consistency and clinical inference of the recommendation we have not taken the publication date of a guideline into account. This approach could raise the consideration that this could lead to potentially biased recommendations as an older guideline is equally weighted as a recently published guideline based on more recent evidence. Nevertheless, we performed a sensitivity analysis in a sample of five recommendations from Tables 4 and 5 (i.e., straight leg test, performing CT scan, bed rest, physical activity and paracetamol) in which guidelines published before 2010 were excluded. The consistency of recommendations and clinical inference remained the same. Therefore, the publication date of guideline would not influence the results consistently. We acknowledge the limitation of the AGREE II consortium not setting specific cut-off scores for domains or guidelines of high versus low quality. The cut-off scores used in this review were taken from a small study [26], but they were also used in other recent reviews of clinical practice guidelines using the AGREE II [61][62][63]. More research on the optimal cut-off scores for the domain and total scores is needed.
This review has highlighted the lack of homogeneity in the manner in which clinical practice guidelines formulate the strength of their recommendations, and the related level of evidence per recommendation. This makes it difficult, at times, to compare and contrast the strength of each recommendation from different guidelines. In our study, we have defined our own terminology by grouping different terms used by the different guidelines (Appendix B). In our view, this point could be addressed by issuing a standardized terminology for strength of a recommendation when developing a clinical practice guideline. Moreover, to date, no standardized threshold value has been suggested to classify high-and low-quality guidelines using the AGREE II appraisal. Introducing a standardized quality classification system based on the AGREE II domain scores is a point of consideration for the future.

Conclusions
Twenty-three clinical practice guidelines for patients with LRP were retrieved and their overall quality ranged from low to high according to the AGREE II tool. These guidelines recommend physical examinations perform the SLR test, crossed SLR test, mapping pain distribution, steppage gait assessment, and agreement of signs and symptoms. Imaging is only recommended under specific circumstances, and its routine use is consistently not recommended. For treatment, the recommendations are: providing educational care, prescribing physical activity, and referring to a specialist when conservative therapy fails, or when steppage gait is present. These consistent recommendations should be adopted by healthcare professionals and healthcare systems worldwide to implement the most effective care.

PEDro
We searched the following terms applying the filter of "practice guidelines"; Low back pain; sciatic*; radicul*.

National Institute for Health and Clinical Excellence (NICE), New Zealand Guidelines Group (NZGG), International Guideline Library
We performed separated terms in these electronic databases using each of the following combination of terms: low back pain OR sciatic* OR radicul*.

Guideline central
We performed separated searches entering each of the following terms: low back pain; sciatic*; radicular; radiculopathy. Scholar.google.com "Guideline" OR "practice guideline" AND "low back pain" OR "sciatic*" OR "radicul*".