Is the Combination of Plain X-ray and Probe-to-Bone Test Useful for Diagnosing Diabetic Foot Osteomyelitis? A Systematic Review and Meta-Analysis

A systematic review and meta-analysis was conducted to assess the diagnostic accuracy of the combination of plain X-ray and probe-to-bone (PTB) test for diagnosing diabetic foot osteomyelitis (DFO). This systematic review has been registered in PROSPERO (a prospective international register of systematic reviews; identification code CRD42023436757). A literature search was conducted for each test separately along with a third search for their combination. A total of 18 articles were found and divided into three groups for separate analysis and comparison. All selected studies were evaluated using STROBE guidelines to assess the quality of reporting for observational studies. Meta-DiSc software was used to analyze the collected data. Concerning the diagnostic accuracy variables for each case, the pooled sensitivity (SEN) was higher for the combination of PTB and plain X-ray [0.94 (PTB + X-ray) vs. 0.91 (PTB) vs. 0.76 (X-ray)], as was the diagnostic odds ratio (DOR) (82.212 (PTB + X-ray) vs. 57.444 (PTB) vs. 4.897 (X-ray)). The specificity (SPE) and positive likelihood ratio (LR+) were equally satisfactory for the diagnostic combination but somewhat lower than for PTB alone (SPE: 0.83 (PTB + X-ray) vs. 0.86 (PTB) vs. 0.76 (X-ray); LR+: 5.684 (PTB + X-ray) vs. 6.344 (PTB) vs. 1.969 (X-ray)). The combination of PTB and plain X-ray showed high diagnostic accuracy comparable to that of MRI and histopathology diagnosis (the gold standard), so it could be considered useful for the diagnosis of DFO. In addition, this diagnostic combination is accessible and inexpensive but requires training and experience to correctly interpret the results. Therefore, recommendations for this technique should be included in the context of specialized units with a high prevalence of DFO.


Introduction
Diabetic foot ulcers (DFUs) have been described as one of the most prevalent complications related to diabetes mellitus (DM) [1].Approximately 50% of diabetic foot disease cases are at risk of developing a foot infection [2].Diabetic foot infection (DFI) is the cause of almost 85% of foot amputations in people with DM and has been linked to an increase in morbidity, increased costs, and decreased quality of life [3,4].DFIs can lead to osteomyelitis and spread contiguously to deeper tissues, including the bones [5].Diabetic foot osteomyelitis (DFO) is a severe complication of diabetic foot disease and can affect 50-60% of severe DFI cases and approximately 20% of moderate DFIs [6,7].
Although bone histopathology and culture provide the standard criteria for diagnosing DFO [5], resources or expertise to perform bone biopsy are unavailable in many settings.The International Working Group on Diabetic Foot (IWGDF) recommends detecting DFO as early as possible to prevent further complications such as foot amputation and death [8].Plain X-ray is the first imaging modality used for the diagnosis of DFO [8].However, the classic radiological triad comprising osteolysis, periosteal reaction, and bone destruction is generally not evident until a later stage occurring at least 10 to 20 days after onset of symptoms.Imaging studies for diagnosing DFO have reported low sensitivity (SEN) of 43-75% and specificity (SPE) of 75-83% using conventional X-ray [9,10], especially in early infection [11].
The probe-to-bone (PTB) test is widely used for clinical outpatients to assess DFO and is performed with sterile metal forceps, such as Halsted mosquito forceps.The result is considered positive when the investigator can feel a sandy or hard surface [12].The combination of PTB and X-ray tests has a SEN and SPE similar to those of other more expensive diagnostic tests such as magnetic resonance imaging (MRI) for the diagnosis of DFO (SEN 97% and SPE 92%) [13].
Studies have evaluated the performance characteristics of the PTB test in the diagnosis of DFO [12,[14][15][16][17]. Nevertheless, new analyses are needed due to the recent publication of new studies [13,18].To the best of our knowledge, no systematic reviews and meta-analyses have evaluated the diagnostic performance of the combination of PTB and plain X-ray tests in the diagnosis of DFO thus far.The primary aim of this systematic review and meta-analysis was to evaluate and estimate the performance characteristics of the PTB test together with conventional X-ray and to determine the pretest probability at which this combination is useful for diagnosing osteomyelitis.

Literature Search
This study was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [19] and has been registered in PROSPERO (a prospective international register of systematic reviews; identification code CRD42023436757).Two reviewers (M.M.C.W. and F.J.Á.A.) independently searched three electronic databases (PubMed, Medline, and Cochrane) for relevant studies on the diagnosis of osteomyelitis using the PTB test or plain X-ray, spanning from inception until May 15, 2023.An independent search was carried out for each test by the two reviewers.The words "Osteomyelitis", "Probe-to-bone", "Diagnosis", and "Diabetic Foot" where used as search terms.These keywords were directly combined using the Boolean operator "AND" to form the following search strategies: probe-to-bone AND osteomyelitis AND diabetic foot and probe-to-bone AND diagnosis AND osteomyelitis AND diabetic foot.
For the second search, the keywords used were "Osteomyelitis", "Plain X-ray", "Diagnosis", and "Diabetic Foot".These terms were combined using the Boolean operator "AND" to form the following search strategies: plain X-ray AND osteomyelitis AND diabetic foot and plain X-ray AND diagnosis AND osteomyelitis AND diabetic foot.

Selection Requirements
The inclusion criteria were (a) studies published in English, (b) patients with suspected DFO and a positive PTB test (for the first search), and (c) studies using prospective/retrospective case series, case-control, cross-sectional, cohort, or randomized clinical trial designs.The exclusion criteria were (a) animal trials, (b) articles including only diagnostic tests other than PTB or plain X-ray for DFO, (c) articles unrelated to DFO, and (d) articles from which it was not possible to extract the data required for the meta-analysis.

Literature Screening
Following deduplication of search results, potential articles were reviewed based on the title and abstract.Articles were independently screened by two authors (M.M.C.W. and F.J.Á.A.), and the results were compared.Any disparity between the authors was resolved by a third reviewer (J.L.L.M.).The articles included in the systematic review were divided into three groups.The first was used for the validation of the PTB test, the second was used for the validation of plain X-ray, and the third was used for the validation of the combination of both tests.

Data Extraction
A customized Microsoft Excel spreadsheet was used to extract the data from the studies.A total of three spreadsheets were made, one for each group (PTB validation, plain X-ray validation, and combination test validation).The extracted data included the first author's name, year of publication, study design, number of patients, evaluated test, comparative diagnostic test, and outcome measures (SEN, SPE, positive and negative predictive values (PPV and NPV), LR+ and negative likelihood ratio (LR−), and osteomyelitis prevalence).

Quality Evaluation of Included Studies (STROBE Guidelines)
Three independent researchers analyzed the data collected from all the articles.Since the included articles were prospective, retrospective, and cross-sectional studies, the quality evaluation was based on the standard STROBE guidelines to help guarantee high-quality presentation of observational studies [20].Reviewers evaluated the adequacy of reported items using the STROBE checklist.This checklist provides a framework to ensure completeness and transparency.
The STROBE checklist has 22 items: item 1, title and abstract; items 2 and 3, introduction; items 4-12, methods; items 13-17, results; items 18-21, discussion; and item 22, funding and sponsorship.Two reviewers (M.M.C.W. and F.J.Á.A.) independently assessed each study using the STROBE guidelines.A third reviewer (J.L.L.M.) helped to achieve a consensus in cases of disagreement.

Statistical Analyses
The meta-analysis was carried out using a web application for meta-analysis of diagnostic test accuracy data Meta-DiSc version 2.2 (https://ciberisciii.shinyapps.io/MetaDiSc2/,accessed on 2 July 2023) [21] with a bivariate or univariate random-effects model.Pooled SEN and SPE were calculated for PTB, plain radiography, and their combination.The heterogeneity (I 2 ), correlation index, LR+, and diagnostic odds ratio (DOR) were extracted for each test separately.The receiver operating characteristic (ROC) curve was also obtained, and the 95% confidence interval of the area under the curve (AUC 95%) was calculated.A bivariate random-effects model was used.A second statistical analysis was done for each test by extracting the studies that used histopathology as a reference, which is considered the gold standard for the diagnosis of osteomyelitis [5].For the analysis of the combination of PTB + plain X-ray, LR+ and I 2 for SEN and SPE were extracted.All the studies included in this analysis used histopathology as a comparative diagnostic test.

Literature Retrieval
In a first search with the application of the inclusion criteria, 47 articles for PTB and 297 articles for plain X-ray were identified.After eliminating duplicates and reading the titles and abstracts, 18 articles for PTB and 27 articles for plain X-ray were selected for full-text evaluation.Finally, after eliminating the coinciding articles from the two searches, 18 studies were included in the analysis.Figure 1 shows the literature screening process.

Quality of the Reporting
The most poorly completed items by the included studies were 9 (bias), 10 (study size), 13 (participants), and 21 (generalizability).Table 1 shows the overall rating for the STROBE checklist.Red when it does not meet the criteria and green when it does. 1 (a), title; 1 (b), abstract.
These same variables were extracted in a sub-analysis of the articles comparing the PTB test with histopathology (gold standard) [12][13][14]16].The results are shown in Figures 2 and 3. Figure 2 shows the pooled SEN and SPE of the PTB test compared to histopathology.Figure 3 shows the ROC curve analysis and estimates of LR+, I 2 , correlation, DOR, and AUC 95% of the studies selected.
Related to the sample size, there were a total of 1156 participants in the articles included in this group [12][13][14][15][16][17]22,23], of which 550 were considered true positives for DFO diagnosed by PTB.Considering only the articles that compared the results of PTB to gold standard, the sample was 667, of which 460 were true positives.These data are shown broken down by article in Figure 2.
When comparing the accuracy of plain X-ray with histopathology as the gold standard [13,16,25,27,30,31,33], changes were found in the parameters evaluated, which are shown in Figures 4 and 5. Figure 4 shows the pooled SEN and SPE of plain X-ray compared to histopathology (gold standard).Figure 5 shows the ROC curve analysis and the estimates of LR+, I 2 , correlation, DOR, and AUC 95% of the studies selected.There was a total sample of 963 in the articles included in this group [24][25][26][27][28][29][30][31][32][33], of which 496 were true positives for DFO diagnosed by plain X-ray.
The sample size for the subgroup of studies that compared X-ray to histopathology was 734, of which 422 were true positives.These data are shown broken down by article in Figure 4.

Combination of PTB and Plain X-ray
Two studies analyzed the diagnostic combination of PTB + plain X-ray compared to the gold standard [13,16].In this case, a univariate analysis was carried out in order to determine the diagnostic accuracy of the combination of these tests.From the statistical analysis, pooled SEN was 0.94, the heterogeneity was 88.5%, the SPE was 0.83, and the relative heterogeneity was 89.9%.LR+ was 5.874, and DOR was 82.212.These data are shown in Figures 6 and 7. Figure 6 shows the estimates of LR+, I 2 SEN, I 2 SPE, and DOR. Figure 7 shows the pooled SEN and SPE of the combination of PTB and plain X-ray.
The total sample of patients included in the two articles assessing the diagnostic combination [13,16] was 488, of which 343 were true positives.These data are shown broken down by item in Figure 7.
ogy. Figure 3 shows the ROC curve analysis and estimates of LR+, I 2 , correlation, DOR, and AUC 95% of the studies selected.
Related to the sample size, there were a total of 1156 participants in the articles included in this group [12][13][14][15][16][17]22,23], of which 550 were considered true positives for DFO diagnosed by PTB.Considering only the articles that compared the results of PTB to gold standard, the sample was 667, of which 460 were true positives.These data are shown broken down by article in Figure 2.
When comparing the accuracy of plain X-ray with histopathology as the gold standard [13,16,25,27,30,31,33], changes were found in the parameters evaluated, which are shown in Figures 4 and 5. Figure 4 shows the pooled SEN and SPE of plain X-ray com-     analysis, the pooled was 0.94, the heterogeneity was 88.5%, the SPE was 0.83, and the relative heterogeneity was 89.9%.LR+ was 5.874, and DOR was 82.212.These data are shown in Figures 6 and 7. Figure 6 shows the estimates of LR+, I 2 SEN, I 2 SPE, and DOR.
Figure 7 shows the pooled SEN and SPE of the combination of PTB and plain X-ray.The total sample of patients included in the two articles assessing the diagnostic combination [13,16] was 488, of which 343 were true positives.These data are shown broken down by item in Figure 7.

Discussion
Based on the data obtained in this systematic review and meta-analysis, it can be determined that the combination of PTB and plain X-ray demonstrates high diagnostic accuracy for DFO.This diagnostic combination shows a SEN of 0.94, which means that out of every 100 patients with a diagnosis of DFO (established by histopathology as the relative heterogeneity was 89.9%.LR+ was 5.874, and DOR was 82.212.These data are shown in Figures 6 and 7. Figure 6 shows the estimates of LR+, I 2 SEN, I 2 SPE, and DOR. Figure 7 shows the pooled SEN and SPE of the combination of PTB and plain X-ray.
The total sample of patients included in the two articles assessing the diagnostic combination [13,16] was 488, of which 343 were true positives.These data are shown broken down by item in Figure 7.

Discussion
Based on the data obtained in this systematic review and meta-analysis, it can be determined that the combination of PTB and plain X-ray demonstrates high diagnostic accuracy for DFO.This diagnostic combination shows a SEN of 0.94, which means that out of every 100 patients with a diagnosis of DFO (established by histopathology as the

Discussion
Based on the data obtained in this systematic review and meta-analysis, it can be determined that the combination of PTB and plain X-ray demonstrates high diagnostic accuracy for DFO.This diagnostic combination shows a SEN of 0.94, which means that out of every 100 patients with a diagnosis of DFO (established by histopathology as the gold standard), 94 are correctly diagnosed (true positives).This high SEN makes it highly unlikely for a patient to have DFO if the test results are negative.When the tests were analyzed separately, the mean SEN of the studies was lower at 0.91 for PTB and 0.76 for plain X-ray.These values were even lower when the analysis included studies that did not use the gold standard as a reference test for the diagnosis of DFO.
SPE represents the ability to determine that a negative test result actually corresponds to a patient without DFO (true negative).The pooled value was 0.83 for the diagnostic combination of PTB + plain X-ray.This value is much higher than that obtained in the plain radiographic analysis (0.76) and slightly lower than that shown by the PTB test alone (0.86).It is well known that as a test becomes more sensitive, it becomes somewhat less specific, so we can determine that the values obtained show good diagnostic accuracy for the test combination.
DOR represents the effectiveness of a diagnostic test and was 82.212 for the diagnostic combination of PTB and plain X-ray.For each test separately, lower values were obtained at 57.444 for PTB and 4.897 for plain X-ray.For this variable, values above 1 indicate discriminatory capacity, which is greater when the DOR is higher.Thus, the obtained DOR value can be considered high for the diagnostic combination of PTB + plain X-ray, indicating that it is effective for the diagnosis of DFO.
LR+ represents how much more likely it is that a patient would have a disease (DFO in this case) after obtaining a positive test result and is independent of prevalence.For plain the LR+ was 1.969, which is considered bad.For analyses of PTB alone and the combination of PTB + plain X-ray, the LR+ values were 6.344 and 5.684, respectively, which are considered good and can be extrapolated to populations with other prevalence rates of osteomyelitis.
The most recent systematic review and meta-analysis on the diagnostic accuracy of imaging tests for DFO [34] that included X-ray performance was published in 2020.That study showed that among all the imaging tests evaluated, MRI was the most accurate with SEN 96.4% and SPE 83.8%.These values are much higher than those obtained for plain X-ray but are similar to that obtained from the diagnostic combination of PTB + plain X-ray (SEN 0.94, SPE0.83) and can be compared to histopathological analysis (the gold standard).Both MRI and histopathological analysis are costly tests and may not be readily available.Therefore, the diagnostic combination of PTB + plain X-ray could result in a more accessible and cost-effective option for the diagnosis of DFO.However, it is currently necessary to carry out a cost-effectiveness studies of each of the diagnostic tests.
The reproducibility of PTB, plain X-ray, and the combination of both diagnostic tests has been assessed in several studies.García-Morales et al. [35] showed that the interobserver variability of PTB in the diagnosis of DFO was statistically significant depending on the experience of the clinician.The PTB test demonstrated moderate to fair concordance with an experienced examiner, but the degree of concordance was not significant between a very experienced professional, a medium-experienced professional, and a healthcare professional without experience in diabetic foot.
Álvaro-Afonso et al. [36] performed a study to assess the influence of the location of the ulcer on the interpretation of the PTB test.They observed a stronger association between the results from clinicians with different levels of experience for ulcers located in the hallux and in the central metatarsals.There was poorer agreement for ulcers located in the lesser toes.
Another study analyzed the inter-observer and intra-observer variability in plain radiography in the diagnosis of DFO [37].It was found that when using only plain radiography, low concordance rates were observed for clinicians with a similar level of experience.Intra-observer agreement was highest among experienced clinicians, followed by moderately experienced clinicians and inexperienced clinicians.This shows that using plain radiography for the diagnosis of DFO is dependent on the operator and shows low association strength, even among experienced clinicians, when interpreted in isolation without knowing the clinical characteristics of the lesion.
Álvaro-Afonso et al. [38] later analyzed the inter-observer reproducibility of a sequential combination of the PTB test and X-ray in the diagnosis of DFO among experienced clinicians.They observed very good agreement in the interpretation of the PTB test and good agreement in the interpretation of radiographs for the diagnosis of DFO.Based on these results, the authors consider that the interpretation of radiography will be easier if the clinician explores the ulcer beforehand or at least receives clinical information about it.This will make the final diagnosis more reliable.This also demonstrates the importance of jointly considering clinical information (PTB test) and diagnostic tests (simple radiography) to increase agreement among clinicians in the diagnosis of DFO.All these reproducibility studies show that a lack of agreement among professionals with similar or different levels of experience can lead to different diagnostic approaches and therapies that may sometimes be inadequate.Thus, there is a need to implement training programs for these diagnostic tests when establishing specialist diabetic foot units.
Our review has several limitations.First, we would like to point out that the literature is scarce, so no exclusion criteria have been applied with respect to the year of publication, which has meant the inclusion of numerous articles published more than 20 years ago.The two studies on the diagnostic combination of PTB and plain X-ray are more recent, but more and new studies on this subject are needed to provide more reliable results.
With regard to statistical analysis, an ROC curve analysis of the diagnostic combination could not be performed due to the number of articles included.There were only two articles that assessed the accuracy of this combination [13,16], so the results should be interpreted with caution.The results may be extrapolated to centers with similar prevalence of osteomyelitis (>70%) that include professionals who are trained in this field.
It should be noted that the limitations of this review are largely a consequence of the limitations in the identified studies.There were numerous concerns about the potential for bias in the included studies.As shown in Table 1, methods to reduce the risk of bias and select patients were not completed in most of the studies included in this review.The most poorly completed items by the included studies were 9 (bias), 10 (study size), 13 (participants), and 21 (generalizability) based on the STROBE checklist.
The differences in the prevalence of DFO and the type of lesions included in the different studies could explain the high heterogeneity obtained in the meta-analyses.Studies conducted in specialized units within a hospital setting [13] showed a higher prevalence of DFO and lesions with more severe or acute infectious conditions than those conducted in an outpatient setting [16,33].We found studies with DFO prevalence >75% [13,16,23], studies with prevalence of 49-75% [12,17,22,24,27,30,31,33], and studies with lower prevalence of 12-34% [14,15,26,29,33].However, previous systematic reviews and meta-analyses [9,34,39] have analyzed the performance of each test separately, but not the combination of both tests, which is one of the strengths of our study.Another strength of this study is the analysis of studies that used histopathology as a reference diagnostic test [12,13,16,23,25,27,30,31,33], which provides more reliable and accurate results, as well as a more homogeneous analysis.

Conclusions
The combination of PTB and plain X-ray could be considered useful for the diagnosis of DFO as it shows high diagnostic accuracy comparable to that of MRI and histopathology diagnosis (the gold standard).This diagnostic combination is accessible and inexpensive but requires training and experience to correctly interpret the results.Therefore, recommendations for this combination should be included in the context of specialized units with a high prevalence of DFO.Diabetic foot healthcare professionals should be trained in the performance and interpretation of these diagnostic tests so that they can be included in the day-to-day clinical practice and promote early diagnosis to prevent consequences of DFO.However, it should be noted that the literature is sparse, and more studies are needed to support these findings with more evidence.

Figure 5 .
Figure 5. ROC curve and extracted statistical variables of plain X-ray and histopathology studies.LR+, positive likelihood ratio; I 2 , heterogeneity; AUC 95%, 95% confidence interval of the area under the curve; DOR, diagnostic odds ratio.

Table 1 .
Overall rating for Strengthening the Reporting of Observational studies in Epidemiology (STROBE).