Sex Bias in Diagnostic Delay: Are Axial Spondyloarthritis and Ankylosing Spondylitis Still Phantom Diseases in Women? A Systematic Review and Meta-Analysis

Diagnostic delay (DD) is associated with poor radiological and quality of life outcomes in axial spondyloarthritis (ax-SpA) and ankylosing spondylitis (AS). The female (F) population is often misdiagnosed, as classification criteria were previously studied mostly in males (M). We conducted a systematic review to investigate (i) the difference in DD between the sexes, the impact of HLA*B27 and clinical and social factors (work and education) on this gap, and (ii) the possible influence of the year of publication (before and after the 2009 ASAS classification criteria), geographical region (Europe and Israel vs. extra-European countries), sample sources (mono-center vs. multi-center studies), and world bank (WB) economic class on DD in both sexes. We searched, in PubMed and Embase, studies that reported the mean or median DD or the statistical difference in DD between sexes, adding a manual search. Starting from 399 publications, we selected 26 studies (17 from PubMed and Embase, 9 from manual search) that were successively evaluated with the modified Newcastle–Ottawa Scale (m-NOS). The mean DD of 16 high-quality (m-NOS > 4/8) studies, pooled with random-effects meta-analysis, produces results higher in F (1.48, 95% CI 0.83–2.14, p < 0.0001) but with significant results at the second analysis only in articles published before the 2009 ASAS classification criteria (0.95, 95% CI 0.05–1.85, p < 0.0001) and in extra-European countries (3.16, 95% CI 2.11–4.22, p < 0.05). With limited evidence, some studies suggest that DD in F might be positively influenced by HLA*B27 positivity, peripheral involvement, and social factors.


Introduction
In many rheumatic diseases, the women-to-men ratio is generally considered much higher, and hormones were shown to have a deep influence on disease activity [1,2].
On the other side, ankylosing spondylitis (AS) is a chronic inflammatory disease characterized by progressive disabling axial ankylosing and peripheral involvement, with a higher-than-average incidence in the male (M) population ranging from 2:1 to 3:1 [3,4].
Indeed, we know that estrogens play a role in both enhancing and inhibiting immune reactions, whereas testosterone and progesterone predominantly exert suppressive effects on inflammation [5], though more recently, an implication of sex chromosomes was postulated to explain the links between sex and rheumatic diseases [6].
In recent years, there has been an increasing recognition that AS may have a similar impact on the quality of life in both sexes.Nevertheless, women without a family history of AS were generally underdiagnosed [7].
The disease pattern and severity of AS in the first 10 years of the disease seemed to influence and predict the subsequent natural history [8], and for this reason, an early diagnosis was deemed fundamental [9,10].
Radiological damage is often compounded by diagnostic delay (DD) [11,12], defined as the time between symptoms onset and diagnosis [13], which was found to be much longer in AS [11] than in other rheumatic diseases, such as psoriatic arthritis [14], over the past decades.
Furthermore, the strong association between the human leukocyte antigen (HLA*)-B27 and spondylitis and its recognition such as the main gene implicated in AS susceptibility [22,23] led to its inclusion in the 2009 ASAS criteria for ax-SpA [23,24].However, currently, there is no clear consensus on its distribution between female (F) and M sexes, as well as on its impact on DD [14].
Unfortunately, the misdiagnosis of inflammatory back pain [22], the variability of HLA*B27 world distribution, the genetic polymorphism of HLA*B27-negative patients, and cultural prejudices still represent a great obstacle to a fast approach to the diagnosis of AS in women.
This systematic sex bias [23] has unfortunately disadvantaged women, considering this disease's natural history as equivalent to that in men [22].
While the severity of well-known aspects of spondylitis, such as sacroiliitis [25,26] and lumbar syndesmophytosis [27], seem more manifest in the M population, the cervical spine's involvement, which is more common in women, is still not included in the classification criteria [27].
In addition, the higher prevalence of peripheral onset in the F population [2,28] might be one of the possible factors of misdiagnosis using SpA classification criteria in women, with an initial correct identification only in 11% vs. 30.2% in men [22].
Based on this background, the aims of the present systematic review were to evaluate (i) the difference in DD between sexes, the impact of HLA*B27 and clinical and social factors (work and education) on this gap, and (ii) the possible influence of the year of publication (before and after the 2009 ASAS classification criteria), region (Europe and Israel vs. extra-European countries), sample sources (mono-center vs. multi-center studies), and world bank (WB) economic class on DD in both sexes.

Search Strategy
We performed a systematic review according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [29].This study has not been registered in PROSPERO or any other relevant databases.
We searched in PubMed and Embase following the PRISMA guidelines, namely for titles up to 15th September 2023 regardless of the date of publication, using the following PubMed MeSH terms: (axial spondyloarthritis OR spondylitis) AND (delay) AND (sex).Additionally, the references for relevant articles were hand-searched for the identification of other potentially suitable papers.
This search strategy was designed to include full papers (excluding published conference abstracts, reviews, commentaries, and editorials) that reported differences in DD between M and F sexes.To achieve this goal, exclusion criteria for each item were prespecified.Letters showing DD prevalence in data for different sexes were considered, but only if they had a sufficiently detailed methodology and results.

Inclusion and Exclusion Criteria
Studies enrolling patients with confirmed ax-SpA and AS, diagnosed with ASAS [15], New York [30], Amor [31], International Statistical Classification of Diseases, Clinical Modification code = 720.0,as listed by the WHO (ICD-9-CM) [32] and ESSG criteria [33], were included; mixed populations of SpA were also included.DD and therapeutic delay were considered as the index test/intervention; F and M sexes were comparators, while the outcomes were HLA*B27, clinical presentation, social parameters, fibromyalgia, and classification criteria.
We excluded studies based only on psoriatic arthritis or those utilizing the same patient cohort of studies already selected as eligible for systematic review.
Considering the possible limits of studies not included in the searchable PubMed and Embase fields, displaying DD in ax-SpA and AS only in the text and not in titles, we performed a forward snowballing of complementary papers with a manual search of cross-references in the citation track [34] for the selected studies and reviews, particularly in previous reports that performed a meta-analysis [14,35].
Studies were included for systematic review if they reported for both sexes the mean or median DD, defined as the difference between the age when the first symptoms of spondylitis were felt (inflammatory back pain using Calin [13,36] or Berlin criteria [15]) or since initial ax-SpA indicators were identified (at least one among other inflammatory axial pain (e.g., cervical spine), peripheral or enthesis symptoms, and uveitis [11]) and the age at AS diagnosis; we also included for the analysis studies that reported both the mean age at onset and at diagnosis [13].
Papers that reported only a statistical analysis of differences in DD were also considered eligible for review, but they were deemed only representative and, therefore, were not included in the final meta-analysis.

Selection Process
Two reviewers (F.B. and B.M.-C.) independently screened titles and abstracts, assessed full texts for eligibility, and extracted relevant data from qualifying studies.This process adhered to the protocol schedule established from 15 September to 15 November 2023.We followed the PRISMA checklist (see Supplementary Materials); the final PRISMA flow chart is depicted in Figure 1.Any discrepancy at each stage of selection was resolved through discussion moderated by a third reviewer (M.S.V.).
The protocol, including PICO (Patient: AS and ax-Spa; Intervention: DD; Comparator: F vs. M; Outcome: sex) to exclude not eligible articles, was written in accordance with all authors and was not modified in the course of systematic review selection and meta-analysis.
The data of initial eligible studies for systematic review (Figure 1) are shown in Table 1, and information from the included studies was extracted into predefined tabulated summaries as follows: the first author, year of publication, geographic region (similar healthcare systems-Europe and Israel-and extra-European countries), classification criteria for AS and ax-SpA, sample size and sources, F/M ratio, World Bank (WB) economic class [14], mean age at onset (F vs. M), mean age at onset and at diagnosis (F vs. M), mean DD (F vs. M), as well as the significant differences between sexes.We considered Europe and Israel to have similar healthcare systems for the universal right for citizens to access tertiary centers.
Successively, papers were assessed for risk of bias using an adapted version of the modified Newcastle-Ottawa Scale (m-NOS) for case-control studies (see Supplementary Table S1) based on their selection (score 0-4) (disease definition and representativeness), comparability (0-1) of F vs. M, and ascertainment of DD (records 0-1, same method between sexes 0-1, not response rate 0-1 = 0-3).The data of initial eligible studies for systematic review (Figure 1) are sho 1, and information from the included studies was extracted into predefine summaries as follows: the first author, year of publication, geographic reg healthcare systems-Europe and Israel-and extra-European countries), c criteria for AS and ax-SpA, sample size and sources, F/M ratio, World economic class [14], mean age at onset (F vs. M), mean age at onset and at diag

Statistical Analysis
Finally, we performed a meta-analysis of the pooled mean DD of ax-SpA, AS, and other mixed members of the SpA family, using random-effects models selecting only medium-high quality papers (m-NOS > 4/8).When the mean DD was not reported, it was imputed as the difference in mean age at symptom onset and mean age at diagnosis.When the standard deviation (SD) of DD was missing, we imputed it using the methods recommended by Cochrane as follows: in essence, this was based on the SD of age at onset, the age at diagnosis, and their correlation in all studies or the SD of a study reporting the most similar mean DD duration [61].
Because DD is a continuous variable, the mean difference, its SD, and 95% confidence interval (CI) were used to calculate the global effect of DD by sex; results expressed in the median and interquartile range were excluded.
The meta-analysis was performed using the software ProMeta 3. Heterogeneity among studies was evaluated using the I 2 statistics (high heterogeneity if >60% and p < 0.1) [62,63].The effect size was estimated using the unstandardized mean difference reported with its 95% CI.Values of p < 0.05 were considered statistically significant.To calculate the pooled effect, a random effect model was applied according to the found heterogeneity (Egger's linear regression test).Lastly, the funnel plot was visually evaluated to assess possible publication bias.
Following Higgins and Green's guidelines, which were also successively carried out by a Cochrane review [64], if the heterogeneity of meta-analysis was related to univocal reasons, the study could be removed to produce more confidence in the results [61].
We added the funnel plot to examine the effect sizes estimated from individual studies as a measure of their precision.Furthermore, sensitivity analyses without imputed values were performed to check the stability of the study findings.Specifically, the influence of effect sizes was assessed by the deletion of each paper to check heterogeneous data that could affect the overall results.
We also used random-effects meta-regression to evaluate the influence of the following study characteristics: the year of publication (pre-2009 vs. post-2009), regions with a similar healthcare system (Europe and Israel vs. other extra-European countries), WB (lower-middle vs. upper-middle/high classes) [14], and sample sources (single center vs. multicenter studies).Meta-analysis was not performed for HLA*B27, clinical involvement, and classification criteria, as it was not univocally present in the studies examined, as well as for a number of studies less than 3.

Results
Starting from a total of 399 publications, we finally selected 26 studies (Figure 1) (15 from PubMed, 2 from Embase, and 9 from manual search) on ax-Spa and AS that reported DD data for both sexes or presented statistical analysis for the difference in DD between M and F populations.
Table 1 shows the age at onset, age at diagnosis, and DD difference between sexes, as well as the main characteristics of the papers (i.e., year of publication, region, sample sources, and WB) and population examined (i.e., classification criteria, sample size, and F/M ratio) in the 26 papers included initially in the systematic review (Figure 1).Collectively, this analysis included a total of 21,704 ax-SpA and AS patients with a total distribution rate of 1:2 for women vs. men (8411 F (38.7%) vs. 13,293 M (61.2%)).The included studies were mostly case-control and cross-sectionally designed, except for one cohort issue [45].
Because of the paucity of studies selected to verify the possible correlation of DD to HLA*B27, clinical aspects, and social parameters (education and work) in AS and ax-SpA, we were not able to conduct any meta-analysis on the studies shown in Table 2 that we described only as a systematic review, also indicating the prevalence of the single parameters examined in both sexes.■ Employed F vs. M, %: 62.5 vs. 75.3Abbreviations: ax-SpA, axial spondyloarthritis; AS, ankylosing spondylitis; CI, confidence interval; DD, diagnostic delay; F, females; HLA*, human leukocyte antigen; IQR, interquartile range; M, males; nr-ax-SpA, non-radiographic ax-SpA; SD, standard deviation.
Two other studies [37,44] reported no differences in the HLA*B27 distribution between M and F, with a DD that was not significantly different between the two sexes (Table 2).
Unfortunately, only one paper [42] analyzed the DD related to clinical onset in the two sexes and showed that when axial involvement was the first symptom of ax-SpA and AS, DD might be even higher in men if compared to women with axial or peripheral onset (p = 0.0003).
Finally, in six papers, widespread fibromyalgia pain was higher in women (4/6 significantly different) with a prevalence between 13.1% and 50% of cases compared to 0-13.1% in men [35,38,39,42,53,59], but none of them evaluated its role in DD.

Social Parameters
Only six papers investigated the level of education and work of patients (Table 2).The high level of education varied between studies (in F: 15.9-49.5%, in M: 13.18-54.1%),with controversial results for the comparison of sexes when analyzed in two European papers: Garrido-Cumbrera et al. [47] showed that university education is more common in women, whereas Neuenshwander et al. [53] did not find any difference at all school levels.
Finally, only Bandinelli et al. [42] showed that DD was higher in M with a low level of education (<8 years of education) vs. medium level education (<13 years of education) in F patients (Table 2).
Garrido-Cumbrera et al. [47] and Zink et al. [60] showed that women are employed equally to men, and Neuenschwander et al. [53] reported that they also have similar work absenteeism but with a higher percentage of precarious work (Garrido-Cumbrera et al. [47] and Ogdie et al. [54]).
Blasco-Blasco et al. [43] reported that DD's effect on work was not significantly different between sexes, whereas another study (Bandinelli et al. [42]) showed that DD in M manual workers was higher than in F non-manual workers.

Meta-Analysis of DD Difference between Sexes
From the initial articles included for systematic review, 8 were excluded for metaanalysis with statistical incoherence as follows: 2 reported only one value of DD for both M and F [41,45], 3 showed only median and interquartile values [22,50,60], and 3 additional papers were excluded due to statistical heterogeneity, as they included subgroups not comparable in the overall analysis of DD differences.Indeed, Blasco-Blasco et al. shared two different values of DD in each sex, one influenced by the effect of the disease on patients' work life and one regarding its effect on family and partner relationships [43], Ogdie et al. differentiated DD in three periods (<1, 2-9, >10 years) [54], while Ma et al. analyzed data separately from the North and South of China [51].
The final m-NOS, based on the sum of single scores of patient selection, comparability, and DD ascertainment of the 18 articles eligible for meta-analysis, are shown in Supplementary Table S1.
The final m-NOS, based on the sum of single scores of patient selection, comparability, and DD ascertainment of the 18 articles eligible for meta-analysis, are shown in Supplementary Table S1.
The mean value of DD varied from 5. The authors described patients diagnosed with the following criteria: 1/18 New York; 6/18 ASAS; 14/18 modified New York; 3/18 both ASAS and modified New York; 1/18 ESSG criteria in one of the studies that also included ASAS and modified New York (Table 1).
Eighteen studies were included in the meta-analysis with a high heterogeneity for random-effect meta-analysis (I 2 : 95.88%); for the sensitivity analysis, two recent papers [42,49] presented data dissimilar from other studies (Figure 2A,B).The first study [42] observed less DD in women than in men, related to higher education, work levels, and peripheral involvement in the F Italian population from two third-level rheumatology centers for early diagnosis.The second [49] was also a recent paper with data from a tertiary hospital and mixed European (Portugal) and South American populations, with ethnic and geographic bias.The first study [42] observed less DD in women than in men, related to higher education, work levels, and peripheral involvement in the F Italian population from two third-level rheumatology centers for early diagnosis.The second [49] was also a recent paper with data from a tertiary hospital and mixed European (Portugal) and South American populations, with ethnic and geographic bias.
The possible publication bias of the 18 studies initially included was confirmed using a funnel plot (Figure 3).The slight asymmetry at the lower left indicates that there is a lack of articles with small samples reporting estimations of lower DD in F compared to M.
The possible publication bias of the 18 studies initially included was confirmed us a funnel plot (Figure 3).The slight asymmetry at the lower left indicates that there is a l of articles with small samples reporting estimations of lower DD in F compared to M.Then, a second meta-analysis was carried out on the remaining 16 papers follow the Cochrane guidelines [61], which led to the observation of a lower DD in men (a m difference 1.48 years, 95% CI 0.83-2.14,p < 0.0001), with a lower heterogeneity (I 2 : 56.76 (Figure 2B).
Furthermore, we performed a more detailed meta-analysis based on stu characteristics, including namely the year of publication, geographic region (sim healthcare systems -Europe and Israel-vs.extra-European countries), sample sources, a WB economic class.
The studies published before 2009 showed higher DD in F (mean difference 3 years, 95% CI 2.11-4.22,p < 0.0001, I 2 : 0%) and in non-European countries (mean differe 0.95 years, 95% CI 0.05-1.85,p = 0.04, I 2 : 57.96%, p = 0.008), while after 2009 and in stud from similar healthcare systems of Europe and Israel, DD was similar (Figures 4 and 5 Then, a second meta-analysis was carried out on the remaining 16 papers following the Cochrane guidelines [61], which led to the observation of a lower DD in men (a mean difference 1.48 years, 95% CI 0.83-2.14,p < 0.0001), with a lower heterogeneity (I 2 : 56.76%) (Figure 2B).
Furthermore, we performed a more detailed meta-analysis based on study characteristics, including namely the year of publication, geographic region (similar healthcare systems -Europe and Israel-vs.extra-European countries), sample sources, and WB economic class.
Sample sources (mono-centric vs. multi-centric papers) and WB class seemed not to influence DD in the meta-analysis, as shown in Supplementary Figures S1 and S2.
Because of the non-univocal presence of classification criteria in these studies, a subanalysis of ASAS classification criteria was not possible.

Discussion
Our systematic review showed that DD in ax-SpA and AS was higher in women.Even if the final meta-analysis considered only limited descriptive cross-sectional studies eligible, reporting large samples only in a few cases, our analysis did not show significant differences in sex-related DD between mono-centric and more representative multicentric studies.
Furthermore, to reduce the initial heterogeneity and publication bias of metaanalysis, we had to exclude, in the final sensitivity analysis, two recent papers from third-level centers.In particular, the first one [42] observed a lower DD in F than in M related to higher education, work levels, and peripheral involvement; the second one [49] also had geographic and ethnic bias, focusing on patients from Europe (Portugal) and South America.
We know that the disease is often not suspected in the F population, leading to an incorrect or nonspecific diagnosis in healthcare records: Jovani et al. [22] reported that only 11% of F received a correct diagnosis vs. 30% of M, and also that F had a number of specialist visits that was significantly higher than M.
We also know that DD might have important consequences in terms of radiological progression [11], and the difference in spinal progression observed in patients is also depicted in relation to symptom duration.
Commonly, women are considered less prone to develop a disability, and the difference in DD between sexes was attributed to poorer spinal mobility and radiological damage in men who consent to an earlier diagnosis [35,48].This means that women might have difficult conditions for a long time, continuously looking for the right diagnosis and treatment without a solution [22].
Otherwise, in F, the self-ratings of handicap scores were generally worse than in M, and pain intensity increased with age [60].
The lower prevalence of HLA*B27 in F [22,40,47,51,53,58], reported in most of the papers analyzed, might also impact the diagnosis, as demonstrated only by Bandinelli et al. [42], who found a lower DD in HLA*B27-positive women.
We can assume that, as proposed by Hajialilo [24], since HLA*B27 was introduced as a criterion for the diagnosis of AS patients in the ASAS classification criteria, physicians visit HLA*B27-positive patients earlier, facilitating the diagnosis.
The difference in the gender distribution of HLA*B27 in the studies examined is difficult to interpret: HLA*B27 belongs to a family of closely related cell-surface proteins encoded in the HLA*B locus, located within chromosome 6, apparently without any known sex-related hereditary [65].
The over-estimation of HLA*B27-negative women in published studies might be a consequence of an incorrect initial referral lacking genetic and familial investigation for SpA and the inclusion of F in the study population only after unsuccessful treatment under a different diagnosis [58].
In addition, even if substantial evidence exists that HLA*B27 might have a direct role in genetic susceptibility to AS/SpA, HLA*B27 is only responsible for ~20% of the overall genetic risk [66]; the underlying molecular basis has yet to be identified, and it might not be a prerequisite for the occurrence of AS since this disease also affects individuals who lack this gene [65,67].
Interestingly, in some geographic areas, newly generated *B27 subtypes might include protective haplotypes, such as *B27 09 in Sardinia [68] and *B27 06 in Southeast Asia [69], offering a novel perspective to dissect disease pathogenesis and identify additional genetic factors of susceptibility [70].
Furthermore, there are numerous new genetic loci that are known to be associated with increased risk for AS ( [71,72]), and the possibility that AS susceptibility could be related to other MHC genes is increasing [73].For instance, in the Portuguese population, the genotype HLA*-F01 showed a possible susceptibility effect, while other F and G haplotypes seemed to be protective [74].
Thus, in future studies, other haplotypes might be more deeply studied and included in the clinical classification of the F AS and ax-SpA populations.
Secondly, another possible explanation for DD bias between sexes, previously hypothesized by Jovani et al. [35], might be the different prevalence in clinical symptoms at the onset or during the disease that might delay the diagnosis, which was confirmed by our systematic review that found a higher prevalence of joint, enthesis, and cervical spine involvement in women [35,38,40,49,58] or fibromyalgia overlap [38,39,53,54,59].Otherwise, only one study analyzed the correlation of clinical presentation at the onset with DD [42], showing that when axial involvement is the first symptom of the disease, DD might be even higher in M than in F.
In the F population, some symptoms of SpA, like morning stiffness and sleep disorders associated with diffuse enthesis pain, are frequently not interpreted in the same way as the M population.These symptoms might be mistakenly associated with fibromyalgia or incorrectly referred to surgeons or internal specialists [59] or even psychiatry [75,76].
Finally, our meta-analysis showed that DD was more pronounced in F in papers published in extra-European countries and before 2009.
The high significance and low heterogeneity of the analysis of studies published before 2009 may be due to the diffuse use of the New York criteria in the past [37,39,56], which was deemed essential for the presence of radiographic sacroiliitis and were validated prevalently in the M population [30].In fact, the new ASAS classification criteria were validated in 2009, and successively, a pelvis MRI and evaluation of peripheral involvement were introduced in clinical practice for AS and ax-SpA diagnosis [15].
Nevertheless, the ASAS classification criteria might also present some black holes.Firstly, the involvement of the cervical spine seemed higher in women [38], as was also confirmed in a recently published article [27] that showed a slightly higher number of cervical syndesmophytes [27,77] compared to lumbar ankylosis in F, which is a phenomenon that was not observed in men.
Otherwise, we should consider the potential limits of our current analysis.Firstly, the quality of the research on SpA has increased over the last twenty years, and old studies might have methodological limits due to limited sample size and possible selection bias.Indeed, the predominant source of information in the past decade has often been hospitals, which might be less accessible through primary healthcare channels.In addition, the year of publication was the only available representation of change in the diagnostic approach because a precise comparison between different classification criteria was not possible, as it is often contemporaneously present in studies, even though, as shown by Zhao et al. [14] the intervals between recruitment and publication are generally homogeneous in different studies.
The higher DD in women in the extra-European countries could be explained by the difference in healthcare system compared to Israel and Europe.Indeed, in these regions, citizens have the right to universal healthcare and access to tertiary centers for early diagnosis through MRI and HLA testing.
Another possible explanation for such a different DD distribution between sexes might be the similar genetic distribution, given that most of the Israeli population migrated from Europe after the Second World War.In fact, it was observed that the strength of this association of HLA*B27 varies among some of the ethnic and racial groups in the world [65,69,78].Furthermore, the geographic distribution of HLA*B27 shows a latituderelated gradient inverse to that of the malaria endemic, with an apparent exception in New Guinea, a region where malaria is present, but where HLA*B27 frequency shows, an orographic gradient antithetic to the malaria incidence [79,80].
Otherwise, in our study, genetic distribution was necessarily approximated to great geographic areas, and a more specific ethnic analysis might better clarify the different distribution between genders in the future, together with an analysis of gene polymorphisms.

Conclusions
This article highlights a fascinating and actual theme of gender bias in ax-SpA and AS, showing a poorer DD in the F population through a systematic review and meta-analysis of the literature from the last 40 years, from 1983 to 2023.
This research aimed to evaluate the difference in DD between sexes and the impact of genetic (HLA*B27), clinical conditions, education, and work factors on this gap, and the possible influence of the year and sample source of publications, geographic region, and WB economic class on DD in both sexes.
The different prevalence of HLA* B27 positivity, clinical presentation (peripheral, enthesis, and cervical spine involvement), education, and manual work between sexes might be an influence on DD, as shown only by limited evidence not sufficient for meta-analysis.For these reasons, future research is deemed essential, especially on genetic aspects and more modern and tailored ax-Spa and AS criteria to avoid the serious and important consequences of sex bias still present in the clinical management of women's health.
Moreover, in the second part of our meta-analysis, we argued that the year of publication and the geographic provenience might have a deep impact on DD.In fact, studies performed before 2009 and from extra-European countries reported higher DD.
We might conclude that the increase in the quality of the research on ax-SpA and AS in recent years and the use of the ASAS classification criteria has improved the operative definition of disease in different genders.At the same time, the research approach still risks decontextualizing women's health and under-sizing the sex bias, as a consequence of a skein of responsibilities still difficult to unravel.
3 and 14.4 in women and from 4.1 and 10.3 in men, respectively, with age at onset (shown only in 9 studies) variable from 23 to 30.4 in F and from 22.2 and 28.3 in M, and age at diagnosis (shown only in 4 papers) variable from 33 to 42.2 in F and from 30.2 and 41.4 in M.

Figure 2 .
Figure 2. Forest plot of the differences between the mean diagnostic delay of axial spondyloarthritis and spondylitis in women vs. men according to published papers.(A) Analysis including 18 papers.(B) Analysis including 16 papers.

Figure 2 .
Figure 2. Forest plot of the differences between the mean diagnostic delay of axial spondyloarthritis and spondylitis in women vs. men according to published papers.(A) Analysis including 18 papers.(B) Analysis including 16 papers.

Figure 3 .
Figure 3. Funnel plot showing publication bias of the published papers on diagnostic delay women vs. men (18 studies included in the meta-analysis).

Figure 3 .
Figure 3. Funnel plot showing publication bias of the published papers on diagnostic delay in women vs. men (18 studies included in the meta-analysis).

Figure 4 .
Figure 4. Forest plot of the differences between the mean diagnostic delay of axial spondyloarthritis and spondylitis in women vs. men according to different countries.(A) Extra-European countries.(B) Europe and Israel.Multi-centric studies were from: Argentina, Brazil, Costa Rica, Chile, Ecuador, Mexico, Peru, Uruguay, and Portugal (A); Austria, Belgium, France, Germany, Italy, Netherlands, Norway, Russia, Slovenia, Spain, Sweden, Switzerland and the United Kingdom (B).

Figure 5 .
Figure 5. Forest plot of the differences between the mean diagnostic delay of axial spondyloarthritis and spondylitis in women versus men according to the year of publication.(A) Papers published before 2009.(B) Papers published after 2009.

Figure 4 .
Figure 4. Forest plot of the differences between the mean diagnostic delay of axial spondyloarthritis and spondylitis in women vs. men according to different countries.(A) Extra-European countries.(B) Europe and Israel.Multi-centric studies were from: Argentina, Brazil, Costa Rica, Chile, Ecuador, Mexico, Peru, Uruguay, and Portugal (A); Austria, Belgium, France, Germany, Italy, Netherlands, Norway, Russia, Slovenia, Spain, Sweden, Switzerland and the United Kingdom (B).

Figure 4 .
Figure 4. Forest plot of the differences between the mean diagnostic delay of axial spondyloarthritis and spondylitis in women vs. men according to different countries.(A) Extra-European countries.(B) Europe and Israel.Multi-centric studies were from: Argentina, Brazil, Costa Rica, Chile, Ecuador, Mexico, Peru, Uruguay, and Portugal (A); Austria, Belgium, France, Germany, Italy, Netherlands, Norway, Russia, Slovenia, Spain, Sweden, Switzerland and the United Kingdom (B).

Figure 5 .
Figure 5. Forest plot of the differences between the mean diagnostic delay of axial spondyloarthritis and spondylitis in women versus men according to the year of publication.(A) Papers published before 2009.(B) Papers published after 2009.

Figure 5 .
Figure 5. Forest plot of the differences between the mean diagnostic delay of axial spondyloarthritis and spondylitis in women versus men according to the year of publication.(A) Papers published before 2009.(B) Papers published after 2009.

Table 1 .
Summary of the main characteristics of the studies on DD difference between sexes, with data expressed as the mean (SD) or median (IQR, interquartile range) (*), included in the systematic review (n = 26) and in the meta-analysis (n = 18).Articles in bold were included in the meta-analysis.

Table 2 .
Studies analyzed only in the systematic review that included HLA*B27, peripheral, enthesis, axial, and fibromyalgia clinical presentation, social factors (work and education) prevalence, and its possible influence on DD, with data expressed as the percentage (%), mean ± standard deviation (SD) or median ± interquartile range (IQR).