Heterogeneity in Measures of Illness among Patients with Myalgic Encephalomyelitis/Chronic Fatigue Syndrome Is Not Explained by Clinical Practice: A Study in Seven U.S. Specialty Clinics

Background: One of the goals of the Multi-site Clinical Assessment of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (MCAM) study was to evaluate whether clinicians experienced in diagnosing and caring for patients with myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) recognized the same clinical entity. Methods: We enrolled participants from seven specialty clinics in the United States. We used baseline data (n = 465) on standardized questions measuring general clinical characteristics, functional impairment, post-exertional malaise, fatigue, sleep, neurocognitive/autonomic symptoms, pain, and other symptoms to evaluate whether patient characteristics differed by clinic. Results: We found few statistically significant and no clinically significant differences between clinics in their patients’ standardized measures of ME/CFS symptoms and function. Strikingly, patients in each clinic sample and overall showed a wide distribution in all scores and measures. Conclusions: Illness heterogeneity may be an inherent feature of ME/CFS. Presenting research data in scatter plots or histograms will help clarify the challenge. Relying on case–control study designs without subgrouping or stratification of ME/CFS illness characteristics may limit the reproducibility of research findings and could obscure underlying mechanisms.


Background/Introduction
Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is recognized to be a significant illness that impairs the lives of those affected and their families.The economic burden of the illness in the U.S. is estimated to be USD 9-14 billion annually in direct medical costs and an additional USD 9-37 billion annually in lost productivity [1,2].Patients with ME/CFS experience a wide range of symptoms, most characteristically significantly decreased function associated with severe fatigue, post-exertional malaise, unrefreshing sleep, cognitive impairment, and orthostatic intolerance.However, chronic widespread pain, allergies, sensitivity to light and sound, chemical and food sensitivities, headaches, and other symptoms are common.The pathogenesis of this biologic illness remains a mystery despite decades of research.While a wide variety of objective differences between patients with ME/CFS and healthy controls have been reported, none are specific and sensitive enough to be used in diagnosis [3].
In the absence of a diagnostic test, recognizing ME/CFS requires reliance on patientreported characteristic symptoms and clinical acumen to identify and treat any other illnesses that may contribute to the problems experienced by the patient.A variety of research and clinical case definitions have been in use, and clinicians world-wide recognize the clinical entity [4].While the definitions have consensus on core features, differences in the numbers of required symptoms and in the use of exclusionary conditions contribute to variations in diagnosis.Further, the case definitions do not include guidance on how symptom-based criteria are fulfilled, and different approaches to applying the same case definition can affect diagnosis [5].The 2015 Institute of Medicine report on ME/CFS recommended a streamlined clinical case definition to aid healthcare providers in recognizing the illness [6].Nonetheless, case ascertainment for research purposes remains controversial, and differences in study findings may arise from the lack of defined reproducible measures and criteria for each element of a case definition.Currently, many studies rely on a combination of clinical expertise and one or more case definitions.Further, as many studies rely on patients recruited from a single specialty clinic, differences in referral patterns and clinical practices could affect patient characteristics.
One of the goals of the Multi-site Clinical Assessment of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (MCAM) study was to evaluate whether clinicians experienced in diagnosing and caring for patients with ME/CFS recognized the same clinical entity.This question is important for both research and clinical practice.Are there clinicbased differences resulting in the identification of patients that differ in important ways between clinics?Such differences could result in challenges replicating study findings and contribute to clinical responses to therapies.We used standardized data collected on key illness dimensions of patients with ME/CFS enrolled from seven specialty clinics in the U.S. to investigate whether these patient characteristics differ between clinics.

Methods
We have published details of the study design and methods of the source study, MCAM [7].Briefly, expert clinicians from 7 specialty clinics in the U.S. enrolled patients who, in their clinical opinion, met criteria for ME/CFS and were between the ages of 18 and 70 years at the time of enrollment.Clinical sites were designated by letters (A-G).Each clinical site completed a form to provide information describing their clinical practice [Form included in Supplemental Material S1].
Clinic personnel systematically abstracted socio-demographic data and presentation of illness from medical records.Patients completed a battery of standardized questionnaires/forms to measure ME/CFS symptoms and illness domains (Table 1).Although the MCAM study was initiated prior to the publication of the first set of Common Data Elements (CDEs) for ME/CFS, the study collected many of these data elements via their recommended assessment tools [8].
We have published the overall socio-demographic characteristics of the initial study sample (n = 471) from baseline enrollment between January 2012 and May 2013 [7].Six of those enrolled were later confirmed to be age-ineligible and removed from further analysis; thus, the current analysis includes data from 465 ME/CFS patients.Patient characteristics are examined by site (A through G) as well as overall.In addition to examining differences in patient characteristics by site of enrollment, we sought to estimate whether patients met the most frequently used case definitions of ME/CFS or CFS.We operationalized the application of each case definition by using standardized measures and scoring thresholds for each required element of the definition.The 1994 research case definition of CFS [9] was evaluated using items from the CDC Symptom Inventory, SF-36, and MFI-20 [5].The 2003 Canadian case definition for ME/CFS [10] was evaluated using items from the DePaul Symptom Questionnaire, as previously reported [11].The 2015 IOM clinical case definition [6] was evaluated using items from the CDC Symptom Inventory and the DePaul Symptom Questionnaire.Details of the algorithms are included in Supplemental Materials (S2).
Descriptive statistics were calculated by site.Means and standard deviations were calculated for continuous variables, and proportions were calculated for categorical variables.Variation in patient characteristics, overall and by site, were examined using a combination of graphical (e.g., bar graphs, boxplots) and numerical (e.g., general linear models for continuous variables and chi-square tests for categorical variables) methods.
Bonferroni-corrected p-values were calculated for the multiple group comparisons on outcome measures of interest.

Clinical Practice Characteristics of Study Sites
Each of the seven expert clinicians established clinics to provide care to patients with ME/CFS.Most of them also cared for patients with fibromyalgia, fatigue, and sleep and/or pain disorders.The primary physicians at each clinical site had between 15 and 45 years' experience with diagnosing and caring for patients with ME/CFS, and their clinical practices had been active for 5-40 years.The board certifications of study physicians varied and included internal medicine, environmental medicine, infectious disease, immunology, hematology, neurology, and pediatrics.About half of the clinical sites required payment from their patients rather than relying on insurance.While over 75% of clinics indicated use of an electronic health record (EHR) system, this was limited to billing and payment for the majority.Most clinics did not use an EHR system until 2012, the first enrollment year of the MCAM study.Three sites were solo practices, two were small group, one was academic, and one was hospital-based.Clinic staff included nurses and/or nurse practitioners.One clinic had a physician assistant.The number of patients seen at each clinical site ranged from 5 to 70 per week.Six clinics accepted new patients, and waiting lists at four clinics ranged between 40 and 700 patients.Clinics varied in their location (four metropolitan, two urban, and one rural), ease of accessibility (public transportation or on-site parking available at all sites), and other important characteristics (time spent with patients and charges for the initial and follow-up visits).

Socio-Demographic Characteristics
As shown in Table 2, other than employment, each of these patient characteristics differed by site according to statistical measures; however, there was also significant variation for each measure within individual clinics.For example, overall, the mean age at enrollment was 47.9 years (standard deviation (SD) = 12.6), and the range in the mean age by site was 43.3 to 52.5 years.As shown in Figure 1, there was extensive overlap in the age distributions between sites (overall range <20 to 70 years).Females predominated at each site, with an overall female-to-male ratio of 2.9 (range of 1.6 to 5.6 across sites).Patients were predominantly white, comprising 94.4% of the study sample (range of 84.4% to 100% across sites).More than half were married or in a committed relationship, but the proportion ranged from 40.8% to 73.3% across sites.Overall, about 75% were unemployed (range of 64.3% to 89.1% across sites).The majority were insured (94.3%) and at least college graduates (76.8%), but the insured rate ranged from 87% to 100%.The proportion attaining a college degree or higher ranged from 58.9% to 85.9%.

General Clinical Characteristics
Table 3 presents the general characteristics of the patients overall and by site, including age at diagnosis, mode of onset, duration of fatigue, Body Mass Index (BMI), and number of medications.Each of these characteristics differed by site according to statistical measures.Overall, the mean age at diagnosis was 38.3 years (SD = 12.6), whereas by site, the mean age at diagnosis ranged from 33.5 to 40.9 years.As shown in Figure 2, the distribution of ages at diagnosis by clinic and overall is broad and shows substantial overlap.Sudden illness onset was reported by 65.4% of patients overall, and this varied significantly by clinic, ranging from 49.3% to 75.4%.In all but one clinic, sudden onset was more common than gradual.The mean duration of fatigue ranged from 9.4 to 17.9 years across sites.We used the duration of fatigue as a surrogate for the duration of ME/CFS, as not all

General Clinical Characteristics
Table 3 presents the general characteristics of the patients overall and by site, including age at diagnosis, mode of onset, duration of fatigue, Body Mass Index (BMI), and number of medications.Each of these characteristics differed by site according to statistical measures.
Overall, the mean age at diagnosis was 38.3 years (SD = 12.6), whereas by site, the mean age at diagnosis ranged from 33.5 to 40.9 years.As shown in Figure 2, the distribution of ages at diagnosis by clinic and overall is broad and shows substantial overlap.Sudden illness onset was reported by 65.4% of patients overall, and this varied significantly by clinic, ranging from 49.3% to 75.4%.In all but one clinic, sudden onset was more common than gradual.The mean duration of fatigue ranged from 9.4 to 17.9 years across sites.We used the duration of fatigue as a surrogate for the duration of ME/CFS, as not all symptoms may appear at the same time.On a clinic basis, the mean age at enrollment and the duration of fatigue showed a modest correlation (R 2 = 0.74).The mean BMI was in the overweight range for all clinics (range of 24.3-29.2),overall, 26.6 (SD = 6.3).The mean number of medications was 5.9 overall, with variation by site (range of means of 4.1-8.2).

Functional Impairment
Table 4 shows the patients' measures of functional impairment by site and overall.These measures show no statistical differences among the clinics.As presented previously for the overall group [7], the mean scores for Role Physical and Vitality were lowest (worst functioning) for patients at each site, and the mean scores for Mental Health and Role Emotional were highest (least impaired) among SF-36 subscales.Hours of vertical activity is the patient-reported average number of hours per day that they spent with their feet on the floor.This encompasses sitting, standing, and walking activities.The median of patients' vertical hours by site ranged from 6.3 to 8.0, with an overall mean of 7.5 h.To help put this into context, this is half of that reported by healthy controls (14 h) [12].
Exercise was evaluated using the response to the level of physical activity questions (medical history form) and the question about how many times in a typical week participants would exercise for more than 15 minutes in their leisure time.Strenuous exercise was described as "heart beating rapidly" (examples: soccer, jogging).Moderate exercise was described as "not exhausting" (examples: lifting weights, fast walking, folk dancing).Mild exercise was described as "minimal effort" (examples: yoga, golf, bowling).The highest level of exercise engaged at least once per week for more than 15 min was captured.Overall, more than half (73.6%) of all patients reported engaging in mild exercise, with a range across sites of 65.3% to 84.1%.Fewer patients (29.5% overall, 23.3% to 36.4% across sites) reported engaging in moderate exercise, while even fewer reported strenuous exercise at least once a week (7.8% overall, 4.1% to 11.6% across sites).
The CDC Health-related Quality of Life (HRQoL) asks patients to record the number of days during the past 30 days when their physical health (includes physical illness and injury) was not good.Overall, the mean number of physically unhealthy days experienced by patients in the past 30 days was 23.2 days (SD = 9.5), and by site, the means ranged from 20.5 to 25.1 days.

Measures of Post-Exertional Malaise (PEM)
We evaluated PEM using medical record review (Illness Abstraction Form [IAF]) as well as portions of two different questionnaires, the CDC SI (PEM question) and DePaul Symptom Questionnaire (items [14][15][16][17][18].The CDC SI and DSQ provided item scores and a determination of PEM presence.The results are shown in Table 5 by site and overall.None of these measures differed significantly by site.All values are mean (standard deviation) unless otherwise indicated.No comparisons were statistically significant at *** p < 0.001, ** p < 0.01 or * p < 0.05. 1 Insufficient information was available to determine PEM status using all three approaches for four subjects.2 PEM present for CDC SI defined as experienced ≥ 6 months and present during the past month.3 PEM present for DSQ if frequency ≥ 2 ("About Half the Time") and severity ≥ 2 ("Moderate") for at least one of DSQ items #14-#18.
The IAF was completed by clinic staff based on the review of all available medical records for an indication that PEM, with illness relapse of at least 24 h duration, was experienced by the patient for at least 6 months at any point in their illness.PEM was recorded as present based on the IAF if these criteria were met.Overall, PEM could be identified by the IAF in 87.5% of patients, with a range by clinic of 83.8% to 90.7%.
Based on the CDC SI, PEM was recorded as present if the symptom lasted at least 6 months and was reported during the past month.Overall, the CDC SI identified the presence of PEM in 94.7% of patients with a mean score of 11.7 (SD = 4.8).The mean score by site ranged from 10.8 (SD = 5.1) to 12.6 (SD = 3.3), and PEM presence varied from 90.3% to 100%.
Based on the DSQ, PEM was determined to be present if any of the five DSQ items had values for both frequency and severity ≥ 2. Overall, the mean for each of the five PEM items ranged from a high of 9.1 (SD = 5.4) for item 17 (tired after minimal exercise) to a low of 6.5 (SD = 5.1) for item 16 (mentally tired after slightest effort).The mean PEM item scores by site but did not differ significantly.Overall, the DSQ identified PEM in 94.0% of patients, and the range by site was 89.1% to 97.4%.
The three approaches to evaluating PEM were compared in the 461 participants with enough information.Overall, PEM was identified by all three methods in 81% (371/461) of participants, and only 6 (1.3%) failed to have PEM documented with at least one method.Concordance was highest between the DSQ and CDC-SI (92.4%, 426/461) and lowest between the IAF and DSQ (83.5%, 385/461).

Measures of Fatigue
Table 6 shows the mean scores on the MFI-20 subscales and the mean T-score of the PROMIS Fatigue Short Form 7a (PROMIS F-SF) by site and overall.Most of these measures show no statistically significant difference by site.The exceptions are two of the MFI-20 subscales, Physical Fatigue and Reduced Activity.However, the mean difference in these subscale scores across sites is less than the two-point minimal clinically important difference for these measures.

Measures of Sleep
Table 7 shows the mean scores of sleep measures overall and by site.The measures include unrefreshing sleep and sleep problems (from CDC-SI), sleep disturbance and sleeprelated impairment (from PROMIS Sleep Short Forms), and a variety of measures from DSQ items 19-24 (unrefreshed, need to nap, problems falling asleep, problems staying asleep, awaking too early, and sleep all day/awake all night).The mean scores by site for nearly all measures show no statistically significant difference.The two exceptions are PROMIS sleep-related impairment and DSQ unrefreshed.The T-scores for PROMIS sleep-related impairment means by site ranged from 60.5 (SD = 7.1) to 64.3 (SD = 7.6), a difference of 3.8 points that would be considered clinically meaningful [13].However, the SD of the measures is nearly twice the difference in the site means.Both the CDC SI and DSQ measure unrefreshing sleep using a similar Likert scale.Although the CDC-SI measure of unrefreshing sleep was not significantly different across sites, it showed the same trends by site as DSQ unrefreshing sleep.

Measures of Neurocognitive/Autonomic Symptoms
Table 8 shows the mean scores by site and overall for the four symptoms measured in the CDC-SI (memory, concentration, sensitive to light, and short of breath) and the twenty symptoms measured in the DSQ (items 32-51).The mean symptom scores by site did not differ statistically for nearly all these measures.The two exceptions are DSQ items 34 (noise sensitive, range of mean scores of 4.5 to 8.4) and 48 (unsteady on feet, range of mean scores of 2.7 to 5.2).The sites all showed similarities in the symptoms with the highest and lowest mean scores.The four DSQ symptoms with the highest mean scores were the same for all sites and overall, i.e., concentration, memory, only focus on one thing, and word-finding difficulty.Memory and concentration were the CDC-SI symptoms with the highest mean scores in all sites and overall.The three DSQ scores with the lowest mean scores were the same for all sites and overall, i.e., irregular heartbeats, loss depth perception, and nausea.In addition, for all but one site, muscle twitch was among the four DSQ symptoms with lowest mean scores.

Measures of Pain
Table 9 shows the mean scores by site and overall for the pain-related measures, those in the Brief Pain Inventory (BPI), PROMIS Pain Interference and Pain Behavior, three CDC-SI symptoms, and DSQ items #25-31.Once again, there were only a few items with statistically significant differences in means across sites (severity of pain in BPI, joint pain in CDC-SI, muscle pain or ache and pain/stiffness without swelling in >1 joint in DSQ).The PROMIS T-scores for pain interference were slightly higher than pain behavior in each site.In both the CDC-SI and DSQ, muscle symptoms have higher mean scores in each site than joint symptoms.All values are mean (standard deviation).*** p < 0.001; ** p < 0.01; * p < 0.05.

Measures of Other Symptoms
Table 10 shows the mean scores by site and overall for symptoms classified as immunologic/inflammation, gastrointestinal, and emotional or behavioral.Immunologic/inflammation symptoms include five measures from the CDC-SI (sore throat, sinus problems, tender lymph nodes, fever, chills) and five from the DSQ (items 62-66).The mean scores by site show significant differences only for tender lymph nodes in the CDC SI (mean scores range from 2.4 to 4.9) and fever in the DSQ (mean scores range from 0.6 to 1.7).The pattern of symptom severity (as measured by mean scores) was nearly identical among the sites and overall.The top two CDC SI immunologic/inflammation symptoms were sinus problems and tender lymph nodes (all but one site had higher mean scores for sinus problems), whereas fever was the symptom with the lowest mean scores.The top two DSQ symptoms in the immunologic/inflammatory group were sickened by smells, food, medications, chemicals, and flu-like symptoms for all sites and overall, whereas fever was the symptom with the lowest mean scores.Measures of gastrointestinal symptoms did not differ between sites.The mean score for stomach pain was higher than for diarrhea for all sites and overall.

Proportion Meeting Case Definitions
Table 11 shows the proportion of patients meeting each of the three case definition algorithms by site and overall.For each site, the 1994 research algorithm was met by the highest proportion of patients (83.4% overall, range of 77.0-90.0),and the 2003 Canadian algorithm was met by the lowest proportion of patients (50.1% overall, range of 45.9-57.4). Figure 3 illustrates the agreement between case definition algorithms for the 449 patients with data for all three calculations.Of note, 352 (70.2%) met two or more algorithms, and 41 patients (9.1%) did not meet any of the algorithms.Two-way concordance between algorithms was similar: 1994 and Canadian, 0.624; 1994 and IOM, 0.670; and Canadian and IOM, 0.628.3.3.9.Histograms of CDC SI Scores As noted above, the standard deviations of the measures of illness indicate the heterogeneity of these measures.Histograms of the CDC Symptom Inventory scores (Figure 4) illustrate how score distributions differ by symptom, with each clinic's scores following the same pattern.Symptoms included in most case definitions (post-exertional malaise

Histograms of CDC SI Scores
As noted above, the standard deviations of the measures of illness indicate the heterogeneity of these measures.Histograms of the CDC Symptom Inventory scores (Figure 4) illustrate how score distributions differ by symptom, with each clinic's scores following the same pattern.Symptoms included in most case definitions (post-exertional malaise and sleep) have greater frequency in the higher symptom score groups.By comparison, the score distributions of memory and concentration, muscle aches/pain, joint pain, sensitivity to light, and sinus/nasal problems are more evenly spread over the score groups.Other symptoms, such as fever, chills, and depression, are heavily represented in the lower score groups.The DSQ scores follow the same pattern.None of the measures of depression, anxiety, or mentally unhealthy days differed between clinics and overall.Overall, the mean scores for Zung SDS (45.0, SD = 9.0), GAD-7 Anxiety (5.1, SD = 5.3), CDC HRQoL mentally unhealthy days (9.2, SD = 10.5), and PHQ-8 (10.0, SD = 5.1) indicate that depression and anxiety are clinically significant issues for some study participants.

Proportion Meeting Case Definitions
Table 11 shows the proportion of patients meeting each of the three case definition algorithms by site and overall.For each site, the 1994 research algorithm was met by the highest proportion of patients (83.4% overall, range of 77.0-90.0),and the 2003 Canadian algorithm was met by the lowest proportion of patients (50.1% overall, range of 45.9-57.4). Figure 3 illustrates the agreement between case definition algorithms for the 449 patients with data for all three calculations.Of note, 352 (70.2%) met two or more algorithms, and 41 patients (9.1%) did not meet any of the algorithms.Two-way concordance between algorithms was similar: 1994 and Canadian, 0.624; 1994 and IOM, 0.670; and Canadian and IOM, 0.628.

Discussion
We initiated this study to characterize patients with ME/CFS based on the clinical opinion of clinicians with recognized expertise in diagnosing and caring for these patients.The goal of this analysis was to determine if differences in clinical practice resulted in different subgroups of patients.We used standardized instruments to measure the major domains of illness experienced by patients with ME/CFS and evaluated these characteristics for patients seen at each clinic and overall.The specialty clinics differed in size (solo practice, small group, academic, hospital-based), location (metropolitan, urban, and rural), and practice characteristics (time spent with patients, charges for initial visit, follow-up visits).The board certifications of the physicians included internal medicine, environmental medicine, infectious diseases, immunology, hematology, neurology, and pediatrics.Certifications could influence referral patterns and approaches to laboratory testing, diagnosis, and management.There were statistical differences between sites in the general characteristics of their patients.Socio-demographic variables (Table 2) showed statistical differences by clinic site, but the distributions of the variables overlapped (such as age at enrollment, Figure 1) and were in the same direction (predominately insured white women with a high level of education).Similarly, there were statistical differences by site in the age at diagnosis, mode of onset, duration of fatigue, and BMI.Again, distributions were overlapping (such as age at diagnosis, Figure 2) and in the same direction (majority of patients had sudden onset and were overweight).Interestingly, the largest variation was in the number of medications prescribed for patients, with a mean ranging from 4.1 to 8.2 (p = 0.001).
In the face of these practice differences, it is striking that we found few statistically significant and no clinically meaningful differences between clinics in their patients' standardized measures of ME/CFS symptoms and function (Tables 4-10).This suggests that expert clinicians are recognizing the same clinical entity, albeit one that is far from homogeneous.Importantly, patients in each clinic sample and overall showed a wide distribution in all scores and measures.The data tables are provided as a reference of these illness measures in clinically well-characterized patients.
Heterogeneity in ME/CFS illness characteristics has been recognized, and researchers have been advised to use more restrictive or overlapping case definitions to improve case ascertainment [14].However, symptom-based case definitions have limitations, including a lack of guidance in how each criterion of the definition should be met.Differences in how the same case definition is applied or operationalized can impact case ascertainment [5].The use of common data elements for ME/CFS [8] will improve reproducibility, but work remains in establishing the optimal thresholds for each measure.However, current ME/CFS case definitions do not eliminate heterogeneity.For example, the Chronic Fatigue Initiative [15] required that the 203 enrolled patients with ME/CFS met the 1994 research case definition [9], the 2003 Canadian definition [10], or both and reported SF-36 scores with standard deviations similar to those observed in our study [15].
The medical complexity of ME/CFS and the lack of objective diagnostic tests presents challenges for case ascertainment in research and clinical care.Heterogeneity is masked when studies report means or medians.Presenting research data in a format that shows heterogeneity, such as scatter plots or histograms, will help clarify the challenge.Variations in patient demographics, co-morbid conditions, medications, and duration of illness can all contribute to heterogeneity [16].
ME/CFS shares many characteristics with post-acute infection syndromes (PAISs) and is often considered to have an infectious trigger, although other triggers are not ruled out.The COVID-19 pandemic, resulting in a chronic PAIS termed Long COVID, has increased clinical and research studies seeking to understand the mechanisms and options for treatments.Despite a single defined trigger (SARS CoV-2), patients with Long COVID have striking heterogeneity in their illness profiles.An early report from the Patient-Led Research Collaborative [17] collected data on 203 symptoms in 10 organ systems.Illness heterogeneity may be an inherent feature of these syndromes.Relying on case-control study designs without the subgrouping or stratification of ME/CFS illness characteristics may limit the reproducibility of research findings and could obscure underlying mechanisms.Study designs that compare similarities and differences in ME/CFS, Long COVID, and other PAIS and begin to link biomarkers to symptom measures and subgroups may be needed.
While post-exertional malaise (PEM) is considered to be characteristic of ME/CFS and is a required symptom in the 2015 IOM clinical case definition [6], methods to identify PEM are not standardized.We found that the five DSQ items recently recommended as a first step in identifying PEM [18] correlated well with results based on a single question in the CDC SI (92.4%, 426/461).Cotler et al. found that supplementary questions on the timing and duration of PEM were necessary to distinguish between ME/CFS and other illnesses [18], and this could not be evaluated in our study.Given the burden of symptom

Figure 1 .
Figure 1.Distribution of age at enrollment by site (A through G) and overall mean.The boxplots display the five-number summary: minimum, first quartile (Q1), median, third quartile (Q3), and maximum.The central rectangle spans from the first quartile to the third quartile (the interquartile range), a green segment inside the rectangle shows the median, the red diamond shows the mean, and the vertical lines (sometimes referred to as whiskers) are extended to the extrema of the distribution in the data set.

Figure 1 .
Figure 1.Distribution of age at enrollment by site (A through G) and overall mean.The boxplots display the five-number summary: minimum, first quartile (Q1), median, third quartile (Q3), and maximum.The central rectangle spans from the first quartile to the third quartile (the interquartile range), a green segment inside the rectangle shows the median, the red diamond shows the mean, and the vertical lines (sometimes referred to as whiskers) are extended to the extrema of the distribution in the data set.

Figure 2 .
Figure 2. Distribution of age at diagnosis by site (A through G).The boxplots display the fivenumber summary: minimum, first quartile (Q1), median, third quartile (Q3), and maximum.The central rectangle spans from the first quartile to the third quartile (the interquartile range), a green segment inside the rectangle shows the median, the red diamond shows the mean, and the vertical lines (sometimes referred to as whiskers) are extended to the extrema of the distribution in the data set.

Figure 2 .
Figure 2. Distribution of age at diagnosis by site (A through G).The boxplots display the five-number summary: minimum, first quartile (Q1), median, third quartile (Q3), and maximum.The central rectangle spans from the first quartile to the third quartile (the interquartile range), a green segment inside the rectangle shows the median, the red diamond shows the mean, and the vertical lines (sometimes referred to as whiskers) are extended to the extrema of the distribution in the data set.

Figure 3 .
Figure 3. Agreement in classification by case definition algorithm.Venn diagram showing the overlap in classification by case definition algorithm.The number that did not fulfil any of the algorithms is shown in the background.Note: Data exclude 16 participants with insufficient information to determine all classifications.

Figure 3 .
Figure 3. Agreement in classification by case definition algorithm.Venn diagram showing the overlap in classification by case definition algorithm.The number that did not fulfil any of the algorithms is shown in the background.Note: Data exclude 16 participants with insufficient information to determine all classifications.

Figure 4 .
Figure 4. Histograms of CDC Inventory Scores by clinic.Frequency of CDC Symptom Inventory scores (frequency X severity) is shown by score groups 0 (not present), 1-4, 5-8, 9-12, and 13-16 (highest scores) for each clinic (A-G, shown by colors noted at the bottom of the figure).

Figure 4 .
Figure 4. Histograms of CDC Inventory Scores by clinic.Frequency of CDC Symptom Inventory scores (frequency X severity) is shown by score groups 0 (not present), 1-4, 5-8, 9-12, and 13-16 (highest scores) for each clinic (A-G, shown by colors noted at the bottom of the figure).

Table 2 .
Socio-demographic characteristics of study population by site (A through G) and overall. A
A (

Table 3 .
General clinical characteristics of patients by site (A through G) and overall.

Table 3 .
General clinical characteristics of patients by site (A through G) and overall.

Table 4 .
Measures of functional impairment by site (A through G) and overall.

Table 5 .
Measures of post-exertional malaise (PEM) by site (A through G) and overall.

Table 6 .
Measures of fatigue by site (A through G) and overall.

Table 7 .
Measures of sleep by site (A though G) and overall.
A (

Table 8 .
Measures of neurocognitive/autonomic symptoms by site (A through G) and overall.

Table 9 .
Measures of pain by site (A through G) and overall.

Table 10 .
Measures of other symptoms by site (A through G) and overall.

Table 11 .
Patients meeting case definition algorithms by site (A through G) and overall.

Table 11 .
Patients meeting case definition algorithms by site (A through G) and overall.

Table 11 .
Patients meeting case definition algorithms by site (A through G) and overall.