Planimetric and Volumetric Brainstem MRI Markers in Progressive Supranuclear Palsy, Multiple System Atrophy, and Corticobasal Syndrome. A Systematic Review and Meta-Analysis

Background: Various MRI markers—including midbrain and pons areas (Marea, Parea) and volumes (Mvol, Pvol), ratios (M/Parea, M/Pvol), and composite markers (magnetic resonance imaging Parkinsonism Indices 1,2; MRPI 1,2)—have been proposed as imaging markers of Richardson’s syndrome (RS) and multiple system atrophy–Parkinsonism (MSA-P). A systematic review/meta-analysis of relevant studies aiming to compare the diagnostic accuracy of these imaging markers is lacking. Methods: Pubmed and Scopus were searched for studies with >10 patients (RS, MSA-P or CBS) and >10 controls with data on Marea, Parea, Mvol, Pvol, M/Parea, M/Pvol, MRPI 1, and MRPI 2. Cohen’s d, as a measure of effect size, was calculated for all markers in RS, MSA-P, and CBS. Results: Twenty-five studies on RS, five studies on MSA-P, and four studies on CBS were included. Midbrain area provided the greatest effect size for differentiating RS from controls (Cohen’s d = −3.10; p < 0.001), followed by M/Parea and MRPI 1. MSA-P had decreased midbrain and pontine areas. Included studies exhibited high heterogeneity, whereas publication bias was low. Conclusions: Midbrain area is the optimal MRI marker for RS, and pons area is optimal for MSA-P. M/Parea and MRPIs produce smaller effect sizes for differentiating RS from controls.


Introduction
Atypical Parkinsonian disorders (APD) is a term used to describe three rare neurodegenerative Parkinsonian disorders, namely progressive supranuclear palsy (PSP), multiple system atrophy (MSA), and corticobasal syndrome (CBS) [1][2][3].PSP exhibits significant phenomenological heterogeneity, with Richardson's syndrome (RS) being the most common syndrome, characterized by supranuclear gaze palsy and early postural instability.MSA presents with two distinct syndromes, with predominant parkinsonian (MSA-P) or cerebellar (MSA-C) symptomatology.Despite the presence of distinct clinical features in RS, MSA-P, and CBS, misdiagnosis is common, particularly at early disease stages and in oligosymptomatic cases [4,5].
In an effort to support a timely and accurate diagnosis, multiple imaging markers have been implemented.These morphometric MRI markers focus on the relatively selective midbrain and superior cerebellar peduncle (SCP) atrophy in RS, as evidenced by neuropathological studies [6].Likewise, MSA (predominantly MSA-C and to a lesser extent MSA-P) is characterized by relatively selective pontine and middle cerebellar peduncle (MCP) atrophy [7].Multiple studies have examined the diagnostic accuracy of diverse morphometric brainstem MRI markers, including linear distances, surfaces and volumes of the midbrain, the pons, SCPs, and MCPs [8][9][10][11].Additionally, composite MRI markers, such as the magnetic resonance Parkinsonism indices 1 and 2 (MRPI 1 and 2) Neurol.Int.2024, 16 2 have been introduced in an effort to increase discriminative power by combining multiple morphometrical measurements [12,13].
Despite the abundance of relevant studies, significant differences between studies in study designs, cohort characteristics, and imaging markers implemented have resulted in discrepant results regarding the diagnostic accuracy of these MRI markers in different APDs.
The present systematic review and meta-analysis aims to present data regarding MRI brainstem imaging markers in RS, MSA-P, and CBS in a systematic and comprehensive manner.The primary aim of this study was to compare the diagnostic accuracy of the most commonly applied MRI markers in cohorts of RS, MSA-P, and CBS, with a particular focus on planimetric and volumetric markers of midbrain and pons as well as composite MRI markers (MRPI 1 and 2).

Materials and Methods
The present study was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement.The study protocol was registered in the International Prospective Register for systematic reviews (PROSPERO; ID: CRD42023475739) [14].No institutional board review approval was obtained since only previously published data were utilized.

Literature Search Strategy
PubMed and Scopus were searched from database inception to 15 October 2023 by three researchers independently (M.-E.B., I.K., and N.G.).In cases of disagreement regarding the eligibility of a study, these issues were discussed by all researchers and were included only after a consensus was reached.An additional manual search was performed on all included studies regarding: (a) all references of included studies; (b) all citations of these studies; (c) relevant studies (from PubMed).In cases of full text unavailability, the corresponding authors of papers were contacted in an effort to retrieve full texts.
The search strategy applied was: (MRI OR magnetic resonance OR brainstem OR midbrain OR pons OR cerebellar peduncle OR volume OR volumetry OR surface OR area OR planimetry OR distance OR diameter OR width) AND (CBD OR CBS OR corticobasal OR extrapyramidal OR gait apraxia OR MSA or multiple system atrophy OR parkinsonian OR parkinsonism OR PGAF or PSP or Richardson or supranuclear).In cases of publications with possible partial overlap of cohorts, an algorithm including evaluation of authorship, study characteristics, sample characteristics, constructs' and measures' definitions, and study effects was applied to reach a decision regarding the eligibility of studies [15].

Data Extraction
Data extraction was performed by two authors independently (V.C.C. and N.G.).In cases of disagreement, a consensus was reached after joint assessment of the data from the original study.
Information extracted from studies included the following: first author; year of publication; study title; study design (i.e., retrospective, prospective, cross-sectional, unspecified); period of recruitment; study center.
Additionally, for each of the four groups included in this meta-analysis (RS, MSA-P, CBS, control group), the following information was extracted where available: male/female ratio; mean age; mean disease duration (applicable only in the patient groups).Also extracted were the subject count (n), mean value, and standard deviation (SD) of: (a) M area ; (b) P area ; (c) M/P area ; (d) M vol ; (e) P vol ; (f) M/P vol : (g) MRPI 1; and (h) MRPI 2 per study group.
In cases of missing data on the n, mean, or SD of MRI markers, supplementary files of relevant papers were reviewed.Additionally, data were extracted from scatterplots, boxplot, or error bar plots where applicable through the use of WebPlotDigitizer version 4.6 (https://apps.automeris.io/wpd;access date: 15 November 2023).

Summary Measures
Standardized mean difference (SMD), as expressed by Cohen's d, was calculated to measure the effect size on the distinction between planimetric, volumetric, and composite MRI markers in APD patients and control subjects.Effect size based on Cohen's d was interpreted as very small (d ≈ 0.01), small (d ≈ 0.2), medium (d ≈ 0.5), large (d ≈ 0.8), very large (d ≈ 1.2), or huge (d ≈ 2.0), based on recommendations [16].

Quality Evaluation
Quality evaluation was performed by three authors independently (M.-E.B., I.K., N.G.) through use of the QUADAS-2 tool [17].It consists of four key domains-patient selection, index test, reference standard, and flow/timing-which are assessed in terms of bias and concerns regarding applicability.For the present meta-analysis, the signaling questions regarding the presence or absence of "pre-specified cut-offs" from the "reference standard" domain and the "appropriate interval between index test and reference standard" from the "flow and timing" domain were not implemented due to non-applicability.Additionally, the signaling question "Was a case-control design avoided?" from the "patient selection" domain was omitted since the primary aim of this meta-analysis was the comparison of MRI markers between APD patients and control subjects.Thus, per definition, all included studies had a case-control design.In cases of disagreement, a consensus was reached after discussions between the authors.

Statistical Analysis
The Q statistic was used to assess the presence or absence of heterogeneity and the I 2 statistic was applied to quantify between-study heterogeneity.Heterogeneity was classified as low, moderate, or high with I 2 values of <25%, 25-50%, and >50%, respectively.
To control for between-study heterogeneity, a random effects model was applied for meta-analysis.Cohen's d was calculated as a measure of the effect size of distinction between MRI markers in APD patients and control groups.Analyses were performed for each patient group (RA, MSA-P, and CBS) against the control group.No direct comparison between patient groups was performed.Forest plots were produced and displayed effect sizes, standard errors, confidence interval limits, p-values, and weights.
To test for publication bias, funnel plots were constructed with Cohen's d on the x-axis and standard error on the y-axis in order to visualize any outlying studies.Additionally, the Egger linear regression test was performed in order to quantify bias.
SPSS vs. 28 (IBM Corp. Released 2021; IBM SPSS Statistics for Windows, Version 28.0.Armonk, NY, USA: IBM Corp) was used by one author (V.C.C.) for all statistical analyses.A two-tailed p value < 0.05 was considered statistically significant.
To control for between-study heterogeneity, a random effects model was applied for meta-analysis.Cohen's d was calculated as a measure of the effect size of distinction between MRI markers in APD patients and control groups.Analyses were performed for each patient group (RA, MSA-P, and CBS) against the control group.No direct comparison between patient groups was performed.Forest plots were produced and displayed effect sizes, standard errors, confidence interval limits, p-values, and weights.
To test for publication bias, funnel plots were constructed with Cohen's d on the xaxis and standard error on the y-axis in order to visualize any outlying studies.Additionally, the Egger linear regression test was performed in order to quantify bias.
SPSS vs. 28 (IBM Corp. Released 2021; IBM SPSS Statistics for Windows, Version 28.0.Armonk, NY, USA: IBM Corp) was used by one author (V.C.C.) for all statistical analyses.A two-tailed p value < 0.05 was considered statistically significant.

Basic Features Included in the Study
The basic characteristics of included studies are summarized in Table 1.Twenty-five studies included data on RS patients, five studies on MSA-P, and four studies on CBS.Eleven studies were retrospective, two studies were prospective, and the remaining had undefined study designs.Mean age in MSA-P cohorts varied from 59.7-70 years, mean disease duration from 24.7-93.8months, and male-to-female ratio from 0.3-2.5.Four studies included data on midbrain and pons area, and three studies included data on M/P area (Table 1, Supplementary Table S2).
None of the MRI markers had data on >2 studies for CBS cohorts.Mean age in CBS cohorts varied from 61.3-72.8years, mean disease duration from 20.8-55.2months, and male-to-female ratio from 0.1-1.7 (Table 1, Supplementary Table S3).

Quality Evaluation of Included Studies
Based on QUADAS-2 tool, the risk of bias regarding patient selection was unclear for 19 of the 27 studies included due to unspecified consecutive/random selection of patients in these studies.The risk of bias for the index test was unclear for 8 of the 27 studies and low for the remainder.The risk of bias for reference standard and flow/timing as well as concerns regarding applicability, was low for all included studies (Figure 2, Supplementary Table S1).
Mean age in MSA-P cohorts varied from 59.7-70 years, mean disease duration from 24.7-93.8months, and male-to-female ratio from 0.3-2.5.Four studies included data on midbrain and pons area, and three studies included data on M/Parea (Table 1, Supplementary Table S2).
None of the MRI markers had data on >2 studies for CBS cohorts.Mean age in CBS cohorts varied from 61.3-72.8years, mean disease duration from 20.8-55.2months, and male-to-female ratio from 0.1-1.7 (Table 1, Supplementary Table S3).

Quality Evaluation of Included Studies
Based on QUADAS-2 tool, the risk of bias regarding patient selection was unclear for 19 of the 27 studies included due to unspecified consecutive/random selection of patients in these studies.The risk of bias for the index test was unclear for 8 of the 27 studies and low for the remainder.The risk of bias for reference standard and flow/timing as well as concerns regarding applicability, was low for all included studies (Figure 2, Supplementary Table S1).Seventeen studies included data on pons area (P area ) in RS cohorts.A total of 1348 subjects (577 RS patients and 771 control subjects) were included in these studies.Mean pons area ranged from 417 mm 2 to 526 mm 2 .M/F ratio ranged from 0.6 to 4.4.Mean age ranged from 62.5 to 74 years, and mean disease duration ranged from 12.8 to 79.2 months.All included studies reported reduced pons surfaces, with Cohen's d ranging from −0.11 to −1.24.Overall Cohen's d for P area was −0.80 (−0.97 to −0.63; p < 0.001) (Table 1, Figure 3, Supplementary Table S2).
Eleven studies included data on midbrain-to-pons-area ratio (M/P area ) in RS cohorts.A total of 542 subjects (213 RS patients and 329 control subjects) were included in these studies.Mean M/P area ranged from 0.12 to 0.19.M/F ratio ranged from 0.6 to 4.4.Mean age ranged from 62.5 to 74 years, and mean disease duration ranged from 12.8 to 50.2 months.All included studies reported significantly reduced M/P area , with Cohen's d ranging from −1.86 to −4.51.Overall Cohen's d for M/P area was −3.02 (−3.45 to −2.58; p < 0.001) (Table 1, Figure 3, Supplementary Table S2).
Thirteen studies included data on MRPI 1 in RS cohorts.A total of 1154 subjects (493 RS patients and 662 control subjects) were included in these studies.Mean 4, Supplementary Table S2).

MSA-P
Four studies included data on midbrain area (M area ) in MSA-P cohorts.A total of 275 subjects (126 MSA-P patients and 149 control subjects) were included in these studies.Mean M area ranged from 97.2 mm 2 to 153.8 mm 2 .M/F ratio ranged from 0.3 to 1.7.Mean age ranged from 62.6 to 67 years and mean disease ranged from 24.7 to 93.8 months.All included studies reported significantly decreased M area , with Cohen's d ranging from −0.55 to −1.30.Overall Cohen's d for M area was −0.97 (−1.34 to −0.59; p < 0.001) (Table 1, Figure 5, Supplementary Table S3).
Thirteen studies included data on MRPI 1 in RS cohorts.A total of 1154 subjects (49 RS patients and 662 control subjects) were included in these studies.Mean MRPI 1 range from 17.6 to 27.M/F ratio ranged from 0.6 to 2.8.Mean age ranged from 62.5 to 74 years and mean disease duration ranged from 30.7 to 50.2 months.All included studies reporte significantly increased MRPI 1, with Cohen's d ranging from 1.35 to 6.94.Overall Cohen' d for MRPI 1 was 2.78 (2.05 to 3.52; p < 0.001) (Table 1, Figure 4, Supplementary Table S2) Four studies included data on pons volume (P area ) in MSA-P cohorts.A total of 275 subjects (126 MSA-P patients and 149 control subjects) were included in these studies.Mean P area ranged from 381.6 mm 2 to 459 mm 2 .M/F ratio ranged from 0.3 to 1.7.Mean age ranged from 62.6 to 67 years, and mean disease ranged from 24.7 to 93.8 months.All included studies reported significantly decreased P area , with Cohen's d ranging from −1.68 to −0.54.Overall Cohen's d for P area was −1.15 (−1.57 to −0.72; p < 0.001) (Table 1, Figure 5, Supplementary Table S3).Three studies included data on midbrain-to-pons-area ratio (M/P area ) in MSA-P cohorts.A total of 142 subjects (66 MSA-P patients and 76 control subjects) were included in these studies.Mean M/P area ranged from 0.24 to 0.27.M/F ratio ranged from 0.1 to 1.2.Mean age ranged from 62.6 to 67 years, and mean disease ranged from 24.7 to 93.8 months.Two of the included studies reported significantly increased M/P area , whereas a single study reported a decreased M/P area , with Cohen's d ranging from −0.13 to 0.85.Overall Cohen's d for M/P area was 0.45 (−0.10 to 1.00; p = 0.11) (Table 1, Figure 5, Supplementary Table S3).

CBS
Meta-analysis could not be performed because none of the MRI markers had available data in >2 studies (Supplementary Table S4).

Heterogeneity
The Q statistic was used to assess the presence or absence of heterogeneity qualitatively, and the I 2 statistic was applied to quantify between-study heterogeneity.

Discussion
Progressive supranuclear palsy, multiple system atrophy, and corticobasal degeneration are rare neurodegenerative Parkinsonian disorders with characteristic neuropathological features which present with multiple diverse phenotypes.Richardson's syndrome (RS) is the prototypical manifestation of PSP and is characterized by early postural instability and supranuclear gaze palsy [1].MSA-Parkinsonism (MSA-P) manifests as predominant Parkinsonism combined with dysautonomia, cerebellar, and pyramidal signs [2].Corticobasal syndrome (CBS) manifests with symptoms and signs of cortical (apraxia, cortical sensory deficits, alien limb phenomena) and basal ganglionic dysfunction (parkinsonism, myoclonus, dystonia) [3].Despite the presence of these unique clinical features, patients with APD are commonly misdiagnosed, particularly early in their disease course as well as in oligosymptomatic or atypical/mixed presentations.
Neuropathological studies have supported that PSP is characterized by preferential midbrain and SCP atrophy, whereas MSA (particularly MSA-C) is characterized by pontine and MCP atrophy.To this end, most morphometric MRI markers in APD have focused on midbrain, pons, as measured through midbrain and pons areas and volumes.Additionally, composite markers such as MRPI 1 and MRPI 2 have been introduced, which incorporate measurements of midbrain and pons surfaces as well as SCP and MCP widths.
Multiple MRI studies have focused on the planimetric and volumetric midbrain/pons characteristics of RS, MSA-P, and CBS.However, these studies exhibit differences in design, diagnostic criteria, patient characteristics/groups, and imaging markers applied.In order to systematically present data on these markers, we performed a systematic review of all studies on RS, MSA-P, and CBS which included at least one of the following imaging markers: midbrain area and/or volume, pons area and/or volume, midbrain-to-pons-area and/or volume ratio, and MRPI 1 and 2.
An initial conclusion of the present systematic review is that few studies have applied these MRI markers in MSA-P (n = 5) or CBS (n = 4).Meta-analysis could not be performed for any of the MRI markers in CBS because none of these markers had available data in >2 studies.For MSA-p, only three MRI markers (midbrain area, pons area and M/P area ) had available data on >2 studies and were thus available for meta-analysis.Based on Cohen's d as a measure of effect size as measured by pons area, MSA-p patients present with predominant pontine atrophy (Cohen's d = −1.15;p < 0.001).However, these patients also exhibit comparable midbrain atrophy (Cohen's d = −0.97;p < 0.001), thus rendering the M/P area an ineffective surrogate marker for MSA-P.Thus, pontine atrophy, as measured by pons surface in the midsagittal plane, is the most potent MRI marker for MSA-P.
Twenty-five studies included data on MRI markers in RS.Meta-analysis was performed for all MRI markers, except for pons volume and M/P vol , due to lack of >2 studies with data.Midbrain area provided the greatest Cohen's d value among MRI markers (Cohen's d = −3.10;p < 0.001), followed by M/P area (Cohen's d = −3.02;p < 0.001), MRPI 1 (Cohen's d = 2.78; p < 0.001) and MRPI 2 (Cohen's d = 2.48; p < 0.001).The greater effect size of midbrain area compared to M/P area , MRPI 1 and MPRI 2 could be attributed to the concomitant pontine atrophy in RS (as evidenced by pons area Cohen's d = 0.80; p = 0.02).These data indicate that despite the introduction of composite MRI markers such as the MRPI, measurement of the midbrain surface remains the most effective MRI marker for PSP.
The level of evidence, based on the number of subjects included per analysis, varied between RS studies, with midbrain area (n = 1590), pons atrophy (n = 1348), and MRPI 1 (n = 1154) included in the largest samples.Meta-analysis for MSA-P studies included significantly smaller samples (n = 275) for midbrain and pons areas.Publication bias was present for midbrain area studies in RS, and heterogeneity among studies was high for multiple MRI markers.
There are certain limitations to this systematic review and meta-analysis.Initially, most studies included had positive results.We did not perform a systematic search of the grey literature, to search for negative relevant unpublished studies.However, publication bias based on funnel plots and Egger's-regression-based test was minimal.Additionally, negative studies on pontine area in RS and on midbrain area, pons area, and M/P area were published and included.Lastly, the effect size for most MRI markers in RS in the included studies was consistently high, rendering the possibility of negative relevant studies unlikely.Another limitation of this study was the inclusion of all relevant MRI studies with planimetric/volumetric brainstem data irrespective of the methodology used (i.e., planimetry methodology based on Oba et al. vs. Cossotini et al. [10,11]; inclusion or exclusion of midbrain tectum [59]; manual vs. automatic measurements [63]; 1.5 T vs. 3 T MRI [60]).The variability in these factors may have contributed to the heterogeneity of studies.However, relevant studies have supported excellent agreement between automated and manual measurements and between MRI scanners of different field strengths.Only two of the included studies had a prospective design, with most studies being either retrospective or undefined.Thus, the temporal pattern of atrophy based on MRI markers of this meta-analysis cannot be deduced.The studies included in this meta-analysis applied different MRI acquisition protocols with varied TR, TE, FOV, and slice thickness values (Supplementary Table S5).Differences in resulting planimetric or volumetric measurements due to these MRI acquisition protocol differences could not be tested due to the great variability between studies.Rarely, midline structural lesions, such as vascular malformations, tumors or traumatic lesions may result in phenotypes mimicking atypical Parkinsonism.All studies included in this meta-analysis excluded patients with structural lesions.Application of the MRI markers discussed in this review implies the absence of such lesions.Lastly, all included studies rely on the classification of patients based on established clinical diagnostic criteria, in lack of neuropathological confirmation, rendering misdiagnosis of patients possible.

Figure 1 .
Figure 1.Flow chart of study selection according to preferred reporting items for systematic reviews and meta-analyses (PRISMA) criteria.

Figure 1 .
Figure 1.Flow chart of study selection according to preferred reporting items for systematic reviews and meta-analyses (PRISMA) criteria.

Figure 2 .
Figure 2. (a) Risk of bias and (b) concerns regarding the applicability of studies for patient selection, index test, reference standard, and flow/timing based on the QUADAS-2 tool.

Figure 2 .
Figure 2. (a) Risk of bias and (b) concerns regarding the applicability of studies for patient selection, index test, reference standard, and flow/timing based on the QUADAS-2 tool.

Table 1 .
Data regarding study design (Pr: prospective; Ret: retrospective; NC: not clarified), period of recruitment, male to female ratio, mean age (years), mean disease duration (months), and main findings of all studies included in the meta-analysis.NA: not available; dur: duration; ctrls: control subjects.

Table 1 ,
MRPI 1 ranged from 17.6 to 27.M/F ratio ranged from 0.6 to 2.8.Mean age ranged from 62.5 to 74 years, and mean disease duration ranged from 30.7 to 50.2 months.All included studies reported significantly increased MRPI 1, with Cohen's d ranging from 1.35 to 6.94.Overall Cohen's d for MRPI 1 was 2.78 (2.05 to 3.52; p < 0.001) (Table 1, Figure 4, Supplementary Table S2).Three studies included data on MRPI 2 in RS cohorts.A total of 229 subjects (127 RS patients and 102 control subjects) were included in these studies.Mean MRPI 2 ranged from 17.6 to 27.M/F ratio ranged from 1.0 to 1.7.Mean age ranged from 70.4 to 74 years, and mean disease was 46.8 months.All included studies reported significantly increased MRPI 2, with Cohen's d ranging from 1.87 to 3.11.Overall Cohen's d for MRPI 2 was 2.48 (1.80 to 3.172; p < 0.001) (Table 1, Figure 4, Supplementary Table S2).Four studies included data on midbrain volume (M vol ) in RS cohorts.A total of 304 subjects (164 RS patients and 140 control subjects) were included in these studies.Mean M vol ranged from 17.6 to 27.M/F ratio ranged from 1.1 to 1.3.Mean age ranged from 65.1 to 72 years, and mean disease ranged from 38.4 to 55.2 months.All included studies reported significantly decreased M vol , with Cohen's d ranging from −1.51 to −2.74.Overall Cohen's d for M vol was −1.99 (−2.27 to −1.71; p < 0.001) (Figure