Middle Cerebral Artery Doppler Velocimetry for the Diagnosis of Twin Anemia Polycythemia Sequence: A Systematic Review

Twin anemia polycythemia sequence (TAPS) is a rare complication of monochorionic diamniotic (MCDA) twins. Middle cerebral artery peak systolic velocity (MCA-PSV) measurements are used to screen for TAPS while fetal or neonatal hemoglobin levels are required for definitive diagnosis. We sought to perform a systematic review of the efficacy of MCA-PSV in diagnosing TAPS. Search criteria were developed using relevant terms to query the Pubmed, Embase, and SCOPUS electronic databases. Publications reporting diagnostic characteristics of MCA-PSV measurements (i.e., sensitivity, specificity or receiver operator curves) were included. Each article was assessed for bias using the Quality Assessment of Diagnostic Accuracy Studies II (QUADAS II) tool. Results were assessed for uniformity to determine whether meta-analysis was feasible. Data were presented in tabular form. Among publications, five met the inclusion criteria. QUADAS II analysis revealed that four of the publications were highly likely to have bias in multiple areas. Meta-analysis was precluded by non-uniformity between definitions of TAPS by MCA-PSV and neonatal or fetal hemoglobin levels. High-quality prospective studies with consistent definitions and ultrasound surveillance protocols are still required to determine the efficacy of MCA-PSV in diagnosing TAPS. Other ultrasound findings (e.g., placenta echogenicity discordance) may augment Doppler studies.

Routine use of MCA Doppler velocimetry to detect TAPS is currently recommended by the International Society for Ultrasound in Obstetrics and Gynecology (ISUOG) [13]. However, in 2013 the Society for Maternal Fetal Medicine (SMFM) recommended against this practice stating that there is "no evidence that monitoring for TAPS with MCA PSV Doppler at any time, including >26 weeks, improves outcomes [14]". Since 2013, TAPS has received considerable attention in the literature and the first randomized clinical trial for treatment of TAPS is currently underway [15]. Following these developments, there has been increased enthusiasm for screening protocols utilizing serial MCA Doppler velocimetry [16].
MCA Doppler velocimetry is also the principle screening test for fetal anemia in the setting of maternal alloimmunization to fetal red blood cell antigens [17]. The use of MCA-PSV values to screen for fetal anemia from alloimmunization has been established following extensive research, but with a false positive rate of 10% still prevailing [18]. The method, which may be affected by a range of physiologic parameters, is less well studied in TAPS, which is a unique physiologic entity. Furthermore, MCA Doppler velocimetry may be less efficacious for detecting polycythemia than for anemia [19]. Finally, the definitive diagnosis of TAPS and verification of MCA Doppler velocimetry as a screening tool is confounded at early GAs by the inability to sample fetal Hb due to technical limitations and concern regarding perinatal risks. In this study, we sought to exhaustively review the literature to determine the diagnostic efficacy of MCA Doppler velocimetry in screening for TAPS.

Materials and Methods
We performed a systematic review following PRISMA guidelines [20]. The research question of interest was to determine the sensitivity, specificity, positive predictive value, and negative predictive value of MCA Doppler velocimetry for antenatal diagnosis of TAPS. Eligible studies included those where MCA-PSV values were collected to screen for TAPS and fetal or neonatal serum Hb concentrations were also collected as the "gold standard" diagnostic test. Studies written in English during or after 2006 (i.e., when TAPS was first described in the literature) were considered [1]. Case reports, case series and literature that did not undergo peer review (i.e., abstracts, clinical commentary) were excluded.
To locate eligible studies, a search strategy was developed and applied to the Pubmed, Embase and SCOPUS electronic databases. The following terms were queried in April of 2020: "twin anemia polycythemia sequence," "TAPS" and "feto-fetal transfusion." In the SCOPUS search, only publications in the "Medicine" and "Health Professions" subject areas were included. Abstracts for all publications retrieved by this search were reviewed by two of the authors (C.B. and E.B.). Publications that appeared as though they may include antenatal screening for TAPS based on abstract review were obtained for complete review. A secondary search (i.e., "by hand") was performed by reviewing the bibliographies of these publications. The aim of this process was to generate a comprehensive list of primary studies suitable to answer the research question.
Documents retrieved by the primary and secondary searches were reviewed in their entirety. Two independent reviewers (C.B. and E.B.) carefully examined each document and extracted relevant data including the author, PMID number, year of publication, journal and study type (i.e., prospective vs. retrospective). Studies that compared antenatal screening for TAPS by MCA-PSV values to fetal or neonatal serum Hb levels, including either sensitivity/specificity calculations or derivation of a receiver-operative curve (ROC) were considered for analysis. For these studies, we extracted study size, number of TAPS cases, number of plTAPS and sTAPS, stage and GA at TAPS diagnosis and the date range over which women were clinically evaluated. Outcome measures including the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and area under the ROC were also collected. Uniformity of the studies was assessed for the possibility of performing meta-analysis. Because determination of fetal Hb levels may not be possible for pregnancies at earlier GAs, it was also noted whether calculations were stratified by GA. Any discrepancies in extracted data or study inclusion were arbitrated and resolved by a third author (R.P.).
The Quality Assessment of Diagnostic Accuracy Studies II (QUADAS II) methodology was applied to the selected studies to determine their likelihood of bias and applicability in answering the research question [21]. QUADAS II includes a structured list of 11 "signaling questions" evaluating for bias in the following four domains: (1) patient selection, (2) collection and interpretation of an index test (i.e., MCA-PSV to screen for TAPS), (3) collection and interpretation of a reference standard (i.e., Hb levels to definitively diagnose TAPS), (4) flow/timing of the study. Questions may be answered "yes," "no," or "unclear." Additional questions may be added at the users' discretion to address possible biases unique to the research question. We added two questions: (1) "Was a clearly defined schedule described for timing of initial and follow-up application of the index test?" (2) "Were reference test values from repeat intrauterine transfusions or reference test values following birth in neonates who had a previous intrauterine transfusion used to calculate sensitivity/specificity?" MCA-PSV is usually measured serially as recommended by ISUOG; the first question was added because a missed diagnosis following inadequate antenatal screening may lead to underestimation of the test's sensitivity. The second question was added for two reasons: (1) Hb values at the time of intervention, not postnatal Hb values, should be compared with MCA-PSV values at the time of TAPS diagnosis for fetuses that underwent transfusion. (2) The efficacy of MCA-PSV values in predicting fetal anemia decreases following intrauterine transfusion [22]. Based on answers to these 13 questions (see Table 1), the likelihood of bias in the four domains was assessed as "HIGH," "LOW," or "UNCLEAR." Similarly, but without use of signaling questions, concerns about applicability of the studies to the four domains are assessed as "HIGH," "LOW," or "UNCLEAR." The aim of this study was to collate and summarize the results of primary studies and, if possible, perform meta-analysis to estimate the pooled sensitivity and specificity of MCA-PSV values for diagnosing TAPS. Data were summarized in tabular form. Microsoft Excel (Redmond, WA) was used to tabulate search results, extracted data, and QUADAS II results.

Results
The primary search of the Pubmed, Embase, and SCOPUS databases produced 432 publications. Based on abstract review, 54 of these publications were selected to be reviewed in their entirety [1,2,4,5,7,[9][10][11][12]16,19,. The secondary search of bibliographies produced three additional publications which were also reviewed in their entirety for a total 57 publications [6,66,67]. The publications are detailed in Table 1 and reasons for exclusion are summarized in Figure 1. Five of the 57 publications, all from the primary search, described a study of MCDA twins where serial ultrasounds with MCA-PSV values were collected to detect TAPS with direct comparison to Hb values [5,9,10,19,47].
The aim of this study was to collate and summarize the results of primary studies and, if possible, perform meta-analysis to estimate the pooled sensitivity and specificity of MCA-PSV values for diagnosing TAPS. Data were summarized in tabular form. Microsoft Excel (Redmond, WA) was used to tabulate search results, extracted data, and QUADAS II results.

Results
The primary search of the Pubmed, Embase, and SCOPUS databases produced 432 publications. Based on abstract review, 54 of these publications were selected to be reviewed in their entirety [1,2,4,5,7,[9][10][11][12]16,19,. The secondary search of bibliographies produced three additional publications which were also reviewed in their entirety for a total 57 publications [6,66,67]. The publications are detailed in Table 1 and reasons for exclusion are summarized in Figure 1. Five of the 57 publications, all from the primary search, described a study of MCDA twins where serial ultrasounds with MCA-PSV values were collected to detect TAPS with direct comparison to Hb values [5,9,10,19,47]. After applying QUADAS II methodology to these five studies, four were determined highly likely to have bias in patient selection ( Figure 2) [5,9,10,47]. Three were found highly likely to have bias in conduction or interpretation of the index test (i.e., MCA-PSV Doppler values) as well as conduction or interpretation of the standard reference (i.e., Hb levels) [5,10,47]. All of the studies were determined to have low likelihood of bias in flow and timing. The answers to the QUADAS II signaling questions provide rationale for these assessments of bias and are detailed in Table 2. After applying QUADAS II methodology to these five studies, four were determined highly likely to have bias in patient selection ( Figure 2) [5,9,10,47]. Three were found highly likely to have bias in conduction or interpretation of the index test (i.e., MCA-PSV Doppler values) as well as conduction or interpretation of the standard reference (i.e., Hb levels) [5,10,47]. All of the studies were determined to have low likelihood of bias in flow and timing. The answers to the QUADAS II signaling questions provide rationale for these assessments of bias and are detailed in Table 2. Regarding patient selection, studies by Slaghekke et al. and Veujoz et al. only included women with a diagnosis of TAPS without any uncomplicated MCDA twins [9,47]. Hence, the number of true negatives and false positives, or women without TAPS, appear to have been based on women who were ultimately diagnosed with TAPS with MCA-PSV values and Hb levels collected prior to diagnosis or after treatment and resolution of the disease. Studies by De Sousa et al. and Tollenaar et al. excluded twins that were missing either MCA-PCV values or Hb levels [5,10]. As protocols for serial MCA-PSV analysis (or compliance with such a protocol) over the study periods were not available, the reasons these values were missing, and women excluded, could not be determined. One study appeared to include postnatal Hb levels to calculate sensitivity, specificity, PPV, and NPV in twins that had a prior IUT and another used MCA-PSV/Hb pairs from the time of repeat IUT for the same purpose [10,47]. None of the studies were found to have concerns regarding the applicability of patient selection, the index test, or the standard reference. Figure 2 Legend: Results of QUADAS II application. QUADAS II consists of four assessments of bias and three assessments of applicability to the study question. A judgement of "HIGH," "LOW," or "UNCLEAR" is made for each assessment of each study considered. A "HIGH" judgement means that the study methodology is likely to introduce bias (for bias assessments) or likely lacks applicability to the study question (for applicability assessments). A "LOW" means there is less likelihood of bias or lack of applicability. Answers to the "signaling questions" which support the bias judgements which are detailed in Table 3.
Demographic characteristics of women from each of the selected studies are detailed in Table 3. The studies included 779 women and 122 cases of TAPS. There may be overlap in TAPS cases between studies by Tollenaar et al. and Slaghekke et al. as they are from the same institution (Leiden University, Netherlands) with overlapping study periods [10,47]. All but one study (Slaghekke et al.) reported that the incidence of TAPS ranged from 4.6% to 35.3% (sTAPS and plTAPS together). Tollenaar et al. did not report the incidence of plTAPS and sTAPS separately for all cases (see fotenote, Table 3) [10]. Among the remaining studies, the incidences of plTAPS (total n = 44) and sTAPS (total n = 43) were reported separately, but each type was included together in calculations of sensitivity, specificity, PPV and NPV. Among studies where plTAPS and sTAPS were delineated and incidence was reported, the combined incidence of sTAPS was 4.8% (31/643). The majority of TAPS was diagnosed after 24 weeks with delivery (mostly by Cesarean section) in the late 3 rd trimester (i.e., >32 weeks). Only Veujoz et al. described the disease severity by stages [9]. Table 4 shows sensitivity, specificity, PPV, and NPV of MCA-PSV values in predicting the diagnosis of TAPS by differences in intertwin Hb levels. All studies limited analysis to patients where an MCA-PSV value was collected within one week of determining Hb levels by cordocentesis  [5,9,47]. Collectively, four different criteria for a positive MCA-PSV screening test were used between the five studies. None of the studies stratified calculations of the diagnostic characteristics of MCA-PSV values by GA or treatment received. * A retrospective study design was applied without explanation of a standardized protocol for the timing of either the index test (MCA-PSV) or reference standard (Hb) during the retrospective period. Patients were not included when MCA-PSV (73/256, 29%) or Hb data (3/256, 1%) were not available. Patients were also not included if the interval between MCA-PSV and Hb data was greater than one week. † These studies were limited to patients with TAPS (i.e., not consecutive or random MCDA twins). Sensitivity and specificity were calculated using Hb and MCA-PSV values at times when patients met, versus did not meet, the criteria for diagnosis (i.e., multiple MCA-PSV, Hb pairs per patient). Slaghekke et al. did not include patients with missing data for Hb or MCA-PSV. ‡ A retrospective study design was applied without explanation of a standardized protocol for the timing of either the index test (MCA-PSV) or reference standard (Hb) during the retrospective period. Patients were excluded when MCA-PSV (221/351, 63%) or Hb (50/351, 14%) data were not available. Patients were also excluded if the interval between MCA-PSV and Hb data was greater than one week. § These studies did not describe a schedule for application of MCA-PSV measurement (i.e., patients considered, starting gestational age and frequency of follow up) or report ultrasound findings that might have prompted more frequent Doppler studies. ¶ Sensitivity and specificity calculations were performed using "postnatal intertwin Hb difference," however 10 cases of intrauterine transfusion were reported. # Sensitivity and specificity calculations were performed using MCA-PSV/Hb pairs at the time of repeat IUT.     Abbreviations: AUC = area under the curve, Hb = hemoglobin concentration, MCA-PSV = middle cerebral artery peak systolic velocity, ∆MCA-PSV = difference in MCA-PSV between twins, MoM = multiples of the median, TAPS = twin anemia polycythemia sequence, wk = week.
The criteria for definitive diagnosis of TAPS was similarly heterogeneous. Each study included numerical cutoffs for diagnosis; however, three evaluated the difference in Hb levels between twins for diagnosis of TAPS, while the remaining two used individual Hb levels to diagnose anemia and polycythemia separately. Two of the studies also required the reticulocyte count ratio of the polycythemic twin to the anemic twin to be greater than 1.7 [9,10]. Collectively, the five studies had four different definitions of definitive TAPS. The sensitivities, specificities, PPVs, and NPVs were reported for four of the five studies, while Bartal et al. reported ROC characteristics [19]. Finally, none of the studies stratified calculations by GA or intrauterine treatment for any of the calculations.
Heterogeneity between the studies precluded meta-analysis for pooled estimates of PPV, NPV, sensitivity, or specificity.  Figure 2 Legend: Results of QUADAS II application. QUADAS II consists of four assessments of bias and three assessments of applicability to the study question. A judgement of "HIGH," "LOW," or "UNCLEAR" is made for each assessment of each study considered. A "HIGH" judgement means that the study methodology is likely to introduce bias (for bias assessments) or likely lacks applicability to the study question (for applicability assessments). A "LOW" means there is less likelihood of bias or lack of applicability. Answers to the "signaling questions" which support the bias judgements which are detailed in Table 3.  The above findings suggest significant limitations in the diagnosis and management of TAPS. First, false positive and false negative diagnoses are of major concern. Currently, fetal intervention may be undertaken based on MCA Doppler velocimetry which appears to have PPVs ranging from 70% to 100%; however, these PPVs are estimated in retrospective studies with a high likelihood of bias, particularly in patient selection [5,10,47]. This is not surprising as each study includes data from before 2016, when ISUOG first recommended universal serial MCA Doppler velocimetry in MCDA twins [13]. Prospective data by Fishel-Bartal et al. suggest that absolute MCA-PSV values perform modestly in diagnosing TAPS (AUC = 0.687, 95% CI (0.547-0.827) for anemia and AUC = 0.617, 95% CI (0.505-0.728) for polycythemia) [19]. While the same study shows better performance for relative MCA-PSV values (i.e., ∆MCA-PSV, AUC = 0.871, 95% CI (0.757-0.985)), these data are yet to be corroborated in another high-quality prospective study. Furthermore, the incidence of sTAPS in this study (10.1%) is more than twice as high as reported elsewhere (1.2-4.9%, except for De Sousa et al. with an incidence of 9.1%) increasing the likelihood that PPV is overestimated [2][3][4][5][6][7]. Hence, the risk of intervention for TAPS when disease is not truly present may be substantial. As the stage of TAPS is only reported in one of the studies, the degree to which MCA-PSV values may over-or underestimate severe outcomes, which provide greater impetus for intervention, is also unclear.
Of note, the work by Fishel-Bartal et al. highlights an emerging trend of favoring ∆MCA-PSV over absolute MCA-PSV values because of the former's reported higher sensitivity [19]. Tollenaar et al. retrospectively observed that twins meeting ∆MCA-PSV criteria, but not absolute criteria, have similar postnatal outcomes to twins meeting absolute criteria. This led that group to propose a new staging system for TAPS based on ∆MCA-PSV [10]. Fishel-Bartal et al. suggest that the superiority of ∆MCA-PSV values may be related to the poor predictive ability of MCA-PSV < 1.0 MoM in diagnosing polycythemia; however, Slaghekke et al. report high sensitivity and specificity using this method (Table 4) [19,47]. Further prospective data are required to validate (or invalidate) MCA-PSV < 1.0 MoM and ∆MCA-PSV in diagnosing polycythemia and TAPS, respectively.
The evaluation of diagnostic performance is further complicated by the inability to measure Hb levels when TAPS is diagnosed at an early GA or following FLS. The treatment algorithm published in Leiden recommends FLS prior to 28 weeks, preterm delivery after 32 weeks, and IUT with PET at intermediate GAs for Stage II or greater TAPS [42]. At GAs greater than 32 weeks, MCV-PSV values and Hb levels may be measured in close temporal proximity because delivery is likely to occur soon regardless of whether TAPS is diagnosed ( Figure 3A). Thus, the incidence of all four possible outcomes relating MCA-PSV to Hb (true and false positives, true and false negatives) may be determined. These values are required to determine sensitivity, specificity, PPV, and NPV.
For GAs of 28 to 32 weeks, Hb would only be measured when IUT/PET is performed in the setting of Stage II or greater TAPS. As such, true and false negatives cannot be determined ( Figure 3B). It is possible that undiagnosed TAPS (i.e., false negatives) account for some portion of the fetal demise observed in otherwise normal appearing MCDA twins [68][69][70]. For GA less than 28 weeks, none of the four possible outcomes may be determined because Hb levels are not typically measured in the setting of FLS ( Figure 3C). The diagnostic efficacy of MCA-PSV values cannot be determined if the gold standard diagnostic test (i.e., Hb levels) is unavailable. Using one or more surrogate markers (a so called "gold alloy") for diagnosis, such as the "starry liver sign", discordant placental echogenicity, high-output cardiac failure or polycythemic thrombosis (i.e., fetal limb necrosis, bowel infarction with perforation or intracranial hemorrhage) is one method to address such diagnostic quandaries; however, use of this approach was not encountered in our search of the literature [29,71,72].
Following diagnosis, our understanding of the role for various intrauterine treatments is evolving. A recent multicenter study (n = 370) reports a high level of variance in the treatments used (i.e., FLS, IUT/PET, expectant management, early delivery and selective feticide) and the GAs at which they are employed. An advantage in pregnancy prolongation is reported for FLS, but comparative treatments were performed at later GAs [73]. Several smaller studies also address treatment (total n = 185); however, each combined plTAPS with sTAPS, had unsystematic methods of patient inclusion, and directly compared treatments performed across the full range of gestational ages [11,26,52].
Further work is required to provide a valid basis for the diagnosis and treatment of TAPS. sTAPS and plTAPS should be considered separately as data is sparse to suggest whether MCA-PSV values behave similarly in these two physiologically distinct scenarios. We are unaware of any data that separate the two entities as part of an evaluation of MCA-PSV as a screening test. Reliable false positive rates for MCA-PSV may help determine whether the risk of intervention is appropriate. At early GAs, surrogate markers for the disease will be required to determine estimates for the incidence of false positive diagnosis. Further work is also required to elucidate the benefits of intrauterine treatment. While grave outcomes such as neonatal death, ischemic loss of fetal limbs, and cerebellar disruption have been described, it is unclear whether a treatment protocol based on MCA-PSV values can prevent these outcomes [1,[74][75][76].
if the gold standard diagnostic test (i.e., Hb levels) is unavailable. Using one or more surrogate markers (a so called "gold alloy") for diagnosis, such as the "starry liver sign", discordant placental echogenicity, high-output cardiac failure or polycythemic thrombosis (i.e., fetal limb necrosis, bowel infarction with perforation or intracranial hemorrhage) is one method to address such diagnostic quandaries; however, use of this approach was not encountered in our search of the literature [29,71,72]. In summary, this systematic review of MCA-PSV values for diagnosis of TAPS reveals that studies supporting this methodology are dissimilar and at risk for bias. The resulting estimates of the PPV of MCA-PSV measurements may be unreliable, and thus lead to attempted treatment of normal twins, particularly in the setting of Stage II disease. Further work is required to reliably estimate the diagnostic performance of MCA-PSV values and determine the risks and benefits of intrauterine treatment of TAPS. Therefore, in the current practice with limited strong supporting data, it is pragmatic to continue surveillance with antenatal monitoring of MCA-PSV for MCDA twins, but to interpret the results with caution. Intervention should be more strongly considered when additional abnormalities are noted such as cardiac failure, abnormal Doppler flow in umbilical artery or ductus venosus, discordant placental echogenicity or hydrops. Even in the presence of normal MCA-PSV in MCDA twins, it is prudent to consider evaluating for other signs of TAPS during ultrasound surveillance. Until more rigorous, large, prospective and well-controlled studies are conducted, many questions related to screening for TAPS will remain unanswered.

Conflicts of Interest:
The authors report no conflict of interest.