Prognostic Value of Measurable Residual Disease in Patients with AML Undergoing HSCT: A Multicenter Study

Simple Summary In patients diagnosed with acute myeloid leukemia (AML), relapse remains the main cause of mortality after allogeneic hematopoietic stem cell transplantation (HSCT). The detection of measurable residual disease (MRD) by multiparameter flow cytometry in AML patients undergoing HSCT is a powerful predictor of outcome. The aim of this study was performed a retrospective multicenter study to evaluate the prognostic value of MRD by second generation of MFC among patients undergoing HSCT, using recommendations from the Euroflow consortium. MRD levels prior to transplantation significantly influenced outcomes irrespective of the conditioning regimen. Positive MRD on day +100 after transplantation was associated with an extremely poor prognosis. Detection of positive MRD prior to and after transplantation performed with standardized technical conditions has prognostic value in real life. Abstract Allogeneic hematopoietic stem cell transplantation (HSCT) represents the best therapeutic option for many patients with acute myeloid leukemia (AML). However, relapse remains the main cause of mortality after transplantation. The detection of measurable residual disease (MRD) by multiparameter flow cytometry (MFC) in AML, before and after HSCT, has been described as a powerful predictor of outcome. Nevertheless, multicenter and standardized studies are lacking. A retrospective analysis was performed, including 295 AML patients undergoing HSCT in 4 centers that worked according to recommendations from the Euroflow consortium. Among patients in complete remission (CR), MRD levels prior to transplantation significantly influenced outcomes, with overall (OS) and leukemia free survival (LFS) at 2 years of 76.7% and 67.6% for MRD-negative patients, 68.5% and 49.7% for MRD-low patients (MRD < 0.1), and 50.5% and 36.6% for MRD-high patients (MRD ≥ 0.1) (p < 0.001), respectively. MRD level did influence the outcome, irrespective of the conditioning regimen. In our patient cohort, positive MRD on day +100 after transplantation was associated with an extremely poor prognosis, with a cumulative incidence of relapse of 93.3%. In conclusion, our multicenter study confirms the prognostic value of MRD performed in accordance with standardized recommendations.


Introduction
Acute myeloid leukemia (AML) is a heterogeneous disease with different molecular and prognostic characteristics [1]. Allogeneic hematopoietic stem cell transplantation (HSCT) represents the best therapeutic option for many patients with intermediate-or highrisk AML [2]. However, relapse remains the main cause of mortality after transplantation.
The prognostic value of measurable residual disease (MRD) in AML patients is well recognized. Accordingly, 2017 European LeukemiaNet (ELN2017) introduced the new response category, complete remission (CR) without minimal residual disease [1]. The techniques used for MRD monitoring must be applicable, sensitive, specific, and reproducible. In this regard, quantitative polymerase chain reaction (PCR), and more recently, next generation sequencing and digital PCR, have a high sensitivity and applicability. However, they are limited to patients with certain genetic alterations, therefore, their applicability in routine clinical practice can be reduced. Multiparametric flow cytometry (MFC) is almost universally applicable, but display a lower sensitivity; the development of new generation flow (NGF) allows to achieve a similar sensitivity compared to molecular techniques [3]. Nevertheless, NGF is only validated in multiple myeloma and B acute lymphoblastic leukemia [4,5]. An elevated level of expertise is needed to perform MFC-MRD, and the harmonization of the technical issues for the measurement of MRD is necessary. According to ELN consensus, MRD by MFC must integrate the "leukemia-associated phenotype (LAP)" assessment, plus the "different from normal (DfN)" approach [1,6]. Other recommendations include considering the cut-off point of 0.1% to define MRD as positive, the use of prospectively validated panels, such as the Euroflow consortium [7], and the use of standardized flow cytometers [8]. However, there are no studies reporting routine clinical practice based on these standards.
In 2022, the update of the ELN classification emphasized the relevance of the early MRD evaluation that can modify the individual risk classification [9]. In addition, ELN 2017 and 2022 imply a broad genetic characterization, which is not considered in this study, since it includes patients transplanted from 2012 to 2020, who are therefore classified according to ELN 2011.
We performed a retrospective multicenter study to evaluate the prognostic value of MRD by second generation of MFC among patients undergoing HSCT, using recommendations from the Euroflow consortium.

Patients
A retrospective analysis was performed, including 295 AML patients treated according to PETHEMA protocols, undergoing HSCT from 2012 to 2020 in 4 transplant centers. CR was defined, based on ELN2011, as less than 5% of blasts in bone marrow (BM), and no evidence of extramedullary leukemia [17]. All patients who had flow cytometric evaluation prior to transplantation were included in the analysis. All patients provided written informed consent in accordance with the Declaration of Helsinki. This study has been approved by a formally constituted review board: C.P. S2200072, C.I. 0466-N- 22

Detection of Measurable Residual Disease by Second Generation Flow Cytometry
BM aspirate was obtained before starting the conditioning regimen, and at day one hundred after HSCT. The MRD was carried out using 8-color panels based on Euroflow protocols [7]. The Euroflow consortium periodically organizes an Internal Quality Assurance (QA) program to verify that the laboratories work in a standardized way, and according to the published recommendations. Two centers included in the study participate in the QA program organized since 2015, and another one from 2016. Supplementary Table S1 shows the information from the QA program. All the centers carry out calibration and stability controls of the cytometers daily, with CS&T (Becton Dickinson, San Jose, CA, USA) and SPHEROTM Rainbow Calibration Particles, EuroFlowTM. If it fails, and failure is not due to a fluid problem, the technical service is notified. Normal cells were used for internal control, case by case. Fifty µL of BM per tube were stained with monoclonal antibodies. After 30 min of incubation at room temperature in the dark, erythrocytes were lysed, and the sample was washed. The samples were acquired in 8-color digital cytometers FACSCanto II (Becton Dickinson, San Jose, CA, USA). Cytometers were calibrated and compensated according to Euroflow protocols using the Diva software (Becton Dickinson) [8]. More than five hundred thousand viable cells per tube were acquired. For the design of the MRD, the different laboratories used panels from the myelodysplastic syndrome (MDS) panel of Euroflow (T1, T2, and T3 in more than 80% ± T4 when lymphoid markers were expressed, or when the phenotype at diagnosis was not available), along with additional tubes, depending on the patient's specific "leukemia-associated phenotype" (LAP). The tubes that include the LAP had to include the 4 backbones, HLADR, CD45, CD34, and CD117, together with another 4 antibodies at the discretion of each laboratory, in accordance with the recommendations by ELN for MRD [6]. In 240 cases, tubes were added to the maturation tubes of the Euroflow MDS diagnostic panel. The combinations for LAP analysis are shown in Supplementary Table S2. When the phenotype of the blasts at diagnosis corresponded to monocytes, more than 1 tube associated with LAP was designated.
Analysis was performed with Infinicyt software 1.6 to 2.0 (Cytognos). Both LAP assessment plus the "different from normal (DfN)" approach were used to analyze MRD according to the criteria of each laboratory. Any measurable MRD level was considered as positive. A representative MRD image is shown in Supplementary Figure S1. The abnormal population was quantified as a percentage of the total viable cells (including the erythroblast population). There was no reference laboratory, but each laboratory independently carried out the analysis of its samples. All centers carried out the same initial strategy of population selection analysis based on the 4 common markers. The 4 centers confirmed the MRD in more than 1 tube. MRD was considered assessable when the peripheral blood cell counts were recovered. Maturation patterns were analyzed. MRD assays were performed by 2 experts, and most included the lead analyst. A comparative study of 6 MRD samples was carried out between the 4 centers (Supplementary Table S3). MRD included in this study achieved a sensitivity of 0.1%, and more than 90% was 0.01%. The level of sensitivity was considered based on the following conditions: number of acquired events, viable cells, patient's LAP if available, and bone marrow status (representation of bone marrow cells: mast cells, plasma cells, B precursors...). We considered the marrow evaluable for analysis if mast cells, red series, and less than 80% mature neutrophils were found. A cluster of 20 cells with phenotypic abnormalities was needed for the detection of MRD.

Statistical Analysis
The objective of our study was to evaluate the impact of MRD on the outcome of patients with AML undergoing HSCT. The following endpoints were analyzed: cumulative incidence of relapse (CIR), acute and chronic graft versus host disease (GvHD), non-relapse mortality (NRM), overall survival (OS), and leukemia free survival (LFS). Probabilities of OS and LFS were calculated using the Kaplan-Meier method. Cumulative incidence functions were used to estimate CIR, NRM, aGvHD, and cGvHD rates in the setting of competing risks. OS was calculated from the time of transplantation to death from any cause, and those who survived were censored at last follow-up. LFS was calculated from the time of transplantation to relapse or death, and those patients who did not obtain a CR were considered events. Neutrophil recovery was considered as more than 0.5 × 10 9 /L for at least 2 consecutive days, and platelet engraftment was more than 20 × 10 9 /L platelets for at least 2 consecutive days. Comparisons were performed using the log-rank test for LFS and OS, and Gray's test for RI and NRM. The impact of age, type of conditioning, GvHD prophylaxis, ELN classification, and MRD (according to ELN criteria) was evaluated both in univariate and multivariate analysis. Cox proportional hazards regression models were constructed for OS and LFS. GvHD was diagnosed according to NIH criteria. The occurrence of chronic GvHD was treated as a time-dependent covariate. Patients who lived more than 100 days after transplantation were evaluated for chronic GvHD. Cumulative incidences were calculated with the cmprsk package for R version 2.14.0, and other analyses were performed using SPSS 20.0 and Stata 14.2.

Patient Characteristics
A total of 295 patients were included into the analysis; Table 1 shows the characteristics of the patients. Of them, 285 (96.7%) were in CR at the time of HSCT, 207 had negative MRD (MRD neg), in 21 patients MRD was positive but below 0.1% (namely MRD-low), and in 57 patients MRD was positive and greater than or equal to 0.1% (MRD-high or MRD ≥ 0.1). Ten patients had active disease. A total of 47.1% received HSCT from a matched sibling donor, while 39.7% had unrelated donors. A total of 59.7% of patients received myeloablative conditioning. Considering only patients who were in CR at the time of transplant, no significant differences were observed between patients with MRD positive (MRD pos) or negative, except for age, the prior being slightly older (54.5 vs. 51 years, p = 0.02), and GvHD prophylaxis, with MRD pos patients receiving more T-cell depletionbased approaches (Table 1).

Transplant Toxicities and GvHD
Regarding the toxicity of the procedure, no patient in the MRD pos, and three in the MRD neg group, had neutrophil graft failure. Regarding neutrophil engraftment, no differences were observed between MRD neg and MRD pos patients: median of 16 days (range 8-385) among MRD pos patients, versus 16 days (range 9-181) for MRD neg patients. Regarding platelet engraftment, two patients in both MRD pos and MRD neg groups did not engraft. There were no differences in the speed of platelet engraftment between MRD pos (15, range 5-171) versus MRD neg patients (12, range 3-1096) ( Table 2). One hundred and eighty-four (62.4%) patients developed acute GvHD: 56 presented grade one, 100 grade two, 15 grade three, and 12 grade four. Chronic GvHD was observed in 121 patients (41%); in 60 (20%) of them it was mild, and in 61 (20.7%) it was moderate/severe. There were no differences in the incidence of acute or chronic GvHD between MRD pos and MRD neg groups (Table 2).
On the other hand, 76 MRD positive patients with were evaluable for acute GvHD, and 32 of them relapsed. Of the patients, 63.5% (28 of 44) who did not relapse developed acute GvHD, versus 40.6% (13 of 32) of patients who relapsed (p = 0.063). In other words, we observed a trend towards a higher risk of aGvHD in MRD positive patients who did not relapse, as compared to those who did. No significant differences in chronic GvHD were observed: 61.3% (27 of 44) of patients who did not relapse developed chronic GvHD, versus 71.9% (23 of 32) of patients who did (p = 0.5).  Figure S2). Next, we attempted to confirm the prognostic value of the ELN17 proposed cut-off (0.1%) to identify patients at a higher risk of relapse and, therefore, we combined the MRD neg and MRD-low groups into MRD < 0.1: at 2 and 5 years, the respective values for these MRD < 0.  Figure 1, the OS and LFS of the three groups MRD < 1, MRD ≥ 0.1, and active disease are shown (p < 0.001).      Table S5).

Impact of MRD Prior to Transplantation on Outcomes after HSCT
Next, we attempted to identify whether these differences, in terms of OS and LFS between MRD ≥ 0.1 and MRD < 0.1, were due to a higher RI and/or NRM.
Considering only patients in CR, CIR at 2 and 5 years were significantly lower among MRD < 0.     (Table 3).
In Supplementary Table S6, we describe 43 patients who had MRD pos and did not relapse. Of these 43 patients, 30 (70%) developed acute GvHD (70%): 15 grade two, 4 grade four, and the rest grade one. Nineteen developed chronic GvHD (44.2%). Fourteen patients had MRD < 0.1, and three of them died because of an infection within one year after transplantation. The remaining 30 had MRD ≥ 0.1 and 11 (37%) of them have died, 8 before one year post-transplant because of infections (2 bilateral pneumonia), 2 due to GVHD, 1 due to veno-occlusive disease, 1 due to ureteral carcinoma, and another of unknown cause. Four of these thirty patients are still alive with a follow-up of less than 1 year. 58% and 41.1% in MRD < 0.1 vs. 39.3% and 20% for MRD ≥ 0.1 patients, p = 0.16 and p = 0.216, respectively; for intermediate risk: 77.6% and 70.2% for MRD < 0.1 vs. 60.6% and 43.5% among MRD ≥ 0.1 patients, p = 0.0.018 and p = 0.0037, respectively; and for favorable ELN2011 risk: 88.6% and 82.9% among MRD < 0.1 vs. 48.5% and 49.1% for patients with MRD ≥ 0.1, p= 0.0098 and p= 0.0315, respectively (Figure 3). Finally, MRD level did influence the outcome, irrespective of the conditioning regimen: patients with MRD < 0.1 before transplantation had a better OS and LFS at 2 years (82% and 71.4% among those who received myeloablative conditioning, and 65% and 57.6% among those who received reduced intensity conditioning, respectively) than those who had MRD ≥ 0.1 prior to transplantation (56% and 44.4% for myeloablative, and 43% and 25.5% for those who received reduced intensity conditioning) (p < 0.001) (Figure 4). Finally, MRD level did influence the outcome, irrespective of the conditioning regimen: patients with MRD < 0.1 before transplantation had a better OS and LFS at 2 years (82% and 71.4% among those who received myeloablative conditioning, and 65% and 57.6% among those who received reduced intensity conditioning, respectively) than those who had MRD ≥ 0.1 prior to transplantation (56% and 44.4% for myeloablative, and 43% and 25.5% for those who received reduced intensity conditioning) (p < 0.001) (Figure 4).

Impact of MRD on 100 Days after HSCT
Next, we analyzed the prognostic value of MRD at 100 days post transplantation; at 2 years, OS and LFS were significantly higher in patients with MRD < 0.1 on day +100 (83% and 76% in MRD < 0.1 vs. 76% and 7% in MRD ≥ 0.1, respectively, p < 0.001) ( Figure 5). Similarly, CIR at 2 years was significantly lower in patients with MRD < 0.1 (

Discussion
Numerous studies have described the prognostic value of measurable residual disease after induction and/or consolidation [18][19][20][21][22][23][24][25][26][27] and, based on these results, it is currently used to tailor the intensification strategy in patients with AML [21]. Terwijn et al. showed However, the main limitation for the analysis of the MRD at 100 days after transplantation is that only 16 patients with positive MRD remained in CR at this time-point.

Discussion
Numerous studies have described the prognostic value of measurable residual disease after induction and/or consolidation [18][19][20][21][22][23][24][25][26][27] and, based on these results, it is currently used to tailor the intensification strategy in patients with AML [21]. Terwijn et al. showed the independent prognostic value of MRD ≥ 0.1% by MCF after induction and consolidation therapy, identifying a subgroup of patients with higher risk of relapse [22]. Similarly, a second multicenter prospective study identified that patients with positive MRD had a poor outcome [23]. On the other hand, Balsat et al. showed the prognostic value of the detection of mutated NPM1 in peripheral blood after induction, thus identifying a subgroup of patients who would be candidates for transplantation [24]. Likewise, RUNX1-RUNX1T1 MRD monitoring also identifies a subgroup of patients at high risk of relapse [25]. Moreover, Jongen-Lavrencic et al. showed that the combination of detection of MRD by next-generation sequencing and flow cytometry has additional prognostic value in terms of relapse and overall survival [26]. Based on these findings, Cornelissen et al. proposed an algorithm, considering not only the risk subgroup at diagnosis and the performance status of the patients, but also the levels of MRD [27].
Unfortunately, although HSCT is considered the best option for patients with persistent MRD after first line treatment, tumor load, even at the MRD level, is one of the variables with the highest impact on outcome, after allogeneic stem cell transplantation, as described in different studies using either molecular or flow cytometry techniques. Regarding the latter, available studies have not used international standardized protocols; different cut-off values have been used, and the number of combinations of the fluorochrome-conjugated antibodies used, or the analysis strategy, have also greatly varied. Likewise, the need to maintain calibration and compensation protocols is not considered. Araki et al. showed that patients in CR with positive MRD at the time of transplantation had a similar outcome compared to patients with active disease with OS at 3 years, of 26% and 23%, respectively [14]. In this study, 1 million events were acquired, and a panel of 10 colors with three combinations of antibodies is carried out, considering any level of detection to determine positive MRD. However, it is a single center study conducted by the same group of hematopathologists. Additionally, a different from normal approach is used, considering a population showing a deviation from antigen expression patterns on reactive or regenera-tion hematopoietic cells. However, this approach might not display a uniform sensitivity in all cases. Moreover, it is not considered an international standardization of cytometers, allowing an adequate reproducibility within different laboratories. Likewise, in the same group in the context of myeloablative transplantation, the presence of MRD before or after transplantation identified patients with poor outcome [13] As in our study, they observed that 16 patients with MRD pos after transplantation had shorter OS and RFS. However, the MRD measurement is performed at an earlier post-transplant period, between days +28 and +35. Other studies that show the prognostic value of positive MRD use numbers of fluorescence less than six [17,28,29], or combine the detection of disease with low-sensitivity techniques, such as karyotype or fluorescence in situ hybridization (FISH) [28]. In a recent retrospective study, patients with monosomal karyotype had a higher likelihood of MRD positivity by MFC prior to transplant, with worse OS and higher relapse risk than those without it [12]. However, in multivariate analysis, only MRD positivity was associated with a shorter survival.
The current study has been conducted in routine clinical practice, in real life, outside of a specific trial, in four centers that had to meet certain requirements. Three laboratories participated in the quality control organized within the Euroflow QA program. Two different approaches were used to determinate MRD: "leukemia-associated phenotype (LAP)" and "different from normal (DfN)". The latter allowed us to identify MRD even if the diagnostic phenotype was not available, or in case of changes in the phenotype during the disease [30][31][32]. Both strategies have been used, as recommended by ELN [1,6]. All laboratories work with the same quality criteria, and each one carried out its own analyses by experts in clinical cytometry.
In the current study, we confirmed the prognostic value of MRD monitoring, both preand post-allogeneic stem cell transplantation. In fact, cumulative incidence of relapse was significantly higher, both if we consider patients with MRD pos as compared to MRD neg, and by using the cut-off value suggested by ELN, and this difference in relapse incidence significantly influenced OS and LFS.
Considering ELN2011 subgroups, MRD also allowed the identification of subgroups of patients with different prognoses among favorable and intermediate ELN2011, while in the adverse risk group, no clear difference was observed. The number of patients included in each subgroup, when the analysis is separately performed, might explain the lack of statistically significant differences in this subgroup, in terms of overall survival. Alternatively, the poor outcome of these patients might not be influenced by the MRD levels. Further studies would be required to elucidate this point.
An area of current interest, in the context of allogeneic transplantation, is whether the therapeutic strategy should be modified in those patients with positive MRD, before transplantation. In fact, several studies have suggested that patients with positive MRD prior to transplant might benefit from receiving myeloablative conditionings [17,33,34]. However, in a recent retrospective study of 746 patients, Morsink et al. observed that in all patients, regardless of the type of conditioning, the risk of relapse, relapse free survival, and OS were higher in patients with positive, as compared to those with negative, MRD [11]. Similarly, no effect of the conditioning regimen intensity on OS was observed in NPM1 mutated AML patients who remained MRD positive [35]. In the same way, in the current analysis, regardless of the type of conditioning, the presence of MRD by flow cytometry implied a poor prognosis. Recently, Paras et al. conducted a retrospective study of 810 patients, showing that the conditioning regimen can influence the ability to eliminate MRD; however, the elimination of MRD in non-myeloablative regimens had a greater impact on the outcome [36]. With these contradictory results, it remains unclear whether MRD should determine the choice of conditioning regimen. An alternative approach would be to identify these patients for alternative therapy after HSCT, such as withdraw of immunosuppression, hypomethylating agents [37], infusion of donor lymphocytes [38], or venetoclax [39], among other approaches. For example, the RELAZA2 prospective study demonstrated that treatment with azacytidine [37] in patients with MRD pos could avoid or delay hematology relapse of AML or high-risk myelodysplastic syndromes (MDS). In this regard, MRD monitoring after transplantation would further allow the identification of patients at a high risk of relapse, and establish therapeutic maneuvers.
In our patient's cohort, positive MRD on day +100 after transplantation was associates with an extremely poor prognosis, with a CIR of 93.3%. It is possible that the identification of MRD should be conducted in an earlier post-transplant period, to assess any preventive strategy to avoid relapse. Moreover, in our study, those patients who reach day +100 post-transplant and persist, or have a new positive MRD, have an unfavorable short-term prognosis, regardless of the status of the pre-transplant MRD. In this way, a recent study of Paras et al. showed that the dynamics of MRD before, and early after transplantation (+20 or 40 days after transplantation), improve the accuracy of risk assessment; in that study, patients with MRDpos/MRDpos and MRDneg/MRDpos had high relapse rates and poor survival estimates [36].
In relation to the type of donor, is relevant to point out that Yu et al. have shown that a haplo-identical donor might increase the graft versus leukemia effect [40]. Therefore, the positivity of MRD before transplant could modify the selection of the donor.
Other authors have identified the CD34+/CD38-leukemic stem cell (LSC) phenotype, and have shown its prognostic value, both at diagnosis and during follow-up [41][42][43][44]. Furthermore, this phenotype might be used in combination with the MRD assessment in a single tube combining six markers [43]. Therefore, future studies might allow the further improvement of MRD analysis, for example, by incorporating the LSC phenotype into the standardized MRD strategy currently used.
The main limitation of this study is its retrospective nature over a long period of time, in which changes have been incorporated to the treatment according to the molecular results. Likewise, GvHD conditioning and prophylaxis vary during this period. On the other hand, each center carried out its analysis independently. However, this further reinforces the fact that a positive MRD performed with standardized technical conditions has prognostic value in real life.

Conclusions
Our multicenter study confirms the prognostic value of MRD performed in accordance with standardized recommendations. Regardless of the conditioning regimen, positive MRD maintains its poor prognostic value, and might allow the identification of patients who are candidates to receive early post-transplant therapeutic procedures.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/cancers15051609/s1, Figure S1: Representative image of MRD analysis by flow cytometry. Figure S2: Impact of MRD levels in OS and LFS. Table S1. The information per fluorchrome from QA program, Table S2. Designed panels to LAIP analysis, Table S3. Comparative analysis of MRD samples in the 4 laboratories: The table represents the MRD value determined by each laboratory, Table S4. Univariate analysis of CIR, NRM, OS and LFS, Table S5. OS and LFS according to multivariate time-dependent analysis (time-dependent variable GvHD, Table S6. Evolution of patients who did not relapse with positive MRD prior to transplantation.

Data Availability Statement:
The original data will be available by request to the corresponding author.