Clinical Performance and Future Potential of Magnetic Resonance Thermometry in Hyperthermia

Simple Summary Hyperthermia is a treatment for cancer patients, which consists of heating the body to 43 °C. The temperature during treatment is usually measured by placing temperature probes intraluminal or invasively. The only clinically used option to measure temperature distributions non-invasively and in 3D is by MR thermometry (MRT). However, in order to be able to replace conventional temperature probes, MRT needs to become more reliable. In this review paper, we propose standardized performance thresholds for MRT, based on our experience of treating nearly 4000 patients. We then review the literature to assess to what extent these requirements are already being met in the clinic today and identify common problems. Lastly, using pre-clinical results in the literature, we assess where the biggest potential is to solve the problems identified. We hope that by standardizing MRT parameters as well as highlighting current and promising developments, progress in the field will be accelerated. Abstract Hyperthermia treatments in the clinic rely on accurate temperature measurements to guide treatments and evaluate clinical outcome. Currently, magnetic resonance thermometry (MRT) is the only clinical option to non-invasively measure 3D temperature distributions. In this review, we evaluate the status quo and emerging approaches in this evolving technology for replacing conventional dosimetry based on intraluminal or invasively placed probes. First, we define standardized MRT performance thresholds, aiming at facilitating transparency in this field when comparing MR temperature mapping performance for the various scenarios that hyperthermia is currently applied in the clinic. This is based upon our clinical experience of treating nearly 4000 patients with superficial and deep hyperthermia. Second, we perform a systematic literature review, assessing MRT performance in (I) clinical and (II) pre-clinical papers. From (I) we identify the current clinical status of MRT, including the problems faced and from (II) we extract promising new techniques with the potential to accelerate progress. From (I) we found that the basic requirements for MRT during hyperthermia in the clinic are largely met for regions without motion, for example extremities. In more challenging regions (abdomen and thorax), progress has been stagnating after the clinical introduction of MRT-guided hyperthermia over 20 years ago. One clear difficulty for advancement is that performance is not or not uniformly reported, but also that studies often omit important details regarding their approach. Motion was found to be the common main issue hindering accurate MRT. Based on (II), we reported and highlighted promising developments to tackle the issues resulting from motion (directly or indirectly), including new developments as well as optimization of already existing strategies. Combined, these may have the potential to facilitate improvement in MRT in the form of more stable and reliable measurements via better stability and accuracy.


Introduction
Hyperthermia (39-43 • C) has been successful as a cancer treatment due to several beneficial effects on tissue, such as enhancing the efficacy of radiotherapy and chemotherapy [1,2]. The hallmarks of hyperthermia have been identified and are comprehensively presented by Issels et al. [3]. Due to these benefits, paired with no major side effects, hyperthermia has established itself in the clinic for many tumor sites [4][5][6][7]. Dose-effect studies show a positive association between thermal dose parameters and clinical outcome, which implies that real-time temperature dosimetry is essential [8][9][10][11][12][13][14]. Temperature during treatment is traditionally monitored by probes inside catheters that are placed inside lumina or pierced into tissue. These provide information at a limited number of points and may be difficult or unfeasible to place, or associated with complications [15,16]. Magnetic resonance thermometry (MRT) can provide a real-time 3D temperature map in a non-invasive way ( Figure 1) and hence has the potential to make hyperthermia safer for the patient. Visualizing what is heated and to what extent is a necessary first step to be able to not only control hot spots in normal tissue and adapt to cold spots in tumor tissue, but also provide the means to perform a repeatable measurement, as well as to investigate the true optimum temperature for maximizing clinical outcome. MRT has been shown to correlate with pathological response in soft-tissue sarcomas of prospectively registered patients [17]. Despite this potential, MRT thus far has failed to establish itself as the standard temperature measurement method in hyperthermia treatments. Given the continued reported progress in the pre-clinical setting, we hypothesize that a major cause of this stagnation is the unclear validation status, as well as the non-standardized way of reporting pre-clinical performance. There is currently no overview of the clinical status quo of MRT in hyperthermia, and promising technologies are difficult to spot in the jungle of performance indicators. Further, the substantial financial investment will be overcome once the full contribution of MRT to hyperthermia quality is convincingly shown.
There have been many successful attempts to review the field. Rieke et al. [18] gives an overview of the different magnetic properties that can be exploited to obtain MRT. The importance of accuracy and stability of thermometry measurement are stressed, and acquisition and reconstruction methods that reduce motion artefacts are highlighted. Winter et al. [19] is expanding on those challenges faced, also supplying possible solutions. In addition to the hurdles, the implicit nature of the requirements for adequate MR temperature mapping during hyperthermia treatments complicates this quest. Different MRT techniques have different drawbacks and are thus suitable for different purposes of application. One example is the proton resonance frequency shift (PRFS), which is most frequently used to measure temperature due to its linear variability with temperature and, with the exception of fat, tissue independence. As the investigated shifts are very small, they are not easily able to deal with physiological changes, hence accurate temperature measurements are hampered by changes in the microenvironment of the tissue, for example in flow, oxygen levels, perfusion and magnetic properties of the blood. Lüdemann et al. [20] compared MRT techniques and their achievable accuracies. Despite these excellent reviews in the field, there has not been a comprehensive analysis of the validation status and a ranking of the pre-clinical work based on a clear set of performance indicators.
form reporting and clear comparisons across studies. Secondly, after a systematic literature search, clinical data will be used to assess to what extent the MRT performance metrics obtained satisfy these requirements. This will be used to identify areas of insufficiency, but also areas of overlap and common concerns. Finally, we will use pre-clinical data from the literature search to identify new techniques, which address those common concerns. By highlighting these 'most promising to advance the field' publications, we hope to emphasize the direction for future research and thus accelerate progress further.  [21] and re-printed with permission from John Wiley and Sons. Example of anatomy with thermal mapping catheters (red arrows) for two patients with corresponding MR temperature distributions for three different slices. The images were acquired with a T1-weighted gradient-echo sequence. (The arrows were superimposed on the original image for clarity.)

Minimum Recommended Clinical MRT Performance
There are a lot of performance measures that can be evaluated and reported on, which in turn depend on many different acquisition settings. To clarify the situation, we introduce the most important acquisition parameters and state which MRT performance measures are vital to report on and define what minimum values we consider acceptable, based on the group's expertise in nearly 4000 clinical (superficial and deep) hyperthermia treatments [6,9,22]. Our aim is to create a clear list of requirements of what is needed from a clinical MR guided hyperthermia treatment perspective. The focus is on MRT for mild and moderate hyperthermia (39-43 °C) only, hence excluding ablative temperatures. The latter has been the aim for most techniques, since MR guided thermal ablation has a much  [21] and re-printed with permission from John Wiley and Sons. Example of anatomy with thermal mapping catheters (red arrows) for two patients with corresponding MR temperature distributions for three different slices. The images were acquired with a T1-weighted gradient-echo sequence. (The arrows were superimposed on the original image for clarity.) For patients to benefit from MRT in hyperthermia treatments, it needs to become reliable so that the invasive probes are no longer needed. Our objective is to identify how MRT can be improved to a point where the added value is appreciated in the clinic, leading to a more widespread use. In order to aid this development, we will firstly define minimum requirements for a successful treatment, creating a benchmark for more uniform reporting and clear comparisons across studies. Secondly, after a systematic literature search, clinical data will be used to assess to what extent the MRT performance metrics obtained satisfy these requirements. This will be used to identify areas of insufficiency, but also areas of overlap and common concerns. Finally, we will use pre-clinical data from the literature search to identify new techniques, which address those common concerns. By highlighting these 'most promising to advance the field' publications, we hope to emphasize the direction for future research and thus accelerate progress further.

Minimum Recommended Clinical MRT Performance
There are a lot of performance measures that can be evaluated and reported on, which in turn depend on many different acquisition settings. To clarify the situation, we introduce the most important acquisition parameters and state which MRT performance measures are vital to report on and define what minimum values we consider acceptable, based on the group's expertise in nearly 4000 clinical (superficial and deep) hyperthermia treatments [6,9,22]. Our aim is to create a clear list of requirements of what is needed from a clinical MR guided hyperthermia treatment perspective. The focus is on MRT for mild and moderate hyperthermia (39-43 • C) only, hence excluding ablative temperatures. The latter has been the aim for most techniques, since MR guided thermal ablation has a much wider use. Compared to ablation, temperature changes in mild and moderate hyperthermia are slow (approximately 10-30 min to reach the target temperature), target regions are generally large, and the temperature changes from baseline are low (2-8 • C). Consequently, the desired temperature mapping performances are also different: although spatial resolution may be lower, measurement accuracy and stability (temporal temperature precision) must be high and robustness against confounders much better.
The minimum acquisition parameters we recommend for successful MRT are reported in Table 1 and the minimum MRT performances are shown in Table 2. Table 1. Minimum acquisition parameters, such that successful MRT in hyperthermia can be achieved.

Parameter Definition Minimum
Spatial resolution In-plane resolution times slice width (2D) or through-plane resolution (3D) 125 mm 3 Temporal resolution Time needed to acquire one MRT slice 20 s Table 2. Minimum performance metrics for successful MRT in hyperthermia treatments.

Metric Definition Minimum
Bias Mean error (ME) Spatial temperature precision Spatial temperature standard deviation (SD) Considering the large areas of heating in hyperthermia and consequently low thermal gradients, we consider a reasonable minimum spatial and temporal resolution to be 125 mm 3 (for instance 5 × 5 × 5 mm 3 ). A higher spatial resolution may be required to achieve acceptable accuracy, by avoiding partial volume effects in regions with many small and contrasting tissues.
For this recommendation, we also considered the current spatial resolution that is achieved with invasive thermometry. In general, the distance between measuring points along a thermometry catheter track is 1-2 cm. The distance between thermometry catheters is much larger still, in the range of 5-10 cm. Additionally, the MRT resolution should be considered with respect to the resolution of our ability to steer the energy distribution. At this moment, the focus of the 100 MHz RF-deep heating has a diameter of 7-14 cm. For the Hypercollar3d operating at 434 MHz, this is 3 cm. Finally, when utilizing hyperthermia treatment planning for deep as well as head and neck treatments, the CT images used for planning are acquired with a slice width of 5 mm and a resolution of 0.98 mm in both x and y [23]. It is also worth considering that a higher resolution in hyperthermia treatment planning comes at the cost of increased intricacy and treatment time [24].
For deep heating, the clinical objective is to achieve a temperature increase between 0.5 and 2 • C per 5 min. If it is lower (<0.5 • C), the power is increased in order to speed it up; if it is higher (>2 • C), the power is reduced to slow it down. Because of these relatively slow heating times and the resulting high time constant of thermal washout, the minimum temporal resolution should be 20 s. This recommended minimum of the temporal resolution concerns the minimum acceptable time from a clinical perspective, and faster scanning may be required in regions of motion to achieve acceptable accuracy. Another reason to speed up the acquisition may be when the averaging of temperature data is required to achieve the minimum MRT performance, as stated in Table 2.
Regarding the important performance measures, the first mentioned in Table 2 is bias, measured as the mean error (ME), which is defined by Walther et al. [25] as: This is the difference between the MRT measurement (E j ) and another temperature measurement that is considered true (A) over all measured time points (n). This reference A, i.e., the gold standard, can be a set of invasive temperature probes, or another MRT map originating from a well-established sequence. It is important to have a reliable and repeatable MRT readout, without a systematic over-or underestimation of temperature, translating to a low bias in measurements. Curto et al. made a comparison of the currently worldwide installed five RF-MR hybrid systems in anthropomorphic phantoms, showing with a mean error as low as 0.13 • C can be achieved with current systems in 'ideal condition' pre-clinical settings [26]. In light of the best resolution available, we consider a ME of ≤|0.5 • C| to be appropriate.
The following two measures, defined in Table 2, are spatial and temporal temperature precision. The spatial temperature standard deviation (SD) reflects the variability in the region of temperature evaluation, consisting of a ROI. Spatial temperature SD of the ROI evaluated should be ≤0.5 • C in order to guarantee that the noise present is not too large and there are no large temperature gradients within the heated region; in other words, the heated region is sufficiently uniform. Temporal temperature SD assesses the variability of the spatial mean temperature in a ROI across all time points and indicates the repeatability and stability of the measurement. Considering treatment times are long, but keeping in mind the importance of staying in the target temperature zone, the temporal temperature precision should not exceed 0.5 • C (after drift correction) for a 90 min thermometry measurement. Both the spatial and temporal temperature SD are influenced by the size and location of the ROI chosen. This, in turn, is highly dependent on the MRT region imaged, as areas with poor uniformity (for example near tissue/air boundaries) need to be avoided for sufficient accuracy of the measurement. Due to this needed flexibility, no recommendation on size and location of the ROI will be stated. In order to fulfil the minimum requirement of the temperature precision defined above, the ROI should be chosen with care in a region as uniform as possible. The measures of temperature precision are only valuable when the ROI is kept constant throughout the measured time points.
The final performance measure that is vital to report on is the accuracy of the MRT measurement. Accuracy, as stated by Walther et al. [25], can either be presented as the mean squared error (MSE), the root mean squared error (RMSE) or the mean absolute error (MAE). We consider the MAE the best one for our application since it is less sensitive for outliers and easy to interpret: where E j is the MRT measurement, A is another temperature measurement that is considered true and n is the number of all measured time points. Given the importance of keeping to the right heating range for the desired physiological changes in the tumor tissue, we think it should be ≤1 • C.

Literature Search
In order to ensure that all papers published using MRT in hyperthermia treatments will be included, a logical search string was defined including a hyperthermia term, a magnetic resonance thermometry term, and excluding ablation in a major term. The search strings used for the different databases are provided in Appendix A. We searched the databases for papers published from inception of the databases until 24 November 2020. Details on the number of results obtained from the respective data bases are presented in Table A1.
Using the method from Wichor et al. [27], all papers were screened by title and abstracts for relevancy to our topic. At this stage, papers were excluded if they were not published in English, if they were not research articles or if the topic was not related to MRT in hyperthermia. Our definition of the combination of mild and moderate hyperthermia includes treatments with the heating goal between 39 and 45 • C. We acknowledge that in some cases tumor temperatures can be higher than the target temperature, thus papers up to 47 • C were considered relevant.
The resulting 218 relevant papers were then assessed for eligibility, using the following exclusion criteria: (#1) ex-vivo results, (#2) not original data, and (#3) small animals. Ex vivo results excluded studies on simulation or phantoms, which we considered too far from the final intended use of the clinic to be included in this review. No original data excluded reviews and studies using already published data as reference. Small animals were considered to be anything smaller than a dog. These studies were excluded because we deem these data not predictive for humans due to the different motion profile (e.g., faster heart rate) and their smaller size. Additionally, the equipment used is specially made and non-clinical, lowering the ease of translation into the clinic. Large animal studies without heating were also not included. After this eligibility assessment, 43 papers remain to be included in the systematic analysis. A PRISMA flow chart of the exclusion process is shown in Figure 2. Using the method from Wichor et al. [27], all papers were screened by title and abstracts for relevancy to our topic. At this stage, papers were excluded if they were not published in English, if they were not research articles or if the topic was not related to MRT in hyperthermia. Our definition of the combination of mild and moderate hyperthermia includes treatments with the heating goal between 39 and 45 °C. We acknowledge that in some cases tumor temperatures can be higher than the target temperature, thus papers up to 47 °C were considered relevant.
The resulting 218 relevant papers were then assessed for eligibility, using the following exclusion criteria: (#1) ex-vivo results, (#2) not original data, and (#3) small animals. Ex vivo results excluded studies on simulation or phantoms, which we considered too far from the final intended use of the clinic to be included in this review. No original data excluded reviews and studies using already published data as reference. Small animals were considered to be anything smaller than a dog. These studies were excluded because we deem these data not predictive for humans due to the different motion profile (e.g., faster heart rate) and their smaller size. Additionally, the equipment used is specially made and non-clinical, lowering the ease of translation into the clinic. Large animal studies without heating were also not included. After this eligibility assessment, 43 papers remain to be included in the systematic analysis. A PRISMA flow chart of the exclusion process is shown in Figure 2.

Categories and Classification
The studies included were then categorized into patients with treatment intent and pre-clinical groups (with no hyperthermia treatment intent). Peller et al. [28] included treated and non-treated patients and thus was allocated to both groups. Clinical studies included 10 studies. Pre-clinical studies consisted of 35 studies: 26 papers with human subjects and 12 studies including large animals. In early volunteer studies, heating and cooling were sometimes applied to the volunteers without therapeutic intend. These studies were also considered pre-clinical.
The relevant papers were read in detail and relevant data were extracted into Microsoft Excel tables. The information, such as first author and year of publication, is the one obtained from the EndNote library. Other study data considered relevant were: hyperthermia treatment approach, imaging setup, MRT performance and the exclusion of data. Pre-clinical papers were also grouped and ranked based on their main aim and achieved improvement to identify promising techniques. Large animals, volunteers and non-heated patients are easier to image than treated patients, making it more likely for their MRT data to be artefact free. Large animals are typically sedated and mechanically ventilated during treatment, which reduces their breathing and makes it more predictable, and also lowers their blood perfusion. Muscle relaxant and bowel movement suppressants are also administered, minimizing any other avoidable motion. Volunteers have the advantage of no initial stress from illness and, when there is no heat applied, no additional stress during the treatment. Non-treated patients also lack the additional stress of treatment. Except for these differences, both large animals and volunteers have similar confounders such as size, motion profile and they generally use the same equipment for heating as well as imaging. Thus we consider these pre-clinical studies predictive for the reproducibility in patients during treatment.

Status
MRT in hyperthermia is predominantly used for extremities (67%) and some studies investigated it in the pelvis (33%). This trend can be explained by the absence of motion and resulting artefacts in extremities. Data of ongoing research in our group show that achieving successful MRT in the pelvic area is much harder than in more static regions of the body. The average maximum temperature achieved during the hyperthermia treatments was 43.8 • C, which is well within the target treatment temperature range, and the treatment time varied from 30 to 90 min. All studies applied hyperthermia using radiofrequency (RF) electromagnetic waves. The most popular system is the BSD2000/3D/MR, which incorporates the twelve channel Sigma Eye applicator.
The imaging setup for the 10 clinical studies is presented in Table 3. The published MRT in hyperthermia clinical experience is limited to very few centers (Duke, Tubingen, Berlin, Munich). Hence there is a challenge on translating their high degree of specific experience to other centers. Additionally, it is difficult to define a benchmark due to the limited amount of data published.
The imaging coil used by most was the body coil, so when this information was absent, the body coil was assumed. MRT was based on the proton resonance frequency shift (PRFS), except in Peller et al. [28], who used T1. This is not surprising, as PRFS varies linearly with temperature over an adequate range and is near independent of tissue type [29]. Gradient Recalled Echo (GRE) sequences were generally used (Table 3), and all sequences acquired 2D MRT maps. Peller et al. [28] was the only study which used a 0.2 Tesla MRI instead of 1.5 T. The frequency of MRT acquisition varied from continuous to every 20 min. Studies that reported values for spatial and temporal resolutions within our recommended minimum of 125 mm 3 and 20 s are shaded in green in Table 3. Most studies manage to satisfy the minimum requirements, as defined in Tables 1 and 2.
Methods used to improve the thermometry quality were: Increasing the number of excitations (NEX) > 1 (number of times each k-space line is read) [30,31]. Applying flow compensation [32]. Including modelling of blood perfusion [33].
Using background field removal algorithms to correct for motion-induced susceptibility artifacts [34]. Only selecting evaluable volumes or treatments-all but one study (discussed in "Exclusion" section below). Table 3 also shows the MRT performance reported in clinical papers. Values that meet our minimum requirements are shaded in green. Of the metrics that are reported, 6/9 of studies (67%) satisfy one or more of our minimum requirements.
Unsoeld et al. [17] shows the correlation of measured temperature with clinical outcome. Whilst this study investigates the true goal of the treatment, this study could have contributed more to the field if it had also reported bias, temperature SD and accuracy. This would have helped to understand the required treatment quality and the relationship between thermal dose and treatment outcome. A similar line of thought applies to the study of Wu et al. [34], which gives accounts of TNR improvement from their investigated correction method, but neglects to quantify these. Table 3 demonstrates that few performance metrics are reported, which makes it difficult to compare the status of MRT between different studies. Additionally, definitions of parameters are often lacking, leading to the need for educated guesses.

Exclusion of Data
Comparing these indicative performance metrics listed above comes with limitations. Often even the ROIs considered within the same study at different time points are not constant. Additionally, certain numbers of time points were often excluded from the evaluation-usually due to image artefacts that produce noisy thermometry maps. This decrease in the number of thermometry maps adds selection bias to the performances reported. In Table 4, we present what data were excluded post-acquisition and the reason why the authors excluded the data. If exclusion was not explicitly mentioned, we assumed that all MRT data acquired were also included in the analysis.
As is shown in Table 4, only one study included all of the acquired MRT data. This apparent need to exclude data underlines the need for MRT to become more reliable in regions of motion before it can replace invasive temperature probes. Information on the total study sizes also provides objective information on the practicality of using MRT. The limited number of publications on clinical use of MRT is highlighted and confirms that experience is very local (and presumably the conclusion on the feasibility of MRT is biased by the positive attitude of the researchers). All of the above clearly demonstrates that MRT is still in a developing phase and there exists a substantial need to make major improvements to expand to broader use of the technology.  Bulk motion due to discomfort during treatment, ROI contained too much gas

Pre-Clinical Status-How Does It Compare?
Comparing their imaging setup, pre-clinical studies are very similar to clinical ones. The sequences used and MRT methods used were more varied, but just like the patient studies investigated, the spatial resolution was met in all studies and temporal resolution requirement was met in 29/35 studies. Regarding MRT performance metrics about half of the pre-clinical studies achieved our minimum requirements. This is illustrated and contrasted to the clinical performance in Figure 3. Considering exclusion of data post-acquisition, five pre-clinical studies (two with large animal and three with human subjects) excluded some, which is significantly less than the clinical studies investigated. This most likely can be linked back to the subjects making measurement conditions less challenging, as mentioned above. Table 5 presents the main techniques and methods investigated in the pre-clinical studies and their found improvement over standard methods. From this information, we have identified common main aims (last column).  Considering exclusion of data post-acquisition, five pre-clinical studies (two with large animal and three with human subjects) excluded some, which is significantly less than the clinical studies investigated. This most likely can be linked back to the subjects making measurement conditions less challenging, as mentioned above. Table 5 presents the main techniques and methods investigated in the pre-clinical studies and their found improvement over standard methods. From this information, we have identified common main aims (last column). Table 5. Pre-clinical studies by the year of publication; including technique/method investigated, improvement found (where applicable) and main aim of the study. Feasibility and comparison studies are presented in grey. Studies satisfying our performance criterion are highlighted in green.

Author
Year   When looking at the improvements mentioned over the benchmark methods (column 4 of Table 5), there was only one study that did not find an improvement in their investigated techniques (Mei et al. [52]). This demonstrates the importance and success of pre-clinical work.
The most promising techniques from Table 5 are the 13 studies satisfying our MRT performance criterion, these are highlighted in green. It can be seen that in recent years more studies have satisfied these minimum requirements. A total of 8 out of those 13 studies are feasibility or comparison studies (marked in grey in Table 5), which can be grouped into having investigated: It needs to be highlighted that with the exception of Bing et al. [38], these studies have investigated only one subject, so their potential needs to be validated on a larger scale. Despite great potential, the possible limitations of the techniques mentioned above when transferred to patients in a clinical environment must be mentioned. Breath hold may not always be a viable option for the clinic. Some patients (for example young children) may not be able to hold their breath effectively, or may be sedated during the treatment. Additionally, the total treatment time will lengthen, as the patient needs periods of normal breathing to recover. Navigators to correct for B0 changes induced by breathing may only be valuable in areas with no motion in the treated area, as well as very regular breathing patterns. Similarly, B0 drift corrections may only be valuable in areas with no motion or motion induced changes present.
It can be seen from the investigated studies that groups working on MRT advances are generally different groups than those working on patients with a treatment objective. The consequence is that solutions are presented with very limited ability to be transferred to clinical practice in RF-hyperthermia MRT. This is highlighted by the fact that no progress has been made in MRT for pelvic RF-hyperthermia in the past two decades: the body coil is still used for imaging and PRFS is used with no real solution for correction of external and internal movement, for instance by passing air.
However, the technological advances in recent years are promising in many ways. Multi-coil integrated hyperthermia systems are becoming available and provide much better SNR than the commonly used body coil [55,72]. At the same time, multi-coil acquisition also offers acceleration of MRT by techniques such as, for example, parallel imaging or compressed sensing [73]. This faster acquisition enables better temperature monitoring, especially in regions with motion. Additionally, the computational power available now is much bigger and cheaper compared to only a few years ago, which makes more complicated reconstruction method and correction strategies feasible [34]. Last but not least, new sequences and approaches are being developed to increase MRT performance and explore the possibility to perform MRT in more difficult regions. In early years, researchers broadly investigated different methods of MRT, but PRFS quickly crystalized as the method of choice for hyperthermia treatments for reasons aforementioned. It is only now, that other methods are being considered again. Hybrid approaches of PRFS/T1 MRT are just one method on the horizon that enables temperature mapping in fatty regions [74], which PRFS alone would be unable to detect.
Considering all these innovations, the present conditions are favorable to push MRT to the next level and hopefully, in the near future, have the powers to elevate it from a research modality to clinical practice.

Conclusions
Standardized reporting of parameters used and performances obtained in MRT is important. Hence we defined a minimum benchmark of important performance metrics including bias, spatial and temporal temperature precision as well as accuracy; these should be within ≤|0.5 • C|, ≤0.5 • C, ≤0.5 • C and ≤1 • C, respectively.
When systematically assessing the literature, we can conclude that MRT performance in hyperthermia is already achieving most of these requirements for extremities but not yet in regions with more motion present. Motion as well as the B0 changes, as a direct or indirect consequence, emerged as the main problem of accurate and reliable MR temperature measurement.
Various techniques satisfying our performance requirements are already available at the pre-clinical stage addressing these problems. Most promising common solution proposals can be divided into either new approaches or optimizations. New approaches include hardware or software being developed; propitious optimizations include correction and stabilization strategies, navigator echoes, breath hold and various reconstruction methods. We anticipate that highlighting these promising pre-clinical advancements will accelerate the progress of MRT.   Cochrane CENTRAL register of trials (1992-) 21 15 Google scholar 200 147

Total 2929 1675
Original search strings of the respective databases introduced in Figure 2.