Registration of Magnetic Resonance Tomography (MRT) Data with a Low Frequency Adaption of Fourier-Mellin-SOFT (LF-FMS)

Fourier-Mellin-SOFT (FMS) is a rigid 3D registration method, which allows the robust registration of 3 degrees-of-freedom (dof) rotation, 1-dof scale, and 3-dof translation between scans on discrete grids. FMS is based on a spectral decomposition of these 7-dof. This complete spectral representation of the input data enables an adaption to certain frequency ranges. This special property is used here to focus on relevant mutual 3D information between bone structures with a Low Frequency adaptation of FMS (LF-FMS), that is, it is utilized for matching and concurrently determining corresponding transformation parameters. This process is applied on a set of Magnetic Resonance Tomography (MRT) data representing the hand region, in particular the carpal bone area, in a sequence of different hand positions. This data set is available for different probands, which allows a comparison of resulting parameter plots and furthermore matching in between bone structures.


Introduction
Three dimensional (3D) registration, rigid and non-rigid, is used in many medical applications in the context of different medical imaging technologies [1], but registration is challenging in the medical domain as there are, for example, high variations across different individuals, limitations in the temporal and spatial resolution of the imaging methods and significant interfering structures, that is, data that does not belong to the object(s) of interest.
Many registration methods are based on features, that is, distinct local information that can be matched across two scans [2]. Examples include geometric features describing rigidness in the context of CT-MR brain image registration [3], edges or ridges for matching MRT brain images [4], or B-splines for PET-CT image registration [5]. Selected anatomical parts, for example, the aortic valve, can be used as anchors [6]. Similarly, objects like stents can serve that purpose [7]. Note that for inter-modal registration, the similarity metric between potentially corresponding voxels can have a strong influence [8]. Furthermore, some medical applications require non-rigid registration approaches. For example, lungs are highly deformable organs [9]. Other examples of non-rigid registration in the context of medical applications can, for example, be found in [10][11][12][13][14][15][16][17][18]. While feature based methods are well established and have proven their usefulness, they also have known limitations with respect to robustness, especially when the quality of the data is not optimal, for example, due to sensor noise, limitations in the resolution, or interfering structures [19].
Similar considerations apply to the Iterative Closest Point (ICP) algorithm [20] and its variants, which are also often used in medical applications [21]. It is a well-established method, but it is known to only perform well in a restricted transformation space, as it is by design prone to local optima. A recent ICP version (GO-ICP) [22] is supposed to mitigate this. But as experiments show, GO-ICP does not converge when there are interfering structures in the scans, which makes it not well suited for medical applications. Spectral methods, for example, Fourier-Mellin-SOFT (FMS) [19] that is used in the research presented here, are in contrast global methods, which mitigate the aforementioned limitations. One approach along those lines is the detection of a main rotation axis [23][24][25], which causes a minimum in a spherical projection of the difference of corresponding spectrums. This principle axis method is theoretically elegant, but it lacks robustness when the two scans are affected by interfering structures, that is, it is also not well suited for medical data where two scans never look perfectly alike. Also, unlike FMS, the principle axis method can not cope with different scales. Spectral Registration with Multi-Layer Resampling (SRMR) [26] is more robust than the principal axis approach but it uses heuristic resampling and it also lacks the option to determine scale.
Scale can be a transformation parameter of interest for medical applications [27]. In [27], scale is determined as the size of local structures under a pre-specified regionhomogeneity criterion. This is in contrast to the spectral approach used here. For the Mellin transformation [19,28] scale is always the entity or the major part of the data segment where the transformation and the subsequent registration is applied.
The application case considered here stems from the determination of transformation parameters under body movements [29,30]. More precisely, the clinical application is an orthopedic diagnosis utilizing sequences of MRT scans representing a complete movement of extremities, that is, from one feasible extreme point of a posture to an other extreme point. The parameters from these bone movements are represented by rigid transformations. The goal is to detect pathological behavior by comparing patient movements and the determination of reproducible and comparable reference parameters of 3D kinematics for orthopaedic diagnosis. Advances in MRT scanning technologies [31] allow recording of moving sequences, which provide a comparable basis of 3D kinematics of joints.
This type of analysis was also done using markers implanted in cadaveric specimens [32]. But these parameters are less significant than an analysis using living probands. In [30], a method is presented using the principal axes transformation described by using only the inertia properties of an object [33], which allows only a coarse analysis of 3D transformation parameters. This method requires furthermore a complete and accurate segmentation of all scans at their corresponding body positions.
Here, Fourier-Mellin-SOFT (FMS) is extented for use with MRT data sequences. FMS determines 7 degrees of freedom (dof) in subsequent dependent steps [19]. It is based on decoupling all single dof by processing spectral structural information. The magnitude of the spectral information contains all necessary information for a registration of scan data even for significant changes of transformation parameters.
Anatomical structures are differently represented in different imaging modalities. Hence, it is reasonable to focus registration on the basic shape of the object of interestinstead of relying on local, small features of the object(s). Furthermore, there are in addition changing imaging positions and interpolation from only a few MRT slices in the use-case presented here. In a sequence of different scans, that is, scans of different body positions of the extremities and especially of bones, details of bone structures are extremely unlikely to coincide. Since detailed structures are represented by middle and high frequencies, all FMS registration steps are reduced here to low frequencies. This requires a few non-trivial adaptations of FMS as described below.
This article hence presents an adaption of the FMS algorithm to a restricted range of 3D low frequencies (LF), dubbed hence LF-FMS, which achieves a robust registration of objects represented with low-detail shape information, for example, across different imaging modalities, under small motions of local parts, or for different persons. In addition to the underlying theory for the method, a use case with small bone structures in different positions and across different persons is presented.
The rest of this article is structured as follows. In Section 2, the core elements of the Fourier-Mellin-SOFT (FMS) algorithm are presented, which are followed by a description of the FMS transformation sequence in Section 3. The low frequency adaptation of FMS (LF-FMS) is introduced in Section 4. A description of the experiment data is provided in Section 5. The related experiments and results are presented in Section 6. Section 7 concludes the article.

The FMS Algorithm
The resampling methodology in (1)-(6) is the non-trivial 3D extension of the well known 2D Fourier Mellin Invariant (FMI) [28], which is in the following shortly motivated. An important property is that it decouples the transformation parameters of rotation and scale from translation.
The 3D spectrums of a voxel range (1) between two scans contain all degrees of freedom (dof). The 3-dof translational shift t s = [x s y s z s ] T , 3-dof rotation g(α, β, γ) ∈ SO(3) and 1-dof scale σ. The phase information is a conglomerate of all 7 involved parameters, which can hence not be used for a registration. By taking the magnitude of the spectrum, rotation g(α, β, γ) and scale σ remain in the structural information for the first registration step. S(k) and R(k) are corresponding 3D transformations of discrete 3D data and the corresponding frequency coordinates are k = [u v w] T .
Phase correlation of the resulting structures yields unique Dirac peaks indicating all parameters of the underlying transformation. For more details, see [19]. Once this step is successful, which can be verified by a unique signal/noise ratio of the Dirac maximum, the registration is precise within the rendered pixel/voxel resolution. The difference to the 2D case is that rotation is not available on straightforward circular structures, but on a SO(3) rotation of spherical structures [19], which is much more complex to match. This problem is solved by the SO(3) Fourier Transform (SOFT) [34,35] as shortly summarized below.
Two functions on a sphere are related according to (3) by 3D rotation, which are supposed to be resampled from the 3D spectral data of the rotated volume data. ω ∈ S 2 is defined as ω(θ, φ) = (cos(φ) sin(θ), sin(φ) sin(θ), cos(θ)). The pair of functions in (3) is resampled from the 3D spectral magnitude (2) according to (4) using (5). N VS is the length of the cubic size 3D data f (x) and thus the length of F(k). The necessary rotational match according to the integral in (8) can be determined by the SOFT algorithm [34,35].
The adaption of the FMS algorithm presented here uses the fact that with a restricted range of information, precise transformation parameters can be determined. Furthermore, detailed structures, which can not be reliably tracked, for example, across different scans due to imaging modalities and/or local motions, are automatically removed from the registration process. The same holds for noise.

The FMS Transformation Sequence
For the sake of completeness, the steps and related notations of the transformation sequence within FMS are summarized in this section. In the following, each 7-dof transformation is represented by a translation matrix T trans , the rotation matrices R x (α), R y (β), R z (γ) and the scale matrix M scale . Scale is determined as a scalar value and is hence the same parameter for all axes. The matrix T voxcen defines the usual shift of a discrete grid to the center of a coordinate system with a translational shift −( N VS 2 + 0.5) along all three axes. After the bundle of registrations is carried out, the inverse T −1 voxcen shifts back to the center of a discrete grid.
The transformation sequence between rotation and translation is inherent to the FMS algorithm, since it decouples translation from rotation by the spectral magnitude. In a first step rotation is determined. Here, the sequence chosen is R x (α) R y (β) R z (γ) to re-rotate the second scan in alignment with the first scan. The next step is the determination of scale, to adjust the size of scan 2 to scan 1. The sequence of M scale and rotation is permutable within the transformation sequence. Consequently, the last step of the FMS registration sequence is translation. It is important to note that the first step of FMS, that is, the rotational registration, is the last step of the definition of a generated counterpart scan 2 to a reference scan 1. Hence, the FMS registration sequence is the inverse R −1 scan2 to the definition in (9).

FMS Low Frequency Adaptation
For the low-frequency adaptation of FMS (LF-FMS), the to-be registered 3D shapes are restricted to a basic form, which can then precisely be matched within scans. This allows for example registration under completely different body positions inducing local changes, across imaging modalities or across different individuals. The shapes are described by reducing the object to only low frequencies of a 3D spectral representation (see Figure 1). This restricted spectral composition of objects is compelling, since the FMS registration itself is solely based on spectral information. The strategy is therefore to split available information into low-level information, which is expected to remain constant across different scans and in the potentially varying information at middle/higher frequencies. The useless information at middle/higher frequencies, which rather disturbs the registration process, is automatically removed while relevant information is used for LF-FMS registration. While the basic idea is very simple, this adaptation involves some non-trivial aspects as described below.
In the following, all parameters concerning angle range and resolutions of the corresponding descriptor functions for SO(3) rotation and scale are the same as in the original FMS implementation [19].

SO(3) Fourier Transform (SOFT) Registration at Low Frequency Layers
In order to obtain radially accumulated information for the 2D SOFT descriptor function (7), the spectral data is summed from r s to r e . The zero frequency of F(k) is supposed to be in the center N VS 2 + 1. Hence, the limited radial resampling reduces computation time. Radial processing starts at the first frequency r s = 2 and ends at the chosen cutoff frequency f c u,v,w = 0.2 π for all three frequency dimensions u so , v so , w so using (5). This corresponds to r e = Round (0.1 N VS ). The corresponding resolution for B in (4) is chosen to be 30% of the voxel grid resolution N VS , which is a sufficient coverage of the low frequency layers. Figure 1a shows the sectional view of the 3D spectrum of a MRT scan without a prior window function. The intensity distribution shows that approximately 20% of the first frequencies contain the most energy, which shape the relevant and most essential visible structures of an object. Figure 1b shows an example of the outer layer used in (7). The extreme spectral amplitudes building the cross-shaped structures spreading out in 3D are typically an effect of discontinuities between the edges of a 3D MRT segment, as finite data segments are considered as a single period of a periodic signal. These fixed structures usually lead to misregistrations. Such spectral artefacts are suppressed by standard window functions [36]. The necessary 3D functions are easily generated by multidimensional convolution of the definition of the 1D function. For a complete MRT scan, the standard Hanning function achieves good results and it is used throughout the experiments of the MRT subframes presented here. The application of window functions goes along with the trade off between removing spectral artefacts and removing too much scan content. For sparse scans like a segmented bone, window functions with less attenuation can be used.
The Fourier transform of a function of a certain bandwidth BW is the collection of its Fourier coefficients from spherical harmonics, which provide an orthonormal basis for L 2 (S 2 ) (see [37,38]). Hence, it specifies a resolution, which represents the discretization of the SO(3) rotation. In [19] several tests demonstrated that a bandwidth of BW = 64 is sufficient for an alignment on voxel discretizations of sizes around N VS = 128, even for scans using the full resolution of the voxel representation. Note that the bandwidth effects only the precision of the SOFT registration and not the robustness of a correct rotational match. Figure 2 shows an example of SOFT descriptor functions according to (7) and registration peaks of successfully determined transformation parameters.

Scale Registration with a Restricted Mellin Transform
After the SO(3) rotation is determined, the scan data is re-rotated and both scans are rotationally aligned. The next registration step is scale using a logarithmically deformed time axis for signals known as the Mellin transformation [39,40]. It can be shown that a signal v(z) and its counterpart v(αz) differ under scale changes only by a complex factor (see (15) with τ = αz) using the definition of the Mellin transform. A detailed derivation is given in [19]. The summary is that after resampling both signals to a logarithmically axis v(e −τ ), the scale factor is converted to a shift difference between the signals. The shift is actually visible in the 3D descriptor function shown in Figure 3a,b. This in turn explains the phase factor in (15). Note, after applying a FT on shifted signals, the only difference is a phase factor, which again can be determined using phase correlation (10). But when adapting the FMS method to only 20% of the frequency range of its signal representation, only a few discrete frequency bins can be used for the Mellin transformation.
x s , y s , z s = argmax  In [19], it was shown for very different types of sensor data that when using the full spectral resolution, a precise parameter determination is possible, even when the scans have only small overlap. If just following the LF-approach sketched above, the extremely limited signal can not yield such precise scale results. In the following, we discuss methods to remedy this in LF-FMS, that is, to achieve an optimal scale parameter estimation from the limited signal representation.
The 3D descriptor function (16) is generated by resampling the aligned spectral magnitude according to (6). An example for a resulting pair is shown in Figure 3. For the cubic resolution of N VS = 100 used in these experiments, the effective number of bins is theoretically 10, namely from bin 2 (the lowest frequency) to 11. The actual number of frequency bins used in our implementation is 16 in order to increase the resolution of the logarithmic grid. For a subpixel shift, the derivation in [41] showed that the POMF transfer function of resampled data consists of polyphase components of a filtered unit pulse. Figure 4b shows an exemplary Dirac peak for scale registration. In this example, both objects have a significant scale difference of 35%. The resulting scale parameter is 1.25 using a subpixel interpolation [41]. The integer index of the maximum itself indicates a scale of 1.18, a non-acceptable large deviation from the true parameter.
Another option is to resample the spectral information at a higher rate, which leads to a finer grid. In combination with a lowpass designed according to a corresponding oversampling representation, the Dirac pulse from phase matching (10) turns into a SI function. This has an integrating effect over the entire signal to find a maximum, which is more precise. It leads to a representation (13), which extends the frequency axis according to the oversampling factor. The range is defined as u, v, w = −π, ... , +π. F(u me , v me , w me ) denotes the 3D DFT of the Mellin descriptor f (θ j , φ k , m) defined in (16). Figure 3 shows an example of the Mellin descriptor pair of an ideal transformation. For the radial range of the 16 frequency bins at the low frequency range, the oversampling factor is set to L = 4, which leads to a descriptor resolution M = 64 in (6). In the discrete case, the SI function is a Dirichlet function (see [36,41]). A phase-filter for translation is directly multiplied with the phase difference (10) where the filter is defined as zero-phase lowpass filter (14) with an ideal cut-off frequency at π L . Since this signal is finite, a zero-phase system [36] can be applied, which leaves all content at its original position. The result is that the Dirac pulse (Figure 4b) is converted to a Dirichlet function (Figure 4c). The interpolating and integrating effect of a lowpass filter processes the short signal sequence as a whole, which results into a more precise parameter detection.
The remaining two dimensions ofF(u me , v me , w me ), which cover the angular range are processed without oversampling but using a factor Q = 2, π 2 for the 3D lowpass filter. High frequencies in the dimensions covered by the angular range θ j , φ k , corresponding to u me , v me may contain spurious phase structures, which have no contribution to the correct positional information. The phase difference in (10) has the effect that high frequencies are considered with the same weight than low frequencies. The 1D signal ζ(z) displayed in Figure 4 is extracted from the inverse FT of matching both Mellin descriptor functions according to (12). Here q f is defined as the inverse FT of phase matching (10) fromF 1 (u me , v me , w me ) andF 2 (u me , v me , w me ). In all multidimensional signals (size N), the zero frequency is supposed to be in the center N + 1. Table 1 shows the results comparing a simple integer index from the maximum, subpixel interpolation and the maximum of the oversampling representation with a subsequent filter (14). A simple integer index from the maximum leads to coarse steps with according large deviations from ground truth. A subpixel interpolation achieves better results but still with significant deviations. One reason is that the underlying theory of downsampled signals [41] does not completely fit to the logarithmically resampled signal. The integrating effect of the oversampled structure results in more precise parameters, which are close to ground truth; especially when considering the limited signal information. Table 1. Accuracy comparison of different methods for scale parameter determination. Although the signal information is very limited, the oversampled and filtered parameter determination achieves sufficient precise results for an object registration and subsequent identification.

Scale
Integer Index Subpixel Interpolation Oversampling + Filter

Translation and True-False Detection
After a correct determination of all other transformation parameters, translation is finally determined according to (10) and (11). For a distinction between a successful-and erroneous registration a signal to noise ratio is calculated from a certain area around the detected maximum from phase correlation (10). Using the maximum from (11), a range N ps in all three dimensions is cut from q(x), denoted as S peak (x). A parameter of N ps = 5 is chosen as it covers the main lobe of the SI-function according to the resolution N VS and the processed LF frequency range. Note that the Dirac peak is widened due to the low frequency processing. A zero-phase lowpass filter with cut-off π 4 for all 3 dimensions is used in this implementation. SN threshold is then calculated as the overall sum divided by the corresponding area (17). In case of strong correspondences between two scans, values between 1000 and 2000 SN threshold are easily achieved.
The data of the presented use-case consists of MRT scans of the human hand region of living probands [29]. The data is recorded at several positions moving the hand from the outer left to the outer right position. The progress of the 3D rotation in between the body movement is of interest. Figure 5 shows two scans of a human hand in the corresponding outer positions of the movement sequence. For the experiments, MRT scans of three persons are available. Two different bones of the human carpal bones are used in the experiments. The bone Capitatum is denoted as B1 and the bone Scaphoideum as B2.  Figure 6 shows a representation of B1 of all persons generated from all available positions using the LF-FMS registration for the alignment. Figure 6d shows the average of the three bones again using LF-FMS registration from Figure 6a-c. The same average representations are generated for the bone B2, which is more complex in its shape, as also shown in Section 6.2. The averaged representations already indicate that LF-FMS can reasonably register scans of the two bones for different individuals and for different hand positions. But more importantly, these averaged registration results will serve as templates in more challenging experiments presented in the following.

Division into Subframes
LF-FMS can cope with significant scale changes. Nevertheless, it makes sense to roughly partition the data in which a template is to be found such, that there is a rough correspondence between the size of each partition and of the template. Figure 5 shows the full voxel images of MRT scans, which will be searched with LF-FMS registration for certain structures that may vary by 7-dof (3D rotation, scale and translation). Hence, this region is divided into segments by roughly the size of the according template. Figure 7a shows the template bone B1 (Figure 6). The overlay of tissue and the template after a successful registration is shown in Figure 7c. Figure 7d shows the overlay within the complete MRT scan.
A S/N ratio from phase correlation at the final step of determining the 3D translation (10) is calculated according to (17) and then saved in an array for a maximum search. Figure 7b shows the corresponding division of the MRT scan in terms of the S/N ratio. At the point of the maximum, overlap between the reference template and the corresponding subframe is optimal. The decreasing amplitude around the maximum is due to less overlap at the step of translational correlation or even imprecise transformation parameters (rotation, scale), which in turn led to less precise translational phase correlation. The maximum corresponds to the correct frame and their corresponding transformation parameters as shown in Figure 7. The example in Figure 8 demonstrates a difficult case where the corresponding bone lies partially outside of the scan region (Figure 8a). Figure 8b shows the corresponding 3D registration peak, which nevertheless still yields a decent peak. But it can be observed that the S/N is lower than a perfect peak of objects that nicely coincide after registration (compare Figure 2c). This also motivates the fading out of S/N values that can be observed in the grid of the search region (compare Figure 7b).

Experiments and Results
The following experiments compare the movements of two carpal bones of three probands. First, experiments are made using one bone as reference, matching it within the body tissue, that is, with the complete scan containing all neighboring bones. The experiments are then divided into two groups, using a reference bone from the same person and by using an extrinsic bone, that is, by matching a segmented template bone from one person in the scans of different other persons.
In addition to the calculation of the corresponding parameters of the hand movement in the real world data, simulated transformation are generated in order to compare the registration results with ground truth parameters. Furthermore, the simulated transformations allow to cover a higher range for all parameters (rotation, scale).
Last but not least, we also present an experiments that illustrates limitations of LF-FMS.

Bone B1: Capitatum
The example in Figure 9 illustrates the necessity as well as feasibility to detect scale information due to the different sizes of the same bone structures across different test persons. Two different registrations are carried out between the segmented bone B1 of person 1 with the segmented bone of person 2. The second registration is with the complete tissue of person 2 containing this bone. The scale registration shown in Figure 9c,d show similar results, although one bone is segmented and the other scan is surrounded by the complete body tissue. The scale peak of the Mellin transform is smeared since the shape and the size of this bone are not exactly conform between the different test persons. Figure 9e shows the 3D peak after complete alignment between the different persons. In a further experiment, the scale is set to one. The smeared and noisy 3D peak (Figure 9f) is a clear indication for the necessity of scale detection also when only considering a 6-dof rigid registration.
In the following, the term original rotation denotes the result of the FMS registration using segmented data as described in Section 5. The correctness of the determined transformation parameters can be assessed according to the overlay representation of the corresponding bones in their separate positions of the hand movement. Figure 10 shows the movement of bone B1 in all test persons. The plots show only the yaw rotation, which is the only significant B1 angle within the hand movement. Roll and pitch undergo only minor changes within ±2 • .
In Figure 10a, results are shown where the registration is done with segmented bones of all positions as a reference. The bone structures are segmented in this case at all hand positions. Then, LF-FMS is applied on the single segmented bones. The results are used for the generation of an average bone structure template from the three test persons. Figure 10b shows plots of a registration between the average bone B1 (Figure 6d) within the complete body tissue of the hand movement. The progress of the curves is nearly identical to the reference.
In the next experiment, a registration is carried out with a segmented bone from one person used as template and the whole body tissue of a different person. Figure 11 shows the results for person 2 and 3 using the center position from person 1 as reference. The results determine the range and the shape of the motion curves very close to the previous results ( Figure 10). Figure 12 shows results using artificial transformations. As mentioned, this allows an exact ground truth analysis. The segmented bone from one person (center position of hand movement) is used as reference, which is then rotated/scaled and matched within the tissue of a different person. When ignoring scale, that is, setting is to 1, the results even indicate reasonable 6-dof parameters. Only the sequence with 90% scale shows noticeable deviation for pitch when ignoring the scale parameter.
(e) (f) Figure 9. Example for scale registration. The top of (a,b) show a recognizable difference in bone sizes between two test persons. The peaks shown in (c,d) indicate a scale difference of 6.5% and 8% respectively. The clear 3D peak (e) after an alignment of both scans precisely determines the correct transformation. This is in contrast to the imprecise 3D registration peak when the scale is disregarded (scale assumed to be 1).  Although the bone B1 has a relatively uniform shape in between different human individuals, the two main regions are differently large. Despite these differences, the regis-tration results are quite precise and in particularly stable, that is, there is always a clear maximum indicating a successful registration.

Bone B2: Scaphoideum
Experiments with the second bone B2 are more challenging due to two reasons. The shape of the bone is more oblong with more differences across the human individuals (see Figure 13). The second reason is a tilt of this bone during the hand movement in contrast to a simple yaw rotation in the previous experiment. Figure 13 shows the comparison of two persons with different hand positions. The shape of the bone shows slight modifications in the MRT reconstruction of each corresponding hand position. The following experiments show that very similar motion trajectories can be determined under different conditions. First, the original rotation is again determined from the segmented bone sequence. A second sequence is determined by a registration within the full tissue using the segmented bone B2 from the center hand position as reference.
The results are shown in Figure 14. The rotation parameters coincide in their principal movement. All parameters show a yaw rotation moving the hand from left to the right and a concurrent tilt of this bone with a changing roll angle. The pitch angle remains constant at small values. Finally, two different artificial transformations using bone B2 are generated. For the transformation, the same parameters are used as in Section 6.1. Figure 15a shows the results using the segmented bone of the center from the hand movement of person 1 as reference. The corresponding MRT scan as search region is chosen from a different hand position of the same person. The resulting parameters are very close to the ground truth parameters. Figure 15b shows the results of a transformation where the segmented bone from person 1 as reference is matched within the full MRT scan of person 2. For the yaw angle, higher deviations up to 8 • occur at rotations of 20 • and 30 • , the same holds for the scale deviations at these artificially generated angles. Roll and pitch results are always close to ground truth. Considering the shape differences of the involved bones B2 across different human individuals, the results are very reasonable and they are based on clear maximums indicating registration success.

Strengths and Limitations Of LF-FMS
Alternative registration methods like Go-ICP [22] and spectral methods, for example, the principal axis approach [23][24][25], fail for the experiments described above, that is, they produce alignments that are clearly not corresponding, respectively they generate random distributions of errors in the ground truth experiments. To be fair, it has to be stressed that these methods, like almost all state-of-the-art registration methods, are 6-dof approaches, that is, they do not take scale into account. But even the original version of the 7-dof FMS algorithm fails on this data. Nevertheless, LF-FMS has also its limitations, which are illustrated in the following experiments in comparison to FMS.
The RIRE dataset [42] contains different scans recorded with different imaging methods plus MRT scans with different weighting functions. A comparison using the standard FMS (all frequency layers) and the LF-FMS introduced here reveals significant differences including limitations of LF-FMS. Figure 16a,b show that the same scan with weightings T1 and T2 contains the same structures, but with completely different intensities. These larger areas of different intensities are represented by lower frequencies, which is the reason that LF-FMS fails for this data. In turn, the standard FMS works due to the common structures (skull, tissue) in both scans, which contribute to higher frequencies. The resulting overlay of both scans is shown for T1/MPRAGE in Figure 16d and T1/T2 in Figure 16e. The overlays (Figure 16b,c) show successful registration from the sequence shown in Figure 18a. Figure 16f shows an exemplary peak result with a clear maximum at one voxel, while peak results from LF-FMS are in general distributed over a broader range (see Figure 2c). In contrast, Figure 17 shows an example emphasizing again the strengths of LF-FMS. The PET scan is a fuzzy representation of mainly a brain, which contains many interferences (Figure 17b). Since the T1 MRT scan is similar in its intensity distribution, the registration is successful over a range up to yaw = 20 • , roll = 20 • and pitch = 20 • . Figure 18b shows the results (deviation from ground truth in degrees) of a sequence of artificial transformations, which yield reasonable results considering the fuzzy shape of the PET scan, which in addition has only partial overlap with the T1 scan. The registration results shown in Figure 18 demonstrate the different strengths of both FMS variants. The sequence of the T1/T2 registration demonstrates that precise rotational registrations with less than 1 • error are achieved with FMS with detailed data, even under high variances in intensities. LF-FMS performs very well on very coarse, fuzzy data with interferences like the PET scan, which can be successfully registered with a T1 MRT scan.

Conclusions
In this article, registration with Fourier-Mellin-SOFT (FMS) is adapted to a restricted range of 3D low frequencies (LF-FMS) to be able to register the low-detail, basic shape of anatomical structures with 7 degrees of freedom, that is, 3D rigid motion plus scale. It is hence a possible solution in cases where feature-based methods or variants of the Iterative Closest Point (ICP) algorithm fail, for example, to substantial variations across different individuals, limitations in the imaging resolution, or large interfering structures, that is, data other than the object(s) of interest themselves. Further possible application scenarios are the use of LF-FMS for registration of data from different imaging modalitiesas long as there are no high variances in intensity. The presented use case is the recognition and tracking of carpal bones of the human hand. It is shown that an extrinsic reference bone, that is, a segmented template, can among others be registered within full MRT scans of the hand region of different individuals.

Institutional Review Board Statement:
The study was conducted using 3rd party data and approved by the Ethics Committee of Jacobs University Bremen (approval number 2021_04).

Informed Consent Statement:
The study was conducted using 3rd party data for which Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Contact information for access to the data presented in this study is available on request from the corresponding author. The data is not publicly available to ensure proper handling under data protection regulations.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.