2.1. Projection and Optimization Parameters
The algorithm uses an intrinsic coordinate system that is bound to the detector plane. The three vectors to define the position and orientation of the acquired image are shown in
Figure 1. The vector
points from the middle of the detector to the source, and the length is the distance between the detector and the source. The vectors
and
describe the direction and spacing of the detector elements, respectively, and they point from the center of one detector element to the center of the left/top neighbor. The size of each vector is the spacing between elements.
The three unit vectors , , and that make up the coordinate system are parallel to the vectors , and , respectively, and the point of origin is the isocenter, which is the center of the CT image.
These vectors are also used for all movements and rotations. There are three rotations, one around each of the vectors x, y, and z, and three translations that are also along these vectors.
2.2. Feature Points Matching
The algorithm depends on feature points; these are found within each image through using the AKAZE [
22] algorithm. The parameters used for AKAZE were as follows—threshold: 0.0005, four Octaves, and five Octave Layers. AKAZE also generates a description vector for each feature point, and these descriptors can be used to compare to points through using the Hamming distance. To find the matching features between images, the Hamming distances between all feature descriptors of one image to all descriptors from the other image are calculated. Then, for every feature in the calibration image, the features with the lowest Hamming distance
, and the one with the second lowest distance
are selected. On these two distances, Lowe’s ratio test [
23] is applied. This compares the distance with a ratio
r, and this is performed in order to check that the smaller distance is much smaller than the second best match (
). When the test succeeds, the feature point with the smaller distance is used to form a matching pair of feature points. If the distances do not comply with this test, no matching pair was found.
Afterward, all found pairs are filtered by finding the ones that match to the same feature point and those are removed.
The last step involves discarding pairs where the Euclidean distance between the points in the matched pairs is more than one standard deviation from the mean distance of all pairs.
2.3. Algorithm
An overview of the estimation process is in
Figure 2, and this will be explained in more detail in the following section.
The algorithm is initiated with the acquired images, a prior CT, and geometry information about the size of the detector, as well as the distance to the source and to the isocenter. The first step of the algorithm is to generate simulated projections in a regular grid around the center of the prior CT image. The grid has 95 points each for the rotations around the x- and y-axis, which results in a grid with a spacing of 3.8. Further, three different detector rotations are used, 0, 120, and 240. On each of these images, the AKAZE algorithm detects features and extracts the feature descriptors. The simulated projections are then deleted, and only the feature points, descriptors, and projection rotations (, , and ) are saved. They are calculated once and then used for all further calibration.
To find the approximate position of an acquired image, the algorithm first detects the features. They are then matched with each set of features from the simulated grid projections and the matched feature points are counted. To save time, the algorithm operates along this grid with a step size of four, and it then selects the five grid points with the most matched feature points. Next, the grid points surrounding these five points are compared to the acquired image in the same way. For the grid point that has the most matched pairs, the projection rotations are returned. These are the current approximation of the rotations
,
, and
(
Figure 3a).
This selected grid position is most likely the closest to the target position, but it can also be a projection from the opposite side. These projections, simulated from the wrong side, will be corrected later.
Before that, the detector rotation is approximated. The average value of the feature point coordinates is used as the center point; this is performed separately for the real and simulated image. Then, the coordinates of the feature points are converted to polar coordinates by using the averaged center point as the zero point. The angle for each feature point in the simulated image is subtracted from the angle of the matching feature point in the real image. The median of these differences is the new approximated detector rotation . Listing 1 shows this in pseudocode.
| Listing 1. Approximation function for the detector rotation. |
1 | def approximate_detector_rotation(current_parameters): |
2 | # simulate projection and track features |
3 | simulated_projection = ForwardProjection(current_parameters) |
4 | simulated_feature_points = trackFeatures(simultaded_projection, |
| ↪ real_projection) |
5 | # calculate center point |
6 | simulated_mid = mean(real_feature_points, axis=0) |
7 | real_mid = mean(simulated_feature_points, axis=0) |
8 | sim_points = simulated_feature_points - simulated_mid |
9 | real_points = real_feature_points - real_mid |
10 | # calculate angle |
11 | angles = (arctan2(sim_points[:,0], sim_points[:,1]) |
12 | -arctan2(real_points[:,0], real_points[:,1])) ∗ 180.0/PI |
13 | angles[angle<-180] += 360 |
14 | angles[angle>180] -= 360 |
15 | detector_angle = median(angles) |
16 | # test in which direction to rotate |
17 | proj = ForwardProjection(applyRotation(current_parameters, |
| ↪ 0,0,-detector_angle)) |
18 | points = trackFeatures(proj, real_projection) |
19 | diffn = points - real_feature_points |
20 | proj = ForwardProjection(applyRotation(current_parameters, |
| ↪ 0,0,+detector_angle)) |
21 | points = trackFeatures(proj, real_projection) |
22 | diffp = points - real_feature_points |
23 | if sum( abs(diffn) ) < sum( abs(diffp) ): |
24 | return -detector_angle |
25 | else: |
26 | return detector_angle |
A similar approach is taken to correct the projections taken from the opposite direction. Four projections are simulated with different rotation parameters, as well as the approximate rotations with a detector rotation of 180° (). This is conducted from the opposite side that rotates around the x-axis () and around the y-axis (). For these four projections, features are detected and matched, and the projection with the lowest mean Euclidean distance between the matched points is then used.
The result of this first step is a rough calibration. As such, in the next step, this rough calibration is further refined. The translational misalignment is corrected using the method described by Tönnes et al. [
10]. The median Euclidean distance between the matching points is used to move the projection in the x and y directions. The z-translation is corrected by calculating the distance between the feature points within each image, and by then dividing the distances of one image by those of the other image results in the zoom factor. This ratio and the distance between the source and the isocenter are multiplied to give the new distance.
After correcting the translations along the
x-,
y- and
z-axis, the previously described procedure that is used to correct the detector rotation is applied once more (
Figure 3b).
The resulting parameters can then be used to run a state-of-the-art calibration algorithm and fully calibrate the trajectory (
Figure 3c).
2.4. Image Data
In this paper, the data from Tönnes et al. [
10] were used, which were obtained from a CT scan of a lumbal spine phantom with an inserted metal object. The reconstructions can be seen in
Figure 4.
Furthermore, a sinusoidal trajectory, acquired shortly after the abovementioned CT scan, is used. The phantom is not moved in between. The sinusoidal trajectory is acquired in a step-and-shoot mode, which means moving the C-Arm to each of the 161 positions on this trajectory, and then acquiring a single X-ray image with the standard protocol called “P16_DR_L” at 70 keV and with the mAs controlled by the Artis Zeego System.
The third trajectory is acquired using the continuous acquisition mode and by moving the C-Arm during acquisition. This one has the problem that was mentioned in the introduction in that it does not contain positional information for the individual frames. This trajectory is an arc around the object tilted by 28, with 70 keV and 30 frames per s. The exposure time and tube current are managed by the Artis Zeego System; furthermore, the average pulse width is 3.5 ms, with an average current of 35 mA. It consists of 666 individual projections.
All three sinograms are shown in
Figure 5. Since the sinograms are three-dimensional, only two slices are shown, and each of them is cut through the center of all individual projections, both horizontally and vertically.
2.5. Evaluation
To evaluate the quality of the estimator, the estimated parameters were used as inputs for the FORCASTER [
10] algorithm and the algorithm by Oudah et al. [
17] (which uses a CMA-ES minimizer with the normalized gradient information (NGI) as the objective function).
The calibrated trajectories are then reconstructed with the FDK algorithm, which is part of the astra toolbox [
24]. The images become cropped to the field of view, and no further post-processing is performed. The calibrated parameters are also used to generate a forward projection using the prior image; this simulated sinogram is compared to the simulated forward projections of the state-of-the-art calibration algorithm. The continuous acquisition sinogram is compared to the acquired data since there are no correctly calibrated parameters.
2.5.1. Metrics
The structural similarity index (SSIM) [
25] (Equation (
1)) is the normalized gradient information (NGI) [
26], and the normalized root mean squared error (NRMSE) (Equation (
2)) are evaluated on the projections that are simulated with the parameters from the calibrated trajectory in comparison to the forward projections of the reference calibration.
In this equation, is the mean value of x; the standard deviation; and are the constants; N: This is changed to be italics format to keep consistent with the equation, please confirm. Same below. is the number of voxels; and is the Frobenius norm of x.
The reconstructions obtained after calibration and cropping to the point of view are also compared through using the same metrics. Additionally, the vertical part of the large metal object in the center of the phantom is segmented, and the Dice coefficient on the segmentations is then calculated. All metrics are applied to each 2D slice of the images and then averaged.
2.5.2. System Specifications
The algorithms were run on a system with an AMD Ryzen 9 7900X CPU, 128 GB RAM, and NVIDIA GeForce RTX 2070 SUPER. Due to each projection being independent of the others, the algorithm can be easily parallelized. In this paper, ten parallel processes were used. Python version 3.9.9 the following packages was used: astra-toolbox 2.0 [
24], scipy 1.7.3, skimage 0.19, numpy 1.21.4, and opencv 4.5.4.