Towards a Guidance System to Aid in the Dosimetry Calculation of Intraoperative Electron Radiation Therapy

In Intraoperative Electron Radiation Therapy (IOERT), the lack of specific planning tools limits its applicability, the need for accurate dosimetry estimation and application during the therapy being the most critical. Recently, some works have been presented that try to overcome some of the limitations for establishing planning tools, though still an accurate guidance system that tracks, in real-time, the applicator and the patient is needed. In these surgical environments, the acquisition of an accurate 3D shape of the patient’s tumor bed in real-time is of high interest, as current systems make use of a 3D model acquired before the treatment. In this paper, an optical-based system is presented that is able to register, in real-time, the different objects (rigid objects) taking part in such a treatment. The presented guidance system and the related methodology are highly interactive, where a usability design has also been provided for non-expert users. Additionally, four different approaches are explored and evaluated to acquire the 3D model of the patient (non-rigid object) in real-time, where accuracies in the range of 1 mm can be achieved without the need of using expensive devices.


Introduction
Intraoperative radiation therapy (IOERT) refers to the delivery of radiation to the postresected tumor bed, or to an unresected tumor, during a surgical procedure [1].Running an IOERT program involves several aspects from the institutional point of view because it is necessary to organize structural and human resources.Conventional or mobile linear accelerators are used for IOERT.A multidisciplinary group of surgeons, anesthetists, medical physicists, radiation oncologists (ROs), and technical and nursing staff have to be involved.However, treatment planning has not been available in IOERT up to now [2].In the current clinical practice, all necessary parameters, such as the applicator diameter, bevel angle, position, and electron beam energy, are decided by the RO in real-time, with high dependence on the accumulated expertise [3].Dosimetric calculations are performed in real-time at the time before the administration of the treatment, from the data of the prescribed total dose and its distribution to a certain depth, depending on the findings during the surgical procedure.This means that a reliable estimate of dosimetry cannot be made, nor can the results be rigorously evaluated (the full extent of the tumor bed, percentage of healthy tissue irradiation, etc.).Thus, local effects (tumor control and/or toxicity) cannot be explained or fully controlled with the planning data and depend to a large extent on the specialist's learning curve.
In order to position the different devices that form part of intraoperative or radiology scenarios, different techniques have been provided in recent years, such as mechanical, ultrasound, electromagnetic, or optical approaches.Mechanical systems are usually based on using an articulated arm in which the articulation angles can be measured to calculate the instrument position [4,5].The main advantage of these devices is that there is no need of direct vision between the instrument and the signal transducer.Nevertheless, these systems are usually voluminous, can only position a single device and present sterilization problems.Other techniques make use of ultrasound transducers [6][7][8].However, due to their natural dependency on humidity and ambient temperature, they do not offer the required accuracies.These limitations can be solved by the use of electromagnetic systems, some of them being used and/or tested in [9][10][11].However, these systems still have the disadvantage that the results are affected by the influence of the electromagnetic fields of surrounding metallic objects, although recent improvements in the technology can significantly reduce this natural effect.
On the other hand, optical-based tracking systems do not meet the aforementioned limitations.One of the world's leaders in optical tracking systems for image-guided surgical applications is the Polaris system from Northern Digital Inc., which are based on near-infrared (NIR) imaging and are capable of simultaneously tracking both active and passive markers.Examples of this, and other related optical tracking systems used in image-based surgery, can be found in [12][13][14][15][16], where in [17] a comparative study on different optical tracking systems is presented.In [18] an integrated system is used by combining an optical system with an articulated robotic arm.The main drawback of optical systems is that they require direct visualization between the sensing devices (cameras) and the sensed objects (usually artificial or natural marks), but have the advantages that can be more accurate than other technologies and do not meet more critical problems (e.g., humidity, temperature, or magnetic field dependency).
The purpose of the present study is to provide the necessary methodology to spatially position the different objects that are present in an IOERT scenario, which are the applicator and the patient, this last being a non-rigid object.In the presented work, the direct visualization of cameras and tracked objects is critical, as such a collaborative space as the one given in an IOERT will bring many camera-to-object occlusions caused by the natural movements of the different medical staff taking part in the surgery.Nevertheless, an efficient way of minimizing, or even avoiding, this fact can be achieved after studying the optimal locations for the cameras and by increasing the number of sensors, in such a way that there is always a minimum set of cameras (usually two cameras) with a direct visual to the object.Additionally, a 3D model of the patient's affected area by the tumor is required.We aim to investigate on the appropriated sensing devices that need to be installed in the operating room in order to register with six DoF (Degrees of Freedom) the applicator and to acquire the 3D shape of the patient's tumor bed.To that end, four different approaches are evaluated and compared.

Overall Requirements
As explained before, an accurate guidance system to aid in the calculation of the dosimetry is still needed.Furthermore, the 3D shape of the patient's affected area needs to be registered in real-time in order to be used instead of that provided by CAT, which is usually acquired some weeks before the treatment and certainly with a different position of what the patient keeps during the treatment.In order to design both a methodology and a guidance system, some specific requirements need to be fulfilled.First of all, the system has to be based in optical technology, as to avoid all the limitations of other technologies.Secondly, the system resolution has to be such that objects greater than 3.0 mm can be identified, having the system an overall accuracy that must be kept below 2.5 times the system's resolution (i.e., 7.5 mm).These values have to be kept for a working range (distance object-sensor) between 1.0 and 2.0 m.
Such a system has to operate in real-time and be adaptable to the needs of the treatment, i.e., some parameters would need to be recalculated in real-time according to the upcoming needs during the IOERT.For instance, the acquisition of the 3D shape of the patient's postresected tumor bed and/or of the unresected tumor will be an input to recalculate the needed dose to be radiated at specific locations.At the same time, the system has to account for the special characteristics and restrictions of the operating room (which will be described in next section), while being compact and robust in order to avoid any possible interference with the medical staff and/or the treatment.

The Operating Room: Characteristics and Restrictions
The operating room is a controlled environment with specific characteristics and/or restrictions, which will affect both the type and location of sensors and the methodology to reach our goals, among others.In the following paragraphs, a summary on these issues is presented and discussed: Sensor locations: There is a high restriction on where the sensors to achieve 3D data can be placed, as these devices have to be located in such positions that in no way can interfere with the processes of the operating room.As a requirement, the sensors have to be located around the operating couch, in an approx.height of 1.5 m counted from the top of the couch and forming a rectangle of 2.4 × 3.0 m 2 (see schematic in Figure 1).Additionally, the complex shapes of the objects (e.g., resected area of the human body) can make that some inner local concave parts are not fully seen by a single position of an optical device and/or cause some inter-reflections.To leverage this, different sensors and/or different positions of the sensors have to be considered.

Figure 1. Schematic of the NIR camera locations around the treatment coach.
Occlusions: There may be up to five different medical staff around the operating couch at different stages of the operation.This fact can cause several occlusions that can be minimized by increasing the number of sensors in such a way that there is always a minimum number of sensors (at least one or two, depending on the applied technique) that are able to visualize the couch at the same time.
Controlled illumination: The ambient light is crucial in optical-based systems and has to be controlled.In the case of the operating room, the available ambient light must be optimal and uniform, so the operating couch can be perfectly seen by the bare eyes (and, thus, by standard cameras) and the lack of windows avoids interference of sunlight (thus, working with NIR cameras and projectors is possible).
Non-rigid object: Though the patient is anesthetized, he/she can make involuntary movements due to breathing.Additionally, the internal organs may also register small movements.This fact highly affects the decision on which methodology to use for the 3D reconstruction.We are interested in using an approach that can acquire all the required information to derive the 3D shape in an instant.
Non-expert users: The system will be used by medical personnel not necessarily expert in optical tracking systems.Thus, it is necessary to design a usable protocol for both calibration and normal usage of the system (see next section).

Usability Design
As explained before, a multidisciplinary group of surgeons, anesthetists, medical physicists, radiation oncologists (ROs), and technical and nursing staff are involved during the IOERT process, being the main actors the surgeon and the RO.These personnel are not necessarily expert in optical tracking systems.Therefore, a usability protocol must be designed in order to allow our interactive system to be easily used, preventing the risks of malfunctioning and avoiding any possible distraction.The protocol is built by following two essential steps, calibration and acquisition, which are further explained.
Calibration: System calibration is only required if sensors have been moved, though it is highly recommended to be done before and after the acquisition process.The procedure will start when indicated by the medical staff.The system will graphically indicate the necessary steps to be followed, indicating when the procedure finishes.After the calculations are done (a step that should not be time consuming) an indication of the success/failure will be indicated by the system.In case of failure, the medical staff will be asked to repeat the calibration.In case of success, the computed calibration parameters will be automatically incorporated into the system.
Acquisition: Once the system is calibrated, acquisition will start.Similar to the calibration procedure, the medical staff will indicate when the acquisition starts.This procedure should be designed in such a way that the medical staff needs no additional indications while it is running, as this could interfere in the treatment process, which is not desired.When the medical staff finishes the treatment, they will need to indicate to the application that this procedure has finished.

Sensing to Register Rigid Objects
To register the applicator (a rigid object), a commercial optical-based solution has been considered.Our solution makes use of a set of OptiTrack cameras from NaturalPoint Inc. in the operating room, which is based on NIR cameras and automatic registration of reflective markers (usually small spheres).These cameras have a resolution of 640 × 480 pixels and acquire images mainly in the NIR range; images in the visible range are also captured, but somehow are poorly contrasted, being only useful for visualization purposes.This system was selected because of providing optimal accuracies for the tasks involved in this project regarding to the registration of rigid objects; additionally it is a non-expensive system, non-voluminous, quite robust, and easy to install in an operating room.In order to ensure that the objects to be registered are always seen by a minimum of two cameras, a total of eight cameras are used, also dealing with the possible occlusions caused by the medical staff during the treatment.Cameras are located to a height of 1.50 m above the operating couch and at a certain distance to the treatment couch, always ensuring that they do not interfere in the normal process of the treatment, while still capturing the objects with enough resolution to provide the required accuracies.A schematic of the locations of the cameras around the treatment couch is depicted in Figure 1.As it can be seen, the cameras are oriented in such a way that each of them is able to register the whole operating couch and at enough distance from it, in such a way that they do not meet sterilization problems.
In this case, before registering objects in real-time, the system needs to be calibrated, a process in which both the intrinsic (location of principal point, focal length, and lens distortion) and extrinsic parameters (rotations and translations in the three axes to an extrinsic reference frame) of the eight cameras are computed.The calibration process of the cameras is relatively easy following the protocol of Tracking Tools, an application developed by the cameras' manufacturer.The user needs to carry a T-form wand that has three reflection spheres lying in a single line.The wand needs to be moved inside the volume to be calibrated, with the possibility of being registered by all the cameras at any time.After few minutes (around 2-3), the calibration procedure finishes and the cameras' internal and external parameters are computed iteratively.The calibration of the system gives as outputs the accuracy values of the measuring process.In this case, the mean error of the wand was below 0.5 mm, where the general system accuracy was 5.8 mm, as given by the application after calibrating a volume of 2.0 × 1.0 × 1.0 m 3 (the upper volume of the treatment couch) with the eight cameras.Once the system is calibrated, objects having reflective markers can be registered.

Sensing to Register Non-Rigid Objects
In this section we tackle with the sensing technology to register non-rigid objects, i.e., the human body.Four different approaches were evaluated and compared, all of them being based on optical, non-contact sensors and image processing.

Approach 1: Point-Based Registration
With the aforementioned system designed to register rigid objects, i.e., a set of eight low-resolution NIR cameras that are calibrated, punctual locations can be spatially registered by using a body equipped with reflection spheres.Such a body is called a pointer.To that end, a special body was designed to which up to six different reflective spheres can be attached.
Therefore, the spatial registration of points, i.e., the calculation of the six orientation parameters, is straightforward, and it can be performed after the system is calibrated.The main advantage of this approach is that the sensing devices used for registration are the same of that for the rigid-body registration and, thus, it is sure that the acquired points remain in the same coordinate system without extra effort.Additionally, no extra sensors need to be placed in the operating room.

Approach 2: Scanning with a Kinect Sensor
In a second approach the Kinect sensor from Microsoft was used, which constitutes an economic solution that allows fast acquisition of 3D point clouds.This sensor is composed of a NIR projector and a pair of low-resolution cameras, one working in the VIS the other in the NIR range.The VIS camera is mainly used for visualization purposes, whereas the depth image is obtained by applying triangulation from corresponding points emitted and captured by the pair of NIR projector-camera, where the projected pattern consists on an irregular grid of points.
This sensor outputs video at a frame rate of 30 Hz.The RGB video stream uses 8-bit VGA resolution (640 × 480 pixels), while the monochrome depth sensing video stream works also in VGA resolution but with 11-bit depth, which provides 2048 levels of sensitivity.The Kinect sensor has a practical ranging limit of 1.2-3.5 m distance when used with the Xbox software, although the sensor can maintain tracking through an extended range of approximately 0.7-6 m.The horizontal field of the Kinect sensor at the minimum viewing distance is about 87 cm and the vertical field about 63 cm, resulting in a maximal resolution of just over 1.3 mm per pixel [19].

Approach 3: Structured Light with Fringe Patterns
In this approach, instead of a commercial system, an implementation was provided based on structured light and making use of a camera-projector system.The camera used for the first tests is a Canon PowerShot G12 with maximum resolution of 3648 × 2736 pixels (approximately 10 MP), while the projector has a resolution of 1024 × 768 pixels.
Coded structured light systems are based on the projection of one pattern or a sequence of patterns onto the object's surface.Due to the differences in the object height, the projected pattern appears distorted when viewed obliquely.Therefore, the fringe pattern is modulated according to the object's 3D height and the angle formed between the illumination and the viewing axes.The height profile of the object appears encoded as a function of the spatial phase of the fringe pattern, i.e., the object's height modulates the intensity distribution of the fringe pattern as expressed in the next equation: where a(x,y) represents the background illumination, b(x,y) is the amplitude modulation of fringes, f0 is the spatial carrier frequency, and ϕ(x,y) is the phase modulation of fringes.
Pattern projection techniques differ in the way in which every point in the pattern is coded and decoded.A review on the different techniques can be found in [20,21], where the related methods are classified according to the nature of the projected patterns, giving also the minimum number of images to be acquired for the reconstruction, number of cameras, pixel depth, subpixel accuracy, etc. From all the different techniques, we found that Takeda's method [22,23], which is based on the Fourier Transformation, would fit our requirements, as it can be achieved with the projection of only one pattern and registered with a single camera.Additionally, it allows subpixel accuracy.
In this procedure, the unwanted background variation a(x,y) that represents the zero order has to be filtered, and the first order translated by f0 on the frequency axis towards the origin.The demodulation of fringe patterns results in a so-called wrapped phase ψ instead of the required phase ϕ.Therefore, a phase unwrapping is required to recover ϕ.A directional warping can appear due to the perspective distortion introduced by the projector-camera pair; for instance, if the central projection axis of the projector is not orthogonal to the registered object, fringes that are closer to the projector will appear narrower than those located further away; in the same way, the location of the camera can introduce perspective distortion.In order to correct this effect, a reference plane with projected fringe patterns can be processed and reconstructed in the same way, and then the depth variations departing from the condition of coplanarity can be subtracted to the unwrapped phase of the object.Finally, the computed phase is extracted and related to the actual object height.

Approach 4: Scanning with a Sub-Millimeter Accuracy Scanning Device
In the fourth approach, a professional 3D scanning system from NUB3D [24] was used, which is based on structured light and captures dense point clouds.The device is based on a single camera, to reduce the number of overlapping views, and on multi-wavelength technology, to ensure a strong and adjustable light source to get optimized digitizing results, among others.This technology is intended for metrology purposes such as sheet metal forming, foundry, automotive or arts, reaching accuracies up to 50 micrometers.

Results and Discussion
The different approaches as introduced in Section 3 were evaluated and further compared in the case of non-rigid objects, where a mannequin body was used as a target shape.In the following paragraphs the outcomes are presented.
In both procedures of tracking rigid objects and in Approach 1 of tracking non-rigid objects, the same technology was used and, thus, the evaluation done is extendable for both.
Some tests were done with the pointer (recall Section 3.2.1) in order to further calculate the accuracies of the system when measuring single points.These tests were done by using the pointer in two different positions: in diagonal (i.e., with an inclination angle with respect to the horizontal plane) and in vertical.Points arranged in a grid covering an area of 2.0 × 1.0 m 2 were measured.The resulting errors are depicted in Figure 2. Average errors are in between 4 and 5 mm, where maximum errors that do arise at some of the corners are below 12 mm.Although the obtained average errors are inside our requirements, this approach has several limitations that will prevent and/or reduce its use in a real treatment.In the first place, this method consumes tens of seconds to be completed, which can be critical when dealing with non-rigid objects, as the general body positioning can be moved between the registration of one point and the next.Secondly, it is difficult to locate the exact position where the measurements should be done, while it is in many cases impractical to put physical markers in the patient's body that indicate these locations.This brings another consequence, which is the high dependency of the human factor when acquiring point locations, particularly where a lack of experience could be translated into incorrect point locations.Additionally, with this method the 3D model of the tumor bed cannot be obtained and, thus, the measured points are used to reference the patient with the previously acquired model via CAT.Another problem that arose during the trials with the real operating room regarded the sterilization of the pointer.It was found that the final pointer to be used at the treatments should be of a material able to be sterilized; in the same way, reflective spheres could not be reused.
In the second approach of non-rigid objects, a Kinect sensor from Microsoft was used.
From the depth image, the 3D cloud of points can be calculated with a formula that relates the registered depth values with spatial depth distances.We have implemented a procedure that automatically fulfills this task and additionally computes a texturized mesh with the image given by the visible camera (Figure 3a).In Figure 3a-c, the mannequin body used for simulations is depicted as viewed by the different cameras of the Kinect, while in Figure 3d the reconstructed 3D shape is depicted.The main advantage of this system is its capability to work in real-time, while its major inconveniences for our case are the low spatial resolution, the fixed optics and working ranges.
In order to evaluate the third approach of non-rigid objects, different experiments were done, by applying different periods to the projected fringe, from which the best results were acquired with a period of 10 pixels.The computation of the X and Y values of the cloud of points derived from the range image is straightforward, and the Z values are obtained after applying a phase-to-height conversion (Figure 4).To assess the overall accuracy of our implementation in this case, a 3D calibration body was used which consisted of different objects attached to a planar surface with known height values (Figure 5).As a result, we obtained a mean error in height of 0.89 mm.Finally, the body was registered with the NUB3D technology as given in the fourth approach, and the obtained cloud of points was evaluated.Following this approach, we can say that the result of the 3D reconstruction is highly accurate (Figure 6), as the device has a sub-millimeter accuracy, and of high density.However, the acquisition time increases (around 30 s), as different patterns are projected on the body.Additionally, the total costs of the system is around 10 times of that of the 3D-constructed scanner presented in the third approach.As a summary, the pros and contras of the four tested approaches are depicted in Table 1.According to our experiments, we can conclude that the Approach 3 is the most appropriate to be installed in an IOERT surgical environment, as it meets speed, efficiency, dense point cloud, and high accuracy at once.A further consideration would be to allow the technology to be placed in different positions, as to avoid possible occlusions and to perform better reconstruction of complex objects, such as those presenting local concave parts.To that end, the devices in Approach 3 have to be mounted on a movable platform of easy manipulation.

Conclusions
The IOERT has specific surgical restrictions that make it difficult to incorporate treatment planning.In these surgical environments, the 3D shape of patients is acquired by a CAT system, which is usually carried out some weeks before the treatment and with certainly a different position of what the patient is keeping during the treatment.Therefore, the implementation of a method that allows the acquisition of an accurate 3D shape of the patient's tumor bed in real-time is of high interest.
The work here presented constitutes a step forward on the spatially registration of the different objects that are relevant in such a surgical treatment.To that purpose, the operating room needs to be sensorized, a non-trivial task that has to account for different restrictions regarding to the operating room itself, to the medical staff involved in the treatment, and to the technological limitations.
Here, we have shown a solution to register rigid objects in real-time by the use of an optical tracking system composed of eight NIR cameras and registration bodies with reflective spheres.Additionally, some preliminary results for acquiring the 3D shape of the patient, which is a non-rigid object, have been presented by means of four different approaches, the first being a point-based registration, while the others are based in 3D scanning techniques.After our analysis, we conclude that the 3D registration of patients in real-time is possible with high accuracies and relatively low costs, provided that the sensing devices and their placement meet the required restrictions of surgical environments.

Figure 2 .
Figure 2. Obtained errors when measuring with the pointer in two different positions: (a) pointer in the diagonal; (b) pointer in the vertical; and (c) error scale in mm.

Figure 3 .
Figure 3. Calculation of a 3D model from the Kinect sensor, where: (a) physical model as seen by the VIS camera of the Kinect; (b) projected grid pattern as seen by a NIR camera; (c) depth image as acquired by the Kinect sensor; and (d) calculated 3D cloud of points with textured mesh.

Figure 4 .
Figure 4. Generation of a dense cloud of points after a structured light procedure, where: (a) projected fringe pattern on the body; (b) wrapped phase map; (c) unwrapped phase map; and (d) dense cloud of points.

Figure 6 .
Figure 6.3D cloud of points obtained with a professional 3D scanner with sub-millimeter accuracy.

Table 1 .
Pros and cons of the four studied approaches to register the patient.