1. Introduction
In the era of developing artificial intelligence (AI)-based solutions, access to databases is a major issue. These systems necessitate substantial quantities of data for their learning processes [
1,
2]. Traditionally, the accumulation of large-scale data has necessitated the establishment of data warehouses, also known as datalakes, and has concurrently facilitated the evolution of numerous novel technologies [
3], such as Hadoop for the storage of unstructured data, or NoSQL developed to accelerate database access. However, empirical evidence over time has underscored that data quality is a paramount concern. Data collected in big data can be categorized into three types [
4]: unstructured data, semi-structured data (unstructured data supplemented with tags that offer additional insights about the data under consideration), and structured data. The transformation of unstructured data into structured data is feasible, though it demands substantial resources. These factors have prompted a shift in global data collection strategies over the years, veering towards targeted and business-oriented data collection [
5].
Radiotherapy engenders an array of data types, encompassing imaging (Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), Cone Beam Computed Tomography (CBCT), MegaVoltage Computed Tomography (MVCT), etc.), clinical records, and treatment plans (RTDose, RTStruct, RTPlan). These data are disseminated across diverse systems: CT scanner, MR scanner, Treatment Planning System (TPS), Oncology Information System (OIS), treatment unit, etc. The access, consolidation and organization of this data represent a significant challenge for the future [
6,
7], particularly in the context of implementing predictive AI-driven solutions for personalized radiotherapy treatments.
With this idea in mind, we focused on the treatment information contained in the OIS. This data, which is structured in nature and present in large quantities, is not trivially accessible. For these reasons, we developed a model of data, named Oncology Data Management (ODM), which makes the structured data extracted from the OIS available.
Thus, in this study, we present the proof-of-concept of the developed ODM tool. Through different clinical examples, metrics are proposed to evaluate the practices of a radiotherapy department and to improve the quality of patient care.
2. Materials and Methods
2.1. Oncology Information System (OIS)
The OIS Mosaiq, developed by Elekta (Crawley, UK), which has been in production in our radiation oncology department since 2010, was used in this study. The ODM tool was implemented in 2019 with version 2.64 of Mosaiq, and was updated during OIS upgrades. Version 2.83 of Mosaiq was used in this study for data extraction.
The Mosaiq database is in SQL format, and the information contained in the OIS is distributed across 764 tables in version 2.83, the architecture of which has been established by Elekta.
2.2. Oncology Data Management (ODM) Tool
A sophisticated SQL query, which constitutes the core of ODM, was developed in this study to generate a new structured database for the exploitation of radiotherapy data. This query selectively retrieves data of particular relevance to end-users from the Mosaiq database and aggregates it within a distinct SQL database. In case of an error in manipulating the ODM database, the operation of Mosaiq is not altered. An automatic agent has been set up to incrementally update, during the night, new data that was generated during the day.
The development of ODM required a reflection about the ontology of radiotherapy in order to define the objects necessary for describing radiotherapy data [
8]. Indeed, Mosaiq, which was developed for radiotherapy and oncology, does not define this data in a straightforward manner. For instance, the delivered dose of a radiotherapy treatment is not defined in a single table in Mosaiq, but corresponds to multiple fields scattered across different tables. This complexity makes querying this information very challenging. The development of ODM involved creating a data model in which radiotherapy information was logically grouped together to enable the establishment of a database for conducting quick and efficient queries. The structure of ODM was deliberated upon by individuals working in radiotherapy, for whom data accessibility and querying were crucial.
The structure of ODM has been enriched over the years and through various ideas. The ODM 4.C version was utilized in this study. Its database contains 10 tables, but this number of tables is subject to change depending on our needs.
In this study, as illustrated in
Figure 1, the information contained in five tables was used:
Activity contained information related to patient file activities: agenda, radiotherapy treatment coding, Mosaiq EVAL form entries, Escribe documents, and Mosaiq tasks. This included information such as activity start and end dates and times, as well as the names of professionals who created or validated an activity.
Diagnosis contained information on the classification of treatment pathologies (International Classification of Diseases (ICD)-10 code and custom groupings).
Patients contained demographic and civil information about patients, as well as their identifier (IPP) within the radiotherapy center.
TreatmentDate contained patient appointment dates as planned in the Mosaiq agenda. We note that this information was not representative of reality, as the patient may have been treated at a different time than planned.
TreatedSite contained the name of the prescriptions entered in Mosaiq, the treatment technique, the dose and number of prescribed sessions, the treatment beams that are attached to a prescription, the date and time of delivery of the treatment beams, and the treatment machine on which the beams were delivered. The temporal data related to a treatment was therefore reliable.
Basic queries on radiotherapy data in ODM were initially conducted by generating customized reports in Mosaiq using Crystal Reports® (SAP, Levallois-Perret, France). Since the reports were customizable, each query, such as the number of patients treated over a period, the number of patients treated on a specific treatment machine, etc., required a different report to be created. Reports were created once, then users only needed to select the relevant report and enter the date range to launch the query.
While this method was useful for simple queries, its limitations became apparent to access complex information, especially when information cross-references were needed, such as for extracting patients’ cohorts, and customized reports were no longer sufficient. A specific extraction of data from ODM was implemented in such cases. This extraction was performed through an SQL query in the ODM database, generating multiple CSV files, each corresponding to a table. A Python script (version 3.8.11) was then used to concatenate the data into a single database file in XLSX format and to calculate some additional parameters.
2.3. Database
In the present study, data were retrieved with an ODM query encompassing treatment records from 1 January 2016 to 31 December 2022. The resultant XLSX file obtained with the Python script was structured such that each row corresponded to a single treatment event and the columns detailed the requisite information for the oversight of radiotherapy procedures, namely:
- -
Patient identifier
- -
Sex
- -
Age of the patient at the time of irradiation
- -
ICD-10 code of the prescription
- -
Custom location group: these were 13 groups grouping several ICD codes and were defined as: Cerebral, Cutaneous, Dcodes, Upper-GastroIntestinal, Gyneco., Hemato., Mammaries, Metastases, H&N, Bones, Trachea.-Broncho.-Pulmonary (Tr. Br. Pu), Uro. and Others
- -
Prescription name
- -
Treatment technique requested in the prescription: 3D-RTC, IMRT/VMAT, TBI, SRT or Protons
- -
Treatment machine
- -
Date of the simulation scanner
- -
Date of the first treatment beam
- -
Date of the last treatment beam
- -
Delay (in working days and calendar days) between the simulation scanner and the first treatment beam
- -
Prescribed dose (Gy), delivered dose (Gy), missing dose (Gy)
- -
Number of prescribed sessions, number of performed sessions, number of missed sessions
- -
Start date and end date of the various Mosaiq tasks related to a treatment
Patients included in this study provided their consent for using their anonymized data for research purposes.
2.4. Monitoring Tools for a Treatment Unit
The analysis of the activity of a treatment machine was performed using three metrics obtained with ODM:
Quantitative activity of a machine: corresponded to the number of patients treated per day on a treatment machine. These data were extracted from the TreatedSite table as the recorded beams contained information about the treatment machine.
Treatment technique: extracted from the TreatedSite table. The technique could be: 3D-RTC, IMRT/VMAT, TBI, SRT, and Protons.
Pathologies distribution on a machine: for each recorded beam, the ICD-10 code extracted from the Diagnostic table was obtained, and the custom pathology group defined in the previous section was extracted.
In this present investigation, data were garnered from two of the eight therapeutic devices available at our facility: a Synergy (Elekta, Crawley, UK), installed in 2010, and a VersaHD (Elekta, Crawley, UK) installed in 2016. Both were equipped with an Agility multileaf collimator (MLC) and with an CBCT system. The VersaHD was configured for stereotactic treatments and incorporated an iGuide table (Elekta, Crawley, UK) and an Exactrac stereoscopic imaging system (Brainlab, Munich, Germany). The Synergy was equipped with an AlignRT surface recognition system (VisionRT Ltd., London, UK).
2.5. Quality Indicators of Patient Care
Three indicators reflecting the quality of patient care were extracted from the ODM database:
Time (Calendar days) between the simulation scanner and the first treatment session: the first therapy session’s date was derived from the first delivered treatment beam, as recorded in the TreatedSite table, while the simulation scanner’s date was obtained from TreatmentDate table in the patient’s schedule, indicating the most recent CT exam before the first delivered beam. This measure is noteworthy, as it mirrors the efficacy of the treatment planning stages (contouring, ballistic, optimization and dose calculation, treatment plan’s quality assurance) and the availability of the treatment devices. Clearly, if the treatment units are at full capacity and unable to accommodate additional patients, this delay would be extended.
Fractionation: this corresponded to the prescription made in the OIS and for which the treatment beams were delivered. This information was extracted from the TreatedSite table.
This criterion made it possible to evaluate the homogeneity of the prescriptions made in the radiotherapy department for the treatment of the same pathology.
Duration (Calendar days) of radiotherapy treatment: this was calculated as the duration, in calendar days, between the date of the first beam delivered for a medical prescription and the date of the last beam delivered.
This criterion was representative of the quality of patient care, as it ensured that patients were treated according to the scheme prescribed by the radiation oncologist, and that there was no unwanted treatment interruption during radiotherapy.
In this study, the delay between the simulation scanner and the first treatment session was investigated for metastasis treatments and, more particularly, for palliative treatments. Three time periods were analyzed: 2016 to 2019, the year 2020, and 2021 to 2022. An upper threshold of 30 days was set and data exceeding this value were removed, as they did not make sense for this indicator.
The fractionation indicator was investigated for single breast treatments without nodal areas or boost for breast pathologies. A correlation between fractionation and patient age at the time of radiotherapy was performed to further interpret the results.
The distribution of the treatment duration was studied through larynx treatments for head and neck (H&N) pathologies.
4. Discussion
The oversight of radiotherapy operations and the ongoing enhancement of the working environment for both patients and medical staff necessitate vigilant oversight of procedural execution. Recognizing the wealth and caliber of data inherent in the OIS, the ODM system was devised. This study addressed the proof-of-concept of the ODM tool and delineated several prospective applications.
Firstly, ODM facilitates the quantification of a radiotherapy department’s throughput by assessing the number of treatments administered within specified timeframes (weekly, monthly, annually, etc.), and categorizing them according to the types of pathologies treated (
Figure 2). Utilizing this metric, we were able to quantify the reduction in departmental activity experienced in 2020 attributable to the COVID-19 pandemic and ascertain the pathologies that were most significantly impacted. For example, the decrease in surgical care during the pandemic in 2019 generated a decrease of nearly 15% of the radiotherapy treatments for the breast pathology between 2019 and 2020. After 2020, an increase in breast treatments in radiotherapy was observed and, in 2021, it reached a value close to the one before 2019. For the same reasons, the number of metastasis treatments decreased by 11% between 2019 and 2020, but unlike breast treatments, it did not increase again: compared to 2019, it has decreased by 26.5% in 2022. This very singular behavior was quantified thanks to ODM, and, in view of the raised questions, an investigation would be planned to understand the causes.
The development of ODM and its application during the pre-production phase in 2019 enabled the measurement of various metrics, which precipitated numerous modifications within the radiotherapy department’s structure. The outcomes discussed in this research mirror these alterations.
Among the indicators accessible with ODM, monitoring the activities of treatment machines is highly valuable for effective department management. It is important to note that a combination of several indicators is essential for optimizing machine utilization. In this study, we demonstrated that the average number of patients treated per day on a treatment machine should be considered alongside the specific treatment techniques and pathologies being treated to enable a comprehensive analysis of machine activity. The two treatment machines presented in this study were opened daily during the same amount of time, and
Figure 3 alone revealed a higher daily throughput for the Synergy system compared to the VersaHD. Nevertheless, a detailed examination of the therapeutic modalities and the spectrum of pathologies addressed indicates that the VersaHD was predominantly utilized for treatments of higher complexity, such as Intensity-Modulated Radiation Therapy (IMRT), Volumetric Modulated Arc Therapy (VMAT), and Stereotactic Radiotherapy (SRT), across a diverse range of pathologies. Conversely, the Synergy system was engaged in a narrower scope of treatment types, predominantly excluding SRT, which facilitated a greater number of daily treatments. On the Synergy, the installation of the surface recognition system was carried out in 2019, and led to a specialization of breast and metastasis treatments on this machine:
Figure 4 serves as a clear illustration of the departmental decision-making process. Prior to 2019, the Synergy machine was utilized for treating a wide range of pathologies. However, starting from 2020, there was a noticeable shift towards predominantly treating breast and metastasis cases. This example exemplifies how the use of ODM can enable the verification of implemented strategies within the department and provide a means to evaluate their effectiveness. It demonstrates the capability of ODM in tracking and analyzing treatment patterns, facilitating informed decision-making and resource allocation in the department.
In addition to the aforementioned benefits, ODM is an essential tool for quantifying the quality of patient care in radiotherapy. In this study, we focused on three indicators: the delay between the simulation scan and the first treatment session, the coherence of prescriptions for a particular pathology, and the overall treatment duration. The results obtained from the first indicator led to a significant change in departmental strategy in 2019. Prior to that, the date of the first treatment session was assigned once the entire planning process (including delineation, ballistics, optimization, and file verification) was completed. This allowed each actor in the planning process (dosimetrist, radiation oncologist, physicist, manipulator) to be a driving force and to ensure a rapid treatment preparation for the patients. However, the first results provided by ODM showed that, before 2019, this strategy was not effective; this was illustrated in this study for palliative metastasis treatments in
Figure 5, for which only 29.1% were treated within 8 days following the simulation scan. To address this issue, a new strategy was implemented in 2019, which established specific timeframes between the simulation scan and the first treatment session. The delay was set to 1 working day for symptomatic treatments, to 7 calendar days for analgesic, simple metastasis and epiduritis treatments, and to 14 calendar days for other types of metastases. This revised strategy successfully reduced the delay in initiating care. By 2021 and 2022, 45.3% of palliative treatments had their first session within 8 days of the simulation scan.
The study of prescription variability with ODM is of particular interest in breast pathology due to the evolution of practices and reference frameworks linked to patient age. In our institution, the prescription schemes of the START B study (15 × 2.67 Gy) [
9] were applied for breast treatment in 2019, typically reserved for patients over 40 years of age, while the historical protocol (25 × 2.0 Gy) was prescribed for patients under 40 years of age. For patients over 65 years of age, the 15 × 3.0 Gy protocol was used until 2021, when it was replaced by the prescription of 5 × 5.2 Gy.
Figure 6 shows the results of the audit carried out with ODM, indicating that the 25 × 2.0 Gy prescription was still being used for patients over 40 years of age. This data suggests the need for a medical discussion to harmonize prescriptions according to the decided schemes.
The overall duration of radiotherapy treatments is another significant criterion for assessing quality, as it influences the rate of cellular repopulation in tumor cells [
10]. In the case of larynx H&N pathology, this duration was quantified using data obtained from the ODM tool. The prescription of 35 × 2 Gy corresponds to a theoretical duration of 47 to 49 calendar days, depending on the starting day of the first session in the week. In 2020, the department implemented a bi-fractionation strategy for this pathology, scheduling treatments on one day a week to minimize their duration, as treatment interruptions can occur due to medical reasons or machine maintenance or breakdowns. Prior to this measure, the median duration of treatment was 51 days (
Figure 7) for the period from 2016 to 2020, and decreased to 48 calendar days in 2022, approaching the desired theoretical duration. Regular monitoring of this quality indicator using the ODM tool allows for evaluation of any potential deviations and facilitates necessary adjustments.
Overall, these findings demonstrate how ODM can identify inefficiencies and drive improvements in patient care through evidence-based decision-making and strategic adjustments. Consequently, it is a highly promising instrument that is anticipated to be pivotal in the forthcoming years for leveraging radiotherapy data.
Indeed, most of the data analyzed in this study, such as the date and time of treatment, the recorded dose, the name of the treatment unit associated with the delivered dose, etc., were generated by the Mosaiq OIS, and are robust data. However, some entries, such as the diagnostic code (ICD), for example, were entered by humans and were therefore prone to errors. This poses a significant limitation since, in the case of ICD codes, a data entry error could result in the patient being classified in an inappropriate pathology group, thus generating inaccurate statistics. To address this issue, two main actions were implemented in the radiotherapy department. The first, which yielded the most benefits, involved educating individuals entering information into Mosaiq about the importance of their data entry for posterior radiotherapy data extraction. The second action involved verifying data robustness during data analysis by conducting information cross-referencing. For instance, cross-referencing among the pathology group (derived from the ICD code), the prescribed dose, and the prescription name was performed to identify patients whose ICD codes may have been inaccurately assigned.
Since the initial implementation of ODM in our radiotherapy department, alternative software systems have been developed by radiotherapy vendors. The Aria OIS system from Varian Medical System (Palo Altoa, CA, USA) allows for straightforward data extractions, but presents limitations when cross-queries need to be conducted. Additionally, it lacks a user interface for data analysis. To date, the RayCare OIS from Raysearch Laboratories AB (Stockholm, Sweden) also does not support complex treatment data extractions. However, Raysearch Laboratories AB has developed RayIntelligence to exploit radiotherapy data from TPS. It consists of two components: RayData, which enables automatic data extraction to a cloud server from TPS, and RayAnalytics, present on the cloud server, which allows for data analysis and visualization. Considering these developments, the integration of treatment data from the RayCare OIS appears to be a natural progression. Accuray’s Integrated Data Management System (iDMS) currently does not offer treatment data analysis tools for machines like Cyberknife, Tomotherapy, or Radixact, but provides data interfacing with third-party OIS systems such as Mosaiq, Aria, or RayCare. Queries on treatment data conducted with these machines need to be performed via the OIS. Furthermore, Elekta has developed Mosaiq Oncology Analytics (MOA), which enables in-depth extraction of treatment data, and features an intuitive graphical interface designed for users, making data extraction and query execution seamless without the need for specific computer skills. However, two significant limitations are noted. The first limitation, shared by MOA with RayIntelligence, concerns cybersecurity with the cloud-based architecture of these systems, meaning health data is aggregated and made available on secured cloud servers. With the rise in cyberattacks targeting medical data [
11,
12,
13,
14] and their impact on clinical activities, radiotherapy centers are opting to store the medical data on physical servers within their premises, making a cloud platform less attractive until stronger security measures are in place [
15,
16]. The second limitation relates to the MOA structure: while ODM allows accessing all radiotherapy data stored in Mosaiq, enabling almost any type of query, MOA was designed based on a model allowing queries on predefined indicators. If additional queries are desired by users, Elekta would undertake additional developments, leading to a certain rigidity in usage. Nonetheless, radiotherapy centers may have limited human resources to fully extract treatment data, and turnkey software solutions offered by vendors can serve as an attractive alternative for managing a radiotherapy department.
In our center, the implementation of ODM has represented a significant advancement in the analysis and understanding of radiotherapy data, enriching our medical databases with structured treatment information. Firstly, at the local level, ODM facilitated the extraction of essential data for the validation of the OSIRIS [
17] clinical model within its radiotherapy component. Several potential applications of this instrument are currently under consideration. For instance, Guihard et al. [
18] presented the interest of having an overlap between the information of the OIS and those of the treatment planning through an application focused on breast treatments. Nowadays, data related to treatment planning, such as CT, MRI, RTDose, RTStruct, RTPlan, RTImage, REG, CBCT, etc., can be stored in OIS. Future developments of ODM will enable linking data treatment with planning data. Furthermore, a French national project led by the Unitrad group aims to generate structured clinical data directly in OIS [
19,
20]. Three crucial categories of information for radiotherapy will be stored in the OIS: treatment planning data, patient clinical data (including initial consultation, follow-up during treatment, and post-treatment follow-up), and data related to the course of radiotherapy. The extraction of this data using ODM would create a comprehensive structured database.
The subsequent phase involves the integration of radiotherapy datasets with hospital information system records to elucidate the correlations among diverse medical datasets, including surgery, anatomopathological data, diagnostic assessments, chemotherapeutic treatments, radiotherapeutic treatments, and many others. A single database resulting from the concatenation of all structured medical data generated from patients during their cancer care can thus be obtained. These databases, created at the level of a care center or even at a national level [
17], exploitable by artificial intelligence-based tools [
21], present significant potential for improving patient care. Similar to various medical domains, this will have the potential to predict treatment toxicity [
22,
23], propose personalized treatments [
24], provide decision support tools [
25,
26,
27], establish new dose reference framework [
28], and much more.
This integration is a pivotal goal underpinning large-scale initiatives [
29], and in this context, the ODM tool constitutes a critical component amidst the myriad developments required, thereby facilitating advancements in oncological radiotherapy practices.