Machine Learning Techniques to Predict Timeliness of Care among Lung Cancer Patients

Delays in the assessment, management, and treatment of lung cancer patients may adversely impact prognosis and survival. This study is the first to use machine learning techniques to predict the quality and timeliness of care among lung cancer patients, utilising data from the Victorian Lung Cancer Registry (VLCR) between 2011 and 2022, in Victoria, Australia. Predictor variables included demographic, clinical, hospital, and geographical socio-economic indices. Machine learning methods such as random forests, k-nearest neighbour, neural networks, and support vector machines were implemented and evaluated using 20% out-of-sample cross validations via the area under the curve (AUC). Optimal model parameters were selected based on 10-fold cross validation. There were 11,602 patients included in the analysis. Evaluated quality indicators included, primarily, overall proportion achieving “time from referral date to diagnosis date ≤ 28 days” and proportion achieving “time from diagnosis date to first treatment date (any intent) ≤ 14 days”. Results showed that the support vector machine learning methods performed well, followed by nearest neighbour, based on out-of-sample AUCs of 0.89 (in-sample = 0.99) and 0.85 (in-sample = 0.99) for the first indicator, respectively. These models can be implemented in the registry databases to help healthcare workers identify patients who may not meet these indicators prospectively and enable timely interventions.


Introduction
Lung cancer is the fifth most commonly diagnosed cancer in Australia, accounting for 9% of all cancer diagnoses [1], and is the leading cause of mortality, contributing 18% of all cancer deaths [2].It is the leading cancer burden and contributor of cancer mortality at 8695 per year, which is more than double that of colorectal cancer and three times the prostate cancer mortality [3].The 5-year survival rate for lung cancer is just 27% [4].Low survival rates and high treatment costs see lung cancer providing the highest cancer burden in Australia [5].
The timeliness of healthcare delivery represents a critical facet of healthcare excellence.In the context of lung cancer, initiating the first treatment without delay not only holds the potential for curbing disease advancement at a biological level but also brings about patient-centred advantages by alleviating the anxiety and distress linked to treatment postponement [6].The time from diagnosis to treatment initiation is often seen as an important care management time interval, with reported gaps from diagnosis to treatment initiation ranging from 6 to 45 days, and significant variation observed in how access to care time delays are reported in the literature [7,8].
The effects of timeliness of treatment on survival in NSCLC have revealed mixed results [9][10][11][12][13][14][15].A systematic analysis of 37 studies indicated a tendency toward decreased survival rates in individuals with advanced-stage disease, while those primarily undergoing surgical treatment showed more favourable outcomes, as reported in reference [16].In stage 1 patients, a delay in treatment (with intervals exceeding 7 days) led to a decrease in the 5-year survival rate by 9.07% [17].Conversely, for stage 2 patients who initiated treatment earlier (within an interval of 7 days or less), their 5-year survival rate saw a significant increase of 9.01%.When comparing groups based on the interval between cancer diagnosis and treatment, with the reference group being those treated within 7 days, the adjusted hazard ratio (HR) for mortality increased significantly in the other groups (8-14 days, 15-60 days, and ≥61 days) as the interval time increased (HR 1.04-1.08),with a p-value less than 0.05.
The relationship between the timeliness of care among lung cancer patients and mortality has been demonstrated in a previous study [18].When considering significant individual and area-level risk factors, timely first definitive treatment and multi-disciplinary team meetings (MDM) were found to have an independent association with a reduced likelihood of 2-year all-cause mortality in patients with non-small cell lung cancer (NSCLC) (odds ratio (OR) = 0.73, 95% credible interval (Crl) = 0.56-0.94).
A recent report from Victoria, Australia [19] has identified notable unwarranted clinical variation across hospitals in several lung cancer quality indicators, including the interval between diagnosis and surgical resection, targeting a maximum interval of 14 days.Disparities were observed between metropolitan and regional areas, as well as variations based on the geographical index of relative socio-economic advantage and disadvantage.The report found that while timeliness of care improved (proportion diagnosed within 28 days from referral) from 2019 to 2020, significant variations were observed between metropolitan private and regional sites, as well as in time from diagnosis to resection, which varied between levels of socio-economic disadvantage.
A related previous study examined risk factors associated with the timeliness of care [11].In the multivariate analysis, factors such as place of birth (Australia versus others), disease stage at diagnosis, notifying hospital type (private versus public), first treatment intent (curative versus non-curative), and palliative care were associated with delay in time from referral to diagnosis.For instance, patients notified in a public hospital experienced a median delay of 31 days compared to 15 days in private hospitals and had a lower mortality hazard ratio of 0.50 (95% CI: 0.41-0.60)compared to those in private hospitals (p < 0.001).Regarding the time from diagnosis to initial treatment, independent factors included Eastern Cooperative Oncology Group (ECOG) performance status, disease stage at diagnosis, notifying hospital type (private versus public), and the performance of surgery.For the time from referral to treatment, significant factors included palliative care and treating hospital type (private versus public).The timeliness of lung cancer care is strongly related to the accessibility and availability of healthcare services [20][21][22].Seeking timely diagnosis and treatment improves survival and treatment outcomes [23,24].Delay in lung cancer care is strongly linked with increased risk of mortality, distant metastasis, and poor treatment outcomes including sustained anxiety and distress [25][26][27].
Studies are available on predicting the timeliness of lung cancer care using conventional statistical methods such as survival and logistic regression models [11,21,28].Unlike traditional statistical models, machine learning (ML) algorithms have a superior ability in addressing regression and classifications problems simultaneously to the classical statistical methods [29].Machine learning techniques have been successfully implemented to analyse large electronic health records, including among patients with type 1 diabetes, to identify risk factors for complications (diabetic ketoacidosis) using decision trees and cross-validation techniques [30], but in that study, the performance of machine learning techniques did not offer significant improvements over the usual logistic regression model when evaluated against the testing dataset.Similarly, machine learning techniques have been used to predict short and long term HbA1c response among patients with type 2 di-abetes [31] and who had started insulin treatment with reasonable performance through implementing models such as the elastic net regularisation generalised linear model, support vector machines, and random forests.
In the field of lung cancer, machine learning techniques have been used to predict early lung cancer using metabolic biomarkers and clinical information with good performance (AUC = 0.81) [32], derive a clinical prediction model for pulmonary metastasis [33], predict radiation-induced toxicities among lung cancer patients undergoing radiotherapy [34], estimate lung disease mortality from chest x-rays and other clinical and demographic information [35], and estimate lung cancer survival time intervals [36].In the current era of personalised precision medicine in lung cancer, machine learning has shown encouraging capability in the integration of disparate data elements from complex datasets, including clinical characteristics and patient demographics, diagnostic medical imaging and molecular and genomic status to enhance histopathological characterisation, prognostic accuracy, prediction of metastasis, and clinical decision making [37][38][39].
To date, there is scant information on the capability of machine learning techniques in the evaluation and prediction of the quality of care in lung cancer.This project is the first to employ such novel techniques in this field.

Setting
This study obtained deidentified lung cancer data from the Victorian Lung Cancer Registry (VLCR) between 2011 and 2022 [19].The registry collects data from 19 Victorian health services covering 40 hospitals, encompassing around 85% of Victorian lung cancer notifications.The aim of VLCR is to collect real-world observational data from patients and provide risk-adjusted benchmarked quality indicator reports to healthcare practitioners with a view to improving healthcare delivery and improving the health and well-being of lung cancer patients.Specifically, the VLCR report aims to provide institutional benchmarking to identify multiple opportunities for quality improvements in lung cancer detection, management, and treatment at hospitals.Patients were included if they were at least 18 years old and presented with incident non-small cell lung cancer (NSCLC) and small cell lung cancer (SCLC) based on either a clinical or pathological diagnosis.Exclusion criteria included the presence of secondary lung cancer, thymoma, or mesothelioma.Ethics approval for this study was obtained from Monash University Human Research Ethics Committee (Project number: 39010).

2.
The interval between diagnosis and initial surgery, chemotherapy, radiotherapy, or referral to palliative care ("diagnosis to initial definitive management"): Target ≤ 14 days (QI 2).
The interval between referral and initial definitive management: Target ≤ 42 days (QI 4).
The cut-offs for these timelines are based on official Australian guidelines [40].The referral date is the date recorded on the referral letter to the hospital or diagnosing clinician.The date of diagnosis is the date a pathological test confirms a primary lung cancer.If a patient does not have a pathological test to confirm a primary lung cancer, the date of diagnosis is the date a clinical test confirmed a primary lung cancer, supported by medical correspondence confirming a primary lung cancer.See Appendix A for definitions.

Predictive Features
This analysis encompassed various patient phenotypic predictive risk features, including sex (male versus female), age at diagnosis, country of birth (Australia versus others), preferred language (English versus others), smoking status, tumour, node, metastasis (TNM), clinical stage of primary tumour at diagnosis based on the International Association for the Study of Lung Cancer classification [41], Eastern Cooperative Oncology Group (ECOG) performance status (0: good to 4: poor) [42], lung cancer type (small cell lung cancer versus NSCLC), notifying hospital, diagnosing hospital, and type of hospital (private versus public).These data are collected from participating hospitals' medical records, following a 2-week opt-out period for patients.
At the patient level, the only available residential information was the postal areas (POA) derived from the VLCR.POAs are smaller spatial units compared to statistical areas (SA2) and can be linked to census population data directly from the Australia Bureau of Statistics (ABS).POA has been identified as an appropriate scale for evidence-based tools for effectively targeting policy interventions.From the ABS 2016 census [43,44], data for the analysis including the index of relative socio-economic disadvantage (IRSD) and the remoteness index were obtained.The IRSD serves as a surrogate indicator of area-level socio-economic status (SES).A low IRSD score in a given POA indicates a higher proportion of relatively disadvantaged individuals.The IRSD was categorized into quartiles, while the remoteness index was categorized into metropolitan (major cities) and regional areas (inner regional, outer regional, and remote/very remote Australia).Other measures of SES such as IRSAD, IER, and IEO were also included in the prediction model.To improve predictive capabilities, continuous variables were standardised by subtracting the variables from their means and dividing them by their standard deviations.All the predictive features listed above were included in the model, as they were deemed, a priori, to be significant predictors of timeliness of care.

Statistical Methods
Supervised machine learning techniques were implemented to classify patients into the 4 quality indicators listed above using the machine learning techniques described below.The models were set up and run through a user-written command in Stata "c_ml_stata_cv", which implements the Python Scikit-learn tools via a Stata/Python integration function [45].The following machine learning models were studied: decision tree, boosting, random forests, regularized multinomial, neural network, naive Bayes, nearest neighbour, and support vector machine, and the hyper-parameters for each machine learning technique were optimized via greed search using 10-fold cross-validation techniques (described in detail below).

1.
Decision Trees [46,47] Decision trees are supervised machine learning tools that split data based on parameters into nodes and leaves to construct a tree-like model.Information gain is used for node division.Key parameters include the following: Maximum depth: controls tree depth for capturing complex relationships while preventing overfitting.Minimum samples to split: requires a certain sample count before node splitting, preventing premature splits.Minimum samples in a leaf: sets the minimum samples needed in a leaf node, curbing overfitting.Maximum leaf nodes: sets the upper limit for leaf nodes, aiding in avoiding overfitting.Splitting criterion: chooses metrics like "GINI" or "entropy" to assess split quality based on impurity or uncertainty.Class weight: adjusts class influence to manage imbalanced data by assigning weights, with "balanced" being an automated option.

2.
Boosting [46,47] Boosting is an ensemble machine learning method that sequentially fits trees to residuals from the current model, combining them into a strong learning model.In gradient boosting, only variables enhancing prediction accuracy beyond a threshold are used.Key parameters include the following: Number of estimators: Combines weak learning models.More trees can boost performance but risk overfitting.Learning rate: lower rates prevent overfitting, but more estimators might be needed.Maximum depth: Limits boosting tree depth.Higher depth can overfit.Usually, there's a trade-off between learning rates and number of trees.

3.
Random Forests [46,47] Random forests employs multiple decision trees and averages their predictions.Cross validation selects variables and cut-offs.Each tree considers a random subset of predictors for decorrelation.Tunable parameters for model optimization include the following: number of trees, maximum, minimum samples to split, splitting criterion.

Regularized Multinomial
Regularized multinomial models like regularized logistic regression or regularized multinomial naive Bayes can be optimized by adjusting key parameters: Regularization strength: Controls regularization level.Smaller C enhances regularization, preventing overfitting.Larger C weakens regularization for better training data fit, but risks overfitting.Penalty type.L1 ratio: Chooses "l1" or "l2" regularization.L1 adds absolute coefficient penalty, encouraging sparse models.L2 adds squared coefficient penalty, yielding smoother models.

5.
Neural Network [48,49] Neural networks consist of hidden layers linking predictors to outcomes, each with nodes passing signals using activation functions.Learning adjusts weights to minimize loss.Key parameters for optimization: Number of hidden layers: more layers capture complexity but risk overfitting.Number of neurons per layer: more neurons capture complexity but risk overfitting.Activation function: (e.g., sigmoid, ReLU, tanh) affects neuron output, influencing pattern capture.Learning rate: larger rates speed convergence but risk instability.Dropout rate: Probability of neuron dropout in training.Reduces overfitting by limiting individual neuron impact.Regularization strength (alpha): Controls regularization extent.Prevents overfitting by penalizing large weights.

6.
Naive Bayes Naive Bayes models can be optimized by adjusting key parameters: Smoothing parameter: Manages probability smoothing.Lower alpha means less smoothing, risking overfitting.Higher alpha increases smoothing, aiding prevention of overfitting and rare feature impact.Prior probabilities: Sets initial class probabilities.Defaults based on training data frequencies.Adjust for prior knowledge of class distribution.

K-Nearest Neighbour
The main parameter for the k-nearest neighbour (k-NN) algorithm: k: The count of nearest neighbours considered for prediction.Its choice impacts the bias-variance trade-off.Smaller k leads to flexible, low-bias but high-variance models.Larger k yields rigid, high-bias but low-variance models.k's value is tuned through cross validation or other methods.

Support Vector Machine [47]
Support vector machine (SVM) models create hyperplanes to separate observations, linearly or non-linearly via kernels.The parameters to optimize an SVM's performance include the following: Kernel type (kernel): Chooses a function to transform data for hyperplane separation.Options are linear, polynomial, and radial basis function (RBF).Choice depends on data nature and problem.Regularization parameter: balances margin maximization and classification error minimization.Smaller C: wider margin, more misclassifications; larger C: narrower margin, fewer misclassifications.Kernel-specific parameters: Gamm: controls RBF kernel's Gaussian width.Smaller gamma: wider, smoother decision boundaries; larger gamma: narrower, complex boundaries.

Model Evaluation
Prediction is improved via a 10-fold cross validation (re-sampling), which is used to optimally tune the various parameters of the individual machine learning methods, through minimising the test (or out-of-sample) classification errors.Each of the machine learning models requires at least 1 tuning parameter to be specified.The aim is to determine a model which is parsimonious and least complex.Machine learning models were assessed based on their accuracy.Missing data based on the outcome variable were excluded.For the predictive features, we created a category for missing data and included the category in the model to preserve sample size.

Parameter Tuning
Data were randomly split into 2 sets: 80% for a training dataset, where the model was tuned and developed, and then the final model was tested on a 20% dataset.Throughout the training process, each model underwent 10-fold cross validation.This involved splitting the training set into a training subset and a validation subset with a ratio of 10:1 to fine-tune the hyperparameters by minimising the out-of-sample classification errors.The parameters were selected from a list of values through grid search from exploratory analysis, and the values for each machine learning method and parameters are shown in Appendix B. Following the optimization of tuning parameters, the model underwent evaluation using a tenfold cross-validation approach applied repeatedly ten times.This rigorous process involved dividing the data into ten subsets or folds, with the model being trained and assessed iteratively, rotating through each fold as a test set while the remaining nine folds were utilised for training.The ultimate accuracy metric was established by amalgamating the outcomes from the ten cross-validated models through the use of the Area Under the Receiver Operating Characteristic Curve (AUC-ROC).This curve graphically illustrates the relationship between the true positive rate and the false positive rate and ranges between zero and one, with a higher value signifying better predictive performance.The decision to prioritize accuracy as the optimizing metric was driven by the desire to ascertain the optimal threshold value for prediction determination, where the AUC provides an overall snapshot of classification performance for our study.The clinical significance of these AUCs were interpreted using these cut-offs (<0.20 = poor, 0.21-0.40= fair, 0.41-0.60= moderate, 0.61-0.80= good, ≥0.80 = very good) [50].A schematic conceptual model of the design, data wrangling, analysis and reporting is provided in Figure 1.

Results
Out of the initial 14,720 lung cancer patients registered with the VLCR, after exc ing those patients diagnosed interstate and missing lung cancer type and date of refe a total of 11,602 patients were included in our study (Figure 2).

Results
Out of the initial 14,720 lung cancer patients registered with the VLCR, after excluding those patients diagnosed interstate and missing lung cancer type and date of referral, a total of 11,602 patients were included in our study (Figure 2).The demographic and clinical characteristics of the cohort in its entirety (11,602) and stratified by adherence to QI 1 are shown in Table 1 below.Slightly more than half (56%) were male with a mean age of 69 years (SD = 11).In terms of ECOF status at diagnosis, most were able, with category 1 (24%) and category 2 (30%).The vast majority spoke English as a first language (90%).In terms of smoking status, 51% were ex-smokers and 35% current smokers.In terms of lung cancer type, most (88%) were of the more serious NSCLC type, with 46% presenting at the stage 4 clinical stage at diagnosis.The demographic and clinical characteristics of the cohort in its entirety (11,602) and stratified by adherence to QI 1 are shown in Table 1 below.Slightly more than half (56%) were male with a mean age of 69 years (SD = 11).In terms of ECOF status at diagnosis, most were able, with category 1 (24%) and category 2 (30%).The vast majority spoke English as a first language (90%).In terms of smoking status, 51% were ex-smokers and 35% current smokers.In terms of lung cancer type, most (88%) were of the more serious NSCLC type, with 46% presenting at the stage 4 clinical stage at diagnosis.In terms of adherence to QI 1 (i.e., referral to diagnosis within 28 days), 8008 patients (69%) met the criteria.There were significant differences in demographics and clinical and contextual factors between those who met/did not meet QI 1 (Table 1).A higher proportion of patients who met QI 1 were males, were younger, had lower ECOG scores, were smokers, lived in outer regional areas, had a higher stage, 4, of disease, and were Australian born.SES indices such as IRSD, IEO, and IRSAD were also significantly higher among those who met QI 1.For instance, 62% of those who met QI 1 were Australian born compared to 58% of those born elsewhere, and this difference was statistically significant (p < 0.001, chi-squared test).
The figures in Appendix C highlight the optimal tuning of parameters for the eight different machine learning methods utilised in the study.The optimal combination of parameters for each machine learning model and their performance for both the training and testing datasets are shown in Table 2.The support vector machine learning method (margin parameter C = 1, gamma = 1) performed the best among all machine learning models, in terms of a testing AUC of 0.89 and training AUC of 0.99.The classification error rates (CER) for training and validation datasets were 0.2% and 0.1%, respectively.The nearest neighbour model (# of neighbours = 100, kernel = distance) performed well as well, with a testing AUC = 0.85 and testing CER = 0.1%.The boosting model performed third best (tree depth 15, number of trees = 150, and learning rate = 0.3), with an AUC = 0.83.These machine learning models performed much better than the traditional logistic regression model (training and testing AUC = 0.73).Appendix D shows the relationship between the various demographic, clinical, and socio-economic characteristics and meeting the individual quality indicators 2, 3, and 4. For QI 2, significant associations were found across all variables, except for smoking status and Australian born.A significantly higher proportion of those who met QI 2 had SCLC (23%) compared to NSCLC (6%), p < 0.001.For QI 3, ECOG status at diagnosis, smoking status, remoteness location of residence, clinical stage, and socio-economic location were all significant predictors.A higher proportion of patients who met QI 3 resided in major cities (72%) compared to those who did not meet the quality indicator (64%), p < 0.001.Finally, for QI 4, all features were significant, except for sex, smoking status, and Australian born.Those who met QI 4 stayed in a higher IRSD location (1004) compared to those who did not meet this indicator (982), p < 0.001).
This study shows that in terms of meeting the targeted timeliness quality indicators, for QI 1: overall proportion of "time from referral date to diagnosis being ≤ 28 days", QI 2: proportion where "time from diagnosis date to first treatment date (any intent) being ≤ 14 days", QI 3: proportion where "time from diagnosis date to surgical resection date being ≤ 14 days", and QI 4: proportion where "time from referral date to first treatment (any intent) being ≤ 42 days", the overall proportions across the whole cohort were 69%, 41%, 60%, and 49%, respectively.Figure 3 summarises the out-of-sample AUCs for the various machine learning models for quality indicators 1 to 4. The support vector machine, nearest neighbour, and boosting trees performed consistently well, compared to the logistic regression model, for these quality indicators.

Discussion
This study has found that machine learning techniques can provide better classification of lung cancer patients in terms of meeting key quality indicators for timeliness of care.Specifically, machine learning models such as support vector machines, k-nearest neighbour, and boosting trees fared much better than the logistic regression model.In terms of clinical significance, the performance of these models can be rated as having very good discriminatory properties (AUC > 0.80) compared to the traditional logistic regression model (rated good).The AUC of 0.89 was also higher compared to another study involving a large dataset of electronic health record analysis, which found AUCs ranging from 0.64 to 0.73 [51], when trying to identify patients with delays in starting cancer treatment using patient demographic, clinical, and neighbourhood socio-economic indices.Timeliness of care is important, as it has been shown to be associated with survival, and interventions to improve on the timeliness and completeness of cancer investigations and treatment such as a customised "OnkoNetwork" patient navigation program have been demonstrated to provide for a large survival benefit (HR = 0.63, p = 0.039) [52], but the first important step is to better classify which groups of patients would benefit from such interventions.
The incidence of timely diagnosis of lung cancer was 69%.This is lower than the study reported in Jordan [53].This could be due to the difference in cut-off point used to

Discussion
This study has found that machine learning techniques can provide better classification of lung cancer patients in terms of meeting key quality indicators for timeliness of care.Specifically, machine learning models such as support vector machines, k-nearest neighbour, and boosting trees fared much better than the logistic regression model.In terms of clinical significance, the performance of these models can be rated as having very good discriminatory properties (AUC > 0.80) compared to the traditional logistic regression model (rated good).The AUC of 0.89 was also higher compared to another study involving a large dataset of electronic health record analysis, which found AUCs ranging from 0.64 to 0.73 [51], when trying to identify patients with delays in starting cancer treatment using patient demographic, clinical, and neighbourhood socio-economic indices.Timeliness of care is important, as it has been shown to be associated with survival, and interventions to improve on the timeliness and completeness of cancer investigations and treatment such as a customised "OnkoNetwork" patient navigation program have been demonstrated to provide for a large survival benefit (HR = 0.63, p = 0.039) [52], but the first important step is to better classify which groups of patients would benefit from such interventions.
The incidence of timely diagnosis of lung cancer was 69%.This is lower than the study reported in Jordan [53].This could be due to the difference in cut-off point used to define the timely diagnosis of lung cancer, and the period of study.The majority of the patients had delayed treatment after diagnosis and referral.The possible reason might be due to diagnostic investigations and treatment being very expensive [54][55][56], which meant they couldn't be afforded by the majority of the patients.The timeliness of lung cancer care is influenced by patient-related, physician-related, and system-related factors [57].In the majority of lung cancer cases, during the initial phases, the lung cancer patients commonly have non-specific symptoms, and this could have made them delay in seeking healthcare.Another explanation could be due to the impact of the coronavirus (COVID-19) pandemic, as during the pandemic, medical services were shifted to COVID-19 patients [58].This could contribute to the delay in lung cancer presentation, diagnosis, and treatment [59,60].
Demographic, clinical, and determinants such as SEIFA and remoteness have a significant impact on the timeliness of lung cancer care.Age and IRSAD were found to be important predictors of timeliness of lung cancer care.These findings are in line with previous studies [57,[61][62][63][64], and given that the costs of diagnosis and treatment are rising unexpectedly [65,66], lung cancer patients with the lowest socio-economic status may not be able to afford the high costs of cancer diagnosis and treatment [67,68].Lung cancer patients with better incomes have a higher chance of seeking early diagnosis and treatment services [69,70].Younger age groups usually perceive that they are capable of coping with medical conditions and identify themselves as not being susceptible; this could have made them visit hospitals at a later stage than older age groups did.Additionally, usually, older people have comorbidities and manifest more clinical symptoms of lung cancer than young people do [71].
This study has the following strengths.Firstly, before testing and training the machine learning algorithms, appropriate pre-processing of the dataset was performed to avoid errors.Secondly, the predictors used in the model were biologically plausible and screened by an expert oncologist.Moreover, to the best of our knowledge, this is the first study applying the novel machine learning algorithms for the prediction of the timeliness of lung cancer care.The analysis has some limitations.Even though eight different machine learning methods were utilised, it is acknowledged that these are not comprehensive and that other techniques are available (e.g., deep learning, lasso, etc.).The aim was to provide researchers with familiarity with the Stata (Version 17.0, Stata Corp, College Station, TX, USA) software's access to such machine learning tools and, hence, we were unable to compare against the other techniques.Garavand and colleagues [47] provide a brief comparison among the various machine learning methods: neural networks are one of the widely used machine learning algorithms; survey vector machines, developed by Vladimir Vapnik, have been successfully applied to many classification and forecasting studies; random forests, introduced in 2001, is a highly recommended classifier when dealing with overfitting and underfitting; KNN is one of the simplest and is preferred over other classifiers due to its simplicity and high convergence speed.In this study, neural networks performed similarly to random forests, and one possible reason could be due to the fact that neural networks work better with larger datasets (i.e., >100,000 observations) [72].A formal comparison of the methods would be beyond the scope of this research.
For a similar reason, this study was only able to optimize selected parameters within each machine learning model, based on their availability in the user-written Stata code.We are planning to expand on the Stata code to enable fine-tuning other parameters in future work.The 95% confidence intervals for the AUCs for some of the models were also relatively wide.This could potentially be due to a smaller sample size for the testing dataset and selecting from a narrow grid search for the parameters for these models.In a future work, the list of machine learning methods can be expanded and the grid search widened for the parameters.It is also noted that there is an imbalance in numbers for quality indicator 1, namely 8008 patients with a referral to diagnosis within 28 days and 3594 patients with an interval > 28 days, and this may affect the performance of some of the machine learning techniques utilised in this study.This is acknowledged as an area for further work, but a recent study [72], which over-sampled the target class (or outcome) to address the potential limitation, showed that taking this approach did not improve the performance of their compared models.
In terms of the dataset, a real-world multicentre registry dataset from Victoria, Australia was analysed.While the data are comprehensive across regions, missing data are inherent in large observational studies such as ours.Missing data on outcome (e.g., referral date) were excluded in the analysis, as the models utilised could not accommodate missing data, but the impact on the results should be minimal due to the relatively small amounts of missing data (e.g., referral date for n = 1381/12,983).For predictive features, a category for missing data was created, so that was not an issue.Future work will evaluate the most appropriate way to deal with missing data (e.g., imputation).The VLCR is also not 100% in terms of population coverage, but completeness and accuracy of recruitment of the eligible population has been assessed on a scheduled basis by comparing data from the clinical registry with other data sources such as the Victorian Cancer Registry, the Victorian Admitted Episode Data, and hospital clinical record data [19].Most of the predictive features utilised in this study such as the clinical variables may not be subject to bias, but some such as smoking status could be poorly reported and cause misclassification, and this is acknowledged as a limitation.
The provision of timely care in lung cancer management is a critical measure of quality of care [73].Delay in care provides the biological risk of disease progression and the loss of opportunity of provision of curative-intent treatment as the cancer stage progresses.The timing of a new lung cancer diagnosis is also a period of high-level anxiety, distress, and social disruption likely to be beneficially impacted by timely care and the absence of delay.

Future Directions
This study's results have important clinical implications.It may be possible to develop an app or integrate these models into hospital clinical information systems, so that when a few key pieces of patient demographic and clinical information are keyed into the database, the models can help to "predict" patients who may experience a delay in referral to diagnosis or treatment, thus prompting case management, triage, or escalation of due follow-up.The models can also be fine-tuned or validated to work with vulnerable populations.Future research could also examine other tools such as deep learning, and also optimize more machine learning methods' parameters and address the issue of missing data.The structure of the design, analysis, and reporting of the data in this research could lead to different results when configurations and tools other than those described above are used.In terms of practical implementation of such tools in clinical practice, a cost-effectiveness analysis would be beneficial in terms of whether the additional costs of integrating such models in hospitals' clinical data collection and management systems would be outweighed in terms of improvements in patients' health outcomes through the provision of timely lung cancer care, and this would be an interesting follow-up work.

Conclusions
This research has demonstrated that machine learning models can be successfully implemented to classify timeliness of care among lung cancer patients.The findings have implications in terms of patient care, as clinicians can utilise such tools to evaluate which patients should be followed up closely or case managed to pre-empt delay in diagnosis and treatment to ultimately improve their clinical outcomes.Plans are set in place for a test model to be showcased to sites via the steering committee of the VLCR.

Informed Consent Statement:
The original registry data collection was based on an opt-out consent method.
Data Availability Statement: Restrictions apply to the availability of the data.Data was obtained from the Victorian Lung Cancer Registry and is available to researchers once they make a formal application to the registry and it is approved.https://vlcr.org.au/.

Figure 1 .
Figure 1.Conceptual framework of data preparation, splitting, and analysis applied.

Figure 1 .
Figure 1.Conceptual framework of data preparation, splitting, and analysis applied.

Figure 2 .
Figure 2. Flowchart of patient inclusion/exclusion criteria and final cohort for study.

Figure 2 .
Figure 2. Flowchart of patient inclusion/exclusion criteria and final cohort for study.

Figure 3 .
Figure 3. Out-of-sample area under the curve comparisons of machine learning methods for quality indicators 1 to 4.

Figure 3 .
Figure 3. Out-of-sample area under the curve comparisons of machine learning methods for quality indicators 1 to 4.

Table 1 .
Demographic and clinical characteristics of lung cancer cohort, and split by adherence to quality indicator 1 (referral to diagnosis within 28 days).

Table 1 .
Demographic and clinical characteristics of lung cancer cohort, and split by adherence to quality indicator 1 (referral to diagnosis within 28 days).

Table 2 .
Optimal parameters for selected machine learning models based on 10-fold cross validation for quality indicator 1.
Note: CER: classification error rate.AUC: Area under the curve.NA denotes Not Applicable.

Machine Learning Models and the Parameters Specified in the Grid Search
Interest: The authors declare no conflict of interest.Number of patients where time from referral date to first treatment (any intent) is ≤42 days (numerator).Number of patients in Registry undergoing anti-cancer treatment with referral and treatment date available (denominator).