PET/CT Radiomics in Lung Cancer: An Overview

: Quantitative extraction of imaging features from medical scans (‘radiomics’) has attracted a lot of research attention in the last few years. The literature has consistently emphasized the potential use of radiomics for computer-assisted diagnosis, as well as for predicting survival and response to treatment. Radiomics is appealing in that it enables full-ﬁeld analysis of the lesion, provides nearly real-time results, and is non-invasive. Still, a lot of studies suffer from a series of drawbacks such as lack of standardization and repeatability. Such limitations, along with the unmet demand for large enough image datasets for training the algorithms, are major hurdles that still limit the application of radiomics on a large scale. In this paper, we review the current developments, potential applications, limitations, and perspectives of PET/CT radiomics with speciﬁc focus on the management of patients with lung cancer.


Introduction
Lung cancer is the second most common type of cancer in men and women worldwide, with an estimated lifetime prevalence of about 1/15 and 1/17, respectively, for the two genders [1]. In Italy, there were ≈42,500 new cases in 2019, accounting for ≈11% of all the newly diagnosed cancers in the same year [2]. Five-year survival rates of patients with lung cancer vary considerably depending on the type and stage of the disease, ranging from a dismal 3% for distant small-cell lung cancer (SCLC) to 60% for localized non-small-cell lung cancer (NSCLC) [3]. Timely detection and correct management are therefore essential to improve the clinical outcome of patients affected by lung cancer.
In recent years, computerized analysis of 3D scans from Computed Tomography (CT), Positron Emission Tomography (PET), and Magnetic Resonance Imaging (MRI) has received a great deal of attention as a means to improve the clinical management of a number of disorders. It is believed that radiomics has the potential to improve on traditional, manual interpretation by detecting features and patterns that otherwise would go unnoticed to the human eye [4,5]. By leveraging on large datasets (hence the suffix '-omics') and artificial intelligence techniques, radiomics could help predict the type of disease, survival, and response to therapy [6,7]. There are also a number of logistic advantages in this approach, such as providing nearly real-time results and not requiring any invasive procedure for the patient [8]. Furthermore, compared with standard biopsy, radiomics can offer not only a full-field

Methodology
The overall objective of radiomics is to build classification and/or regression models based on some quantitative features extracted from the imaging data. The typical workflow in radiomics ( Figure 1) is rather independent of the underlying disease and consists of six sequential steps [17] that are described in the following subsections.

Acquisition
This is the procedure whereby the scans are obtained, and includes both the examination itself and the patient preparation protocol. The output will be a three-dimensional matrix of intensity values (voxel model), which in the remainder we shall refer to as the raw data. A wide range of parameters intervene in the acquisition process, among them tube current and voltage (for CT); spatial resolution (voxel size), reconstruction algorithm and related settings both for CT and PET. All these variables may have a significant impact on the radiomics features computed [6,18], with certain features being affected more than others [19].

Pre-Processing
Pre-processing may involve spatial filtering, windowing, and/or resampling. The objective of spatial filtering can be either to reduce noise or emphasize features at different scales. Common tools for this task are Butterworth smoothing [20], Gaussian filters [21], and Laplacian of Gaussian filters [22]. Windowing consists of applying a lower and upper threshold to the intensity values of the raw data, this way defining a range of acceptable values. Resampling amounts to changing the number of bits used for encoding the raw data, which is commonly 12 or 16. This is usually reduced to eight, six, or four before feature extraction [20,23,24].
Pre-processing is a crucial step in the workflow and may significantly affect the overall outcome, as numerous experiments have demonstrated [19,20].

Segmentation
Segmentation (delineation) is the process whereby the part of the scan that is relevant for the analysis (Region of Interest-ROI) is separated from the background. Note that sometimes the literature makes a distinction between a two-dimensional ROI and a three-dimensional one, the latter being in some cases indicated as Volume of Interest (VOI). For the sake of simplicity, we use the acronym ROI to indicate both a two-and a three-dimensional region. The output of this step is a binary (boolean) matrix the same size of the raw data, where 'true' (1) indicates that the voxel belongs to the ROI, 'false' (0), otherwise. Segmentation is a time-consuming and anything but easy step, for many lesions will show unclear and ill-defined borders. The process is also complicated by the presence of areas such as necrosis, atelectasis, and/or inflammation, whose role in the radiomics work-flow is not fully understood yet. Although a number of automated (e.g., adaptive thresholding [25], convolutional networks) and semi-automated (e.g., level-set [26], region growing [27]) methods have been proposed, manual delineation is still regarded by many as the ground truth [6].

Feature Extraction
Feature extraction is a pivotal step in the whole procedure and involves computing a set of quantitative parameters (image features or, simply, features) from the region of interest. The features should obviously correlate with the outcome of the clinical investigation involved (more on this in Section 3). At present, there are two main classes of features: the 'hand-designed' (or 'hand-crafted') ones and those based on Deep Learning (see Figure 2 for a possible taxonomy). Hand-crafted features are obtained via some suitable mathematical functions that are essentially designed by hand (hence the name). Most common among them are shape and texture features. By contrast, Deep Learning features are obtained implicitly by training on large datasets of images.

Shape Features
Shape features aim at characterizing the geometry of the region of interest. They can be computed either from each slice separately (2D) or from the whole ROI (3D). In most cases, their objective is to differentiate between round, smooth, and regular lesions from spiculated, elongated, and irregular ones. Apart from volume, common shape features are compactness, elongation, rectangular fit, spherical disproportion, sphericity, surface area, and surface-to-volume ratio [24,28,29]. Clearly, shape features are more easily assessed at CT than PET due to the higher image resolution of the former.

Texture Features
The objective of texture features is to quantify the variability of the grey-scale levels in the region of interest. They are therefore able to assess intra-lesion heterogeneity, which is considered a strong indicator of malignancy and poor prognosis [30,31].
Texture features commonly used in radiomics are first-and second-order statistics. First-order statistics describe the overall variation of the signal in the region of interest regardless of the relative position of the voxels. As a consequence, these features are fairly invariant to geometric transforms, therefore robust to image reconstruction and filtering. They include basic statistics such as mean, median, range, standard deviation, skewness, and kurtosis [32]. Second-order statistics model the joint signal variation between pairs of voxels lying at a predefined relative displacement between each other-among them are: Grey Level Co-occurrence Matrices (GLCM [33]), Grey Level Run-Length Matrices (GLRLM [34]), and Neighborhood Grey-Tone Difference Matrices (NGTDM [35]).
Other texture features different from the above have also been investigated for characterizing PET/CT images in lung cancer-as, for instance, Gabor filters [36], Laws' masks [37], Local Binary Patterns [38], and wavelets [32].

Deep Learning
Deep Learning is a relatively new data-driven paradigm for image analysis [39]. In this model, the feature computation is no longer defined a priori and hard-coded as in the hand-crafted methods but is learned from the data. This scheme is typically implemented by combining suitable computational blocks (layers) to form more complex structures (convolutional neural networks-CNN). Certain types of such blocks contain a number of free parameters, the values of which are determined via some training procedures. These consist of presenting the network with large sets of pre-classified (labelled) ROIs, through which the network adjusts ('learns') the values of the free parameters.
Deep Learning has recently shown great potential for computer-assisted diagnosis [38,40] and for prediction of response to therapy [41] in patients with lung cancer. A potential drawback of Deep Learning, however, is that the resulting features are not as easy to interpret as the hand-designed ones, nor readily linkable to clinically relevant image findings [42]. The paradox, then, is that, even if the results are good, we don't know why; as a consequence, it is hard to investigate the methods for possible sources of bias and/or mistakes (for a discussion on the perils of excessively complex algorithms, see also ([43], Ch. 6)).

Post Processing
The imaging features computed by any of the methods discussed in Section 2.4 can undergo further processing with the aim of reducing redundancy and/or increasing their discrimination capability. Most common approaches to this end are feature selection and feature generation [17].
Feature selection consists of retaining a subset of the original features by selecting the most discriminative ones. This is crucial in radiomics, for some image features tend to be strongly correlated with one another [44]. Approaches to feature selection come in different varieties, such as correlation-based selection, reduction based on mutual information gain, recursive elimination, and Lasso regularization (see [45] for a recent review on this subject).
Feature generation involves obtaining new features by combining of the original ones through some suitable transformations, such as Linear Discriminant Analysis (LDA), Principal Component Analysis (PCA), and Multi-Dimensional Scaling (MDS) [46,47].

Data Analysis
Data analysis comprises two separate steps: the first (model building), in which a classification and/or regression model is generated; the second, where the model is used to make predictions about the case or cohort of patients under evaluation. Model building involves (a) establishing the type of classifier or regressor to be used, and (b) feeding the model with a set of pre-classified cases-i.e., arrays of features/label pairs where the label indicates the clinical condition of the corresponding subject. This process of presenting the model with pre-classified cases is usually referred to as training. Crucial to this step, of course, is the availability of large enough datasets of pre-classified cases (ground truth).

Applications
Applications of PET/CT radiomics in lung cancer can be either cross sectional, where the interest is determining specific characteristics of a lesion at some point in time (we may want, for instance, to discriminate benign vs. malignant lesions or identify the histological subtype), or longitudinal, in which case the concern is predicting the likely evolution of the disease over time-i.e., overall survival, disease-free survival, and/or response to treatment. In the remainder of this section, we discuss four potential applications that can have important implications in clinical practice.

Discrimination between Benign and Malignant Pulmonary Nodules
Solitary pulmonary nodules (SPN) are relatively common findings, although the available data about the estimated prevalence at CT examination vary significantly [57,58]. Clinical management of SPN poses significant challenges to the clinician, for a non-negligible fraction of them (estimated between 3.7% and 5.5% [59]) may actually be malignant. Traditionally, the evaluation involved manual assessment of some key image characteristics at CT that are considered strong indicators of benignity or malignancy [60]. In this scenario, recent studies have shown that prediction models based on quantitative imaging features can help differentiate between benign, malignant, and inflammatory pulmonary nodules [37,50,[61][62][63][64].
Positron emission tomography has also proved effective in the assessment of suspicious SPN. The standard strategy in this case consists of comparing the uptake value against some absolute or relative threshold: uptake in the nodule higher than background mediastinal activity [65] and/or SUV max > 2.5 are typically considered indicative of malignancy [66]. This approach has demonstrated good sensitivity but rather low specificity (pooled values in a recent meta-analysis [67], respectively, of 89% and 78%). Quantitative assessment of radiotracer uptake by texture analysis has shown to improve the diagnostic accuracy-in particular specificity-compared with SUV max alone [68,69].

Classification between Primary and Metastatic Lesions; Histological Subtyping
Detailed lesion characterisation has important implications for the management of patients with lung cancer. Differential diagnosis between primary and metastatic lesions, for instance, is crucial for stratification as well as for establishing the optimal treatment strategy [70]. Likewise, correct identification of histological subtype has a strong influence on the outcome and determination of the most appropriate therapy [71,72].
In a retrospective study on a fairly large cohort of patients (n = 545), PET texture features were able to differentiate between primary and metastatic lung lesions, whereas CT features were not [48]. PET radiomic features were also found to correlate with histological subtype (specifically adenocarcinoma vs. squamous cell carcinoma) in [21,44], whereas, in [49,52], radiomics signatures based on CT texture features were significantly associated with tumor histology.

Prediction of Survival
Prediction of survival plays an important role for triaging and, consequently, for determining the suitability of subjects for different treatment options. Survival studies, however, are notoriously difficult due to the long follow-up required and the presence of many confounding factors. It is not surprising then that results in this topic are less clear-cut than in the other potential applications.
In [53], tumor heterogeneity evaluated by texture features at CT was a significant predictor of overall survival (OS) in NSCLC, but radiotracer uptake was not. CT-and PET-derived heterogeneity were both significant predictors of OS in [73], while Hatt et al. [28] found an association between OS and PET sphericity, although their results also showed dependency on lesion volume and method used for segmentation. Positive association with different survival metrics and PET/CT texture features [37,54,73,74] or CT features alone [32,36,53] were also reported in other studies. Krarup et al. [55], however, determined that PET/CT texture features were insignificant in predicting progression-free survival beyond tumor volume and clinical stage in lung cancer patients. Similarly, Sacconi et al. [56] did not observe any significant correlation between CT texture features and OS in lung adenocarcinoma.

Prediction of Response to Treatment
Predicting response to treatment-and in particular to chemotherapy, radiotherapy, and immunotherapy-is crucial to maximize the outcome, and, at the same time, minimize the side effects by avoiding the administration of inefficient treatments.
Several studies have investigated the potential of radiomics to predict response to treatment in lung cancer. In [10,23], textural features from PET/CT scans at the baseline were found to correlate with local recurrence and disease-specific survival in patients treated with radiotherapy. Radiomic signatures from baseline PET/CT were also predictive of disease-free survival in NSCLC patients undergoing surgery [8], whereas, in [75], CT baseline feature predicted response to chemotherapy in lung adenocarcinoma. Image biomarkers from PET/CT also showed the ability to predict immunotherapy response in advanced NSCLC [76].

Discussion
Many recent studies have consistently emphasized the potential advantages of PET/CT radiomics in lung cancer and other oncological disorders. Chief among such advantages are the ability to capture information beyond the capabilities of the human eye, non-invasiveness, virtually real-time response, and full-field analysis of the lesion.
The results available in the literature are undoubtedly promising, but they also need to be considered with care. Lack of reproducibility, for instance, is a well-known problem in radiomics, and is mostly a consequence of the absence of standardized methods and settings in all the steps of the workflow [14,17]. A number of studies also suffer from serious limitations at the validation level: among them improper statistical analysis (e.g., lack of adjustment of the p-value for multiple tests) and/or absence of an independent validation dataset to confirm the results. This may easily lead to biased discovery rates and inflation of type-I errors, as correctly pointed out in [77]. Last but not least, publication bias naturally tends to overstate positive results against negative ones (for a discussion on this, see also [78,79]).

Conclusions and Perspectives
Quantitative image analysis of PET/CT scans has attracted increasing research interest in the last few years. Clearly, the field is still at an early stage and further work needs to be done to confirm the benefits and the potential advantages in the clinical practice. As of now, the evidence of a superiority of radiomics beyond standard imaging analysis tools including SUV, kinetic data, etc. is yet to be confirmed for the tasks discussed in this work.
In [17], the authors identified three major obstacles to overcome before radiomics can effectively translate into clinical practice: standardization, automatization, and data availability. The first involves the definition of guidelines and/or standards for all the steps described in Section 2. The Image Biomarker Standardization Initiative (IBSI) is an interesting attempt toward this end [80]. The second requires the automatization of those steps of the procedure that still rely too heavily on human intervention-for instance, segmentation. Finally, the availability of large, possibly multi-center image datasets is crucial for the development and validation of radiomics methods.
Funding: This work was partially supported by the Department of Engineering, Università degli Studi di Perugia, Italy, under the project 'Shape, colour and texture features for the analysis of two-and three-dimensional images: methods and applications' (Fundamental Research Grants 2019), and by the Università degli Studi di Sassari, Italy, within the framework 'Fondi d'Ateneo per la Ricerca' (University Research Funds) 2019.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: