Lung Segmentation and Characterization in COVID-19 Patients for Assessing Pulmonary Thromboembolism: An Approach Based on Deep Learning and Radiomics

: The COVID-19 pandemic is inevitably changing the world in a dramatic way, and the role of computed tomography (CT) scans can be pivotal for the prognosis of COVID-19 patients. Since the start of the pandemic, great care has been given to the relationship between interstitial pneumonia caused by the infection and the onset of thromboembolic phenomena. In this preliminary study, we collected n = 20 CT scans from the Polyclinic of Bari, all from patients positive with COVID-19, nine of which developed pulmonary thromboembolism (PTE). For eight CT scans, we obtained masks of the lesions caused by the infection, annotated by expert radiologists; whereas for the other four CT scans, we obtained masks of the lungs (including both healthy parenchyma and lesions). We developed a deep learning-based segmentation model that utilizes convolutional neural networks (CNNs) in order to accurately segment the lung and lesions. By considering the images from publicly available datasets, we also realized a training set composed of 32 CT scans and a validation set of 10 CT scans. The results obtained from the segmentation task are promising, allowing to reach a Dice coefﬁcient higher than 97%, posing the basis for analysis concerning the assessment of PTE onset. We characterized the segmented region in order to individuate radiomic features that can be useful for the prognosis of PTE. Out of 919 extracted radiomic features, we found that 109 present different distributions according to the Mann–Whitney U test with corrected p -values less than 0.01. Lastly, nine uncorrelated features were retained that can be exploited to realize a prognostic signature.


Introduction
Since the beginning of the COVID-19 pandemic, researchers have focused on the association between interstitial pneumonia related to the infection and onset of thromboembolic phenomena. In fact, several researchers have found that the clinical aggravations of some patients are correlated to an increase in some laboratory indices, such as D-dimers and fibrinogen, and to the finding of pulmonary embolism (PE) or pulmonary thromboembolism (PTE) detectable with a dedicated computed tomography angiography (CTA) study [1][2][3]. Therefore, the onset of PTE in COVID-19 patients is considered a prognostically unfavorable event that requires prompt treatment with low molecular weight heparin or other anticoagulants [4]. Other researchers have proposed elaborations based on neural topologies in case series of COVID-19 patients in an attempt to provide alternative diagnostic screening tools to the nasopharyngeal swab. The latter still represents the diagnostic gold standard, even if it is expensive, with long sample processing times [5].
can be obtained with thresholding, the segmentation of the lesions caused by interstitial pneumonia is not an easy task, since the lesions may be confused with surrounding tissues and vessels. We trained both a healthy lung parenchyma model and a lesion model in order to correctly segment all of the lungs from COVID-19 patients. The first part of the study posed the basis for further analysis on the segmented region, which may be used to establish a relationship between lesions and the onset of PTE. In the second part of the study, we selected different slices from each CT scan, and we extracted features in these regions. These radiomic features were used for characterizing lungs in PTE and non-PTE patients. To the best of our knowledge, no other author has considered the problem of performing statistical analysis of radiomic features from lung and lesion regions with respect to the onset of PTE in COVID-19 patients.
The rest of the paper is structured as follows: in Section 2, we present both the publicly available datasets and the local cohort of the Polyclinic of Bari, concerning CT scans with related annotations for lungs and lesions. In Section 3, the methods employed to perform the study are described, which involve both deep learning-based semantic segmentation architectures, detailed in Section 3.1, and the radiomics pipeline, introduced in Section 3.2. Mathematical details of the radiomics features considered are available in Appendix A. In Section 4, the obtained results are shown, whereas their discussion is presented in Section 5. Lastly, Section 6 portrays conclusions and future works.

Materials
The study considers 20 CT scans of patients positive with COVID-19 from the Polyclinic of Bari. Nine of these patients developed PTE. CT acquisition parameters were the following: slice thickness 0.6 mm, tube voltage 120 kVp, rotation time 0.33 s, pitch 1.2, and acquisition time 2.94 s. Images were reconstructed with a slice thickness of 1 mm in mediastinal and lung settings. Expert radiologists provided COVID-19 lung lesions (8 CT scans) and lung parenchyma (4 CT scans) annotations.
Publicly available datasets have also been considered for this work. Jun et al. provided a collection of 20 CT scans from COVID-19 patients with annotations of the left lung, right lung, and lesions [24]. Two datasets are available at MedSeg (http://medicalsegmentation. com/covid19/, accessed on 8 October 2021). The first consists of 100 axial CT slices, coming from openly accessible JPG images of more than 40 patients, whereas the latter is composed of 9 volumetric CTs taken from Radiopaedia (https://radiopaedia.org/, accessed on 8 October 2021). It should kindly be noted that Radiopaedia images have already been considered in the dataset by Jun et al. [24]. Zaffino et al. described a procedure for performing segmentation of CT images in COVID-19 patients by employing unsupervised learning techniques [25]. The authors realized a publicly available dataset comprised of 50 CT scans with related annotations for healthy lungs, ground-glass opacity (GGO), lung consolidation (LC), and denser tissues. We also deemed a dataset of patients with other lung ailments, VESSEL12 (https://vessel12.grand-challenge.org/, accessed on 8 October 2021), a challenge organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI 2012), held in Barcelona, Spain, from 2 to 5 May, 2012.
The summary of considered datasets is reported in Table 1, whereas example images are illustrated in Figure 1. The materials herein presented have been exploited for the analysis described in the following section of this paper.

Semantic Segmentation
Deep learning methodologies show potential in being widely used for medical image analysis tasks due to the accuracy they offer. Deep learning denotes the exploitation of computational models composed of multiple layers of abstraction, which allow obtaining hierarchical feature extraction and processing from raw data. These methodologies have resulted in state-of-the-art performances in areas where it is infeasible or challenging to design handcrafted features to feed classifiers, such as in computer vision problems [26]. Inside the realm of deep learning approaches for medical imaging applications, a special focus is devoted to architectures for semantic segmentation. This task consists of the labeling of each pixel of an input image, but without recognizing different instances of objects, opposed to what is done in instance segmentation architectures, such as Mask R-CNN [23,27]. Examples of applications include robotics, medical image processing, and human-computer interaction [28][29][30][31].
Semantic segmentation can be perceived as an image-to-image conversion process, where we intend to transform the original image into a map of values for each class of interest. In order to execute this conversion, the encoder-decoder CNNs offer several benefits. The encoder part of the network realizes the process of feature extraction, reducing the spatial dimensions whilst increasing the depth of the feature maps. Instead, the decoder serves the purpose to retrieve spatial information from the output of the encoder. The most common encoder-decoder architectures employed in medical domain, well-known as U-shaped architectures [32], include U-Net [33], U-Net 3D [34], and V-Net [35].
In this work, in order to realize the segmentation of lung parenchyma and lesions, we trained two CNN models based on the V-Net architecture, but considering the 2.5D and 2D variants with the same Dice loss formulation provided by Altini et al. [36,37]. For the task of lesion segmentation, two classes were considered: GGO and LC, as also detailed in Section 2. For that case, a multi-Dice loss formulation was provided, obtained by averaging the Dice loss across the two classes. Binary classification soft Dice coefficient is defined as in Equation (1): Then, we can define binary classification Dice loss as in Equation (2): Multi-class classification Dice loss can be defined as in Equation (3): where N represents the number of voxels in the batch, shows a constant for avoiding numeric errors, and K is the number of classes. We should note that the background class is not considered in the MDL S calculation (j = 0 is not in the sum). In order to train a model capable of segmenting both lung parenchyma and lesions, we followed the workflow depicted in Figure 2. For the lung segmentation model, images were pre-processed by considering a Hounsfield Unit (HU) range of [−1250, 250], clamping values out of this range, and then rescaling into [0, 1] for easing the training procedure. We realized a train set of 32 CT scans from the three datasets D1, D2 and D3 introduced in Table 1. In particular, we randomly selected 8 CT scans from D1, 16 CT scans from D2, and 8 CT scans from our local dataset D3. We denoted these training sets as T1, T2, and T3, respectively. As a validation set, we randomly selected 2 CT scans from D1, 4 CT scans from D2, and 4 CT scans from D3. We denoted these validation sets as V1, V2, and V3, for an overall validation set of 10 CT scans. A lung segmentation model was trained with 2.5D V-Net on the T1 and T2 train sets, whilst a lung lesion segmentation model was trained with 2D V-Net on the T3 train set. The models were validated in a 'combinate' usage on the final validation set. Namely, the masks obtained from both models were merged, exploiting the logical OR operation, resulting in a final lung mask. Lastly, a post-processing stage composed of the following three operations was performed. First, a binary morphological opening was carried out on the predicted mask by adopting a ball kernel with a radius of 5 voxels in order to remove noisy points. Afterward, only the two largest connected components were retained, considering that two lungs appear in the human body. Finally, a binary morphological closing operation was carried out in order to fill holes in the image using a ball kernel with a radius of 10 voxels.

Radiomic Features
Radiomics refers to the high-throughput extraction of several features from radiographic images, and it is more commonly used for characterizing solid cancers [38]. The important applications include non-small-cell lung cancer [39], head-and-neck cancer [40], glioblastoma [41], hepatocellular carcinoma [42], and breast cancer [43] among many others.
To perform the feature extraction pipeline, we exploited the PyRadiomics library [44], which defines most of the image features in compliance with those prescribed by the Imaging Biomarker Standardization Initiative (IBSI) [45]. Eventual differences are explicitly stated in the PyRadiomics documentation. From a radiological point of view, any fundamental CT alteration (ground-glass opacity, consolidation, linear opacity, nodules) can be found in COVID-19 interstitial pneumonia as well as in neoplastic lesions. Tumors are usually discrete focal lesions, while in inflammatory conditions, such as COVID-19 pneumonia, the same findings may be widely and bilaterally distributed. Therefore, the different distribution of the lesions is decisive for a correct interpretation.
According to Gillies et al., radiomic features can be distinguished in "semantic" and "agnostic" features [46]. The first refers to those that are generally adopted by radiologists to describe regions of interest (ROIs), whereas the latter are those devoted to capturing lesion heterogeneity via quantitative descriptors. In this work, we exploited agnostic features. These features include first-, second-, or higher-order statistical indicators.
First-order statistical indicators refer to histogram-based features, which do not take into account the spatial relationships of voxels, but only consider the distribution of individual voxels. Starting from an ROI, it is possible to extract from the histogram descriptors, such as mean, median, energy, entropy, minimum, maximum, percentiles, interquartile range, skewness (an indicator of asymmetry), kurtosis (a descriptor of flatness), etc.
Second-order statistical indicators are those usually referred to as "texture" features; they have the potential to explain spatial interrelationships between voxel intensity values. They can be derived for instance by the gray-level co-occurrence matrix (GLCM), which allows measuring the existence of voxels with the same intensities along a given direction, or from the gray-level run-length matrix (GLRLM), which allows the quantification of repeated voxels with the same intensity along given directions. Starting from the GLCM, it is possible to extract features like correlation, contrast, auto-correlation, dissimilarity, energy, cluster tendency and shade, difference entropy, and so on. Instead, from the GLRLM, it is possible to extract features that describe short-run and long-run run emphasis, run length and gray-level non-uniformity, low and high gray-level run emphasis, run percentage, and so on [47].
Higher-order statistical indicators can be attained by applying statistical methods after having performed filtering or mathematical transformations to images. Instances of such transformations are wavelet transforms or Laplacian transform of Gaussian-filtered images, which can permit the extraction of zones with texture patterns that become more coarse [48].
Lastly, shape features are those that describe the geometric properties of the delineated ROI, as maximum surface, volume, maximum diameter along diverse orthogonal directions, sphericity, and compactness. Shape features can be extracted from both 2D and 3D ROIs. It is worth noting that this kind of analysis can lead to the generation of feature vectors with hundreds of elements, thus exposing to the risk of redundant information. Moreover, the number of extracted features can be higher than the sample size, reducing the power and the generalization capability of the study. Therefore, the exploitation of feature selection or dimensionality reduction techniques is strongly recommended in order to generate highly informative, reproducible, and non-redundant feature vectors [47].

Segmentation Evaluation Metrics
The common procedure to assess the performance of a classification model is to evaluate the confusion matrix, which allows calculating different metrics, including Precision, reported in Equation (4), Recall, reported in Equation (5), and Dice, reported in Equation (6): where TP, TN, FP, and FN are the number of true positives, true negatives, false positives and false negatives, respectively. For a semantic segmentation task, such as the one proposed in this paper, it is useful to adopt evaluation metrics based on volumetric overlapping, as the aforementioned Dice coefficient, defined for binary volumes, as in Equation (7): where G is the ground truth volume and P is the binary prediction volume. A high value of Dice means that the prediction is superimposable to the ground truth. This definition is analogous to the one provided in Equation (6). The relative volume difference (RVD) can be useful especially if combined with other measures, such as the Dice coefficient, to understand if the model is over-or under-segmenting. It is defined as in Equation (8): When it is important to assess the volumetric shape of the segmented region, it is useful to define metrics based on surface distances. The most common ones are the maximum symmetric surface distance (MSSD or symmetric Hausdorff distance) and the average symmetric surface distance (ASSD). Interested readers could refer to [36,[49][50][51] for further exploration.

Segmentation Experimental Results
From the experiments conducted, we obtained the metrics reported in Table 2. Example CT scans, ground truth, and predictions can be seen from Figure 3.  It can be clearly observed that the lung model (LM) appears to have some troubles in predicting the whole lung region when it is highly lesioned. In order to overcome these difficulties, we combined it with a lesion model, resulting in the lung model plus lesion model (LLM) that is more likely to individuate the lung correctly with lesioned regions. Since the human body carries only two lungs, and they form a closed structure from a morphological point of view, applying post-processing based on morphological operators and connected component analysis can offer several benefits. In fact, it can be seen that MSSD is reduced from an average of more than 90 mm to an average of 37 mm with the lung model plus lesion model plus post-processing (LLMPP).

Radiomics Lung Characterization
In this phase, we investigated the features extracted from the 20 slices containing the largest portions of the lung from each CT scan in order to understand if there was a different feature distribution between patients who developed PTE or not. In order to obtain the mask of the lung, we exploited the methodology for semantic segmentation developed in Section 3.1 of this paper.
To the best of our knowledge, the possibility to characterize lung parenchyma and lesions with radiomics methodologies from COVID-19 patients, with respect to the onset of PTE, has never been explored before; therefore, this can be considered one of the main contributions of this paper.
CT scans were considered in a 2D slice-wise fashion on the axial plane, resampling x and y axes to a resolution of 1 mm, exploiting the sitkBSpline interpolator. According to PyRadiomics documentation, CT intensity values reflect absolute world values (HU), which should be comparable across scanners. In studies where multiple scanners are considered, researchers have to check if extracted features are correlated to the scanner adopted. However, given that we considered a cohort coming from the same hospital, problems of this kind do not occur. Image discretization was performed with a bin width set to 25, often considered as a default in PyRadiomics usage (e.g., https://pyradiomics.readthedocs. io/en/latest/customization.html, accessed on 8 October 2021). This process resulted in the extraction of 919 features for 400 samples (20 slices for each of the 20 CT scans).
We opted to visualize these features in a low-dimensionality space, by both considering principal component analysis (PCA) [60] and t-distributed stochastic neighbor embedding (t-SNE) [61]. For further visual understanding, the result of the 2D embedding space with PCA and t-SNE space is portrayed in Figure 4. No clear pattern emerged from either PCA or t-SNE embedding plots starting from the original feature space, but the feature representation improves after having selected a subset of uncorrelated and significant features.
Likewise, we performed a feature reduction procedure by considering statistical Mann-Whitney U tests on our radiomic feature set. In particular, we kept the features for which the corresponding corrected p-value was lesser than 0.01 (we tested their distribution for both PTE and non-PTE samples), resulting in an array of 109 features. The Holm-Šidák correction was exploited to counteract the multiple comparisons problem [62].
Starting from these 109 features, we investigated the correlation matrix with a cluster map (https://seaborn.pydata.org/generated/seaborn.clustermap.html, accessed on 8 October 2021), in order to assess the correlations between the features and eventually reduce the feature set. Cluster map performs hierarchical clustering. In our case, we used the unweighted pair group method with arithmetic mean (UPGMA) [63] clustering algorithm, with Euclidean distance as metric. To visualize the cluster map, kindly refer to Figure  5. A 5000 × 5000 version of the same image is available in the supplementary materials, with all the feature names on both axes. It is possible to see that indeed there are clusters of correlated features. In particular, many features of the families glcm, glrlm, glszm show high correlation between the same family. Consequently, we proceeded to remove features with a correlation higher than 0.5, resulting in nine uncorrelated radiomic features that can be considered for the realization of a prognostic signature related to PTE assessment in COVID-19 patients. The procedure exploited for removing correlated features is reported in Algorithm 1. The nine radiomic features we propose as prognostic signature are:
wavelet-HH_glcm_ClusterShade, defined in Equation (A5). These feature names, derived from PyRadiomics, are composed of three parts, divided by the underscores.

Algorithm 1: Correlated Features Removal.
input : X, the original dataset, an N × P matrix , the correlation threshold // set to 0.5 output : LC, the list of correlated features to remove The first part refers to the image from which the features were extracted. It could be the original image (original), the image after having applied LoG with a given sigma (log-sigma-), or the image after having applied wavelet transform (wavelet-). In particular, four decompositions per level were obtained with Coiflets [64,65] from 2D slices; that is, all possible combinations of applying either a high or a low pass filter in each of the two dimensions. The parameter N of the Coiflets was set to 1 (e.g., see https://it.mathworks.com/help/wavelet/ref/coifwavf.html, last accessed: 8 October 2021).
The second part refers to the kind of features, as first-order statistical indicators (firstorder), second-order statistical indicators (as glcm or glrlm), and so on.
The last part refers to the specific feature name inside its family, as defined by the PyRadiomics documentation [44], which can be consulted for further information. Appendix A summarizes mathematical details and definitions for the features exploited in this study.
Violin plots showing the difference between distributions for PTE and non-PTE cases are reported in Figure 6. Statistics concerning the distributions of these features for PTE and non-PTE cases are reported in Table 3.

Discussion
Several authors have already dealt with the problem of COVID-19 detection from CT scans [12,17,18,21,22] or focused on the task of lung and lesion segmentation [13,[15][16][17][20][21][22]. In our cohort, all of the patients were positive to COVID-19, since the objective was not that of realizing another algorithm for COVID-19 diagnosis. The methods, materials, and results adopted by the related works are summarized in Table 4. It is possible to see that COVID-19 classification algorithms manage to obtain high performances. In regard to the segmentation, it is worth noting that some authors did not provide quantitative comparable metrics [14,16]. Other authors provided metrics for lesions segmentation, but not for lung parenchyma and lesion joint segmentation. Nonetheless, it is possible to see from our experiments that our model is better at segmenting lungs with respect to models that have been trained only on non-COVID-19 patients. To the best of our knowledge, no other work in the literature considered the possibility of exploiting radiomics methodologies in order to characterize the lungs between PTE and non-PTE patients. As can be seen from Figure 6 and Table 3, the designated features show different characteristics, from a statistical point of view, among the two conditions. This can aid in posing the basis for the development of a reliable predictive model, which will need to be tested on external and larger validation sets. Instead, in this work, we focused on the feature selection and statistical analysis stage. Table 4. Literature overview for various COVID-19 image tasks, as segmentation, detection, and classification. CT stands for computed tomography, ROC stands for receiver operating characteristic, IoU stands for intersection over union, AUC stands for area under the curve, AP stands for average precision, NPV stands for negative predictive value, PR stands for precision-recall, MCC stands for Matthews correlation coefficient, CNN stands for convolutional neural network, FCN stands for fully convolutional network, R-CNN stands for region-based convolutional neural network, DPN stands for dual path network, FRAGL stands for fuzzy region-based active contours driven by weighting global and local fitting energy.

Method Materials Task Results
Wu et al. [

Conclusions and Future Works
The COVID-19 pandemic has led to drastic impacts in our world. To address the deadliest complications from COVID-19, CT scans are routinely adopted in diagnostic and prognostic procedures. The task of manually segmenting lung parenchyma and lesions is tedious and error-prone, and it can suffer from intra-and inter-operator variability. Therefore, we proposed an automated workflow for segmenting lung lesions and parenchyma, aiding the work of radiologists and posing the basis for further analysis on the segmented region. The segmentation allowed to obtain accurate results with a mean Dice coefficient of 97.48% and a mean ASSD of 0.95 mm for the chosen model. We also addressed the problem of finding a prognostic signature that can be used to characterize the risk of onset of PTE, deriving nine uncorrelated features that show significantly different statistical distributions according to the Mann-Whitney U test. However, more data need to be acquired in order to better understand the generalizability of the discovered feature set, which is considered as part of future work.

Institutional Review Board Statement:
The study was conducted according to the guidelines of the Declaration of Helsinki. Ethical review and approval were waived for this study since this was a retrospective observational study with anonymized data.
Informed Consent Statement: Patient consent was waived due to the fact that this was a retrospective observational study with anonymized data, already acquired for medical diagnostic purposes.

Data Availability Statement:
The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Radiomic Features Definitions
In this appendix, we introduce definitions for the radiomic features considered throughout this study. The interested reader may consult PyRadiomics documentation [44] or the IBSI reference manual [45] for further information.
The GLCM describes the second-order joint probability function of an image ROI. The (i, j) th element of this matrix represents the occurrences of levels i and j separated by δ voxels along the θ direction in the image; that is P GLCM (i, j | δ, θ). PyRadiomics by default computes the symmetric GLCM, resulting in a symmetric P matrix. The normalized version can be obtained as in Equation (A1): Then, the GLCM maximum probability (MP GLCM ), which is our feature 1, can be defined as in Equation (A2): In order to define the GLCM inverse variance (IV GLCM ), which is our feature 4, we need to introduce the diagonal probabilities, as in Equation (A3): with | i − j |= k; k = 0, 1, ..., N g − 1; N g represents the number of discrete intensity levels in the image. Then, IV GLCM can be defined as in Equation (A4): The GLCM cluster shade (CS GLCM ), which is our feature 9, measures the skewness and uniformity of the GLCM, and can be defined as in Equation (A5): where: The gray-level size zone matrix (GLSZM) measures the number of zones of the connected voxels that have the same gray-level intensity. The (i, j) th element represents how many zones with gray-level i and size j appear in the image. We denote the GLSZM as P GLSZM (i, j). The normalized GLSZM, p GLSZM (i, j), can be obtained as in Equation (A10): where N z is the number of zones in the ROI. The GLSZM gray-level variance (GLV GLSZM ), which is our feature 2, measures the variance in gray-level intensities for the zones, as shown in Equation (A11): where N s is the number of discrete zone sizes in the image and µ GLSZM can be defined as in Equation (A12): Gray-level dependence matrix (GLDM) portrays the gray-level dependencies in an image. They are defined as the number of connected voxels inward a distance δ from the center voxel. If | i − j |≤ α, where α is a customizable cutoff parameter, the neighboring voxel with gray-level j is regarded as dependent on the center voxel having gray-level i. The (i, j) th element describes the number of times a voxel with gray-level i with j dependent voxels in its neighborhood appears in the image. We denote the GLDM as P GLDM (i, j). The normalized GLDM, p GLDM (i, j), can be obtained as in Equation (A13): where N dz is the number of dependency zones in the ROI. We can define the GLDM dependence entropy (DE GLDM ), which is our feature 3, as in Equation (A14): where N d is the number of discrete dependency sizes in the image and is a small constant for avoding numeric errors. Gray-level run length matrix (GLRLM) analyzes gray-level runs, namely the number of consecutive voxels having the same gray-level value. In a GLRLM, (i, j) th element represents, along a given direction θ, the number of runs with gray-level i and length j which occur in the image. We denote the GLRLM as P GLRLM (i, j | θ). The normalized GLRLM, p GLRLM (i, j | θ), can be obtained as in Equation (A15): where N r (θ) is the number of runs in the image along the direction θ. GLRLM gray-level variance (GLV GLRLM ), which is our feature 5, assesses the variance in gray-level intensities for the runs, as defined in Equation (A16): where N r is the number of discrete run length in the image and µ GLRLM is defined as in Equation (A17).
The long-run low gray-level emphasis (LRLGRLE), which is our feature 6, measures the joint distribution of long-run length with low gray-level values, as defined in Equation (A18): Neighboring gray tone difference matrix (NGTDM) evaluates the dissimilarity between a gray value and the average value of the neighboring voxels inward a distance δ. Inside the matrix, it is stored the absolute differences sum for gray-level i. Let X gl be a set of delineated voxels and x gl j x , j y ∈ X gl be the gray-level of a voxel at location j x , j y , then the average gray-level of the neighborhood (A i ) can be defined as in Equation (A19): x gl (j x + k x , j y + k y ) where k x , k y = (0, 0); x gl (j x + k x , j y + k y ) ∈ X gl ; W represents the number of voxels in the neighborhood.
The contrast (C NGDTM ), which is our feature 7, is a measure of spatial intensity variability, and can be defined as in Equation (A20): with p i = 0, p j = 0; N g,p is the number of gray-levels where p i = 0; N v,p is the number of voxels in X gl ; p i , i = 1, ..., N g is the relative frequency of the i gray-level; Inside the realm of first-order statistics, the skewness (g 1 ), which is our feature 8, measures the asymmetry of the distribution of values about the mean value, and can be defined as in Equation (A21): where N p is the number of voxels in the image.