Collinearity and Dimensionality Reduction in Radiomics: Effect of Preprocessing Parameters in Hypertrophic Cardiomyopathy Magnetic Resonance T1 and T2 Mapping

Radiomics and artificial intelligence have the potential to become a valuable tool in clinical applications. Frequently, radiomic analyses through machine learning methods present issues caused by high dimensionality and multicollinearity, and redundant radiomic features are usually removed based on correlation analysis. We assessed the effect of preprocessing—in terms of voxel size resampling, discretization, and filtering—on correlation-based dimensionality reduction in radiomic features from cardiac T1 and T2 maps of patients with hypertrophic cardiomyopathy. For different combinations of preprocessing parameters, we performed a dimensionality reduction of radiomic features based on either Pearson’s or Spearman’s correlation coefficient, followed by the computation of the stability index. With varying resampling voxel size and discretization bin width, for both T1 and T2 maps, Pearson’s and Spearman’s dimensionality reduction produced a slightly different percentage of remaining radiomic features, with a relatively high stability index. For different filters, the remaining features’ stability was instead relatively low. Overall, the percentage of eliminated radiomic features through correlation-based dimensionality reduction was more dependent on resampling voxel size and discretization bin width for textural features than for shape or first-order features. Notably, correlation-based dimensionality reduction was less sensitive to preprocessing when considering radiomic features from T2 compared with T1 maps.


Introduction
Radiomics is a novel tool allowing the extraction of many quantitative morphological, histogram-based, and textural characteristics (i.e., radiomic features) from digital medical images [1]. The underlying idea is that medical images are actually data containing objective and quantitative information, which is not obtainable from qualitative visual inspection as usually performed in routine clinical practice. Artificial intelligence (AI) methods applied to radiomic data have the potential to become a useful tool in a clinical setting, supporting clinical practice and, at the same time, medical understanding of diseases [2]. On the other hand, given the extraction of a large amount of features, AI methods applied for radiomic analyses present issues of high-dimensionality data [2]. In addition, while hundreds or thousands of radiomic features can be extracted, the sample sizes of datasets usually available in clinical studies are much smaller. Therefore, when the number of dimensions is larger than the number of samples, multicollinearity must be taken into account as another confounding effect in the analysis [3]. In this case, at least one of the variables can be expressed as a linear combination of the others. This kind of correlation among variables could affect the subsequent analyses and the results interpretation. Indeed, from a computational point of view, a group of highly correlated features will not bring additional information (or just very few) but will increase the complexity of the algorithm. Remembering the principle of parsimony (Occam's razor), with no significant difference in performance, simpler models should be preferred [4]. In addition, in the field of medicine, interpretation of results is of paramount importance. Although some conventional statistical models and machine learning algorithms can appear simple and interpretable (e.g., decision trees), their results can be biased in the presence of correlated input features.
For these reasons, redundant radiomic features are usually removed based on the results of a correlation analysis. This step is critical because the right choice of which features should be eliminated could either improve the performance and explainability of the AI model or affect its performance by removing interesting correlations between features and the study objective [5]. In the previous literature, extensive examples of dimensionality reduction in radiomic features based on correlation analysis have been provided [2,.
Image resampling and discretization, as well as filtering, are some steps of the radiomic workflow that may be performed as preprocessing before radiomic features extraction from the acquired image data [33,34]. In particular, image interpolation at the same voxel size is a common and recommended practice (especially in retrospective studies) to reduce any heterogeneity in acquisition voxel size. In contrast, image discretization is required to ensure that textural features estimation is computationally less burdensome [33,35]. Moreover, applying filters before radiomic features estimation could allow uncovering further tissue characteristics. Previous studies have investigated the dependence of radiomic features extraction on different image preprocessings for various applications, showing an appreciable (low or high) sensitivity of radiomic features estimate to preprocessing [36][37][38][39][40]. For instance, Marfisi et al. [37], investigating the effect of image preprocessing (in terms of image resampling, discretization, and filtering) on radiomic features from cardiac T1 and T2 mapping, have found a remarkable dependence of feature estimates on image filters, while the sensitivity of many radiomic features to image resampling and discretization was limited. On the other hand, in computed tomography imaging, the effect of image resampling can be clearly appreciable for first-order features and relevant for textural features [40]. Traverso et al. have shown that textural features derived from apparent diffusion coefficient maps appear to be highly or moderately sensitive to image preprocessing [41]. However, to the best of our knowledge, no work has assessed the dependence of dimensionality reduction based on collinearity analysis on image preprocessing.
Cardiac magnetic resonance (CMR) has a crucial role in diagnosis, risk stratification, and treatment planning in hypertrophic cardiomiopathy (HCM) [42], a genetically determined disease that affects about 1 in 500 people in the general adult population [43]. Traditionally, CMR evaluation of HCM patients relied on cine steady-state free precession images, utilized to define myocardial thickness and to calculate ventricular function, as well as on late gadolinium enhancement sequences for the identification of focal myocardial fibrosis, which has been demonstrated to have a negative prognostic value. Moreover, classic T2 STIR (short tau inversion recovery) sequences can be employed to detect myocardial edema, which has a role in arrythmic risk stratification.
In recent years, in addition to the traditional CMR sequences, T1 and T2 mapping sequences have been developed, allowing a more quantitative evaluation of myocardial changes, signally fibrosis (T1 maps), and edema (T1 and T2 maps). T1 and T2 mapping have shown to feature a possible important diagnostic and prognostic role in HCM patients (see [44] for a review). Radiomic analysis of T1 and T2 maps is hence particularly attractive, as it could allow overcoming some limitations of T1 and T2 values that have been demonstrated to suffer possible overlap between different myocardial diseases, as well as between patients and healthy controls [45][46][47][48][49]. For a proper clinical application of radiomics, the influence of each step of the radiomic workflow on features estimation should be considered. Notwithstanding, the effect of preprocessing on radiomic features selection is not usually considered in clinical studies.
Therefore, in the present study, we aimed to comprehensively assess the effect of preprocessing-in terms of voxel size resampling, discretization, and filtering-on correlation-based dimensionality reduction in radiomic features from quantitative cardiac T1 and T2 maps in a group of patients with HCM.

Dataset
Between November 2013 and July 2020, twenty-six patients with known or suspected HCM were referred for clinical cardiac magnetic resonance imaging (MRI). A complete MRI scan with both T1 and T2 mapping sequences was executed for these patients. The HCM diagnoses were made following the most recent recommendations of the European Society of Cardiology. They were based on the detection of the left ventricular wall thickness ≥15 mm in one or more myocardial segments, which was not due mainly to loading conditions [42]. Table 1 provides details about the patient group's clinical and cardiac MRI-derived characteristics. Cine scans were obtained using a TrueFISP sequence (TR = 2.5 ms, TE = 1.2 ms, slice thickness = 8 mm) in the 2-and 4-chamber view planes (3 slices each), as well as in the short-axis view (8-14 slices comprising the entire left ventricle).

T1 and T2 Maps Preprocessing
A cardiac MRI specialist with 15 years of expertise manually segmented the whole myocardium of each subject using 3D Slicer software (version 4.11.2) [52,53]. In order to avoid partial volume effects, the myocardium area was independently defined on T1 and T2 maps. A myocardial segmentation for a representative HCM patient is shown in Figure 1. In this study, we independently applied three preprocessing steps on T1 and T2 maps: (1) voxel size resampling, (2) discretization, and (3) filtering.
Given that T1 and T2 mapping only enabled the acquisition of a single slice, we performed voxel size resampling by 2D interpolation using the B-spline interpolation algorithm (with the origins of interpolation and original image grids aligned together [33]). Calculated T1 and T2 maps, which had an in-plane spatial resolution varying across subjects from 1.77 mm × 1.77 mm to 2.34 mm × 2.34 mm, were resampled to achieve in-plane isotropic spatial resolutions of 1.8 mm, 1.9 mm, 2.0 mm, 2.1 mm, 2.2 mm, 2.3 mm, and 2.4 mm.
The Image Biomarker Standardisation Initiative (IBSI) has suggested carrying out image discretization with fixed bin width when dealing with quantitative data, such as T1 and T2 maps [33]. In this study, for each resampling voxel size, the discretization bin width values used for T1 maps were 3.60 ms, 3.95 ms, 4.30 ms, 4.65 ms, 5.00 ms, 5.35 ms, 5.70 ms, 6.05 ms, and 6.40 ms, while bin width values used for T2 maps were 0.49 ms, 0.50 ms, 0.51 ms, 0.52 ms, 0.53 ms, 0.54 ms, 0.55 ms, 0.56 ms, and 0.57 ms. Specifically, bin width values were chosen so that the number of quantization levels for T1 and T2 maps was within the range of 30-130 for each HCM patient. This approach, previously used in other technical investigations [41,[54][55][56], may reduce the variability in estimating radiomic features [57][58][59].
Different filters were applied to T1 and T2 maps, including the gradient magnitude of the map (i.e., gradient filter), the square of the map values (i.e., square filter), the square root of the absolute map value (i.e., square-root filter), and 2D wavelets (Daubechies 3). The last one yielded four filtered maps obtained through different combinations of 2D wavelets (i.e., wavelet-LH, -HL, -HH, and -LL), where L/H refers to the combination of low-/high filters applied in the horizontal and vertical direction. Specifically, filtering was carried out at fixed isotropic in-plane resampling voxel size of 2.1 mm and at fixed discretization bin width of 6 ms and 0.56 ms for T1 and T2 maps, respectively. These bin width values ensured a median (across subjects) number of quantization levels between 30 and 130 [57][58][59].
All preprocessing steps and subsequent radiomic features estimations were carried out by using the open-source PyRadiomics library [60] (version 3.0.1) and Python (version 3.7.3) running on a MacBook Air (macOS version 10.14) with a 1.8 GHz Intel Core i5 CPU.

Radiomic Features Estimation
Given that the used acquisition sequences allowed obtaining T1 and T2 maps on a single slice, the 2D versions of radiomic features were considered. For each preprocessing combination, in terms of resampling voxel size and discretization bin width, a total of 98 features were obtained from both T1 and T2 maps: 9 2D shape features, 16 first-order features (14 intensity-based statistical features and two intensity histogram features, namely Entropy and Uniformity), and 73 second-order features (i.e., textural features) from graylevel co-occurrence matrix (GLCM, 22 features), gray-level run length matrix (GLRLM, 16 features), gray-level size zone matrix (GLSZM, 16 features), gray-level dependence matrix (GLDM, 14 features, with coarseness parameter α = 0), and neighborhood gray-tone difference matrix (NGTDM, 5 features). Second-order features estimation was performed according to the Chebyshev norm with a distance of 1 pixel. GLCM and GLRLM features were computed from each 2D directional matrix (i.e., at 0°, 45°, 90°, and 135°) and averaged over 2D directions.
For each filter applied on T1 and T2 maps, 89 features were estimated for both T1 and T2 maps, i.e., all the above except the shape features. Indeed, given that shape features are usually estimated regardless of the applied image filter, they were not included in our analysis.
All radiomic features were computed following the definitions provided by the IBSI [33]. It is worth noting that the first-order feature of Kurtosis calculated by PyRadiomics was in accordance with the IBSI, except for an offset value (i.e., 3).

Collinearity Analysis and Dimensionality Reduction
For T1 and T2 maps, three different effects on radiomic features collinearity and dimensionality reduction were assessed: Effect A -for each discretization bin width, the effect of using different resampling voxel sizes; Effect B -for each resampling voxel size, the effect of using different discretization bin widths; Effect C -at fixed resampling voxel size and discretization bin width, the effect of using different filters.
For all combinations of preprocessing (in terms of resampling voxel size, discretization bin width, and filter), we performed a collinearity analysis by computing the pair-wise Pearson's correlation coefficient (PCC) [61] and the Spearman's correlation coefficient (SCC) [62] for each couple of radiomic features' values across subjects. All significant correlation coefficients (p-value < 0.05) with absolute value above a predefined threshold were counted and represented in a correlation heatmap. In particular, according to previous studies, we considered cut-off thresholds of 0.8 and 0.9 for the |PCC| [2,7,18,19,63] and the |SCC| [15,22,26,30], respectively.
Subsequently, we executed an iterative dimensionality reduction in radiomic features based on either the PCC or the SCC. For the pair of features with the highest absolute correlation coefficient, we computed the mean absolute correlation coefficient of each of the two features with all the others, removing the feature with the highest mean absolute correlation coefficient. We iteratively repeated each step until the pair-wise correlation coefficients among radiomic features became less than the predefined threshold value.
Using this procedure, from the original radiomic features data, we obtained a specific set of radiomic features with lower dimensionality and redundancy for each combination of preprocessing. Therefore, we (1) compared the number of significant correlations with an absolute correlation coefficient greater than or equal to the threshold value, (2) evaluated the percentage of remaining features after the correlation-based dimensionality reduction, and (3) analyzed the differences in the remaining feature subsets by measuring a stability index [64,65].
The stability index has been proposed by several authors for the study of feature selection, showing how even slight variations in the data can lead to different sets of selected features, in terms of both cardinality and type [64][65][66][67][68][69]. In this work, we computed a measure that belongs to stability by Index/Subset category [70,71]. Briefly, a subset of remaining features is represented as a binary vector, where 0 represents absence and 1 represents the presence of the specific feature. The stability is calculated by the amount of overlap between the overall subsets of remaining features. Specifically, we used the stability index defined by Nogueira et al., which complies with the properties of a stability measure [66]. This stability index takes continuous values between 0 (lowest stability) and 1 (highest stability). In accordance with the work by Kuncheva et al., stability is considered good if it is greater than or equal to 0.5 [64]. This index can be actually used to analyze correlation-based dimensionality reduction, helping us to understand whether preprocessing can introduce changes in the data such as to yield different sets of selected features in terms of both cardinality and type.
The collinearity analysis, dimensionality reduction, and stability analysis were carried out using in-house written Python code (Version 3.10.4) running on an M1 MacBook Air (macOS Monterey version 12.3.1). In particular, we computed the stability index using the Python package freely available at https://github.com/nogueirs/JMLR2018 (accessed on 15 January 2021) [66].

T1 Mapping
When varying resampling voxel size and discretization bin width (i.e., effect A and B, respectively), both the PCC-and SCC-based correlation analysis showed different numbers of significant pair-wise correlation coefficients with an absolute value greater than or equal to the defined threshold values. For T1 mapping, for instance, when considering the resampling voxel sizes of [  Table 2 for effect A and B and in Supplementary Figures S1-S4. Table 2. Collinearity analysis and correlation-based dimensionality reduction for T1 maps. In the column "# of CC", the number of pair-wise correlation coefficients that were significant and, in absolute value, greater than the predefined threshold is reported (see Section 2.5 for details). In the column "% of remaining features", the percentage of remaining features after the correlation-based dimensionality reduction is reported. Different numbers of significant pair-wise correlations between features with absolute correlation coefficient value greater than the defined threshold may yield different percentages of features remaining downstream of dimensionality reduction. With reference to the representative abovementioned example, as the resampling voxel size changed, the percentages of features remaining after the PCC-based dimensionality reduction were [21,24,24,22,23,23,23]% and [21,24,23,23,22,22,24]% for bin width = 3.60 ms and bin width = 3.95 ms, respectively. On the other hand, for the SCC-based dimensionality reduction, the percentages of remaining features were [28,28,29,27,27,32,31]% and [29,29,26,29,29,31,29]%. As shown in Table 2, for both PCC-and SCC-based dimensionality reduction, at fixed discretization bin width (effect A) and resampling voxel size (effect B), the percentage of remaining features changes only slightly when varying resampling voxel size and discretization bin width, respectively. Nonetheless, it is also important to assess whether and to what extent the type of remaining features is dependent on a specific preprocessing element.
In Table 3, the results of the stability analysis are reported in detail. For both effects A and B, the stability of the features remaining after the dimensionality reduction can be considered relatively high, given that all the stability indices were greater than 0.5 [64].
On the other hand, as the type of filtering varies, the number of significant pairwise correlations with absolute correlation coefficient values greater than the predefined threshold can differ greatly ( Supplementary Figures S5 and S6). In particular, for both PCC and SCC analysis, applying a gradient filter on the original T1 maps leads to a relevant increase in the number of significant correlation coefficients with absolute value greater than the predefined threshold (see Table 2). While the percentage of features remaining downstream of dimensionality reduction seems to vary slightly with filtering, the stability of the remaining feature subsets is relatively low (less than 0.5 [64]), confirming the sensitivity of correlation-based dimensionality reduction to the application of different filters (Table 3).
Heatmaps in Figures 2-5 show in greater detail the relationship between each preprocessing and the specific radiomic features that are retained or eliminated by correlationbased dimensionality reduction. In particular, by fixing one resampling voxel size or discretization bin width while varying the other, these heatmaps represent the ratio between the number of times each radiomic feature was selected and the total number of times the variable could be selected (i.e., the number of considered preprocessing combinations) through the dimensionality reduction process. For T1 mapping, the features belonging to the GLRLM class were almost always removed, indicating high collinearity with the other features (Figures 2e, 3e, (Figures 2f, 3f, 4f and 5f)].

Effect C-Varying Filtering, with Fixed Resampling Voxel Size (2.1 mm) and Discretization BW (6 ms)
Pearson-correlation-based dimensionality reduction stability Spearman-correlation-based dimensionality reduction stability

T2 Mapping
Overall, the results for T2 mapping are similar to those for T1 maps. Specifically, when varying resampling voxel size and discretization bin width (i.e., effect A and B, respectively), both the PCC-and SCC-based correlation analysis showed different numbers of pair-wise significant correlation coefficients with absolute value greater than or equal to the defined threshold value. For instance, when considering the discretization bin width values of [0. 49 Table 4 and in the correlation heatmaps in Supplementary Figures S7-S10, a similar result can be observed for all preprocessing combinations of effects A and B. Following the abovementioned example, as the discretization bin width changed, the percentages of features remaining after the PCC-based dimensionality reduction were [24,24,24,23,24,23,23,24,24]% and [23,21,22,22,26,22,22,22,21]% for resampling voxel sizes of 1.8 mm and 1.9 mm, respectively. On the other hand, for the SCC-based dimensionality reduction, the percentages of remaining features were [32,33,34,34,31,33,32,31,32]% and [32,32,33,32,32,35,30,31,34]% for resampling voxel sizes of 1.8 mm and 1.9 mm, respectively. As observed for T1 mapping, and reported in detail in Table 4, for both PCC-and SCC-based dimensionality reduction, at fixed discretization bin width (i.e., effect A) and resampling voxel size (i.e., effect B), the percentage of remaining features changes only slightly with varying resampling voxel size and discretization bin width, respectively.
Moreover, the results of the stability analysis, reported in Table 5, indicate that, regardless of the effects A and B, the stability of the features remaining after the dimensionality reduction can be considered relatively high (all the stability indices were greater than 0.5) [64].
The number of significant pair-wise correlations between features with absolute correlation coefficient value greater than the predefined threshold was greatly dependent on the applied filter ( Supplementary Figures S11 and S12). In particular, for both PCC and SCC analysis, applying a gradient filter on the original maps yielded a clear increase in the number of significant correlations between features with absolute correlation coefficient value greater than the predefined threshold (see Table 4). As indicated in Table 5, the stability index of the remaining feature subsets is relatively low (less than 0.5), confirming also for T2 maps the sensitivity of correlation-based dimensionality reduction to the application of different filters. Figures 6-9 show the heatmaps of remaining features downstream of the dimensionality reduction process. The features belonging to the GLRLM category were almost always removed, indicating high collinearity with the other features (Figures 6e, 7e, 8e and 9e). For the other classes of features, only a few specific ones were always eliminated: Mean, Variance, MeanAbsoluteDeviation, RobustMeanAbsoluteDeviation, Energy, RooMeanSquared, Entropy, Uniformity (first-order (Figures 6b, 7b, 8b and 9b)), SumSquares, Autocorrelation (GLCM (Figures 6c, 7c, 8c and 9c)), GrayLevelNonuniformity, GrayLevelVariance, HighGrayLevelEmphasis, LargeDependenceEmphasis (GLDM (Figures 6d, 7d, 8d and 9d), GrayLevelNonUniformityNormalized, and LowGrayLevel-ZoneEmphasis (GLSZM (Figures 6f, 7f, 8f and 9f). Table 4. Collinearity analysis and correlation-based dimensionality reduction for T2 maps. In the column "# of CC", the number of pair-wise correlation coefficients that were significant and, in absolute value, greater than the predefined threshold is reported (see Section 2.5 for details). In the column "% of remaining features", the percentage of remaining features after the correlation-based dimensionality reduction is reported. Values in [1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4

Discussion
The effect of preprocessing on radiomic features dimensionality reduction is not usually considered in clinical studies. Therefore, we assessed in detail, for the first time, whether and how voxel size resampling and discretization, as well as filtering, can impact on radiomic features selection, considering the specific case of myocardial T1 and T2 mapping-derived features of a homogenous group of patients with HCM.
For both T1 and T2 maps, the dependence of collinearity of radiomic features and correlation-based dimensionality reduction with respect to preprocessing, in terms of maps resampling and discretization, results relatively moderate. In fact, for all considered cases, the percentage of features downstream of correlation-based dimensionality reduction varies only slightly with resampling voxel size and discretization bin width, within an interval equal to [19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35]%. In addition, the stability index is always greater than 0.5, indicating that the feature sets remaining after dimensionality reduction not only have similar cardinality when applying different resampling voxel sizes or discretization bin widths but also exhibit almost the same type of radiomic features. Overall, the number of eliminated radiomic features through correlation-based dimensionality reduction is more dependent on resampling voxel size and discretization bin width for textural features than for shape or first-order features. Figures 2-9 clearly show that the textural features remaining after correlation-based dimensionality reduction (panels c-g) vary with preprocessing changes more than the shape (panel a) or first-order features (panel b). Indeed, textural features present intermediate color hues between light yellow and dark blue, indicating whether the specific textural feature was removed according to the preprocessing parameter. On the other hand, shape and first-order features mainly present color hues equal to light yellow, i.e., the feature was permanently eliminated regardless of the preprocessing performed, or dark blue, i.e., the feature always remained regardless of preprocessing. This result is in line with a previous study on myocardial radiomic features derived from T1 and T2 mapping [37], showing that textural features estimate has a greater sensitivity to resampling voxel size and discretization bin width than shape or first-order features estimate.
For both effects A (i.e., varying resampling voxel sizes, with fixed discretization bin width) and B (i.e., varying discretization bin widths, with fixed resampling voxel size), our results suggest a greater influence on correlation-based dimensionality reduction in resampling voxel size than discretization bin width (Tables 3 and 5). In fact, the stability indices are, as a whole, greater for effect B than for effect A, indicating how discretization bin width has less influence on Pearson's as well as Spearman's correlation analysis and the subsequent dimensionality reduction. As expected, we confirmed that features belonging to the shape and first-order classes are not sensitive to the change in discretization bin width, given that these radiomic features (except Entropy and Uniformity) were estimated before discretization under the IBSI recommendations [33] (panels (a) and (b) of Figures 4,5,8,and 9).
Digital image filters can be applied before radiomic features extraction to detect and emphasize tissue characteristics different from those usually obtained from original images. In this regard, the IBSI has proposed a new reference manual to define and standardize the implementation of image filters in radiomics software [34]. Given that filtering can actually modify (even in a substantial way) T1 and T2 maps, we observed a relevant sensitivity of both PCC-and SCC-based dimensionality reduction to filtering. Although the percentage of remaining features is similar when using different filters, the stability indices were less than 0.5, indicating that the subsets of selected features were composed of different features.
A remarkable difference between T1 and T2 maps in sensitivity of collinearity analysis and dimensionality reduction to preprocessing was found. Performing the correlationbased dimensionality reduction on radiomic features from T2 maps was characterized by lower sensitivity to voxel size resampling and discretization than radiomic features from T1 maps. Regardless of whether PCC or SCC is used, the percentage of T2-derived features eliminated by the dimensionality reduction procedure is less than the percentage of removed T1-derived radiomic features. In addition, the remaining subsets of T2-derived features showed greater stability than the corresponding subsets of T1-derived features. These results, along with the previous findings by Marfisi et al. [37], also support the use of T2 mapping as a potential useful tool to describe myocardial structural anomalies in patients with HCM. So far, only T1 mapping has been used in previous HCM radiomic research, primarily due to its capacity to detect myocardial fibrosis [47][48][49]72]. T2 maps, however, are regarded as the gold standard for assessing myocardial edema, a well-known adverse prognostic feature in HCM [73,74]. While T2 mapping cannot evaluate myocardial fibrosis per se, texture analyses have the potential to circumvent this constraint by revealing myocardial structural heterogeneity caused by myofibrillar disarray and fibrosis [46,75].
We acknowledge the following issues as potential study limitations. First, we focused only on T1 and T2 maps, albeit they are particularly suitable for radiomic analysis, given their quantitative nature. Additional studies will be necessary to investigate the effect of preprocessing on correlation-based dimensionality reduction in radiomic features for classical cine, LGE, and STIR sequences. In HCM patients, T1 maps are usually obtained before and after contrast administration to calculate extracellular volume, a surrogate marker of interstitial remodeling and interstitial fibrosis [47,48,72]. However, in the present study, we focused only on native T1 values, mainly considering the possibility of obtaining equivalent information from radiomics analysis of non-contrast images to gadoliniumenhanced images. Indeed, avoiding contrast administration is a hugely desirable prospect currently under investigation.
Second, even though we only looked at single-slice T1 and T2 mapping acquisitions, myocardial alterations in HCM patients may also affect visibly non-hypertrophied cardiac parts. Consequently, a whole-heart coverage might offer a more thorough assessment of disease load and improve the diagnostic effectiveness of cardiac MRI. However, evaluating a single ROI on the mid-cavity short-axis map for a global/diffuse illness is considered sufficient [76]. Our technical study's primary objective was to assess how voxel size resampling and discretization affected radiomic characteristics calculated from standard cardiac T1 and T2 mapping. As a result, we concentrated on a single short-axis slice at a location where myocardial changes were thought to be more severe, and the thickness of the myocardium was at its maximum. This allowed us to obtain minimum partial volume effects, which can significantly affect regions of myocardial segments that are thinner. Furthermore, given that this is a retrospective study with participants who had been referred for clinical or routine cardiac MRI, it is crucial to avoid having excessively lengthy acquisition times, especially for patients who are not cooperative.
Third, we included only twenty-six patients with HCM, representing a homogeneous group with the same pathology. Nonetheless, the findings of this technical investigation may be helpful and prodromic for future clinical studies, which should enroll a larger number of participants and include control subjects to specifically assess the clinical potential of radiomic analysis of T1 and T2 maps in HCM patients. This could lead to a better understanding of the role of CMR in differentiating HCM from hypertensive heart disease and other cardiomyopathies and in discriminating different genotypes of HCM, as well as in assessing arrhythmic risk in these patients [44,47,48].
Finally, our results depend on the cut-off thresholds defined for PCC and SCC. We decided to use the most chosen thresholds from previous studies that have performed correlation-based dimensionality reduction in a machine learning workflow (i.e., 0.8 for |PCC| and 0.9 for |SCC|) to understand this procedure's sensitivity to preprocessing in a real scenario. Studying the changes in dimensionality reduction as the cut-off threshold changes was beyond the scope of this work.

Conclusions
In this HCM study, using radiomic features extracted from T1 and T2 maps, we observed a moderate sensitivity of collinearity analysis and correlation-based dimensionality reduction to some conventional image preprocessing procedures. While, as a whole, this effect is relatively moderate for voxel size resampling and discretization, it is remarkable when considering filtering. Moreover, correlation-based dimensionality reduction is less sensitive to preprocessing when considering radiomic features from T2 compared with T1 maps. Our findings further confirm the effect of preprocessing in radiomic analyses, with the consequent need of considering it toward a standardization of methods and when comparing data/results from different clinical studies.