Predicting Progression of Kidney Injury Based on Elastography Ultrasound and Radiomics Signatures

Background: Shear wave elastography ultrasound (SWE) is an emerging non-invasive candidate for assessing kidney stiffness. However, its prognostic value regarding kidney injury is unclear. Methods: A prospective cohort was created from kidney biopsy patients in our hospital from May 2019 to June 2020. The primary outcome was the initiation of renal replacement therapy or death, while the secondary outcome was eGFR < 60 mL/min/1.73 m2. Ultrasound, biochemical, and biopsy examinations were performed on the same day. Radiomics signatures were extracted from the SWE images. Results: In total, 187 patients were included and followed up for 24.57 ± 5.52 months. The median SWE value of the left kidney cortex (L_C_median) is an independent risk factor for kidney prognosis for stage 3 or over (HR 0.890 (0.796–0.994), p < 0.05). The inclusion of 9 out of 2511 extracted radiomics signatures improved the prognostic performance of the Cox regression models containing the SWE and the traditional index (chi-square test, p < 0.001). The traditional Cox regression model had a c-index of 0.9051 (0.8460–0.9196), which was no worse than the machine learning models, Support Vector Machine (SVM), SurvivalTree, Random survival forest (RSF), Coxboost, and Deepsurv. Conclusions: SWE can predict kidney injury progression with an improved performance by radiomics and Cox regression modeling.


Introduction
Chronic kidney disease (CKD) has become a global health burden, with an incidence of around 10% [1]. Progression to CKD at over stage 3 was estimated to cost USD 5367 to USD 53,186 per patient per year, a 1.3 to 2.4 fold increase compared with CKD stages 1-2, whereas the costs associated with end-stage renal disease were the highest, ranging from USD 20,110 to USD 100,593 [2]. However, the rate of CKD progression differs individually. Acute kidney injury is one of the major causes of and accelerating factors in CKD [3]. Thus, the determination of early predictors of the progression of kidney injury is important. However, current monitoring methods for the progression of kidney disease are not ideal. These include biopsy (which is too invasive to repeat), eGFR (which is only elevated after most kidney cells lose regenerative capacity and is insensitive to CKD progression), and proteinuria (which is largely affected by etiology and insensitive and non-specific to CKD progression) [3][4][5]. Studies on imaging techniques and urinary biomarkers are emerging as part of the search for promising non-invasive monitoring methods.
Ultrasound remains the preferred non-invasive radiographic method for diagnosing CKD due to its economic and portable properties [6]. Two-dimensional shear wave elastog-Diagnostics 2022, 12, 2678 2 of 17 raphy (SWE) is an emerging technique of elastography ultrasound for evaluating kidney stiffness [7,8]. Based on the physical theory that shear wave propagation velocity is higher in stiffer tissues, the stiffness of kidneys can be estimated by the linear formula of Young's modulus using shear wave velocity obtained through SWE [9,10]. Due to the anisotropy of the kidneys, the SWE parameters usually include Young's modulus value in the cortex and Young's modulus in the medulla [11]. However, a contradictory relationship between SWE and eGFR or histological fibrosis was found among different cohorts with different confounding factors [12]. Furthermore, few studies have reported the predictive value of SWE for the prognosis of CKD.
Recent advances in the radiomics analysis of ultrasound images of fibrosis [13,14] and artificial intelligence in clinical diagnostic and prognostic models [15] may provide a new approach to clarifying the relationship between SWE and CKD, as well as alternative CKD progression predictors to kidney biopsy. PyRadiomics is an open-source platform based on Python that has been widely used in radiology, including images of CT and MRI, and shows no difference from ultrasomics in ultrasonography [16,17]. PyRadiomics can extract high-throughput quantitative features from the region of interest (ROI) in medical images, including ultrasound images [16][17][18]. The extracted signatures by PyRadiomics include classes of first-order statistics (19 features), shape descriptors (including 2D and 3D, not often used in ultrasound), and texture classes of gray level cooccurrence matrix (glcm, 24 features), gray level run length matrix (glrlm, 16 features), gray level size zone matrix (glszm, 16 features), gray level dependence Matrix (gldm, 14 features), and neighboring gray tone difference Matrix (ngtdm, 5 features), based on original images or preprocessed images using built-in filters. Support vector machine (SVM), SurvivalTree, and Random Survival Forest (RSF) are machine learning models developed for clinical survival analysis based on binary classification [19][20][21]. Coxboost and DeepSurv are machine learning or deep learning models developed based on the traditional Cox regression method [22,23].
Hence, we observed the predictive value of SWE for CKD progression in our kidney biopsy cohort. We used the clinical index and pathological changes as references. We also applied a PyRadiomics analysis of SWE ultrasound images, traditional Cox regression models, and machine learning models in our study. We hypothesized that SWE would predict CKD progression with or without the help of radiomics and machine learning.

Study Design and Population
The study featured a prospective cohort of kidney biopsy patients in our hospital from May 2019 to June 2020. The inclusion criteria were patients aged 18-70 with unexplained abnormal kidney function, proteinuria over 1 g/day, rapidly progressive glomerulonephritis, or persistent hematuria with proteinuria. The exclusion criteria were patients who could not cooperate with breath-holding for SWE, contraindication for kidney biopsy (solitary or horseshoe kidney, bilateral kidney atrophy, bleeding tendency, severe hypertension, or acute pyelonephritis), pregnancy, comorbidities of cysts, urological stones or tumors, unilateral kidney atrophy, or pathological diagnosis of acute kidney injury. B-mode, SWE, and collection of serum and urinary samples from patients were performed on the day of the kidney biopsy. The included patients were then re-examined every three months for the first year and every six months thereafter. The primary outcome was the initiation of renal replacement therapy or death. The secondary outcome was CKD stage over 3. CKD stage was evaluated based on eGFR (MDRD) as CKD 1, ≥90; CKD 2, 60-89; CKD 3, 30-59; CKD 4, 15-29; and CKD 5, <15 at the time of inclusion and follow-up [24]. CKD progression was defined, according to the 2012 KDIGO Guideline [25], as a sustained decrease (measured at least twice, with >3 months in between) of eGFR over 25% from baseline accompanied by a drop in the CKD stage. The follow-up time was until 31 March 2022. Details of the study flow are in Figure S1.

Clinical, Pathological, and Ultrasound Index
Serum creatinine (Scr) and urinary creatinine (UCr) were measured by the sarcosine oxidase method. Urinary albumin was measured by the immunoturbidimetric method. The urinary albumin-creatinine ratio (ACR) was subsequently calculated. The eGFR was calculated using the MDRD equation [26].
B-mode and SWE ultrasound examinations were conducted by one ultrasound radiologist who had received specialized training in kidney ultrasound and SWE for more than five years. The radiologist was blind to patients' clinical information. B-mode was performed before SWE as a reference for kidney imaging [27]. SWE was then carried out with patients in a right lateral decubitus position by Supersonic Imagine Aixplorer (convex transducer SC6-1, frequency 1-6 mhz). All patients were required to perform deep inspiration and breath-holding during the examination to ensure stable image generation and acquisition. The penetration depth of the radiofrequency data was 12 cm without the effect of BMI ( Figure S2). In total, four images were acquired at the inferior pole of the left kidney of each patient. Regions of interest (ROI) were manually drawn at the kidney cortex, medulla, and sinus with fixed diameters of 6, 6, and 4 mm, respectively. Mean and Median SWE values of the kidney cortex, medulla, and sinus were calculated.
A kidney biopsy was performed under the guidance of B-mode ultrasound at the inferior pole of the left kidney. Biopsy specimens were formalin-fixed before routine hematoxylin and eosin, periodic acid-Schiff, periodic Schiff-methenamine silver, and immunofluorescence staining, or fixed with 4% PFA and 2.5% glutaraldehyde before observation under electron microscopy. Pathological changes were divided into 13 categories, scored, and diagnosed by two renal pathologists with over 20 years of experience. Grades of chronic changes were then determined according to total renal chronicity score as minimal, 0-1; mild, 2-4; moderate, 5-7; and severe, ≥8, by referring [28,29].

Radiomics Signature Extraction from Ultrasound
Considering the heterogeneity of kidney histological changes during CKD and the purpose of adding values to SWE, ROIs for radiomics analysis were manually drawn using 3DSlicer based on the ROIs of the SWE ultrasound images (Figure 1). Radiomics signatures of each ROI were then extracted by PyRadiomics (v3.0.1) [16,17]. Eight hundred and thirty-seven radiomics signatures were acquired from each ROI, including first-order statistics (18 signatures), gray level cooccurrence matrix (glcm, 24 signatures), gray level dependence matrix (gldm, 14 signatures), gray level run length matrix (glrlm, 16 signatures), gray level size zone matrix (glszm, 16 signatures), and neighboring gray-tone difference matrix (ngtdm, 5 signatures) calculated in original or wavelet-transformed (HHH, HHL, HLH, HLL, LHH, LHL, LLH, LLL) images. Median values of the radiomics signatures at the kidney cortex, medulla, and sinus from the 4 SWE images were used for final analysis.

Cox Regression, Machine Learning, and Deep Learning Modeling
Lasso regression and Cox regression were conducted using glmnt (v4.1-3) and survival (v3.2-10) in R (v4.0.5) and SPSS (v26). Features with likelihood test p < 0.1 in univariate Cox regression entered further multivariate Cox regression using the stepwise-backward method. The hypothesis of proportional hazard was verified by the chi-square test before Cox regression modeling. The likelihood test was also used in the comparison of Cox regression models. Nomogram was built based on the results of multivariate Cox regression. Machine learning models SVM, RSF, SurvivalTree, Coxboost, and deep learning model DeepSurv were built using the same features as the Cox regression model by the scikitsurvival package (v0.17.1) of Python (v3.7.0). The dataset was split randomly at an 8:2 ratio, 80% for training and 20% for testing to prevent overfitting of the machine learning or deep learning models. Hyperparameters were optimized through grid search and 10-fold cross-validation. Models were evaluated by the concordance index (C-index). C-index >0.9 means a model with high accuracy; 0.7-0.9 means medium accuracy; 0.5-0.7 means poor accuracy. Ninety-five percent confidence intervals were calculated by Bootstrap 1000 times. (a,b) Two-dimensional shear wave elastography ultrasound (2D-SWE) images and radiomics analysis of regions of interest (ROIs) in a 63-year-old female patient at CKD stage 3, who recovered to CKD stage 2 at the last follow-up. The SWE value from the single examination of the left kidney cortex was 9.5 kPa, shown in the right box; (c,d) 2D-SWE images and radiomics analysis of ROIs in a 28-year-old female patient at CKD stage 3, who progressed to CKD stage 5 at the last follow-up; (+), (Í), (å) represents left kidney cortex, sinus, and medulla, respectively; green, yellow, and red masks in (b,d) represent the ROI of left kidney cortex, sinus, and medulla, respectively, drawn for radiomics analysis according to the ROIs of 2D-SWE.

Cox Regression, Machine Learning, and Deep Learning Modeling
Lasso regression and Cox regression were conducted using glmnt (v4.1-3) and survival (v3.2-10) in R (v4.0.5) and SPSS (v26). Features with likelihood test p < 0.1 in univariate Cox regression entered further multivariate Cox regression using the stepwisebackward method. The hypothesis of proportional hazard was verified by the chi-square test before Cox regression modeling. The likelihood test was also used in the comparison of Cox regression models. Nomogram was built based on the results of multivariate Cox regression. Machine learning models SVM, RSF, SurvivalTree, Coxboost, and deep learning model DeepSurv were built using the same features as the Cox regression model by the scikit-survival package (v0.17.1) of Python (v3.7.0). The dataset was split randomly at an 8:2 ratio, 80% for training and 20% for testing to prevent overfitting of the machine learning or deep learning models. Hyperparameters were optimized through grid search and 10-fold cross-validation. Models were evaluated by the concordance index (C-index). C-index >0.9 means a model with high accuracy; 0.7-0.9 means medium accuracy; 0.5-0.7 means poor accuracy. Ninety-five percent confidence intervals were calculated by Bootstrap 1000 times.

Statistical Analysis
Statistical analysis was performed by SPSS (v26) and R (v4.1.0). Normality distribution was tested by the Kolmogorov-Smirnov method before the Students' t-test for normally distributed and Mann-Whitney U test for non-normally distributed data. The chi-square test was used for the comparison of categorical variables. A paired t-test (a,b) Two-dimensional shear wave elastography ultrasound (2D-SWE) images and radiomics analysis of regions of interest (ROIs) in a 63-year-old female patient at CKD stage 3, who recovered to CKD stage 2 at the last follow-up. The SWE value from the single examination of the left kidney cortex was 9.5 kPa, shown in the right box; (c,d) 2D-SWE images and radiomics analysis of ROIs in a 28-year-old female patient at CKD stage 3, who progressed to CKD stage 5 at the last follow-up; (+), (Í), (å) represents left kidney cortex, sinus, and medulla, respectively; green, yellow, and red masks in (b,d) represent the ROI of left kidney cortex, sinus, and medulla, respectively, drawn for radiomics analysis according to the ROIs of 2D-SWE.

Statistical Analysis
Statistical analysis was performed by SPSS (v26) and R (v4.1.0). Normality distribution was tested by the Kolmogorov-Smirnov method before the Students' t-test for normally distributed and Mann-Whitney U test for non-normally distributed data. The chi-square test was used for the comparison of categorical variables. A paired t-test was applied for within-group comparisons. Log-rank test was applied for comparison between Kaplan-Meier curves. p < 0.05 was considered statistically significant.
During the follow-up, three patients died. One of these deaths was due to multi-organ failure (at CKD stage 3 at the time of inclusion), while the causes of the other two were unknown (at CKD stage 2 at the time of inclusion). Six patients initiated renal replacement therapy, of whom two were at CKD stage 5 at the time of inclusion, two were at CKD 4, and two were at CKD 3. Of the remaining patients who did not die or initiate renal replacement therapy, 18 out of 187 (9.63%) had a sustained decrease (measured at least twice, >3 months in between) of eGFR over 25% from baseline, accompanied by a drop in CKD stage. Furthermore, 48 out of 187 (25.67%) had a sustained increase (measured at least twice, >3 months in between) in eGFR over 25% from baseline, which could be considered as a regression of CKD [30]. The total incidence of CKD progression was 25 out of 187 (13.37%). At the time of the last follow-up, 75 out of 187 (40.11%) patients were at CKD stage 1, 55 (29.41%) were at CKD 2, 29 (15.51%) were at CKD 3, 11 (5.88%) were at CKD 4, and 17 (9.09%) were at CKD 5. The incidence of patients' prognosis over CKD stage 3 or over is 57 out of 187 (30.48%). The baseline characters are in Table 1. The patients' Scr and eGFR before and after are depicted in Figure S3.

Feature Selections for CKD Prognosis by Lasso Regression
In total, 973 features, including clinical index, pathological changes, ultrasound parameters, and radiomics signature, were entered into the lasso regression analysis. The 26 significant features selected were Scr, eGFR, 24-h urinary protein at baseline, glomerular global sclerosis rate, glomerular focal segmental sclerosis rate, interstitial inflammation, and 20 radiomics signatures. Details of the features and their coefficients are shown in Figure 2.

Cox regression for CKD Prognosis
Univariate and multivariate Cox regression analyses were performed next, using both statistical and clinically significant parameters from the results of the differential comparison (Table 1) and lasso regression (Figure 2). The parameters of sinus wavelet HLH first-order maximum, sinus wavelet HLH first-order range, and medulla wavelet HLH, first-order 10th percentile from ultrasound radiomics, were excluded because their values were all less than 0.0001, and therefore meaningless for clinical use. As shown in Table 2, eGFR, Scr, ACR at baseline, median SWE value of left renal cortex (L_C_median), length of left kidney, and nine radiomics signatures were statistically significant for the secondary outcome. The hypothesis of proportional hazard for the Cox regression is proven in Table S1.  Continuous features with p < 0.1 in the univariate Cox regression analysis were further calculated as cutoff values for the secondary outcome based on Kaplan-Meier method ( Figure S4). The cutoff values for eGFR, Scr, ACR, length of left kidney, L_C_median, mean SWE value of left renal sinus (L_S_mean), and the nine radiomics signatures were 51.23 mL/min/1.73 m 2 , 102 µmol/L, 1000 mg/g, 98 mm, 13 kPa, 15.5 kPa, 0.37, 0. 16, 9.43, 2.68, 0.08, 0.46, 258.03, 0.25, and 3.66, respectively. They were then divided into a high group or low group and drawn into Kaplan-Meier curves with categorical features of tubular atrophy and artery/arteriole hyalinosis (Figure 3). Except for L_S_mean, all the subgroups of features were shown to be statistically different for the secondary outcome. To better explain the clinical meaning of the Cox regression model, we built four further Cox regression models using the statistically significant features in a multivariate Cox regression analysis. Model-All used all the features. Model-Clin + Patho used the clinical features of eGFR, Scr, ACR at baseline, pathological features of tubular atrophy, artery/arteriole hyalinosis, and length of left kidney, which are commonly used in clinical practice. Model-Clin + SWE used the clinical features, length of left kidney, and the SWE parameters of L_C_median and L_S_mean. Model-Clin + SWE + Radiomics added the nine radiomics signatures. There is no problem with multicollinearity in our multivariable model according to the multicollinearity diagnosis (single variance inflation factor < 10 and average variance inflation factor < 6 [31,32] in Table S2. Likelihood chi-square test, C-index, and time-dependent ROC all illustrated that Model-All performed the best, with an average area under curves (AUCs) of time-dependent ROC over 0.9 (Tables 3, S3 and S4; Figure 4). Model-Clin + SWE + Radiomics improved the prediction ability of Model-Clin + SWE and Model-Clin + Patho.

Nomogram for CKD Prognosis
As shown in Figure 5, a prognostic nomogram was built based on the Cox regression model using all the statistically significant features from multivariate Cox regression analysis. The calibration curve for the 1-year, 2-year, and 2.5-year survival of those with CKD stage over 3 demonstrated the good performance of the prognostic nomogram (Figure 5c-e). The decision curve for the net benefit demonstrated that the nomogram was more reliable at predicting survival after two years (Figure 5f,g).

Predicting Models for CKD Prognosis Using Machine Learning and Deep Learning
As shown in Figure 6, the RSF and Coxboost prediction models performed best in the random-split test dataset (baseline character in Table S5) among all the machine learning and deep learning models, with C-indices of 0.8095 (0.7938-0.8303), and 0.8139 (0.8037-0.8307). However, compared to the Cox regression model using the same features (Model-All), the machine learning and deep learning models dropped in predictive performance at 30 months. The decision curve for the nomogram. The pink line annotated as "None" represents the assumption that no patients have progressed to CKD stage 3 or over. The dotted line annotated as "All-xx" represents the assumption that all patients have progressed to CKD stage 3 or over. The further the model line from the "None" or "All" lines, the greater net benefit to the model gets. "Model-1" and "All-12" in (f) are at the time of 12 months, "Model-2" and "All-24" are at the time of 24 months, and "Model-3" and "All-30" are at the time of 30 months. The values "24.05", "25.37", and "27.15" in (g) are the quartiles of total follow-up time. "Model" in (f,g) all represent the Cox regression model used to build the nomogram.

Discussion
In this cohort study, we found that L_C_median is an independent risk factor for CKD progression to CKD stage 3 or over by multivariate Cox regression with a hazard ratio of 0.890 (0.796-0.994) (p < 0.05, Table 2). This finding might support the early prediction of CKD progression in clinical settings, especially in healthcare centers with the inability to perform kidney biopsies. Patients with a high risk of disease progression according to the nomogram can be treated more aggressively.
Although few studies have reported the predictive value of SWE for the prognosis of CKD in adult native kidneys, our finding still complies with Liu et al.'s finding of higher SWE values for the left and right renal cortex in children's CKD progression using the same machine [33]. Kennedy et al. also found that the renal cortex stiffness of allograft, reflected by point-shear elastography ultrasound, increased at baseline in those who developed into graft loss later [34]. This finding may be one of the reasons why high intra-subject variability was found by Radulescu et al. among CKD patients [35]. As shown in Figure  S3d, some patients had worsening eGFR while others at the same CKD stage had stable or recovering eGFR. This was consistent with the insensitivity of eGFR in subclinical injury or initially adaptive repair [3].
Additionally, using SWE, we found that the patients who progressed to or stayed at CKD stage 3-5 did not statistically differ in all SWE parameters from the patients who stayed in or regressed to CKD stages 1-2 at the inclusion time (Table 1). The reasons for this might be confounding factors. Studies have also shown that tissue viscosity might increase shear wave velocity [36,37]. Furthermore, tissue viscosity has been reported as a marker for hepatic necroinflammation [38]. This may also account for the non-linear relationship between SWE value and eGFR or fibrosis [12]. Furthermore, the SWE value in the medulla was more unstable than the value in the cortex because the organized microstructures of the loop of Henle and the vasa recta in the kidney medulla made the shear wave velocity vary according to the direction of measurement [39]. In line with the literature, we found that the SWE (Young's modulus) value in the cortex is more predictive than that in the medulla or sinus.
In addition, we also found nine radiomics signatures that were independent risk factors for CKD progression (p < 0.05). Furthermore, the Cox regression model containing them performed better than the Model-Clin + Patho and Model-Clin + SWE (Figure 4; Tables 3, S3 and S4). These represent the added value of radiomics in SWE. All of the significant radiomics signatures were from wavelet-transformed images, which offer proven reproducibility [16]. We manually delineated the ROIs for radiomics analysis based on the ROIs of the SWE ultrasound images instead of segmenting the whole renal cortex or sinus, or medulla. Because focal or segmental fibrosis might occur during CKD [40], comparing the features of SWE, the use of radiomics obtained from a very similar part to biopsy is more appropriate. To minimize the effect of the ROI position on the reproducibility of radiomics signatures, we used the median value from four images per patient, similar to the SWE parameters. Machine learning offers advantages in the handling of large-scale and complex clinical data. However, risk bias is receiving attention due to the selection of modeling features and overfitting [15]. Therefore, we built the machine learning models based on the same features as the traditional Cox regression model and minimized overfitting by random-split, 10-fold validation, and Bootstrap. We found that the Cox regression model was no worse than the machine learning models, as was reported previously [41]. The reason for this might be the overfitting of machine learning models in hidden layers or nodes.
The limitations of this study are of its short follow-up time and use of a single center, which may have caused selection bias. External validation and a 5-year or 10-year follow-up will be needed in our feature studies.

Conclusions
In conclusion, the median SWE value of the left kidney cortex can independently predict a 2.5-year CKD prognosis of CKD stage 3 or over. Radiomics can improve the predictive performance of SWE for CKD progression. Traditional Cox regression modeling is no worse than machine learning.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/diagnostics12112678/s1, Figure S1: The flowchart of the study cohort; Table S1: Test of proportional hazards assumption for the multivariate Cox regression model; Figure S2: Value of shear wave elastography ultrasound in the left kidney cortex, sinus, medulla grouped by BMI; Table S2: The multicollinearity diagnosis for the multivariate Cox regression; Figure S3: Patients' serum creatinine and eGFR at baseline and last follow-up; Table S3: C-index of Cox regression models; Figure S4: Cutoff value based on Kaplan-Meier method; Table S4: Comparison of time-dependent ROCs of Cox regression models; Table S5: Baseline characters of the train and test cohort.  Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: All data mentioned in this manuscript is available with publication upon reasonable request through emails to the correspondence author Shan Mou, shan_mou@126.com.