1. Introduction
Systemic lupus erythematosus (SLE) is an autoimmune disease that impairs multiple organ functions [
1]. Kidney injury is a common and severe organ-specific manifestation of SLE, occurring in up to 50% of adults and 70% of children with the disease [
2,
3]. Kidney injury could deteriorate the prognosis of patients with SLE, and the early diagnosis and active evaluation of patients with SLE-related kidney injury play a pivotal role in promoting its therapeutic effect [
4]. Conventional laboratory markers, such as urinary protein-to-creatinine ratio, complement 3, complement 4, and anti-dsDNA antibodies, are not sensitive or specific enough for predicting kidney damage in patients with SLE. As a routine marker for clinical evaluation of renal injury, proteinuria is associated with postoperative acute kidney injury (AKI) and is a valuable risk-stratification tool in the post-AKI period [
5,
6]. However, it plays a limited role in predicting ongoing or relapsing SLE-related kidney injury because of restrictive sensitivity, specificity, and conventional medical diagnosis methods [
7,
8,
9]. Renal biopsy is a gold standard for diagnosing SLE-related kidney injury, and it is a useful tool for evaluating clinical efficacy [
10,
11]. Yet, renal biopsy is an invasive procedure with a high risk of complications, which limits its widespread application. Therefore, it is necessary to find non-invasive and effective biomarkers to improve the prediction of SLE-related kidney injury, especially in SLE-related kidney injury with negative proteinuria [
7,
10].
Association rule mining is a method for finding frequent co-occurrences in large databases. More recently, this method has been applied to identify clinical risk factors and analyze hidden associations with prognostic survival, which can aid in disease diagnosis and clinical management [
11]. Association rule mining is based on the apriori algorithm, a popular data mining algorithm for identifying interesting correlations. In our previous report, the apriori algorithm was used to mine the biomarker of alpha-hydroxybutyrate dehydrogenase as a biomarker for predicting systemic lupus erythematosus with liver injury [
12].
Triglycerides represent a major source of available lipid substrates and are the main components of lipoproteins [
13,
14]. Triglyceride hydrolysates include glycerol and fatty acids, which can be provided for signaling, β-oxidation, and the assembly of very-low-density lipoprotein (VLDL) [
15]. Triglycerides are esters derived from glycerol and three fatty acid molecules [
16]. They are hydrophobic and impossible to appear alone in blood. Under physiological conditions, triglycerides are stored in hepatocytes or exported into the blood in the form of VLDL particles. Triglyceride measurement is a direct, accurate, and precise measurement of all triglycerides in plasma, which has been applied in the diagnosis of cardiovascular diseases and hypertriglyceridemia [
17]. Elevated serum triglycerides may serve as a biomarker and as causal factors for atherosclerosis and atherosclerotic cardiovascular disease [
18]. It is recognized that the level of triglycerides is increased in SLE by inducing altered lipoprotein metabolism [
19]. However, the value of triglycerides for predicting SLE-related kidney injury, especially kidney injury with negative proteinuria, remains unclear.
Here, we proposed an apriori algorithm of association rule mining approach to identify an optimal logistic model of biomarkers for predicting SLE-related kidney injury of negative proteinuria. In addition, a diagnostic algorithm was proposed to predict SLE-related kidney injury.
2. Materials and Methods
2.1. Study Population
The records of 158 consecutive hospitalized cases with a diagnosis of SLE, which fulfilled the diagnostic criteria of the American College of Rheumatology in 1997, between July 2011 and January 2018, at the First Hospital of Lanzhou University were collected. The age of these patients ranged from 18 to 60 years old, 103 of them had never received immunosuppressive therapy, and the other 55 SLE patients had stopped receiving immunosuppressive therapy for more than 12 weeks. All patients who underwent blood transfusion or were diagnosed with malignancy, other autoimmune diseases, lymphoproliferative disorders, infections, and hematopoietic diseases were excluded. Additionally, 158 age- and sex-matched healthy controls were enrolled. All patients with SLE were divided into two subgroups: an SLE-related kidney injury group and an SLE-no kidney injury group. The diagnostic criteria of SLE-related kidney injury were patients with SLE who met one of the following criteria: biopsy-proven lupus nephritis (class III, IV, V, III + V, and IV + V) [
20], serum creatinine > 108 μmol/L, proteinuria > 0.5 g/d, urine red blood cells > 5/HP or urine pathology cast (P-CAST)). The study protocol was approved by the Research Ethics Committee of the First Hospital of Lanzhou University (No. LDYYLL201731).
2.2. Laboratory Values and Clinical Assessment
Demographic data, clinical characteristics, and laboratory test results of enrolled subjects, including sex, age, body mass index, systemic lupus erythematosus disease activity index 2000 (SLEDAI-2K) score, SLE damage index, blood cell common tests, urine common tests, lactate dehydrogenase (LDH), blood urea nitrogen, serum creatinine, uric acid, the ratio of aspartate transferase to alanine transferase (AST/ALT), total cholesterol, triglycerides, high-density lipoprotein (HDL), low-density lipoprotein (LDL), albumin, total protein, α-hydroxybutyrate dehydrogenase (α-HBDH), proteinuria, complement 3, complement 4, immunoglobulin G (IgG), immunoglobulin A and immunoglobulin M, were extracted from the medical records. The estimated glomerular filtration rate (eGFR) was calculated with the indexes of serum creatinine, age, and weight. SLEDAI-2K scores greater than 4 were defined as active SLE disease.
2.3. Association Rule Mining (ARM) and Apriori Algorithm
ARM is a data mining technique that finds the association between an item and variables from various kinds of databases. The rule is defined as a connotation of the form A ⇒ B. The sets of items A and B are called the “antecedent” and “consequent”, respectively. Association rules are evaluated on the values of support and confidence. The support of the association rule is defined as support (%) = [number of diseases A ∩ B]/[total number of diseases], and the confidence in an association rule is defined as confidence (%) = [number of diseases A ∩ B]/[number of diseases A], where A ∩ B is the item set obtained by amalgamating A with B. The support of an item set measures its commonness, and the confidence of an association rule measures its association strength. By the essential meaning of lift, we can also define the lift for a rule, which is: lift = [(number of diseases A ∩ B) × (total number of diseases)]/[(number of diseases A) × (number of diseases B)] [
21,
22,
23].
The process of ARM can be divided into two steps. In the first step, all the frequent itemsets that have more than minimum support in the transaction database are found. In the second step, strong association rules that meet the minimum confidence level from frequent itemsets are generated. The Apriori algorithm is a classical ARM technique, and it computes the frequent itemsets in the database through several iterations. Then, the strong association rules that meet the criteria are found from the frequent itemsets.
In our study, the itemset of association rules for 70 elements that are consisted of 54 laboratory indicators, 15 patient demographics, and 1 disease status variable. The association between the 70 elements and SLE-related kidney injury was identified by using the Apriori algorithm module in SPSS Modeler 18.0. The disease state variable was considered as an antecedent, and laboratory indicators were the consequents.
2.4. Statistical Analysis
The quantitative demographic data are presented as the mean ± standard deviation for normally distributed variables, and the chi-square test was used to analyze categorical variables. SPSS Modeler 18.0 was performed for data mining. Student’s t-test was used for normally distributed variables, and the Mann-Whitney U-test was employed for others. Logistic regression analysis was conducted to identify the independent risk factors for SLE-related kidney injury. In addition, spearman correlation analysis was used to evaluate the correlation between biomarkers and the disease activity of SLE-related kidney injury. Furthermore, a receiver operating characteristic curve (ROC) was constructed to determine the value of biomarkers for predicting SLE-related kidney injury. All data were analyzed by SPSS 22.0 statistical software (SPSS Inc., Chicago, IL, USA). p < 0.05 was considered statistically significant.
4. Discussion
Kidney injury is a common complication of SLE, while non-invasive, easily accessible, and accurate predictive markers for SLE-related renal injury are lacking [
25]. The growing amount of electronically available data has augmented data sets [
26]. ARM is an important data mining technique. The apriori algorithm, as a classical ARM from transaction data, is mostly deterministic and can identify the relationships between diseases and biomarkers from a large amount of data. In this study, we identified non-invasive and easily accessible biomarkers to predict SLE-related kidney injury with negative proteinuria from laboratory indicators by ARM.
Triglycerides, HDL, LDL, LDH, AST/ALT, α-HBDH, total cholesterol, hemoglobin, PDW, hematocrit, RDW, and LYM were extracted from the laboratory indicators, and all of them were significantly different between patients with SLE and healthy controls. These results are consistent with previous studies [
27,
28,
29,
30]. The nature triglycerides consist of glycerol and three fatty acids. The increased or decreased triglycerides levels are associated with human diseases [
15]. The elevated triglycerides may be a result of systemic inflammation in SLE [
31]. It may be caused by reduced clearance and increased synthesis of lipoproteins in SLE patients. Because HDL dysfunction in nephrotic syndrome could result in impaired LPL-mediated lipolysis of triglycerides-rich lipoproteins, this process plays a key role in dysregulating triglycerides-rich lipoprotein [
32]. These observations suggest that HDL abnormalities are associated with impairment of triglycerides clearance. Patients with renal disease often have reduced clearance of triglycerides-rich lipoproteins due to hepatic lipase deficiency, which impairs the function of the liver to metabolize triglycerides [
33]. In contrast to inhibiting the clearance of circulating triglycerides, inhibition of LPL by high circulating ANGPTL4 can also improve the synthesis of triglycerides [
33,
34].
We firstly demonstrated that elevated triglycerides might be an independent risk factor for SLE-related kidney injury by logistic regression analysis, and patients with SLE-related kidney injury occur mainly around 34 years of age. Furthermore, we proved that more patients with SLE had kidney injury in the high-triglycerides group than in the low-triglycerides group. These results indicate that triglycerides are correlated with SLE-related kidney injury. The mechanisms of elevated triglycerides in SLE-related kidney injury are not clear completely. The higher sugar and fat intake could result in raising triglyceride concentrations, and hyper-triglycerides may exacerbate renal damage following an inflammatory insult with increased accumulation of macrophages [
35].
Proteinuria, urine P-CAST, urea nitrogen, creatinine, IgG, albumin, total protein, and SLEDAI-2K are associated with kidney damage. Our results showed that SLEDAI-2K, urea nitrogen, serum creatinine, proteinuria, and urine P-CAST levels were significantly higher, while age, IgG, albumin, and total protein levels were obviously lower in the high-triglycerides group. In addition, triglycerides were positively correlated with proteinuria and P-CAST and negatively correlated with serum albumin and IgG in patients with SLE-related kidneys. These results further illuminate that triglycerides are correlated with the disease activity of SLE-related kidney injury.
SLE-related kidney injury results in inflammatory cell infiltration, which leads to glomerular filtration barrier injury and tubular re-absorption damage in patients with SLE [
36]. The reduced glomerular filtration rate causes increased serum urea nitrogen and creatinine concentrations. In our results, albumin and total protein were significantly lower in the high-triglycerides group than in the low-triglycerides group, and albumin was negatively correlated with triglycerides. The reason may be that serum albumin and total protein are filtered out through urine discharge resulting from the damaged kidneys of patients with SLE with low triglycerides.
The result of the area under the ROC curve analysis suggested that triglycerides could predict patients with SLE-related kidney injury. Proteinuria is an important biomarker for diagnosing kidney injury [
37]. However, the amount of released protein is too small to detect by conventional medical diagnosis sometimes. Identifying the molecular biomarker at the early stage of the SLE-related kidney injury is the most important. Our current results showed that 50% of SLE-related kidney injury patients with negative proteinuria could be identified by high triglycerides levels. These results suggested that triglycerides may be a marker for predicting SLE-related kidney injury, especially in SLE-related kidney injury of negative proteinuria patients. Triglycerides combined with proteinuria could provide a better prediction of SLE-related kidney injury.
Our study illustrates that triglycerides level is significantly higher during the disease progression from SLE-no kidney injury to SLE-related kidney injury. A low eGFR often indicates severe kidney impairment [
38]. We further found that as the level of triglycerides increased, the eGFR decreased. These results suggest that triglycerides may reflect the occurrence and progression of SLE-related kidney injury. Of course, more cases of SLE-related kidney injury should be added to further prove the association of serum triglycerides with renal function/proteinuria in the future.