Machine Learning for Identifying Medication-Associated Acute Kidney Injury

: One of the prominent problems in clinical medicine is medication-induced acute kidney injury (AKI). Avoiding this problem can prevent patient harm and reduce healthcare expenditures. Several researches have been conducted to identify AKI-associated medications using statistical, data mining, and machine learning techniques. However, these studies are limited to assessing the impact of known nephrotoxic medications and do not comprehensively explore the relationship between medication combinations and AKI. In this paper, we present a population-based retrospective cohort study that employs automated data analysis techniques to identify medications and medication combinations that are associated with a higher risk of AKI. By integrating multivariable logistic regression, frequent itemset mining, and stratified analysis, this study is designed to explore the complex relationships between medications and AKI in such a way that has never been attempted before. Through an analysis of prescription records of one million older patients stored in the healthcare administrative dataset at ICES (an independent, non-profit, world-leading research organization that uses population-based health and social data to produce knowledge on a broad range of healthcare issues), we identified 55 AKI-associated medications among 595 distinct medications and 78 AKI-associated medication combinations among 7748 frequent medication combinations. In addition, through a stratified analysis, we identified 37 cases where a particular medication was associated with increasing the risk of AKI when used with another medication. We have shown that our results are consistent with previous studies through consultation with a nephrologist and an electronic literature search. This research demonstrates how automated analysis techniques can be used to accomplish data-driven tasks using massive clinical datasets.


Introduction
Acute kidney injury (AKI), defined as a sudden loss of kidney function over a short period of time, affects approximately 10% of patients admitted to hospitals worldwide [1,2]. It is associated with increased mortality, morbidity, and estimated incremental health care costs of more than $200 million in Canada annually [3]. Medication-induced nephrotoxicity is very common in clinical practice. It accounts for 19% of cases of AKI in a hospital setting [3][4][5][6][7][8] and is associated with increased healthcare expenditure [3,9]. For instance, using the medication utilization data in Canada for 2013, Morgan et al. (2016) have reported an estimated healthcare cost of $419 million due to inappropriate prescriptions [10].
Over the last two decades, the incidence rate of AKI has increased in Canada [11,12], the United States [13,14], and the United Kingdom [15]. The increasing occurrence of AKI is related to the changing spectrum of diseases. There is a growing body of evidence showing that patients with multiple comorbidities and extrarenal complications are at a higher risk of developing AKI [16,17]. For instance, Aikar et al. [18] have shown that the high comorbidity rate, measured by the Deyo-Charlson comorbidity index, is associated with AKI. In a study of 681 AKI patients who are admitted to the intensive care unit, the occurrence of comorbid conditions is high: 37% have coronary artery disease, 30% have chronic kidney disease, 29% have diabetes mellitus, and 21% have chronic liver disease [17]. As a patient's number of comorbid conditions grow, there is a rise in associated hospitalizations, physician visits, prescriptions, and expenses [19], ultimately leading to an increase in medication intake. Patients admitted to hospitals, particularly critically ill patients with multiple comorbidities, often take several medications, with up to 25% of these medications having nephrotoxic potential [9]. A study in 2005 has revealed that out of 7 million adverse medication event reports, 2.7% include an incidence of AKI, of which 16% are known nephrotoxins, 18% are possible nephrotoxins, and the rest are new potential nephrotoxins [8].
The use of nephrotoxic medications is associated with 16%-25% of all AKI cases in the adult population [8,20]. Few studies have been conducted to identify medications that are commonly associated with AKI. Most of these studies have been limited in assessing the impact of known nephrotoxic medications [21][22][23]. In addition, information on medication combinations that can cause AKI is lacking in the literature. It is important to identify those combinations because a combination of multiple nephrotoxins may result in synergistic or accumulative nephrotoxicity, thus increasing the chance of renal failure [24]. For example, the risk of developing AKI increases by 53% for each additional nephrotoxic medication used by a patient [25]. Hence, it is important to identify not only nephrotoxic medications but also medication combinations that affect the risk of AKI. Rivosecchi et al., through an exhaustive literature search, further emphasize the need for a comprehensive understanding of how medication combinations alter the risk of AKI [22]. According to a CDC report in 2017, there are about 1000 known adverse medication effects and 5000 medications available in the pharmacies (FastStats-Therapeutic Drug Use), making for approximately 125 billion possible adverse medication effects between all possible pairs of medications [26]. Thus, it is impossible to assess medication-induced AKI through this number of clinical trials. Moreover, conducting a trial to determine whether to prescribe or not prescribe a potentially harmful combination would likely never receive research ethics board approval.
Data analysis has the potential to address this challenge by employing methods and techniques from different fields, such as data mining, statistics, and machine learning, to accomplish various data-driven tasks [27]. It can be used to investigate clinical data to gain both novel and deep insights to help healthcare providers examine medication-induced nephrotoxicity. Recently, several studies have been conducted to identify drug-drug interactions, improve drug-safety science, and predict adverse drug reactions, using machine learning techniques [28][29][30][31][32][33]. For instance, Kandasamy et al. (2015) have developed a prediction model to identify drug-induced nephrotoxicity using human induced pluripotent stem cells and random forest [29]. In addition, Dey et al. (2018) have presented a deep learning framework to predict adverse drug reactions and detect molecular substructures associated with them [30]. An automatic method of processing adverse drug event reports using artificial intelligence and robotics has presented by Schmider et al. in 2019 [32]. Lysenko et al. (2018) have incorporated Mashup [34] and a gradient-boosted tree to predict drug toxicity using biological network data [31]. Although these studies are designed to deal with large bodies of data to solve different medication-related problems, the relationship between medications and AKI has not been studied before through automated data analysis. Automated data analysis techniques allow incorporation of large quantities of data that creates an opportunity to include additional information to more comprehensively study individual medications and their combinations. It is essential to consider comorbidities while studying the effect of medications since it is not clear whether the underlying comorbidities or medications increase the risk of developing AKI. In addition to comorbidity data, demographic data, such as age, sex, and region, are also considered as risk factors for AKI [35,36]. Therefore, any complete study that investigates nephrotoxic medications or combinations should include demographic and comorbidity data in the analysis. Up until now, there is a lack of well-designed studies that consider demographic and comorbidity data while assessing the risk of developing AKI with the use of single or multiple medications. Even though the identification of nephrotoxic medications is crucial for improved patient care, it has not been studied thoroughly using machine learning techniques.
The objective of this study is to identify individual medications associated with AKI in hospitalized patients using a machine learning approach. We also identify AKI-associated medication combinations and investigate whether the use of multiple medications results in multiplicative effects on the risk of developing AKI. Finally, we investigate how our findings are consistent with data in the existing literature. Our study differs from other studies in three ways: (1) we consider all the frequently used medications in the study, whether they have been known to be nephrotoxic or not; (2) we use a frequent itemset mining algorithm to identify frequent medication combinations and multivariable logistic regression to investigate the association between medication combinations and AKI; and (3) we incorporate the patient's demographic and comorbidity features as potential covariates in the regression model.

Materials and Methods
This section describes the methodology we have employed to conduct the study. We describe the design process, study setting, workflow, data sources, cohort entry criteria, baseline covariates, medications, acute kidney injury, tools, and analysis processes.

Design Process and Participants
To help us understand how healthcare providers perform automated analysis, and to help us conceptualize and design our study, we adopted a participatory design method. It is a co-operative method that involves all stakeholders (e.g., designers, intermediary-users, and end-users) in the design process, to ensure the output of the analysis meets their needs [37]. A statistician, a clinician, an epidemiologist, and several computer scientists were involved in the design and evaluation process of this study. During the initial stage in the designing process, we realized that healthcare providers usually perform medication-safety related studies in many ways. It is difficult to determine a single correct analytics technique for these tasks because different techniques have their strengths and weaknesses. As such, we interviewed healthcare experts to identify the data-driven tasks and analytics techniques with which they are familiar. We identified four data-driven tasks to consider in designing this study, through our collaboration with healthcare experts at the ICES-KDT (ICES-an independent, non-profit, world-leading research organization that uses population-based health and social data to produce knowledge on a broad range of healthcare issues; KDT-Kidney Dialysis and Transplantation program), located in London, Ontario, Canada. (1) Studying the relationships between prescribed medications and AKI. (2) Identifying commonly prescribed medication combinations to older patients. (3) Examining the effect of a medication combination on AKI. (4) Investigating whether a certain medication is associated with an increased risk of developing AKI when used with another medication. We came to know that healthcare experts usually rely on different regression techniques to accomplish such tasks. Thus, we decided to employ multivariable regression in this study. We also invited healthcare experts to provide us with formative feedback on design decisions and results.

Study Design and Setting
We performed a population-based retrospective cohort study in older adults from April 2014 to March 2016 in Ontario, Canada, using administrative health databases located at ICES. These datasets were linked using unique encoded identifiers and analyzed at ICES. The use of data in this project was authorized under Section 45 of Ontario's Personal Health Information Protection Act, which does not require review by a research ethics board.
Ontario has a population of approximately 13 million residents with universal access to hospital care and physician services, including 1.9 million people aged 65 years or older who have universal prescription drug coverage (14% of the population). We suppressed our results in cells with five or fewer patients to comply with privacy regulations and minimize the chance of re-identification of patients. Figure 1 illustrates the basic workflow of the study presented in this paper. In the first stage, we created an integrated dataset from five different health administrative databases stored at ICES. The data sources are explained in Section 2.4. Next, we applied the inclusion and exclusion criteria presented in Section 2.5 to build the final cohort. The demographic and comorbidity features were then encoded and transformed into appropriate forms for analysis in Section 2.6. Section 2.7 describes the outcome (i.e., AKI) and how we identified the incidence of AKI. A brief description of the cohort is presented in Section 2.8. After that, we performed individual and combination analysis, which are discussed in Sections 2.9 and 2.10, respectively.

Data Sources
We ascertained patient characteristics, drug prescriptions, and outcome data from five health administrative databases (Table A1). The datasets were linked using unique, encoded identifiers derived from health card numbers, and patient-level data were analyzed at ICES. We obtained vital statistics from the Ontario Registered Persons Database, which contains demographic data on all Ontario residents who have ever been issued a health card. We used the Ontario Drug Benefit Program database to identify prescription drug use. This database contains highly accurate records of all outpatient prescriptions dispensed to older patients, with an error rate of less than 1% [38]. We identified hospital admissions, baseline comorbidity, and emergency department (ED) visits data from the National Ambulatory Care Reporting System (ED visits) and the Canadian Institute for Health Information Discharge Abstract Database (hospitalizations). We used the International Classification of Diseases, tenth revision (post-2002) codes to identify baseline comorbidities. Baseline comorbidity data were also obtained from the Ontario Health Insurance Plan database, which includes claims for physician services. Coding definitions for the comorbidity data are presented in Table A2.

Cohort Entry Criteria
We identified a cohort of individuals aged 65 years or older who were admitted to the hospital or visited the ED between 1st April 2014 and 31st March 2016. The ED visit or hospital admission date served as the index or cohort entry date. If an individual had multiple ED visits or hospital admissions, we selected the first incident. Individuals with invalid data regarding the health card number, age, and sex were excluded. We also exclude (1) patients who previously received dialysis or a kidney transplant, as AKI is often no longer relevant once a patient develops end-stage kidney disease (diagnosis codes for exclusion criteria are shown in Table A3); and (2) patients who left the hospital against medical advice or without being seen by a physician.

Input Features
There were a total of 5 demographic, 10 comorbidity, and 595 medication features in the cohort, which served as input for the analysis. Demographic information included age, sex, residency status (urban and rural), long term care, and socioeconomic status (income quintile, according to Statistics Canada). We used a 5-year look-back window to identify relevant baseline comorbidities, including diabetes mellitus, hypertension, heart failure, coronary artery disease, cerebrovascular disease, peripheral vascular disease, chronic liver disease, chronic kidney disease, major cancers, and kidney stones.
All of the features in the cohort were categorical. We converted the comorbidity features into binary forms. For instance, if a patient had a particular comorbid condition, its corresponding value was taken as "1." We set the value for sex and residency status features if a patient was male and resided in urban areas, respectively. The income feature took an integer value ranged between 1 to 5 to represent the income quintile of a particular patient. All these features from different data sources were integrated using the encoded identifiers derived by ICES. Finally, the features in the cohort were transformed into a format and scale that were suitable for the analysis. For each feature in the cohort, we recorded the last value before the index date. Thus, we aggregated multiple values (rows) of a single feature into one, by considering the latest value of that feature for each patient.

Outcome
AKI was the outcome variable for all the regression models in this study. We identified the incidence of AKI in the first visit to the ED or hospital admission, between 1st April 2014 and 31st March 2016. The incidence of AKI was captured using the National Ambulatory Care Reporting System and the Canadian Institute for Health Information Discharge Abstract Database, based on the International Classification of Diseases, tenth revision diagnostic codes (i.e., N17). We set the value of the outcome variable if a patient was diagnosed with AKI. We recorded the first incidence of AKI, in case there were multiple episodes.

Individual Medication Analysis
We identified a total of 595 unique medications prescribed to about 1 million patients in the Ontario Drug Benefit Program database. The database includes medication name, medication dose, date filled, and route-of-administration of the prescriptions. We generated 595 binary features to record the use data for each medication and each patient. We set the value of a specific medication feature for a patient when the medication was administered to that patient in the 120 days prior to hospital presentation. When patients take a drug, it affects them differently, based on their body composition and metabolism. However, most physicians are not able to consider all of these factors when prescribing a medication. Thus, to investigate the association between medications and AKI, we intended to identify signals that affect a large population. If a particular signal is common in a large number of people (i.e., a population of one million patients), then the possibility of the existence of an association is very high. Our goal was to identify potential interactions that are not yet understood or perhaps known. We considered this as an information retrieval problem, such that our models were designed to discover the possible relationships between each medication and AKI. We developed a multivariable logistic regression model to predict AKI based on the demographic, comorbidity, and medication data and observed the attribute representing medication to understand the relationship between a particular medication and AKI. Logistic regression is a special type of regression technique used to predict the outcome of a binary dependent feature from one or several predictors. We developed separate regression models for each individual medication (i.e., 595 models). For each model, the regression coefficient and p-value of the medication attribute were analyzed to identify potential associations. The study was designed to assist healthcare experts at the ICES-KDT program in choosing potential candidates for their future drug-safety studies.
The "glm" function in R packages was employed to implement multivariable logistic regression models [39]. Model covariates included demographic features and baseline comorbidities. Thus, the formula in R included AKI as the response and comorbidities, demographics, and medication as predictor variables. The value for the "family" argument in the "glm" function was set to "binomial." We used the "summary" function to get the estimate, p-value, z-score, and standard error for each coefficient in the model. In addition, the "confit" function was used to compute the confidence interval and odds ratio.
In order to avoid type I error in comparing multiple independent regression models, we lowered the alpha value based on the Bonferroni correction to account for the number of comparisons being performed. We considered a Bonferroni-corrected p-value less than 8.4 × 10 −5 (divided 0.05 by the number of individual medications), as statistically significant for regression models with each medication.

Medication Combination Analysis
In order to identify the medication combinations that are associated with AKI, we first prepared the medication combinations data. Since the number of individual medications is 595, the total number of combinations is a large number. Hence, we used a data mining technique named Eclat [40], to select the frequent combinations that included prescription data of at least 0.07% of the total number of prescriptions. Eclat is a frequent itemset mining algorithm that uses a depth-first search to discover groups of items that frequently occur in a transaction database. An itemset that appears in at least a pre-defined number of transactions is called a frequent itemset. Each frequent medication combination was annotated with its support. The support of a medication combination was how many times it appeared in the medication database.
We only included combinations of two medications in this analysis and identified 7748 unique medication combinations. Then, we created binary features to record the presence of these combinations. We set the value of a specific combination feature for a patient when that patient had been dispensed all medications within the combination in the 120-day period before the index date. Similar to the individual medication analysis, we applied a multivariable logistic regression on each medication combination. The baseline covariates, such as demographics and comorbidities, and medication combination features were included as potential covariates in the models. We developed separate regression models for each medication combination identified using frequent itemset mining (i.e., 7748 models). The regression coefficients and p-values of the medication combination attributes were analyzed to identify combinations that are associated with AKI. We then performed a stratified analysis to examine potential medication-medication interactions further. We created a subset of medication combinations based on their significance in the regression models. Statistically significant combinations were detected by filtering the regression models based on a Bonferronicorrected alpha value, 6.5 × 10 −6 (divided 0.05 by the number of medication combinations).
Stratified analysis was conducted on each medication available in one or more combinations in the above subset. To do this, we created a list of unique medications (i.e., base medications) from the chosen subset of medication combinations. Then, for each medication in the list, we identified the other medication that holds a combination with the base medication. In the next stage, we prepared two sub-cohorts. The first one includes both medications in the combination (base and other), and the second one excludes the other medication in the combination. Finally, we applied multivariable logistic regression on each sub-cohort that included the combination and/or base medication feature, along with the baseline covariates. The same process was followed for each medication available on the list.
In this analysis, for each unique medication combination, we obtained two models for the subcohorts. In order to help us to assess how the other medication affects the outcome of the base medication, we compared the odds ratio of the combination attribute in the first model, with the odds ratio of the base medication attribute in the second model. We tested the significance of all models in the stratified analysis using a Bonferroni-corrected alpha value. We calculated the percentage change in odds ratios to report the result of this analysis.

Tools and Technologies
SAS was used to cut and prepare the dataset because ICES' administrative databases were stored in the SAS server [41]. In addition, we used R packages [42] to conduct the necessary statistical and machine learning analyses in this study. R was chosen because it (1) provides widespread support for carrying out data mining operations, such as frequent itemset mining and multivariable regression, (2) is available on the ICES workstations, (3) has a rich array of libraries, (4) is platformindependent and open-source, and (5) is continuously growing and providing updates with new features.

Results
This section describes the results of the study. The results of the individual medication analysis and medication combination analysis are discussed in Sections 3.1 and 3.2, respectively.

Individual Medications and AKI
Some of the commonly prescribed medications in the 120 days before the ED visit were Atorvastatin Calcium (24%), Rosuvastatin Calcium (22%), Hydrochlorothiazide (20%), Amlodipine Besylate (19%), and Metformin Hcl (16%). A binary logistic regression model was fit to each medication, where demographic and comorbidity features were included as potential risk factors in the model to test the research hypothesis regarding the relationship between the likelihood of developing AKI and specific medications. Table 2 shows the full list of medications with their pvalues, odds ratios, confidence intervals, and standard errors. The medication classes are shown in brackets with medication names. We sorted medications based on the odds ratio of the medication attribute in each model. Out of 595 medications, 55 of them were found to be strongly associated with AKI (i.e., statistically significant after Bonferroni correction). Among these 55 medications, six of them were Diuretics, four were Beta-blockers, three of them belonged to Oral Anti-Glycemic, three of them were Prostatic Hyperplasia medications, and the rest of them belonged to 33 other medication classes.
Among demographics, age, sex, residency status, and long-term care attributes have shown statistically significant relationships with the probability of AKI. The fitted models revealed that keeping all other attributes constant, the odds of getting diagnosed with AKI for males over females varied between 1.35 to 1.38. The odds for older age groups (i.e., 80 to <90 and ≥90) were higher. The odds for rural residents were 24%-28% lower than the odds for urban residents. Similarly, the odds for patients in long term care were 41%-45% higher. By analyzing the comorbidity attributes in the models, we identified that AKI was more likely to be associated with chronic kidney disease, hypertension, diabetes, and heart failure, and chronic liver disease. Among these attributes, chronic kidney disease, hypertension, and diabetes have shown very strong associations. The average odds ratios for chronic kidney disease, hypertension, and diabetes patients were 1.81, 1.64 and 1.41, respectively.
In the next stage, we applied multivariable logistic regression on each selected combination. We filtered the combinations based on the p-value of the medication feature in each model. We found 78 combinations that were associated with increasing the risk of AKI among 7748 combinations. Then, we performed stratified analysis on the strongly associated combinations and reported the percentage change in the odds ratio. We identified 37 cases where a base medication is associated with increasing the risk of developing AKI when used with another medication. Table 3 contains a filtered list of combinations, with a percentage change of more than 40%.  Table 3 shows the medication names with classes, odds ratios of models with and without the other medication, and percentage change in odds ratios. In the stratified analysis, we found 16 and 27 distinct medication classes representing the first (base medication) and second (other medication in combination) columns, respectively. The percentage change in odds ratio had increased by 80% when Indapamide was used with Clavulanic Acid Potassium or Amoxicillin. The combination of Allopurinol with Venlafaxine Hcl or Morphine Sulfate was associated with a 55% increase in the odds. The odds of getting diagnosed with AKI increases if Alprazolam, Trandolapril, Metformin, Clonidine Hcl, Acetaminophen & Oxycodone Hcl, or Cefuroxime Axetil is used in combination with Furosemide. When Celecoxib, Pregabalin, or Atenolol was used with one of the Antipsychotic medications (Quetiapine), the average change in odds ratio was about 65%. It is interesting to note that Celecoxib (Anti-Inflammatory) was not found to be associated with AKI (Table 2) when used individually, but appeared to be AKI-associated when used with Mirtazapine (Antipsychotic) or Quetiapine Fumarate (Antidepressants).
The relationship between AKI and potential covariates (i.e., demographics and comorbidities) in the combination models resembled the relationship of individual models. By analyzing the regression coefficients of the combination models, we identified that patients with AKI were more likely to be men, reside in urban areas, live in long-term care, have chronic kidney disease, hypertension, diabetes, and heart failure. AKI was less likely to be associated with income quintile, peripheral vascular disease, chronic liver disease, and cerebrovascular disease.

Discussion
In this study, we demonstrated how machine learning techniques could help with the identification of potentially nephrotoxic medications using administrative health databases housed in ICES. Nephrotoxic medications are responsible for about 20% of episodes of AKI, and the rate of medication-induced nephrotoxicity leading to AKI among older patients is approximately 66% [43,44]. We have presented methods for identifying medications and medication combinations that are associated with AKI using regression and frequent itemset mining algorithms. We found that 9% of all the prescribed medications were possibly associated with AKI by analyzing the medication data of one million older patients included in our study. Our analysis identified Angiotensin II Receptor Blockers, Antibacterial Agents, Diuretics, Iron Preparations, Nonsteroidal Anti-Inflammatory drugs, and Xanthine Oxidase Inhibitors as medication classes that were associated with increasing the risk of AKI. In a recent study of the French national pharmacovigilance database, Pierson-Marchandise et al. (2017) found that the majority of cases of medication-induced AKI were related to Antibacterial Agents, Antineoplastic Agents, Diuretics, Anti-Inflammatory drugs, and agents acting on the Renin-Angiotensin system [45]. A similar conclusion was reached by a study of nursing home residents, where Ace Inhibitors, Angiotensin II Receptor Blockers, Antibiotics, and Diuretics were identified as the primary medication classes responsible for developing AKI.
Our study also aimed to investigate how the individual medication analysis results were consistent with what has been found in the previous studies. We first reviewed the results with a nephrologist and learnt that most of the statistically significant medications (Table 2) were already known to be associated with AKI, which confirmed the accuracy of our findings. We also conducted an electronic literature search to find the research papers that studied the relationships between these medications and AKI. To ensure that relevant papers were not missed in our search, we used a relatively large set of keywords. We used two sets of keywords. The first set represented the medication, and the second was concerned with AKI. For the second set, we used the following terms: "AKI", "acute kidney injury", "acute renal failure", "acute phosphate nephropathy", "acute prerenal failure", and "anuria". All the studies included in this literature search were published from 1995 until 2019. Through the literature search, we found studies that investigate the associations between 38 medications (among 55 identified medications) and AKI, which more comprehensively proved the efficacy of our study.
To explain the results of individual medication analysis, we divided the identified medications into two main groups-known and likely-confounded. The medications that belong to the first group were already known to be associated with AKI. The relationships between AKI and these medications have previously been studied in the literature. The likely confounded group contains medications that are used to treat AKI-associated conditions, included in studies with kidney function, or not studied before. There is a lack of evidence regarding the association between AKI and some of these medications, such as Prochlorperazine Maleate and Terazosin. The complete list of medications that are divided into these groups is shown in Table 4. Out of 55 medications, there were 38 medications in the known group and 17 medications in the likely-confounded group. The key finding of the individual medication analysis was the list of medications included in the likely-confounded group. These medications are suitable candidates for clinical drug-safety studies to investigate this potential association.
Through the medication combination analysis, we found that out of 25 thousand patients with AKI in our dataset, about 85% were prescribed multiple medications within 120 days prior to the index date. The incidence rate of developing AKI is usually higher among patients who are prescribed multiple medications. For instance, in a study of 38,782 adverse drug reaction events, out of 1254 reported AKI cases, about 66% included two or more concomitantly prescribed medications [45]. Another study suggested that there were statistically significant associations between the duration of simultaneous medication use and the development of AKI [90]. Similarly, a study of Taiwan's National Health Insurance system showed that the concurrent use of certain medication classes (such as Diuretics, Beta Blockers, Calcium Channel Blockers, Alpha Blockers, Ace Inhibitors, Digoxin, and Platelet Aggregation Inhibitors) was strongly associated with the development of AKI [91]. In order to compare our findings with the existing literature, we discussed the results of medication combination analysis using medication classes, since most of the previous studies presented their results this way. As shown in Table 3, some of the AKI-associated combinations are Alpha Adrenergic Blocking Agents-and-Ace Inhibitors, Corticosteroids-and-Ace Inhibitors, Diuretics-and-Ace inhibitors, Potassium Sparing Diuretics-and-Ace Inhibitors, Diuretics-and-Analgesics & Antipyretics, Tricyclic Antidepressant-and-Analgesics & Antipyretics, Alpha Adrenergic Blocking Agents-and-Angiotensin II Antagonist, and Antilipemic: Fibrates-and-Angiotensin II Antagonist. We have identified that using a combination of Diuretics with some specific medication classes are associated with increasing the risk of developing AKI. In line with our findings, the effect of using Diuretics with Renin Angiotensin Aldosterone Agents, Ace Inhibitors, or Penicillin on AKI has been investigated in several researches [92][93][94][95][96][97][98].
In order to verify the results of the medication combination analysis, we compared our findings with a recent study [22]. In their study, Rivosecchi et al. identified 76 unique combinations of medication classes that were associated with AKI by assessing 2139 citations. Overall, 73.7% of selected medication classes were categorized as very low quality, 15.8% were of low quality, and 10.5% were considered medium quality. We found that our results are consistent with the studies included in this literature review. It is important to note that there were 19 medications in our study that were not statistically significant individually but were found to be strongly associated with AKI when used with another medication (Tables 2 and 3). There are also a few combinations of medication classes in our study, such as Antipsychotic Agents-and-Anti-Inflammatory, Diuretics-and-Xanthine Oxidase Inhibitor, to name a few, which have been studied individually, but there is a lack of evidence in the literature on how these combinations are associated with AKI [99][100][101][102][103].
The main strength of the study presented in this paper was its exhaustive analysis of medication usage patterns of the one million hospitalized patients within a 120-day look-back window. It is noteworthy that all the patients were elderly (65 years or older), suffering from multiple diseases, and taking several potentially nephrotoxic medications. We included most of the frequently prescribed medications and investigated all possible combinations among these medications in our study. Next, to assess the true impact of medications on AKI, we incorporated the patients' demographic and comorbidity features as covariates in the regression analysis. In addition, we performed a stratified analysis to investigate the synergistic effect of medication combinations on AKI. To our knowledge, this study introduced a novel analysis technique by integrating frequent itemset mining, regression, and stratification, to identify medications and combinations that can potentially be associated with AKI.
This research also demonstrates how machine learning can be used to address a well-known problem in the medical domain. It highlights what needs to be considered when designing studies that are intended to incorporate machine learning techniques to support data-driven tasks using health administrative datasets.

Limitations
Our study has some limitations. First, our results can only be generalized to the elderly, as we only had complete medication data on those aged 65 and older. Second, our study population might have included clinically unstable patients who were admitted to the hospital or emergency department. This could be a confounding factor, as clinically unstable patients are more likely to take multiple concomitant medications, increasing their chances of developing AKI. Third, our findings can only be generalized to the population of Ontario, since the models were derived and validated in cohorts from hospitals in Ontario. Lastly, there could be multiple reasons for which a patient is prescribed with medication, and these reasons may lead to the development of AKI rather than the medication itself. The study was designed to assist healthcare researchers at the ICES-KDT program in identifying potential candidates for their future medication-safety studies.

Conclusions
Medication-induced nephrotoxicity is one of the major causes of AKI worldwide. In the present study of the ICES database, we identify the individual medications and medication combinations that are potentially associated with AKI by applying a combination of regression and frequent itemset mining techniques. We have shown that our results are consistent with previous studies throughout this paper. Although most of the medications that we identify are already known to be associated with AKI, some of them have not been thoroughly studied yet. Our findings would raise awareness to conduct clinical research on these potentially nephrotoxic medications. Attention should be directed at avoiding nephrotoxic treatments when an at-risk situation is identified to reduce the chance of patients developing AKI. This requires not only careful monitoring by prescribers but also comprehensive studies on these medications. Ongoing research in this field might provide us with more reliable methods in the detection of potentially nephrotoxic medications and their combinations, thus allowing timely intervention to prevent AKI. This research will also help machine learning researchers to understand what needs to be considered when designing studies that are intended to incorporate machine learning methods to accomplish various data-driven tasks using healthcare datasets.  A   Table A1. List of databases held at ICES (an independent, non-profit, world-leading research organization that uses population-based health and social data to produce knowledge on a broad range of healthcare issues).   C15, C18,  C19, C20, C22, C25, C34, C50, C56, C61, C82, C83, C85, C91, C92,  C93, C94, C95, D00, D010, D011, D012, I62, I630, I631, I632, I633, I634, I635, I638, I639, I64, H341, I600,  I601, I602, I603, I604, I605, I606, I607, I609, I61, G450, G451, G452