Mathematical Modeling of Hydroxyurea Therapy in Individuals with Sickle Cell Disease

Sickle cell disease (SCD) is a chronic hemolytic anemia affecting millions worldwide with acute and chronic clinical manifestations and early mortality. While hydroxyurea (HU) and other treatment strategies managed to ameliorate disease severity, high inter-individual variability in clinical response and a lack of an ability to predict those variations need to be addressed to maximize the clinical efficacy of HU. We developed pharmacokinetics (PK) and pharmacodynamics (PD) models to study the dosing, efficacy, toxicity, and clinical response of HU treatment in more than eighty children with SCD. The clinical PK parameters were used to model the HU plasma concentration for a 24 h period, and the estimated daily average HU plasma concentration was used as an input to our PD models with approximately 1 to 9 years of data connecting drug exposure with drug response. We modeled the biomarkers mean cell volume and fetal hemoglobin to study treatment efficacy. For myelosuppression, we modeled red blood cells and absolute neutrophil count. Our models provided excellent fits for individuals with known or correctly inferred adherence. Our models can be used to determine the optimal dosing regimens and study the effect of non-adherence on HU-treated individuals.


Introduction
Sickle cell disease (SCD) is a hereditary disorder caused by a single gene mutation in the β-globin gene that produces sickle hemoglobin (HbS) [1]. HbS polymerizes when deoxygenated and is the nidus for the complex downstream pathobiology observed in individuals with SCD, including acute SCD-related complications (vaso-occlusive pain, acute chest syndrome, priapism, etc.) and the onset and progression of end-organ damage [2]. SCD affects approximately 100,000 people in the United States and millions globally, and every year an estimated 300,000 children are born with sickle cell anemia (SCA) across the globe [3,4]. Hydroxyurea (HU) is approved by the Food and Drug Administration for adults and children aged 2-18 years with SCD, but it is widely utilized in children beginning as early as 9 months of age [5,6]. Although HU has multiple therapeutic benefits in individuals with SCD, the primary benefits are through increasing fetal hemoglobin (HbF) and additionally increasing mean cell volume (MCV) and reducing absolute neutrophil count (ANC) and total white blood cell (WBC) counts. Clinically, HU reduces the frequency of vaso-occlusive pain crises, acute chest syndrome, number of transfusions required, and total hospitalizations [7][8][9][10]. However, there are challenges associated with HU treatment: significant interpatient variability in PK-PD, the need for timely prediction of the optimal dose, and low rates of adherence [11,12]. The first challenge is addressed in this work through PK-PD model formulation. The successful formulation of PK-PD models allows

Observations from the Data
We retrospectively analyzed data from the HUSTLE trial (NCT00305175) that were collected at St. Jude Children's Research Hospital to study the long-term effects of HU therapy in children with SCD. The data contained participants' demographics, PK, PD, and pharmacy refill records, as shown in Figure 1. The demographic data consisted of participants' gender, age, weight, height, etc. The number of males to females was 54 to 31. At the start of treatment, the subjects' ages ranged from 1.29 to 17.95 years old. The PK data were collected for 87 participants over 8 h at the beginning of HU treatment. The PK data did not include the exact plasma drug concentration versus time data. Instead, the PK data consisted of the AUC and other clinical PK parameters available from non-compartmental analysis (NCA). The PD data were collected for a larger cohort of 253 participants, with the data collection period ranging from less than 1 to 18 years across the population. The data consisted of complete blood count (CBC), MCV, and hemoglobin fractions via high-performance liquid chromatography such as hemoglobin A, F, and S versus time. Table 1 lists the summary of participants with SCD demographics, clinical PK parameters, and laboratory values of PD variables. The laboratory values were reported at the start (±10 days), after 6 (±1) months, and after 12 (±1) months of HU treatment. The pharmacy refill record provided the total dose given, the number of days for which it was given, and the days between participant visits. If the number of days for which the capsules were given was less than the number of days between participant visits, it indicated a clear case of non-adherence.  31. At the start of treatment, the subjects' ages ranged from 1.29 to 17.95 years old. The PK data were collected for 87 participants over 8 h at the beginning of HU treatment. The PK data did not include the exact plasma drug concentration versus time data. Instead, the PK data consisted of the AUC and other clinical PK parameters available from non-compartmental analysis (NCA). The PD data were collected for a larger cohort of 253 participants, with the data collection period ranging from less than 1 to 18 years across the population. The data consisted of complete blood count (CBC), MCV, and hemoglobin fractions via high-performance liquid chromatography such as hemoglobin A, F, and S versus time. Table 1 lists the summary of participants with SCD demographics, clinical PK parameters, and laboratory values of PD variables. The laboratory values were reported at the start (±10 days), after 6 (±1) months, and after 12 (±1) months of HU treatment. The pharmacy refill record provided the total dose given, the number of days for which it was given, and the days between participant visits. If the number of days for which the capsules were given was less than the number of days between participant visits, it indicated a clear case of non-adherence.    PK and PD data were available for 87 participants with SCD who were prescribed HU, and 85 participants' data were used to construct the model. Two participants were not included in the model because one had only a single data point, and one had no pharmacy data. Among the clinical variables, the key variables for modeling included MCV and HbF, biomarkers to indicate treatment efficacy. Additionally, to capture myelosuppression, the RBC and ANC profiles were modeled. Of particular interest to us was the ANC, as it helps clinicians decide the maximum tolerated dose (MTD) of HU administered. The MTD is determined when the ANC reaches the target range between 2000-4000 cells/µL.
In Figure 2, the average values and trends for the key variables of interest with HU treatment are seen over the course of 1 year of treatment. The two biomarkers, HbF and MCV, increased with the onset of HU treatment until 6 months, stabilizing following 6 months of treatment. The number of data points for HbF was lower than the number of data points for MCV, RBC, and ANC. As part of standard medical care, HbF was not collected at each visit. It was observed that some participants experienced decreases in MCV and HbF over time after 1 year of therapy, potentially due to non-adherence.
The ANC and WBC of individuals with SCD are elevated without therapy [32,33]. Hydroxyurea normalizes the average ANC and WBC. The average ANC decreased from 7000 cells/µL at the beginning to the desired level of 2000-4000 cells/µL after 6 months of HU treatment and remained stable afterward. The ANC and WBC (not shown in the figure) fluctuated for some participants, potentially due to changes in the drug amount, non-adherence, and several other reasons, such as common viral infections. With RBC, two factors are in play when HU is administered: (i) the drug decreases RBC due to myelosuppression, and (ii) the drug increases RBC due to reduced hemolysis, increasing the lifespan of RBC [34]. As a result, the average RBC did not fluctuate and remained stable at around 2.5 million cells/µL. The individual participant RBCs fluctuated within a constant range for some participants, while the myelosuppression effect was dominant for others.

Modeling
The different model components include modeling drug kinetics (PK) that describes how the drug gets absorbed, distributed, metabolized, and excreted from the body. For the PK model, the input is the drug dose, , and the output is the drug concentration in the plasma, . The second component includes modeling drug efficacy, which is captured by HbF and MCV dynamics. The efficacy model describes how the HbF and MCV levels change with respect to changes in . The third component includes modeling drug safety/toxicity, captured by how the blood cells such as ANC and RBC counts change

Modeling
The different model components include modeling drug kinetics (PK) that describes how the drug gets absorbed, distributed, metabolized, and excreted from the body. For the PK model, the input is the drug dose, D, and the output is the drug concentration in the plasma, C p . The second component includes modeling drug efficacy, which is captured by HbF and MCV dynamics. The efficacy model describes how the HbF and MCV levels change with respect to changes in C p . The third component includes modeling drug safety/toxicity, captured by how the blood cells such as ANC and RBC counts change against C p . Figure 3 shows the integrated PK-PD model components. Data analysis and modeling were performed in MATLAB R2020b [35]. HU is found to activate HbF through cellular signaling pathways [36][37][38]. For modeling the effect of HU on HbF on a cellular level, the mean cell fetal hemoglobin, F m , was calculated as shown in Appendix A. While calculating F m , the assumption was that HbF was uniformly distributed across all RBCs.

Dose Calculation
The challenge with dose calculation was that the dosing information was not provided when the participant laboratory samples were collected. The dosing information was obtained from the pharmacy refill records, which listed the total dose provided, the age at which the dosing was given, and the number of days for which the drug was given. The number of days between clinic visits, , at th visit, was computed by subtracting the participant's age between two consecutive visits, as shown in Figure 4. If , exceeded the number of days for which the drug was provided, , , the assumption was that the participant was consuming any extra capsules remaining from − 1th visit, , . Then, the number of days for which there was no capsule from the current, or prior, − 1 visits was calculated to compute the number of missed days between clinic visits, , . Therefore, the number of missed and extra doses was calculated by subtracting , from , and , as shown in the flowchart. With dose calculation, the primary assumption was that if the participant had the capsule available, they consumed it; otherwise, the dose was missed only due to lack of availability of the capsule. Suppose there were extra capsules accumulated from previous times. In that case, the assumption was that the participant used it later.
The daily dose was assumed to be constant between visits and calculated in mg/kg by dividing the total dose by participants' changing weight. The weight of participants was measured at every visit and was calculated by interpolation for in-between visits. Once , was determined, the non-adherent days were selected randomly from , . The everyday dose was plugged into the PK model to obtain the versus time profile.

Dose Calculation
The challenge with dose calculation was that the dosing information was not provided when the participant laboratory samples were collected. The dosing information was obtained from the pharmacy refill records, which listed the total dose provided, the age at which the dosing was given, and the number of days for which the drug was given. The number of days between clinic visits, N bcv,j at jth visit, was computed by subtracting the participant's age between two consecutive visits, as shown in Figure 4. If N bcv,j exceeded the number of days for which the drug was provided, N days,j−1 , the assumption was that the participant was consuming any extra capsules remaining from j − 1th visit, N extra,j−1 . Then, the number of days for which there was no capsule from the current, j or prior, j − 1 visits was calculated to compute the number of missed days between clinic visits, N nonad,j . Therefore, the number of missed and extra doses was calculated by subtracting N bcv,j from N days,j−1 and N extra,j−1 as shown in the flowchart. With dose calculation, the primary assumption was that if the participant had the capsule available, they consumed it; otherwise, the dose was missed only due to lack of availability of the capsule. Suppose there were extra capsules accumulated from previous times. In that case, the assumption was that the participant used it later.
The daily dose was assumed to be constant between visits and calculated in mg/kg by dividing the total dose by participants' changing weight. The weight of participants was measured at every visit and was calculated by interpolation for in-between visits. Once N nonad,j was determined, the non-adherent days were selected randomly from N bcv,j . The everyday dose was plugged into the PK model to obtain the C p versus time profile.  ,number of extra days for which capsules are available at th visit; , -number of initial extra capsules is assumed to be zero; , -number of non-adherence days at th visit.

Pharmacokinetic Model
The PK model consists of two compartments, a gastrointestinal (GI) tract and a plasma compartment ( Figure A1). Hydroxyurea is taken orally. It travels through the GI tract and is absorbed in the plasma with the first-order rate constant, . From plasma, HU is eliminated either via renal or metabolic pathways with the first-order rate constant, . To capture the different absorption profiles, as observed in individuals with SCD taking HU, a transit compartment model for absorption is considered, consisting of a series of compartments to introduce an exponential delay term [29,39]. It can adequately describe rapid or delayed absorption by varying the number of transit compartments. The rate of change in the amount of HU in the plasma compartment, , is given by, where ( + 1) is the number of transit compartments, is the drug amount in the final transit compartment in the gut calculated using Equation (A4) in Appendix A [29,39].
The volume of plasma, is obtained by using the following formula, where is the volume of blood obtained using an empirical formula. For subjects' weight ≥ 25 kg, the is obtained using the Nadler equation given below [40]:  . Flowchart to calculate the number of non-adherent days for jth visit from the pharmacy refill record. The pharmacy refill record contains the total dose given to the participant at every visit and the number of days for which it is given. This process assumes that the participant takes the capsule if they have it. Age j−1 and Age j -age in years at j − 1th and jth visit; N bcv,j -number of days between clinic visits; N days,j−1 -number of days HU was provided at j − 1th visit; N extra,j -number of extra days for which capsules are available at jth visit; N extra,init -number of initial extra capsules is assumed to be zero; N nonad,j -number of non-adherence days at jth visit.

Pharmacokinetic Model
The PK model consists of two compartments, a gastrointestinal (GI) tract and a plasma compartment ( Figure A1). Hydroxyurea is taken orally. It travels through the GI tract and is absorbed in the plasma with the first-order rate constant, k tr . From plasma, HU is eliminated either via renal or metabolic pathways with the first-order rate constant, k e . To capture the different absorption profiles, as observed in individuals with SCD taking HU, a transit compartment model for absorption is considered, consisting of a series of compartments to introduce an exponential delay term [29,39]. It can adequately describe rapid or delayed absorption by varying the number of transit compartments. The rate of change in the amount of HU in the plasma compartment, A p , is given by, where (N t + 1) is the number of transit compartments, a N t is the drug amount in the final transit compartment in the gut calculated using Equation (A4) in Appendix A [29,39]. The volume of plasma, V p is obtained by using the following formula, where V b is the volume of blood obtained using an empirical formula. For subjects' weight ≥ 25 kg, the V b is obtained using the Nadler equation given below [40]: For subjects' weight < 25 kg, the V b is scaled by 70 mL/kg as shown below [41,42]: The drug concentration in the plasma, C p is obtained by the following equation:

Parameter Estimation for PK Model
The parameter estimation for the PK model started with an initial guess for the PK parameters. The PK data provided did not consist of the time course of measured C p data points. Instead, it consisted of clinical PK parameters obtained from NCA. The clinical PK parameters in the data consisted of AUC, AUC ∞ , MRT ∞ , T max , C max , and λ z . The area under the first moment of the concentration-time curve extrapolating to ∞ is obtained by the following formula: The clinical PK parameters were calculated from the C p versus t plot obtained from the model. Ware et al. [14] measured HU concentrations in plasma at the following time points: t = 0, 15 min, 30 min, 1, 2, 4, 6, and 8 h after drug administration. The last time the measurements were made was 8 h after HU administration. So, 8 h was used as the last time point to obtain AUC, and 24 h was used as the time point to obtain AUC ∞ , AUMC ∞ . The AUC, AUC ∞ , and AUMC ∞ , from the model were calculated by the following equation: The maximum concentration, C max , and the time at which the drug reaches the peak value, T max , were calculated from the model. The rate constant of elimination, k e , was considered to be the same as λ z . The four model parameters, F, N, k tr , and k e , for every subject were calculated by minimizing the weighted sum of square error given below: where y j,clinical data is the clinical data value for the jth clinical PK parameter consisting of AUC, AUC ∞ , AUMC ∞ , T max , C max , and λ z .ŷ j (θ) is the model prediction for the jth clinical PK parameter, θ is the set of PK model parameters, w j is the weight associated with jth clinical data. Further, the two metrics were used as weights in the cost function shown in Equation (8). The first was when the individual data points for every clinical PK parameter were used as weights, and the second was when the means of every clinical PK parameter across all participants were used as weights. The means of every clinical PK parameter as weights produced better fits. The model was implemented in MATLAB R2020b [35] using MultiStart optimization algorithm to estimate θ. Once the PK model parameters were estimated, the C p versus t plot was obtained, from which the daily average C p , C p were calculated as shown in Appendix A). The PK model simulations were performed every day with dose as input, and the C p was computed for every day. The C p was then taken as the input for the PD models, which included modeling the effect of HU on erythropoiesis, leukopoiesis process, and HbF activation by HU. The erythropoiesis and leukopoiesis processes were modeled because HU targets the actively dividing cells present in the bone marrow in the initial stages of erythropoiesis and leukopoiesis, which eventually manifest in the cells in circulation [43].

Erythropoiesis and MCV Model
The erythropoiesis and MCV models to study the effect of HU on RBC and MCV were adapted from the work of Jayachandran et al., 2014 [44]. The erythropoiesis model divides the cells into five compartments, as shown in Figure 5. The stem cell compartment denoted as N se , consists of stem cells and early proliferating progenitors, which proliferate at the rate k pe . The proliferation is regulated by a cytokine, erythropoietin (EPO), whose production is regulated by RBCs in the periphery [45]. In SCD, RBCs undergo hemoglobin polymerization and hemolysis, resulting in decreased oxygen delivery to the cells and tissues. The hypoxia induces EPO production in the kidney, which upregulates erythroid progenitors [46]. The indirect effect of circulating cells on progenitors' proliferation is modeled here without incorporating the EPO expression. It is assumed that HU targets only proliferating cells in the N se compartment. The cells from the N se compartment transition and go through three precursor compartments where cells do not undergo proliferation but only maturation. The precursor compartments are denoted as N e1 , N e2 , N e3 . Finally, the precursor cells become fully functional erythrocytes, denoted as N e . The erythrocytes or RBCs circulate in the body for~120 days [47] and die at a rate dependent on the drug. The death rate is modeled as a function of HU because the erythrocyte half-life is dependent on HU exposure. This is due to HU increasing the lifespan of RBCs in addition to being myelosuppressive to stem cells and progenitors [34]. The model equations are given in Equation (9). The erythropoiesis and MCV models to study the effect of HU on RBC and MCV were adapted from the work of Jayachandran et al., 2014 [44]. The erythropoiesis model divides the cells into five compartments, as shown in Figure 5. The stem cell compartment denoted as , consists of stem cells and early proliferating progenitors, which proliferate at the rate . The proliferation is regulated by a cytokine, erythropoietin (EPO), whose production is regulated by RBCs in the periphery [45]. In SCD, RBCs undergo hemoglobin polymerization and hemolysis, resulting in decreased oxygen delivery to the cells and tissues. The hypoxia induces EPO production in the kidney, which upregulates erythroid progenitors [46]. The indirect effect of circulating cells on progenitors' proliferation is modeled here without incorporating the EPO expression. It is assumed that HU targets only proliferating cells in the compartment. The cells from the compartment transition and go through three precursor compartments where cells do not undergo proliferation but only maturation. The precursor compartments are denoted as , , . Finally, the precursor cells become fully functional erythrocytes, denoted as . The erythrocytes or RBCs circulate in the body for ~120 days [47] and die at a rate dependent on the drug. The death rate is modeled as a function of HU because the erythrocyte halflife is dependent on HU exposure. This is due to HU increasing the lifespan of RBCs in addition to being myelosuppressive to stem cells and progenitors [34]. The model equations are given in Equation (9).  To avoid complexity, the RBC-controlled EPO production and EPO-controlled progenitors' proliferation are bypassed, and the effect of RBCs on proliferation rate k pe is directly Pharmaceutics 2022, 14, 1065 9 of 25 modeled through a negative feedback mechanism; k pe is negatively correlated to RBCs and is modeled using Hill kinetics.
where k pe,max is the maximum proliferation rate, γ e is the steepness parameter for feedback, Ψ e is the feedback parameter. To model myelosuppression in the N se compartment by HU, the model has a death rate constant, k dse , which is dependent on C p and is modeled using Hill-type kinetics as shown below: where k dse,max is the maximum death rate constant due to HU, K dse,50 is the saturation constant for the effect of HU on RBC. The cells are transferred from the stem cell to precursor to erythrocyte compartments at a rate constant, k te . Further, the death rate of RBC is assumed to be dependent on C p to model the increased lifespan of RBC due to HU. The death rate constant k de for RBC is modeled as shown below: where k de,max is the maximum death rate constant for RBC, K de,50 is the saturation constant for the drug.

MCV Model
MCV is used as a biomarker to indicate treatment efficacy. The MCV model was adapted from the work of Jayachandran et al. [44]. MCV is obtained by dividing the total volume of RBCs by the total count of RBCs, assuming every RBC has the same volume. The total volume of cells in the circulation increases due to the influx of cells from the precursor compartments in the bone marrow and the HU-induced increase in MCV. These cells have baseline MCV, V m0 , and there is an increase in MCV due to HU. The increase in MCV due to HU is assumed to be a linear function of drug concentration, αC p . The total volume of cells decreases due to the death of RBCs with the current MCV, V m . The rate of change in total volume of all the RBCs, V TOT , is given by, The MCV is derived in Appendix A and given by the following formula:

Leukopoiesis Model
The leukopoiesis process produces leukocytes that play an essential role in defending the body against foreign invasions and inflammation [48]. The process was modeled to study the effect of HU on the progenitors and precursor cells, and eventually the leukocytes. A leukopoiesis model similar to the erythropoiesis model was adapted from the work of Jayachandran et al. [44], and the schematic is shown in Figure 6. The stem cells and proliferating cells are represented as N sl . The neutrophil precursors are represented as cells in three precursor compartments denoted as N l1 , N l2 , N l3 . The ANC in the circulation is represented by the cells in the final compartment, N l . The model equations are shown below: The proliferation of leukocytes is regulated by a cytokine, granulocyte-macrophage colony-stimulating factor (GM-CSF) [49]. The proliferation rate, , is inversely proportional to neutrophil count and is given by the following equation: where , is the maximum proliferation rate constant, is the steepness parameter for feedback, is the feedback parameter. HU targets the cells in the stem cell compartment, and the death rate constant, , is modeled by Hill-type kinetics, as shown below: where , is the maximum death rate constant, , is the saturation constant for the effect of HU on ANC.

Fetal Hemoglobin Model
For HbF, the formulated model captures its production in an average RBC due to HU. The assumption here is that every RBC makes the same amount of HbF. The HbF% is highest at birth and decreases rapidly until 4-6 months after birth, after which it diminishes gradually and reaches a minimum level after a year [50]. Some individuals have unusually high HbF levels even after 1 year of age due to hereditary persistence of fetal hemoglobin (HPFH), which protects against SCD symptoms [51,52]. The individuals with HPFH condition express elevated HbF levels in the range of 10-40% [53]. The high expression of HbF level in some individuals was correlated to their haplotype [54]. The baseline The proliferation of leukocytes is regulated by a cytokine, granulocyte-macrophage colony-stimulating factor (GM-CSF) [49]. The proliferation rate, k pl , is inversely proportional to neutrophil count and is given by the following equation: where k pl,max is the maximum proliferation rate constant, γ l is the steepness parameter for feedback, Ψ l is the feedback parameter. HU targets the cells in the stem cell compartment, and the death rate constant, k dsl , is modeled by Hill-type kinetics, as shown below: where k dsl,max is the maximum death rate constant, K dsl,50 is the saturation constant for the effect of HU on ANC.

Fetal Hemoglobin Model
For HbF, the formulated model captures its production in an average RBC due to HU. The assumption here is that every RBC makes the same amount of HbF. The HbF% is highest at birth and decreases rapidly until 4-6 months after birth, after which it diminishes gradually and reaches a minimum level after a year [50]. Some individuals have unusually high HbF levels even after 1 year of age due to hereditary persistence of fetal hemoglobin (HPFH), which protects against SCD symptoms [51,52]. The individuals with HPFH condition express elevated HbF levels in the range of 10-40% [53]. The high expression of HbF level in some individuals was correlated to their haplotype [54]. The baseline HbF varied between 0-28% in the HUSTLE data. Therefore, the model includes a basal rate of production of HbF, which is independent of HU to account for subjects' inherent machinery for the production of HbF that might vary with subjects' age. Studies showed that HU metabolizes into nitric oxide (NO) and its derivatives, such as hydroxylamine, urea, nitrite, and nitrate [55,56]. The NO binds to soluble guanylate cyclase (sGC) inside the cell and activates it [36,37]. The activated sGC is known to convert guanosine triphosphate (GTP) to cyclic guanosine monophosphate (cGMP). Studies suggested the role of cGMP in HU-induced activation of HbF [36,37], as shown in Figure 7   Without going into the complexities of signaling pathways, only two components are modeled here. One is intermediate produced from HU, and the other is HbF. The exact intermediates of HU are not known, and all the possible intermediates are clubbed into one. With this hypothesis, the rates of change in intermediates and HbF with time are modeled. In the model, HU is metabolized to an intermediate represented as C i . This intermediate could be NO or its derivatives. The C i production from HU happens through Michaelis-Menten kinetics owing to the involvement of enzymes in the degradation of HU into NO [57,58], and C i is degraded (Equation (18)). The first term in the HbF equation denotes the inherent or basal rate of production of HbF, k b f , in the absence of HU (Equation (19)). The second term represents the activated rate of production of HbF in the presence of HU through the intermediate C i and is modeled using Hill kinetics. The third term denotes the degradation of HbF. The model equation is shown below: where C i is the intermediate concentration, k met is the maximum rate constant for the intermediate production from HU, K met is the Michaelis constant, k di is the degradation rate constant for C i , k b f is the basal rate of production of HbF, k a f is the maximum rate of C i -induced HbF activation, n is the Hill coefficient, K a f is the half-saturation constant, k d f is the degradation constant for HbF.

Parameter Estimation
For the parameter estimation, multiple methods, including non-linear least-squares solver and derivative-free search, were run in series. A combination of functions lsqnonlin, fmincon, patternsearch was used from MATLAB R2020b [35], and the model was solved 25 times starting from 25 initial guesses generated randomly from a uniform distribution. The final parameter set that gave the lowest cost function and with a visually good-looking fit was selected.
The cost function, which was minimized, is the weighted sum of square errors, as shown below: where subscript i represents the time index, j represents the clinical variable index. y j (t i ) is the jth clinical data at ith time point,ŷ j ( t i |θ) is the model predicted jth clinical data at ith time point given model parameters, θ. Due to the optimization of more than one clinical variable, the cost function is normalized using weights, w j . N exp is the total number of clinical time points, and N var is the total number of clinical variables. The bifurcation analysis was performed to determine the parameter bounds for the ANC model in XPPAUT [59].

PK Model
In the PK model, the AUC, AUC ∞ , AUMC ∞ , C max , λ z , T max data from every participant were used to fit the individual PK model and estimate the PK parameters: F, k tr , N t , k e . The goodness-of-fit plots are shown in Figure 8. It shows the measured versus model-estimated values for the clinical PK parameters of AUC, AUC ∞ , AUMC ∞ , C max , λ z , T max , and the goodness-of-fit was measured by R 2 . Each of the dots represents individual participant data that were modeled. For most of the clinical variables such as AUC, AUC ∞ , AUMC ∞ , C max , and T max , the model estimates matched well with the measured values of these variables, as seen from R 2 ≥ 0.75. The model could not predict well for λ z with R 2 = 0.56, as for some participants, the estimated λ z was less than the measured value. Further, Table 2 lists the statistics of estimated PK parameters. The average value of bioavailability, F, was 0.12, which was lower than the F value of 0.7, or higher as reported earlier in the HU studies conducted in cancer [25,27]. Using the estimated PK model parameters, the C p versus time plot was obtained for 87 participants, as shown in Figure 9. The C p increased and reached its peak value in 1-2 h. The drug was cleared from the body within 24 h. The generated participant PK data resembled the true PK data from the HUSTLE study [14]. As shown in Ware et al. [14], participants with fast and slow absorption profiles were also seen from the PK plots generated here.  Using the estimated PK model parameters, the versus time plot was obtained for 87 participants, as shown in Figure 9. The increased and reached its peak value in 1-2 h. The drug was cleared from the body within 24 h. The generated participant PK data resembled the true PK data from the HUSTLE study [14]. As shown in Ware et al. [14], participants with fast and slow absorption profiles were also seen from the PK plots generated here.  Once the PK model parameters were estimated, the PK model was simulated daily with the dose calculated from the pharmacy data. Since the PK model gives , it is divided by to obtain . The was calculated daily by computing the participant's weight, height, and HCT. The four PK model parameters remained constant with time, but the change in dose changed the and other clinical PK parameters. Figure 10 demonstrates PK model simulations performed every day for a representative participant with the daily dose as input. The top plot shows the everyday dose, and the middle plot shows the versus time plot where the peak concentration, changes with change in the drug input. The ̅ was computed every day, and was assumed to be constant at ̅ for the entire day, which was then plugged into the PD model to study the effect of change in drug input on the biomarker dynamics. Once the PK model parameters were estimated, the PK model was simulated daily with the dose calculated from the pharmacy data. Since the PK model gives A p , it is divided by V p to obtain C p . The V p was calculated daily by computing the participant's weight, height, and HCT. The four PK model parameters remained constant with time, but the change in dose changed the C max and other clinical PK parameters. Figure 10 demonstrates PK model simulations performed every day for a representative participant with the daily dose as input. The top plot shows the everyday dose, and the middle plot shows the C p versus time plot where the peak concentration, C max changes with change in the drug input.
The C p was computed every day, and C p was assumed to be constant at C p for the entire day, which was then plugged into the PD model to study the effect of change in drug input on the biomarker dynamics.
with the dose calculated from the pharmacy data. Since the PK model gives , it is divided by to obtain . The was calculated daily by computing the participant's weight, height, and HCT. The four PK model parameters remained constant with time, but the change in dose changed the and other clinical PK parameters. Figure 10 demonstrates PK model simulations performed every day for a representative participant with the daily dose as input. The top plot shows the everyday dose, and the middle plot shows the versus time plot where the peak concentration, changes with change in the drug input. The ̅ was computed every day, and was assumed to be constant at ̅ for the entire day, which was then plugged into the PD model to study the effect of change in drug input on the biomarker dynamics.

MCV and RBC Models
The MCV and RBC models were fit to the clinical data of 85 individuals. Initial conditions were chosen from the baseline values of the individual data. Figure 11A shows C p plot in the middle, and the drug dose in mg/kg for every day in the bottom plot. The C p plot shows an increase, followed by a decrease and then an increase again in the C p value. This change in C p is essentially a manifestation of change in drug input. The change in C p is also reflected in the MCV behavior, because the MCV rises when C p increases and then drops as C p decreases, and so on. The model captures the dynamical changes in MCV with HU treatment initiation and with changes in C p and fits well for this fully adherent participant. Figure 11B depicts a non-adherent participant as seen from the C p profile for this participant. The regions of blue block in C p , for example, from 700 to 1000 days, indicates the presence of multiple non-adherent days. There is also a drop in the MCV data between 700 to 1000 days, indicating potential non-adherence. The model fits this drop in MCV due to non-adherence when the dosing profile contains the nonadherence information. On the other hand, the MCV data between 350 and 400 days suggest non-adherence, but the dosing profile, as seen from C p , does not contain nonadherence information. As a result, the model does not fit the drop in MCV in this region. Therefore, the model mimics adherent and non-adherent participant behavior subject to the condition that the dosing profile contains the non-adherence information. Figure 12 shows the observation versus individual prediction for all participants at all time points. The participants are color-coded, where each color represents an individual. This figure shows that the model fits well to data for most participants, and the data points fall within the 10% error of y = x line for~95% of total MCV data. the model mimics adherent and non-adherent participant behavior subject to the condition that the dosing profile contains the non-adherence information. Figure 12 shows the observation versus individual prediction for all participants at all time points. The participants are color-coded, where each color represents an individual. This figure shows that the model fits well to data for most participants, and the data points fall within the 10% error of y = x line for ~95% of total MCV data.  There are two broad categories of profiles observed for RBC data. In one category, the participant's RBC decreases when HU treatment is started and stabilizes after some time. These individuals show a clear myelosuppression trend. Another category is where the RBC data fluctuate and lack a clear trend. These individuals do not show myelosuppression. So, when the RBC data indicate myelosuppression, the model mimics that trend, as seen in Figure 13A. When the RBC is fluctuating, the model is not able to capture all the points. The model fits fluctuating RBC data with a straight line or a curve, as seen in Figure 13B, where the model fit tries to pass through as many data points as possible. There are two broad categories of profiles observed for RBC data. In one category, the participant's RBC decreases when HU treatment is started and stabilizes after some time. These individuals show a clear myelosuppression trend. Another category is where the RBC data fluctuate and lack a clear trend. These individuals do not show myelosuppression. So, when the RBC data indicate myelosuppression, the model mimics that trend, as seen in Figure 13A. When the RBC is fluctuating, the model is not able to capture all the points. The model fits fluctuating RBC data with a straight line or a curve, as seen in Figure 13B, where the model fit tries to pass through as many data points as possible.
time. These individuals show a clear myelosuppression trend. Another category is where the RBC data fluctuate and lack a clear trend. These individuals do not show myelosuppression. So, when the RBC data indicate myelosuppression, the model mimics that trend, as seen in Figure 13A. When the RBC is fluctuating, the model is not able to capture all the points. The model fits fluctuating RBC data with a straight line or a curve, as seen in Figure 13B, where the model fit tries to pass through as many data points as possible. Further, the observation versus individual prediction plot for all the RBC data is displayed in Figure 14. For most of the participants, the model fit lies within 10% error of the y = x line. For ~18% of the data points, the model fits lie outside the 10% error region. Here, Further, the observation versus individual prediction plot for all the RBC data is displayed in Figure 14. For most of the participants, the model fit lies within 10% error of the y = x line. For~18% of the data points, the model fits lie outside the 10% error region.
Here, for participants with clear myelosuppression trends, the model fits well to the data. For fluctuating RBC, the model does not capture the trends in data well. Table 3 gives the RBC and MCV model estimated parameter statistics. for participants with clear myelosuppression trends, the model fits well to the data. For fluctuating RBC, the model does not capture the trends in data well. Table 3 gives the RBC and MCV model estimated parameter statistics.

ANC Model
The ANC model was fit to the clinical data of 83 individuals, omitting two participants due to insufficient data points. The ANC of individuals with SCD is usually elevated, as leukocytes are recruited to adhere to the vessel wall and play a role in vasoocclusion [60]. In Figure 15A, the myelosuppression effect is evident, where ANC decreases from 8000 cells/µL and reaches a steady state between 2000-4000 cells/µL, which is the ideal target range for ANC. The above is an example of an adherent participant, as apparent from the C p profile here, and the model fits the data well in this case. Figure 15B Table 4 lists the statistics for the leukopoiesis model parameter estimates.   Table 4 lists the statistics for the leukopoiesis model parameter estimates.
subjects. Many data points, ~80%, fall outside the 10% error for individual predic because the neutrophil data are highly fluctuating. The model fits well to the data w clear myelosuppression trends in ANC are observed. However, the model perfo poorly when there are high ANC fluctuations. Table 4 lists the statistics for the leuko esis model parameter estimates.

HbF Model
The HbF model was fit to 81 individual participants' data, leaving 4 individuals out due to insufficient data points. The HbF model performance for the HbF participant data is demonstrated in Figure 17. The participant in Figure 17A is adherent, as seen from the C p profile where the participant is not missing a dose. The C p increases and decreases and then rises again. The initial increase in HbF is due to HU treatment initiation, and further changes in HbF follow a trend similar to that of the C p . The model fits this individual very well. The participant in Figure 17B is non-adherent at times, also shown in Figure 11B. Similarly to the MCV profile, the drop in the HbF was captured when the dose input contained the missing dose information. So, the HbF model can fit adherent and non-adherent participants conditionally, given that the dosing profile accurately describes non-adherence. and then rises again. The initial increase in HbF is due to HU treatment initiation, and further changes in HbF follow a trend similar to that of the ̅ . The model fits this individual very well. The participant in Figure 17B is non-adherent at times, also shown in Figure 11B. Similarly to the MCV profile, the drop in the HbF was captured when the dose input contained the missing dose information. So, the HbF model can fit adherent and non-adherent participants conditionally, given that the dosing profile accurately describes non-adherence. In Figure 18, it is seen that many data points fall outside the 10% error for individual predictions, with ~54% of data points outside this region. For some participants with data points outside the 10% error region, the individual predictions were higher than the observed values, indicating overprediction. This might happen when the participant starts missing the dose after HbF reaches its maximum saturation value. The lower clinical value of HbF indicates that the participant might have missed the dose. Still, if the dosing information does not contain those missing doses, the model will predict a higher level of HbF. Moreover, for many participants, the number of data points for HbF was scarce and lower than the number of data points for MCV and other variables. The scarcity of HbF data can lead to the model not representing the non-adherence in the dosing profile well. Table 5 summarizes the parameter estimates for the HbF model. In Figure 18, it is seen that many data points fall outside the 10% error for individual predictions, with~54% of data points outside this region. For some participants with data points outside the 10% error region, the individual predictions were higher than the observed values, indicating overprediction. This might happen when the participant starts missing the dose after HbF reaches its maximum saturation value. The lower clinical value of HbF indicates that the participant might have missed the dose. Still, if the dosing information does not contain those missing doses, the model will predict a higher level of HbF. Moreover, for many participants, the number of data points for HbF was scarce and lower than the number of data points for MCV and other variables. The scarcity of HbF data can lead to the model not representing the non-adherence in the dosing profile well. Table 5 summarizes the parameter estimates for the HbF model.

Discussion
During HU treatment, while HU is cleared from the body within 24 h, it takes weeks to see its effect on the biomarker levels. Existing models have focused mainly on predicting the HU PK and optimizing the dose based on the PK parameters; only a few studies have explored the relationship between PK and PD models. In this work, the focus was to build an integrated model to explain the substantial variability in the PK-PD of individuals with SCD receiving HU treatment. Our integrated PK-PD model can be used to quantitatively describe the treatment mechanism and be applied for planning dosing regimens.
The PK model gives reasonable accuracy by calculating and fitting the clinical PK parameter values obtained from NCA. Since the data that are matched are AUC, AUC ∞ , which are integrated values of C p over time, it is possible that even when the model is able to match AUC and other integrated clinical PK parameters, the model might not exactly replicate the actual time course of C p . This can cause model identifiability issues, as multiple parameter sets can estimate clinical PK parameters close to the data. In this case, matching C max and T max helps in making sure that in the model, the peak concentration occurs at the exact timepoint and value as the data provided. Therefore, having C p vs. time data would help us in improving PK model accuracy. Another challenge was that the everyday dosing information was not available. The daily dose was computed from the pharmacy records available for individual participants. While calculating everyday dose, it was assumed that the total dose remained constant in between visits. The C p was determined daily by simulating the PK model with the computed daily dose. This approach has limitations in that the PK model assumes the parameter to remain constant with the participant's age or with changes in other variables. However, the fact that these participants were pediatric made it more complicated, as they underwent several physical changes such as changes in height, weight, or physiological or anatomical changes, thus leading to the possibility that the absorption, distribution, metabolism, and excretion (ADME) of the drug might change. The change in ADME will dictate the change in PK model parameters.
Among the various PD clinical variables that were modeled, the MCV model performed well for most participants because of low variability. The MCV model was adapted from the work of Jayachandran et al., where the drug 6-mercaptopurine, similar to HU, increased MCV levels after initiation [44]. The HbF model performed well for those participants whose dosing information was likely accurate. The model fit well for participants who fell into the two categories: adherent participants and non-adherent participants, where non-adherence was seen in both the drug input and the data. However, when the drug input does not contain non-adherence information, but the data for HbF or MCV indicate non-adherence, a discrepancy occurs between the model and the data. Overall, these models help in describing the mechanism of HU in SCD. Further, some participants received blood transfusions. Separating such cases from non-adherence and incorporating and modeling blood transfusion will help improve the fit for the individuals when they receive a transfusion.
In the HbF model, one of the assumptions is that the basal rate of HbF production remains constant with age, but this need not be the case, especially if the participant is starting on HU before 1 year of age. Considering the dependency of basal HbF production rate on age might help improve the fits for participants under 1 year of age when they are initiated on HU treatment. Certain participants with a higher basal rate of production of HbF might indicate the presence of HPFH. Another assumption of HbF is that every RBC makes the same amount of HbF, but it is seen that certain RBCs produce a higher amount of HbF than others. Few studies have captured the distribution of F cell percentage [61]. A similar assumption was made for the MCV model, where every RBC was assumed to have the same cell volume.
The RBC and ANC data showed either a clear myelosuppression trend or a lack of any trend. When there was a clear myelosuppression trend, the model performed reasonably well. In contrast, when there was a lack of a clear trend, the model fit the data with a periodic solution if the data exhibited some periodicity. The model matched the data with a straight line or curve if the data did not show any periodicity. The ANC data of participants fluctuated. There are various reasons for neutrophils to demonstrate this behavior. The fluctuations in ANC can be due to disease-related complications and treatment-related issues as well as regular physiological changes. Additionally, the increase in ANC can be due to non-adherence and common infections such as cold, cough, fever, and flu. The model presented in this work did not consider such fluctuations, so it could not capture the participants who showed these fluctuations.
The models developed here could pave the path for individualized treatment of individuals affected with SCD quantitatively, which could help save time and effort for clinicians as well as participants. The models formulated in this work could be used to determine the individual trajectory of key biomarkers as well as keep the blood cell counts within the target range and determine the optimal dose in a short time span compared to the time spent in the clinic to determine the MTD. The multiple responses of individuals with SCD demand a thorough analysis and monitoring of participants' biomarkers, blood cell counts, and metabolites. When a patient is unresponsive, the interesting thing to explore will be whether the treatment is not very effective due to PK-related effects such as the lower activity of the transporting proteins or PD-related effects such as lower HbF synthesis. There appears to be a need to track non-adherence more rigorously so that model predictions can more closely correlate with clinical measurements. of Hematology (ASH) and National Heart, Lung, and Blood Institute (NHLBI). The other authors declare no conflicts of interest.

Mean Cell Fetal Hemoglobin Calculation
There are two options to compute F m :

1.
F m (pg) is calculated by multiplying HbF (%) by mean cell hemoglobin, MCH (pg), as shown below: 2. F m (pg) is also calculated by multiplying HbF (%) by hemoglobin, Hb (g/dL), dividing by hematocrit, HCT (%), and multiplying by MCV (fL). In this transit compartment model, a simple mass balance equation describing the rate of change in drug in different compartments is set up using mass-action kinetics. The drug amount in different compartments is given by [29,39] where ( + 1) is the number of transit compartments, and are the drug amounts in the 0 and th compartments, respectively, is the transit rate constant, is the drug bioavailability, is the drug input. Solving Equation (A3) analytically, the drug amount in the final compartment in the gut, , is given by,

. Average Drug Concentration Calculation
The ̅ is calculated using the formula below: Figure A1. Schematic of hydroxyurea pharmacokinetic model.
In this transit compartment model, a simple mass balance equation describing the rate of change in drug in different compartments is set up using mass-action kinetics. The drug amount in different compartments is given by [29,39], da 0 dt = −k tr a 0 ; a 0 (t = 0) = FD da i dt = k tr (a i − a i−1 ); a i (t = 0) = 0; f or i = {1, 2, .., N t } (A3) where (N t + 1) is the number of transit compartments, a 0 and a i are the drug amounts in the 0 and ith compartments, respectively, k tr is the transit rate constant, F is the drug bioavailability, D is the drug input. Solving Equation (A3) analytically, the drug amount in the final compartment in the gut, a N t , is given by, a N t = FD (k tr t) N t e −k tr t N t ! (A4) The C p is calculated using the formula below:

MCV Calculation
The MCV is given by, The rate of change in MCV is obtained from the above equation as follows: After substituting Equations (9) and (13) in (A7), the rate of change in MCV is given by,