Temporal, Location- and Symptom-Specific Likelihood of Patient-Reported Sensory Symptoms Related to Oxaliplatin-Induced Peripheral Neuropathy (OIPN) in Patients Receiving Oxaliplatin for Three Months

Simple Summary There are no known effective preventative interventions for oxaliplatin-induced peripheral neuropathy (OIPN) sensory symptoms of numbness, tingling and pain other than limiting drug exposure. With the shorter 3-month duration of oxaliplatin increasingly being used, compared to the previous 6-month standard, we were motivated to quantify the temporal, location- and symptom-specific likelihood of patient-reported sensory symptoms related to OIPN in 141 patients from the placebo arms of two multisite OIPN prevention trials exposed to oxaliplatin for 3 months. Despite a shorter duration of oxaliplatin, we show that OIPN was still pervasive, with patients experiencing considerable mild to moderate numbness and tingling in the lower and upper distal extremities. To avoid the debilitating sequelae from OIPN and to ensure that patients continue to receive the most efficacious doses of oxaliplatin, identification of effective OIPN preventative interventions is still needed, regardless of whether oxaliplatin is planned to be given for 3 versus 6 months. Abstract While oxaliplatin-induced peripheral neuropathy (OIPN) is more common and severe in patients who receive the previous standard, 6-month oxaliplatin-based treatment, we hypothesized that OIPN was still pervasive in patients who received shorter, 3-month-treatment regimens. Using six EORTC QLQ-CIPN20 questions that quantify numbness (N), tingling (T) and shooting/burning pain (P) in upper/lower distal extremities, our aim is to quantify patient-reported responses over 3 months (6 cycles) of oxaliplatin regarding symptom-specific timing, location and severity. For each question, patients were asked how each of the sensory symptoms had affected them during the preceding week, with 1 = “Not at all”, 2 = “A little”, 3 = “Quite a bit” and 4 = “Very much”. The proportional odds model for the cumulative log odds of response that allowed symptom-specific patient heterogeneity to be obtained was applied to a pooled dataset from the placebo arms of two multisite OIPN prevention trials and fit separately to the upper/lower distal extremities. For each symptom, we report the cycle-specific marginal probabilities for each response. In 141 patients, substantial patient heterogeneity in the likelihood, at a given cycle, of a more severe response for a symptom was present. Distinct patterns in the probabilities for each response over time for N and T were observed between the upper/lower distal extremities, while the probabilities of a response >1 for P was largely negligible in both locations. Despite the decrease in exposure to oxaliplatin from 6 to 3 months, OIPN was still pervasive with patients experiencing considerable N and T in the fingers (or hands) and toes (or feet).


Introduction
Oxaliplatin is a key component of the FOLFOX (85 mg/m 2 every two weeks) and CAPOX (130 mg/m 2 every three weeks) multi-drug regimens that are used to treat colorectal cancers. However, neuropathy is a common and troublesome long-term side effect of oxaliplatin. In total, 85-95% of patients receiving oxaliplatin experience oxaliplatin-induced peripheral neuropathy (OIPN) [1]. No known effective agents are recommended for the prevention of OIPN. There are numerous characteristics of OPIN, including numbness (N), tingling (T) and shooting/burning pain (P), which spread proximally to affect both lower and upper extremities.
The results of the SCOT trial [2], which is the largest single randomized study regarding the adjuvant treatment of colorectal cancer, recently demonstrated the non-inferiority of a shorter duration (3 months as opposed to 6 months) of either of the two adjuvant chemotherapy regimens for patients with high-risk stage II and stage III cancer of the colon and rectum. This finding was consistent with the meta-analysis of six worldwide studies conducted by the IDEA collaboration and, consequently, the recommended number of FOLFOX and CAPOX treatment cycles for adjuvant colorectal cancer has recently changed from 6 months to 3 months, particularly for those low-risk patients with T1-3, N1 disease [3,4]. Further, the SCOT trial reported that patients, both high-and low-risk, treated with adjuvant therapy for the shorter 3-month duration had substantially lower rates of neuropathy-N, T and P in their hands and feet. These data support the use of shorter 3-month oxaliplatin treatment courses.
Despite the decrease in total cycles and cumulative oxaliplatin dosage, some patients still experience dose-limiting oxaliplatin OIPN. In our Ca/Mg neuropathy prevention trial, for instance, where 94% of the 353 enrolled patients presented with T1-3 disease and received 6 months (12 cycles) of FOLFOX, the median cycle number when patients experienced dose-limiting OIPN was cycle 6 (3 months) and only about 60% were able to receive full-dose oxaliplatin by this time point; this serves as an indirect measure of OIPN, as most patients who stop receiving full-dose oxaliplatin do so due to OIPN [5]. Because the cumulative doses of oxaliplatin are similar between the 3-month FOLFOX regimen (85 mg/m 2 × 6 doses) and the multi-drug 3-month CAPOX regimen (130 mg/m 2 × 4 doses), they are expected to have a similar effect on dose-limiting oxaliplatin OIPN. From these data, we hypothesized that, while OIPN is more common and severe in patients who receive 6-month oxaliplatin-based treatment, OIPN is still pervasive in patients who receive shorter 3-month treatment regimens.
The aim of the current research study, therefore, is two-fold. First, using pooled data from the placebo arms of two completed OIPN prevention clinical trials of patients receiving FOLFOX for colon cancer, our aim is to develop and evaluate a statistical model that is compatible with the complex clinical course of OIPN during 3 months (6 cycles) of oxaliplatin to quantify the likelihood of symptom-specific timing, location and severity of patient-reported responses of N, T and P in the upper/lower distal extremities. Second, our aim is to illustrate how this information could directly inform the study design of subsequent placebo-controlled OIPN symptom intervention trials, whereby patients receive the shorter 3-month duration (6 cycles) of oxaliplatin.
To address these two aims, this article is organized as follows: In Section 2, we describe the data used to support the first aim. Herein, we define the statistical model and detail the analytic approach. Then, we present the results, in Section 3, obtained by applying the statistical model separately to the upper/lower distal extremities. In Section 4, we address the second aim. Herein, we illustrate how the results obtained from addressing the first aim can be used to design a two-arm, randomized, placebo-controlled OIPN study to determine whether an experimental intervention is effective at preventing OIPN compared with placebo in patients receiving the shorter, 3-month duration (6 cycles) of oxaliplatin. In Section 5, we conclude our presentation by summarizing our research study and discussing additional points of consideration in the design and analysis of such trials.

Data
The current study uses data from two completed randomized double-blind placebocontrolled trials conducted to evaluate potential therapy for prevention of oxaliplatininduced neuropathy (North Central Cancer Treatment Group [NCCTG] N08CB; MC11C4) [5,6]. In MC11C4, 50 patients, scheduled to undergo oxaliplatin-based therapy (FOLFOX) for stages II-III (67%) or stage IV (33%) colon cancer, were randomized to receive venlafaxine or placebo through their last dose of oxaliplatin. In NCCTG N08CB, 353 patients with stages II-III (94%) or stage IV (6%) colon cancer undergoing adjuvant therapy with FOLFOX were randomly assigned to intravenous calcium/magnesium before and after oxaliplatin, a placebo before and after, or calcium/magnesium before and placebo after. Neither study supported the use of the interventional agents for preventing OIPN in clinical practice. Because aim 1 of the current research study is to quantify the temporal, locationand symptom-specific likelihood of patient-reported sensory symptoms related to OIPN in the natural history setting during the first 3 months (6 cycles) of oxaliplatin exposure, we focused exclusively on the placebo arms in studies MC11C4 and NCCTG N08CB and on the first 3 months (6 cycles) of FOLFOX treatment. It is important to note that all patients included in this analysis received oxaliplatin for treatment of colorectal cancer while receiving placebo for prevention of OIPN.
Patient-reported responses from the 6 European Organization of Research and Treatment of Cancer (EORTC) Quality of Life Questionnaire-Chemotherapy-Induced Peripheral Neuropathy (QLQ-CIPN20) questions that quantify N, T and P in upper/lower distal extremities (Table S1; Q31-36) were recorded on day 1 of each 2-week cycle prior to oxaliplatin treatment (cycles 1-6), as well as 2 weeks after the 6th cycle (day 1 of cycle 7, prior to oxaliplatin treatment) in the placebo arms of the two recently completed OIPN prevention trials. This means that cycle 1 served as a baseline (prior to oxaliplatin exposure), while patient-reported responses measured at cycles 2-6 and at 2 weeks post-cycle 6 served as the post-baseline measurements. For each question, patients were asked how each of the sensory symptoms had affected them during the preceding week, with 1 = "Not at all", 2 = "A little", 3 = "Quite a bit" and 4 = "Very much". For a single patient, the format of the ordinal data in the fingers (or hands) is shown in Table 1A. For the 3 EORTC QLQ-CIPN20 questions that quantify N, T and P in the toes (or feet), the format of the ordinal data is the same. Response for patient i at cycle t and symptom s 1 = "Not at all", 2 = "A little", 3 = "Quite a bit", 4 = "Very much" Due to data sparsity, response outcomes of 3-4 ("Quite a bit" and "Very much") were combined such that there were K = 3 response categories and K − 1 = 2 cumulative logits Indicator variables for cycles T2 = I(t = 2), T3 = I(t = 3), T4 = I(t = 4), T5 = I(t = 5), T6 = I(t = 6) and T7 = I(t = 7; two weeks post-cycle 6), such that cycle 1 (t = 1) was the reference category s Symptom 1 = N (numbness), 2 = T (tingling), 3 = P (shooting/burning pain) Indicator variables for N = I(s = 1) and T = I(s = 2), such that P (s = 3) was the reference category Multivariate random effect that describes patient heterogeneity for N, T and P Assumed that the set {(u i1 , u i2 , u i3 )} was independent from a multivariate normal distribution, N(0, Σ), with possibly different variances and nonzero correlations The larger was σ 2 1 , σ 2 2 , or σ 2 3 , the more heterogeneous was the likelihood at a given cycle of a more severe response for that symptom across patients ρ 12 , ρ 13 , ρ 23 The correlations between the random effects were ρ 12 for {u i1 , u i2 }, ρ 13 for {u i1 , u i3 } and ρ 23 for {u i2 , u i3 } Allowed the random effects for each symptom to be correlated and the degree of correlation to be different Note. Here, y its represents the response at cycle t on symptom s for patient i (summary of notation is further detailed in Table 1B). The format of the ordinal data in the toes (or feet) is the same.

Model
Wolf and colleagues (2012) [7] recommended that symptoms in the lower extremities should be modeled distinctly from those in the upper extremities. Following this recommendation, two separate models were assumed, one for fingers (or hands) (Model 1) and one for the toes (or feet) (Model 2). Because interest lied in characterizing effects that referred both to the 4 individual response categories, as well as grouping of response categories, a proportional odds model for the cumulative log odds of response that allowed symptom-specific patient heterogeneity to be obtained was applied to the pooled dataset from the placebo arms of the two OIPN prevention trials and fit separately to the distal extremities. Due to data sparsity, however, response outcomes of 3-4 ("Quite a bit" and "Very much") were combined, such that there were K = 3 response categories and K − 1 = 2 cumulative logits. Using the notation detailed in Table 1B, we can express Model 1 (and similarly for Model 2) as the log odds of being greater than or equal to a particular response category and we can write logit[P(y its ≥ k|u is )] = log P(y its ≥k) 1−P(y its ≥k) = α k + β 1 N + β 2 T + β 3 T2 + β 4 T3 + β 5 T4 + β 6 T5 + β 7 T6 + β 8 T7 + β 9 N * T2 + β 10 N * T3 +β 11 N * T4 + β 12 N * T5 + β 13 N * T6 + β 14 N * T7 +β 15 T * T2 + β 16 T * T3 + β 17 T * T4 + β 18 T * T5 + β 19 T * T6 + β 20 T * T7 +u is , k = 2, 3.
In this generalized linear mixed model for the cumulative logits, log P(y its ≥k) 1−P(y its ≥k) describes the log odds of two cumulative probabilities and measures how likely the response is to be in category k or higher versus in a category lower than k. Specifically, one of the two cumulative logits describes the log odds of responding "Quite a bit"/"Very much" (y = 3) versus "Not at all" or "A little" (y ∈ 1, 2), while the other cumulative logit describes the log odds of responding "A little" or "Quite a bit"/"Very much" (y ∈ 2, 3) versus "Not at all" (y = 1). Because of the proportional odds assumption, the effect, as measured by the difference in log odds, is constrained to be the same for each cumulative logit (y = 3 vs. y ∈ 1, 2 and y ∈ 2, 3 vs. y = 1). Notably, the model has patient effects for each symptom. The patient effects (u i1 , u i2 , u i3 ) is a multivariate random effect that describes patient heterogeneity for N, T and P.

Analysis
To evaluate the appropriateness of the proportional odds assumption in Models 1 and 2, we also fit the corresponding non-proportional odds model and applied the Bayesian information criterion (BIC) to confirm that the BIC expressed a preference for the more parsimonious proportional odds model (data not shown). Further, the BIC was applied to evaluate the variance-covariance structure of each model.
Multiple imputation was applied to handle missing responses [8]. We know that oxaliplatin dose (mg) is both predictive of the probability of missingness and predictive of the responses. However, we had no interest in making inference on the oxaliplatin dose or conditional upon the oxaliplatin dose; in other words, oxaliplatin dose was treated as an extraneous variable, but incorporated into the imputation model because the inclusion of an extraneous variable that is highly correlated with the response greatly improves the imputations. Given the extraneous variable, the missing-at-random assumption became defensible; therefore, multiple imputation was applied to handle missing responses and the multiple imputation model included the extraneous variable, but, otherwise, was compatible with the analysis model.
For each symptom, we report the estimated difference in log odds of a more severe response between each cycle (cycles 2-6 and, at 2 weeks, post-cycle 6) and cycle 1 (day 1 of cycle 1, prior to oxaliplatin treatment) along with the corresponding 95% confidence interval (CI). Additionally, for each symptom, we report the cycle-specific marginal or population-averaged probabilities (averaged over patients) for each response outcome and the corresponding 95% CI.
The SAS procedure PROC GLIMMIX was used to fit both Model 1 and Model 2 using adaptive Gauss-Hermite quadrature to approximate the likelihood function in obtaining maximum likelihood estimates. The number of quadrature points were chosen based on monitoring the convergence of estimates and standard errors. The SAS procedure PROC MI was used to perform multiple imputation with 50 imputed datasets assuming an arbitrary missing-data pattern with the fully conditional specification method.

Results
One hundred forty-one patients scheduled to undergo oxaliplatin-based therapy (FOLFOX) for stages II-III (87%) or stage IV (13%) colon cancer were randomized to the placebo arm in OIPN prevention trials NCCTG N08CB (N = 118) and MC11C4 (N = 23), thus included in these analyses. Among the 141 patients analyzed, the median age (range indicated in squared brackets) in years was 56.0 [28.0, 83.0], 51% were female and 89% were Caucasian. On day 1 of cycle 1, prior to oxaliplatin treatment cycle 1, the number (%) of patients who reported "Not at all" (y = 1) in both the upper and lower extremities for N, T and P was 121 (88%), 121 (88%) and 134 (97%), respectively. Two weeks after cycle 6, the number (%) of patients who reported "Not at all" (y = 1) in both the upper and lower extremities for N, T and P was 54 (47%), 28 (24%) and 89 (77%), respectively.

Model 1-Fingers (or Hands)
At a given cycle, substantial patient heterogeneity in the likelihood of a more severe response for a symptom was present and the degree of heterogeneity was quite different between each symptom, with the largest degree of heterogeneity seen with P. The estimated variance componentsσ 2 1 for {u i1 },σ 2 2 for {u i2 } andσ 2 3 for {u i3 } corresponding to N, T and P were 4.55, 3.18 and 8.92, respectively; the estimated standard deviations were 2.13, 1.78 and 2.99. The large degree of heterogeneity implies tremendous variability in between-patient log odds of a more severe response at a given post-oxaliplatin cycle, compared with cycle 1. Further, concerning the estimated correlation, patient effects (u i1 , u i2 , u i3 ) were highly correlated (ρ NT = 0.86,ρ NP = 0.75,ρ TP = 0.80), with the largest correlation being between N and T. Figure 1 (LEFT) graphically displays the estimated log odds of a more severe response at cycles 2, 3, 4, 5 and 6, as well as 2 weeks post-cycle 6, compared with cycle 1 for N, T and P. For N, the estimated log odds compared with cycle 1 increased in a somewhat linear fashion through 2 weeks post-cycle 6; however, the effect was considerably less marked than T. For T, the estimated log odds compared with cycle 1 peaked at cycle 4 and then plateaued. Due to the large degree of patient heterogeneity seen with P, the confidence intervals for the estimated log odds compared with cycle 1 were so wide that they precluded any meaningful interpretation.  Figure 1 (LEFT) graphically displays the estimated log odds of a more severe response at cycles 2, 3, 4, 5 and 6, as well as 2 weeks post-cycle 6, compared with cycle 1 for N, T and P. For N, the estimated log odds compared with cycle 1 increased in a somewhat linear fashion through 2 weeks post-cycle 6; however, the effect was considerably less marked than T. For T, the estimated log odds compared with cycle 1 peaked at cycle 4 and then plateaued. Due to the large degree of patient heterogeneity seen with P, the confidence intervals for the estimated log odds compared with cycle 1 were so wide that they precluded any meaningful interpretation. Distinct patterns in the population-averaged probabilities (averaged over patients) for each response over time for N and T were observed (Figure 2A (LEFT) and Figure 2B (LEFT)). Although the probability Pr(Y = 1 "Not at all") decreased over time for N, the increase in the probability Pr(Y = 2 "A little") never exceeded the probability Pr(Y = 1 "Not at all") following 6 cycles. The probability Pr(Y = 1 "Not at all") and Pr(Y = 2 "A little") for N 2 weeks after cycle 6 was 0.535 (95% CI: 0.397, 0.674) and 0.422 (95% CI: 0.305, 0.539), respectively. Further, the probability Pr(Y = 3/4 "Quite a bit"/"Very Much") for N never Distinct patterns in the population-averaged probabilities (averaged over patients) for each response over time for N and T were observed (Figure 2A (LEFT) and Figure 2B (LEFT)). Although the probability Pr(Y = 1 "Not at all") decreased over time for N, the increase in the probability Pr(Y = 2 "A little") never exceeded the probability Pr(Y = 1 "Not at all") following 6 cycles. The probability Pr(Y = 1 "Not at all") and Pr(Y = 2 "A little") for N 2 weeks after cycle 6 was 0.535 (95% CI: 0.397, 0.674) and 0.422 (95% CI: 0.305, 0.539), respectively. Further, the probability Pr(Y = 3/4 "Quite a bit"/"Very Much") for N never exceeded 0.10. Following 6 cycles of oxaliplatin, the probability Pr(Y = 3/4 "Quite a bit"/"Very Much") for N was 0.043 (95% CI: 0.019, 0.066). For T, however, there was a marked decrease in the probability Pr(Y = 1 "Not at all"), which leveled off at a probability of approximately 0.20 by cycle 4. That marked decrease was replaced by a corresponding marked increase in the probability Pr(Y = 2 "A little"), which leveled off at a probability of 0.60 by cycle 4. The probability Pr(Y = 1 "Not at all") and Pr(Y = 2 "A little") for T, 2 weeks after cycle 6, was 0.176 (95% CI: 0.102, 0.250) and 0.631 (95% CI: 0.598, 0.664), respectively. Additionally, the probability Pr(Y = ¾ "Quite a bit"/"Very Much") for T increased to 0.20 by cycle 4 and then stabilized. Following 6 cycles of oxaliplatin, the probability Pr(Y = ¾ "Quite a bit"/"Very Much") for T was 0.193 (95% CI: 0.114, 0.272).

Model 2-Toes (or Feet)
The BIC criterion expressed a preference for a model with a single patient-level random intercept ( ). The estimated variance of the random intercept was relatively large, ( ) = 4.75, indicating that, at a given cycle, patients were highly heterogeneous in the likelihood of reporting a more severe response. Figure 1 (RIGHT) graphically displays the estimated log odds of a more severe response at cycles 2, 3, 4, 5 and 6, as well as 2 weeks after cycle 6, compared with cycle 1 for N, T and P. Compared with cycle 1, the estimated log odds generally showed a steady increase throughout 2 weeks post-cycle 6 for N and T. As seen with the fingers (or hands), the estimated log odds compared with cycle 1 for P were too imprecise to discern a pattern. For T, however, there was a marked decrease in the probability Pr(Y = 1 "Not at all"), which leveled off at a probability of approximately 0.20 by cycle 4. That marked decrease was replaced by a corresponding marked increase in the probability Pr(Y = 2 "A little"), which leveled off at a probability of 0.60 by cycle 4. The probability Pr(Y = 1 "Not at all") and Pr(Y = 2 "A little") for T, 2 weeks after cycle 6, was 0.176 (95% CI: 0.102, 0.250) and 0.631 (95% CI: 0.598, 0.664), respectively. Additionally, the probability Pr(Y = 3 4 "Quite a bit"/"Very Much") for T increased to 0.20 by cycle 4 and then stabilized. Following 6 cycles of oxaliplatin, the probability Pr(Y = 3 4 "Quite a bit"/"Very Much") for T was 0.193 (95% CI: 0.114, 0.272).

Model 2-Toes (or Feet)
The BIC criterion expressed a preference for a model with a single patient-level random intercept u i . The estimated variance of the random intercept was relatively large,σ 2 (u) = 4.75, indicating that, at a given cycle, patients were highly heterogeneous in the likelihood of reporting a more severe response. Figure 1 (RIGHT) graphically displays the estimated log odds of a more severe response at cycles 2, 3, 4, 5 and 6, as well as 2 weeks after cycle 6, compared with cycle 1 for N, T and P. Compared with cycle 1, the estimated log odds generally showed a steady increase throughout 2 weeks post-cycle 6 for N and T. As seen with the fingers (or hands), the estimated log odds compared with cycle 1 for P were too imprecise to discern a pattern.
As with the fingers (or hands), distinct patterns in the population-averaged probabilities (averaged over patients) for each response over time for N and T were also observed in the toes (or feet) (Figure 2A (RIGHT) and Figure 2B (RIGHT)). The probability Pr(Y = 1 "Not at all") decreased only modestly over time for N, such that, 2 weeks after cycle 6, the probability Pr(Y = 1 "Not at all") was 0.680 (95% CI: 0.549, 0.811). The probability Pr(Y = 2 "A little") for N increased only modestly, such that, 2 weeks after cycle 6, the probability Pr(Y = 2 "A little") was 0.286 (95% CI: 0.174, 0.398). The probability Pr(Y = 3/4 "Quite a bit"/"Very Much") for N never exceeded 0.05. Following 6 cycles of oxaliplatin, the probability Pr(Y = 3/4 "Quite a bit"/"Very Much") for N was 0.034 (95% CI: 0.013, 0.055).

Study Design
The proportional odds model fit these data well. To address aim 2, let us suppose now that we are interested in designing a two-arm, randomized, placebo-controlled OIPN study to determine whether an experimental intervention is effective at preventing OIPN compared with placebo in patients receiving 6 cycles of oxaliplatin. Here, the primary endpoint is the serially measured sensory scores for N, T and P in upper/lower extremities self-reported by the patient on day one of each cycle prior to oxaliplatin treatment (cycles 1-6) and then 2 weeks post-cycle 6. For the primary analysis, the same proportional odds model used in the data analysis of the pooled placebo arms (NCCTG N08CB; MC11C4) is adopted, one for the fingers (or hands) and one for the toes (or feet).
Because there are two cumulative logits (K − 1 = 2), the model has two intercepts, α 2 and α 3 . Because of randomization, it is reasonable to assume that all patients have the same scores at baseline for all three symptoms; therefore, we assume the parameters β 21 = β 22 = β 23 = 0. The 18 parameters from β 24 to β 41 allow the patterns of change from baseline of the odds of being greater than or equal to a particular response category for N, T and P not to be the same in the two arms (i.e., these parameters represent the arm x time interaction). The null hypothesis of no arm x time interaction versus the alternative hypothesis can then be expressed as versus H a : At least one parameter is not equal to zero.
We compare the Wald statistic to a χ 2 distribution with 18 degrees of freedom, which equals the number of parameters being tested.

Scenarios and Simulating Power
Two scenarios were considered. In scenario 1, we assumed no arm-by-time interaction (i.e., responses over time coincide) such that the log odds compared with cycle 1 (baseline) were the same in each arm and corresponded to the log odds obtained from our data analysis of the pooled placebo arms (NCCTG N08CB; MC11C4). In scenario 2, we hypothesized an arm-by-time interaction such that a large effect for N, a moderate effect for T and no effect for P was assumed. Figure 3 graphically shows the assumed effects in scenario 2 for N and T in the fingers (or hands) and toes (or feet). In this illustration, we sought to randomize 100 patients (50 per arm). Power calculations were obtained via simulation. In total, 1000 datasets were generated from each model. We assumed a two-sided significance level of α = 0.05 for both tests such that there was no adjustment for multiplicity. The proportion of times we rejected the null hypothesis, or equivalently the power, was recorded separately for the fingers (or hands) and toes (or feet). For each symptom and at each two-week cycle (2, 3, 4, 5 and 6), as well as at 2 weeks post-cycle 6, we report the estimated average log odds (compared with cycle 1, baseline) over the 1000 datasets generated as a measure of the true estimate and the corresponding empirical standard error. Simulating power was conducted using SAS. Figure 3. Design Application. Scenario 2: An arm-by-treatment interaction such that a large effect for numbness, a moderate effect for tingling and no effect for shooting/burning pain was assumed. The log odds compared with cycle 1 (baseline) for the placebo arm (blue) were based on the analysis of the pooled placebo arms from two recently completed OIPN prevention trials (N08CB; MC11C4).

Results
The results from generating 1000 datasets under scenario 1 are shown in Table 2A,B for the fingers (or hands) and toes (or feet), respectively. The proportion of times we rejected : No arm x time interaction was 5.2% and 4.9% for the fingers (or hands) and toes (or feet), respectively. Because we generated the data assuming no arm-by-time interaction in scenario 1, we were able to reasonably recover the Type I error of 5%. Table 2. Design application. Scenario 1: For each log odds (compared with cycle 1, baseline), we report the average estimate over the 1000 datasets generated as a measure of the true estimate and the corresponding standard deviation (SD) or empirical standard error. The "true" estimates of the log odds compared with cycle 1 (baseline) for the placebo arm were based on the analysis of the pooled placebo arms from two recently completed OIPN prevention trials (N08CB; MC11C4). The "true" estimates of the log odds compared with cycle 1 (baseline) assumed for the treatment arm were hypothesized as being the same as the placebo arm (i.e., responses over time coincide). Note that cycle 1 served as a baseline (prior to oxaliplatin exposure), while patient-reported responses measured at each two-week cycle (2-6), as well as at 2 weeks post-cycle 6 (day 1 of cycle 7, prior to oxaliplatin treatment), served as the post-baseline measurements. An arm-by-treatment interaction such that a large effect for numbness, a moderate effect for tingling and no effect for shooting/burning pain was assumed. The log odds compared with cycle 1 (baseline) for the placebo arm (blue) were based on the analysis of the pooled placebo arms from two recently completed OIPN prevention trials (NCCTG N08CB; MC11C4).

Results
The results from generating 1000 datasets under scenario 1 are shown in Table 2A,B for the fingers (or hands) and toes (or feet), respectively. The proportion of times we rejected H 0 : No arm x time interaction was 5.2% and 4.9% for the fingers (or hands) and toes (or feet), respectively. Because we generated the data assuming no arm-by-time interaction in scenario 1, we were able to reasonably recover the Type I error of 5%.
The results from generating 1000 datasets under scenario 2 are shown in Table 3A,B for the fingers (or hands) and toes (or feet), respectively. The proportion of times we rejected H 0 : No arm x time interaction was 98.1% and 94.8% for the fingers (or hands) and toes (or feet), respectively. In summary, testing at the 5% significance level, if we randomized 100 patients in a 1:1 fashion to either the intervention or placebo, the study would have 98.1% and 94.8% power to detect the stated interaction effects for the fingers (or hands) and toes (or feet), respectively. Table 2. Design application. Scenario 1: For each log odds (compared with cycle 1, baseline), we report the average estimate over the 1000 datasets generated as a measure of the true estimate and the corresponding standard deviation (SD) or empirical standard error. The "true" estimates of the log odds compared with cycle 1 (baseline) for the placebo arm were based on the analysis of the pooled placebo arms from two recently completed OIPN prevention trials (NCCTG N08CB; MC11C4). The "true" estimates of the log odds compared with cycle 1 (baseline) assumed for the treatment arm were hypothesized as being the same as the placebo arm (i.e., responses over time coincide). Note that cycle 1 served as a baseline (prior to oxaliplatin exposure), while patient-reported responses measured at each two-week cycle (2-6), as well as at 2 weeks post-cycle 6 (day 1 of cycle 7, prior to oxaliplatin treatment), served as the post-baseline measurements.  Table 3. Design application. Scenario 2: For each log odds (compared with cycle 1, baseline), we report the average estimate over the 1000 datasets generated as a measure of the true estimate and the corresponding standard deviation (SD) or empirical standard error. The "true" estimates of the log odds compared with cycle 1 (baseline) for the placebo arm were based on the analysis of the pooled placebo arms from two recently completed OIPN prevention trials (NCCTG N08CB; MC11C4). The "true" estimates of the log odds compared with cycle 1 (baseline) assumed for the treatment arm were hypothesized consistent with an arm-by-time interaction such that a large effect for numbness, a moderate effect for tingling and no effect for pain was assumed. Note that cycle 1 served as a baseline (prior to oxaliplatin exposure), while patient-reported responses measured at each two-week cycle (2)(3)(4)(5)(6), as well as at 2 weeks post-cycle 6 (day 1 of cycle 7, prior to oxaliplatin treatment), served as the post-baseline measurements.

Discussion
The results of this evaluation demonstrate that, despite a shorter duration of oxaliplatin (3 months instead of 6 months), OIPN was still a pervasive problem, with patients experiencing considerable mild to moderate N and T in the lower and upper distal extremities. To avoid the debilitating sequelae from OIPN [9,10] and to ensure that patients continue to receive the most efficacious doses of oxaliplatin, identification of effective OPIN preventative interventions is still needed, regardless of whether oxaliplatin is planned to be given for 3 months or 6 months.
To our knowledge, this is the first time that the likelihood and corresponding uncertainty of the estimated likelihood of symptom-specific timing, location and severity of patient-reported responses over 3 months (6 cycles) of oxaliplatin was formally quantified and is reported in the literature. In addition to providing patients and clinicians with an accurate quantification of the natural history of OIPN specifically germane to the adoption of a shorter 3-month duration of oxaliplatin, these population-averaged probability estimates (averaged over patients), which were obtained from a large cohort of patients who received a placebo, can directly inform the design of a subsequent, hypothesis-driven, placebocontrolled OIPN trial. Specifically, these population-averaged probabilities represent the natural history of the symptom-specific timing, location and severity of patient-reported responses over 3 months of oxaliplatin that would be expected on the placebo arm of a subsequent trial. We demonstrated, by way of illustration, how to express and operationalize the scientifically interesting hypothesis for an arm-by-time interaction and illustrated how to simulate statistical power to compare two arms in terms of their patterns of change from baseline in the odds of being greater than or equal to a particular response category for N, T and P separately in the upper/lower extremities. Adopting such an approach would ensure that the study design is compatible with the analytic method used in the primary analysis. The SAS programs developed for this paper to simulate power can be made available from the corresponding author on request.
Designing and analyzing data from a placebo-controlled OIPN symptom intervention trial based on an overall test of the arm-by-time interaction is appealing for several reasons. First, in such a trial, we are primarily interested in testing the hypotheses that compare the intervention and the placebo in terms of changes in the responses over time. In a randomized study, baseline symptoms are expected to be similar between arms. Failing to reject the null hypothesis (same pattern of change over time between the arms) necessarily means that the responses over time coincide. Second, the overall test of the arm-by-time interaction is completely general. The overall test does not target any specific pattern for the difference in responses over time between the arms. If the difference between arms takes a form different from the pattern hypothesized as part of the study design, one can still achieve statistical significance based on the overall test for an arm-by-time interaction. This would not be true, for example, if the study was designed by choosing a within-patient change from baseline endpoint, say, based on a total sensory neuropathy score, at a single point in time (e.g., 2 weeks post-cycle 6). The added sensitivity with such a specific endpoint comes with a price. If the study fails to detect a between-arm difference in the change from baseline to two weeks post-cycle 6, which would have been chosen in advance, the overall test for an arm-by-time interaction can still obtain a statistically significant result due to its generality. Third, while the overall test of interaction does not indicate how the two arms differ, it is straightforward to examine the regression coefficients from the fitted model and their standard errors to ascertain where the differences lie over time and according to which symptom (N, T and P).
There is no accepted primary endpoint in the design and analysis of randomized clinical trials evaluating OIPN in cancer patients receiving oxaliplatin-based chemotherapy [11]. In our opinion, it is appealing to define the primary endpoint as the serially measured EORTC QLQ-CIPN20 sensory scores for N, T and P in upper/lower extremities self-reported by the patient on day 1 of each cycle prior to oxaliplatin treatment (cycles 1-6) and 2 weeks after the 6th cycle and then adopt our statistical modeling approach that appreciates the nuanced and complex clinical course of OIPN that presents with substantial patient heterogeneity. The QLQ-CIPN20 was developed by the EORTC to assess CIPN [12]. The EORTC QLQ-CIPN20 is a multidimensional tool that has been well validated and captures the broad scope of the symptom experience; the tool recognizes that patient-reported outcomes are better tools for measuring symptoms than are physician-determined means. The statistical model we used to analyze such an endpoint was compatible with the complex clinical course of OIPN and could quantify the likelihood of symptom-specific timing, location and severity of patient-reported responses of N, T and pain in the upper and lower distal extremities. In addition, the proposed modeling approach could adequately handle missing data (intermittent and due to drop out) by incorporating the patients' serial oxaliplatin doses, which are highly correlated with the response, into the imputation model to improve the imputations; missing data are a major issue in data analysis-25% of patients discontinue oxaliplatin therapy given for 6 months (12 cycles) and, in two-thirds of these patients, the reason for discontinuation is due to OIPN symptoms [5]. Lastly, we show how such a seemingly complex model can be straightforwardly used to design a hypothesized-driven phase III clinical trial to meaningfully compare a novel intervention to placebo such that the statistical design is compatible with the statistical analysis.
Substantial patient heterogeneity in the likelihood of a more severe response for a symptom was observed and the degree of heterogeneity was quite different between each symptom in the fingers (or hands), with the largest degree of heterogeneity seen with P. We do not fully understand why this variation occurs. Suspected reasons vary widely in the literature and the reasons are likely multifactorial and may include genetic factors, certain comorbidities and the mechanism of nerve injury. Without clear understanding of patient-level factors affecting OIPN, we were reluctant to adjust for them in the statistical models; rather, we accounted for symptom heterogeneity across patients by including random effects for each symptom in the models. Another limitation of our research study is that there may be other analytic approaches different from the analytic approach adopted in this article which may be better at detecting intervention effects in OIPN trials; therefore, future research is needed to compare the performance of different analytic approaches.

Conclusions
While previous reports described patients' experience with OIPN planned to be given for 6 months (12 cycles), this paper provided a more accurate quantification of a patient's treatment experience with shorter (3 months/6 cycles) oxaliplatin treatment. Distinct patterns in the probabilities for each response over time for N and T were observed between the upper/lower distal extremities, while the probabilities of a response >1 for P was largely negligible in both locations. Despite the decrease in exposure to oxaliplatin from 6 to 3 months, OIPN is still pervasive with patients experiencing considerable N and T in the fingers (or hands) and toes (or feet). Identification of effective OPIN preventative interventions is still needed when oxaliplatin is planned to be given for the shorter, 3-month duration (6 cycles). We illustrated how the probability estimates for each response over time derived in this article can be used to design subsequent placebo-controlled OIPN trials.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/cancers14051212/s1, Table S1: EORTC QLQ-CIPN20-The first 6 questions (Q31-36) correspond to numbness, tingling, and shooting/burning pain in the upper/lower distal extremities.  Institutional Review Board Statement: The data for this manuscript came from two clinical trials, namely, North Central Cancer Treatment Group (NCCTG) N08CB and MC11C4, as stipulated in the manuscript. NCCTG is now part of the Alliance for Clinical Trials in Oncology. Both of the clinical trials were approved by the Mayo Clinic ethical committee, which we label as the Mayo Clinic Internal Review Board. The approval numbers for these clinical trials are IRB#12-004385 (NCCTG N08CB) and IRB#11-007327 (MC11C4).

Informed Consent Statement:
Informed consent was obtained from all subjects who participated in the two clinical trials involved in this current report.
Data Availability Statement: All data presented from the two clinical trials involved in this report are available on request from the corresponding author.