Inter-Rater Reliability, Construct Validity, and Feasibility of the Modified “Which Health Approaches and Treatments Are You Using?” (WHAT) Questionnaires for Assessing the Use of Complementary Health Approaches in Pediatric Oncology

Background: This study aimed to test the inter-rater reliability, construct validity, and feasibility of the modified “Which Health Approaches and Treatments Are You Using?” (WHAT) questionnaires in pediatric oncology; Methods: Parent–child dyads were invited to complete self- and proxy-report-modified WHAT, Pediatric Quality of Life Inventory, demographics, a diary of the child’s recent use of CHA, and a questionnaire assessing the aspects of feasibility. Parents were asked to complete a satisfaction of their children’s use of the CHA survey; Results: Twenty-four dyads completed the study. The mean weighted kappa showed strong inter-rater reliability (k = 0.77, SE = 0.056), and strong agreements between the modified WHAT and the diary (self-report [k = 0.806, SE = 0.046] and proxy-report [k = 0.894, SE = 0.057]). Significant relationships were found only between recent and non-recent CHA users in relation to the easy access to CHA (self-report [p = 0.02], proxy-report [p < 0.001]). The mean scores of the feasibility scale (out of 7.0) for the self- and proxy-report were 5.64 (SD = 0.23) and 5.81 (SD = 0.22), respectively, indicating the feasibility of the modified WHAT; Conclusions: The findings provide initial evidence of the reliability and validity of the modified WHAT and their feasibility. Further research is needed to test the theoretical relationships and further explore the validity and reliability of the modified WHAT.


Introduction
Complementary health approaches (CHA), also known as complementary and alternative medicine, are commonly used by children with cancer with percent reporting use ranging from 6% to 100% worldwide (median = 57.8%,n = 7219 from 34 countries) [1].CHA encompasses a diverse group of healthcare products and practices that are not part of conventional medicine or the mainstream healthcare system, including nutritional (e.g., dietary supplements and herbs), psychological (e.g., mindfulness and spiritual practices), and physical (e.g., massage and spinal manipulation) approaches, or combinations thereof (e.g., yoga, acupuncture, and dance or art therapies) [2].CHA is usually used in conjunction with conventional cancer treatments [1,[3][4][5].This may be beneficial to relieve the cancer symptoms and side effects of cancer therapies [6][7][8].However, the use of some

Materials and Methods
A prospective descriptive design was used.An adapted version of the Behavioural Model of Health Services Use [33] was used to conceptualize the underlying relationships between CHA use and the relevant evidence-based variables.This study was guided by the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) guideline [34].Ethical approval was obtained from the Research and Ethics Boards of the Hospital for Sick Children (SickKids; REB# 1000072728) and the University of Toronto (Protocol# 41989) before the study began.Written informed consent was obtained from patients and parents before participation.

Hypotheses
The following were the study hypotheses:

Inter-rater reliability (H1).
There will be a moderate to strong inter-rater reliability (κ ≥ 0.5) between the children's responses to the self-report version of the modified WHAT and their parents' responses to the proxy-report version [35][36][37].

Convergent construct validity (H2).
There will be a moderate to strong agreement (κ ≥ 0.5) between the responses to both the self-and proxy-report modified WHAT questionnaires and the responses to related questions included in the self-and proxy-report electronic diaries of child use of CHA over the previous four weeks.

Feasibility of administration (H4).
The mean scores on the five-item, seven-point Likert-type feasibility of administration scale are at least 5.0 (out of 7.0) [52], indicating that both the child self-report and parent proxy-report versions of the modified WHAT questionnaires are easy to use, understand, and follow.
Positive perceived health status of the person after using CHA.

Feasibility of administration (H4):
The mean scores on the five-item, seven-point Likert-type feasibility of administration scale are at least 5.0 (out of 7.0) [52], indicating that both the child self-report and parent proxy-report versions of the modified WHAT questionnaires are easy to use, understand, and follow.

Participants
A sample of children with cancer (8-18 years) receiving therapy at SickKids and one parent of each child was invited to participate using convenience sampling.Eligible children were English-speaking, 8-18 years of age, diagnosed with cancer, undergoing cancer treatment for at least three months after diagnosis or post-cancer treatment, and were users of at least one type of CHA according to the self-or proxy-report; the parent also needed to be English-speaking.The exclusion criteria for the child and parent participants were severe cognitive impairments or major comorbid illnesses that could preclude the questionnaires' completion, as determined by their treating HCPs.

Procedure
Eligible children and their parents were identified by reviewing the inpatient and outpatient daily lists at SickKids, and consents were obtained from interested parentchild dyads.The participants were asked to independently complete the short weekly diary for four weeks; email prompts were sent that reminded them to complete the diary every seven days (i.e., days 1, 8, 15, and 22).At the end of four weeks, all participants received email invitations to complete short electronic surveys (both child self-report and parent proxy-report): the modified WHAT, Pediatric Quality of Life Inventory (PedsQL), the feasibility of administration survey, and demographic and health information surveys.The parents were also asked to complete a short survey on their satisfaction with their

Participants
A sample of children with cancer (8-18 years) receiving therapy at SickKids and one parent of each child was invited to participate using convenience sampling.Eligible children were English-speaking, 8-18 years of age, diagnosed with cancer, undergoing cancer treatment for at least three months after diagnosis or post-cancer treatment, and were users of at least one type of CHA according to the self-or proxy-report; the parent also needed to be English-speaking.The exclusion criteria for the child and parent participants were severe cognitive impairments or major comorbid illnesses that could preclude the questionnaires' completion, as determined by their treating HCPs.

Procedure
Eligible children and their parents were identified by reviewing the inpatient and outpatient daily lists at SickKids, and consents were obtained from interested parent-child dyads.The participants were asked to independently complete the short weekly diary for four weeks; email prompts were sent that reminded them to complete the diary every seven days (i.e., days 1, 8, 15, and 22).At the end of four weeks, all participants received email invitations to complete short electronic surveys (both child self-report and parent proxy-report): the modified WHAT, Pediatric Quality of Life Inventory (PedsQL), the feasibility of administration survey, and demographic and health information surveys.The parents were also asked to complete a short survey on their satisfaction with their child's CHA use.If the participants did not complete the diary or surveys within a day of receiving the relevant invitation, up to two daily reminders were sent, starting on the day following the initial invitation.If the participants failed to complete the diary or surveys by the day following the second reminder, they were considered lost to follow-up.Figure 2 shows the study schema and the assessment timeline.child's CHA use.If the participants did not complete the diary or surveys within a d receiving the relevant invitation, up to two daily reminders were sent, starting on the following the initial invitation.If the participants failed to complete the diary or sur by the day following the second reminder, they were considered lost to follow-up.F 2 shows the study schema and the assessment timeline.

Measures
We used the Research Electronic Data Capture (REDCap) web-based applicati collect data [53].Table 1 summarizes all measures used in this study.Since there is no standard measure for CHA in pediatric oncology, the research team developed brief self-and parent proxy-report electronic diaries to record child CHA use in order to en testing of the convergent construct validity of the modified WHAT questionnaires diary asked for three simple questions: the types of CHA used in the previous four w the reasons for use, and whether they were helpful.The completion time of the diary up to five minutes.

Measures
We used the Research Electronic Data Capture (REDCap) web-based application to collect data [53].Table 1 summarizes all measures used in this study.Since there is no gold standard measure for CHA in pediatric oncology, the research team developed brief child self-and parent proxy-report electronic diaries to record child CHA use in order to enable testing of the convergent construct validity of the modified WHAT questionnaires.The diary asked for three simple questions: the types of CHA used in the previous four weeks, the reasons for use, and whether they were helpful.The completion time of the diary was up to five minutes.The child self-and parent proxy-report versions of the modified WHAT questionnaires (13 and 15 items, respectively) are electronic disease-specific CHA questionnaires designed to assess and initiate clinical discussions about CHA use in pediatric oncology.The modified WHAT questionnaires have been found to have face and content validity in children with cancer [54], and they ask questions about the child's use of CHAs since their cancer diagnosis and over the previous four weeks and about their plans for use in the future.The feasibility of administration survey comprises five items marked on a 7-point scale and has a completion time of up to two minutes [52].This survey asks whether the modified WHAT instructions and items are easy to understand, use, and follow; whether the children and parents would take the time to complete the questionnaires (i.e., acceptability-if the child or parent are given the modified WHAT, would they complete it); and whether the completion time is appropriate [55].A mean score of 5.0 out of a possible 7.0 would be considered a high level of feasibility [52].
The PedsQL 4.0 Generic Core Scale has shown evidence of reliability and validity and has a completion time of five minutes [56].Cronbach's alpha internal consistency reliability for the full-scale PedsQL has been reported as 0.88 for the child self-report and 0.93 for the parent proxy-report.Construct validity has been demonstrated using the known-groups method; PedsQL scores for healthy children were compared to those of children with cancer, and the healthy children scored higher [56].The scale assessing parent satisfaction with CHA use by their child comprises four items marked on a 5-point scale, which takes less than a minute to complete, and asks questions about whether parents are satisfied with the outcome of the recent use of CHA and their availability, cost, and safety.There is evidence of reliability for the original questionnaire, with Cronbach's alpha coefficients of 0.76-0.95[57].The research team also developed brief demographic and health information surveys to ask about sex, gender, age, race, educational background, household income, the parents' use of CHA, the cost of CHA, the parents' perceptions of CHA benefits, and the child's diagnosis and cancer treatment.
Finally, the Intensity of Treatment Rating scale (ITR) 3.0 was used to collect data from patients' charts to categorize the intensity of the cancer treatment of the participating children based on treatment modality (surgery, chemotherapy, radiation therapy, or/and transplant) and the stage and risk level [58].ITR 3.0 has evidence of face and content validity, and the findings of the inter-rater reliability show a high intraclass correlation coefficient of 0.86 [58].

Data Analysis
The data were analyzed using IBM Statistical Package for Social Sciences (SPSS) 28.0 software [59].Descriptive statistics (means, standard deviations, and percentages) were calculated to describe the participant characteristics, and weighted kappas for the nominal variables were calculated to determine the inter-rater agreement between the responses to the child self-report and parent proxy-report WHAT questionnaires.Weighted kappas were also calculated to determine convergent construct validity between the recent CHA use reports-both self and proxy-in the WHAT questionnaires and the related reports in the diaries.The kappa levels of agreement were defined as follows: κ ≤ 0 indicates no agreement, 0.01-0.20 indicates slight agreement, 0.21-0.40indicates fair agreement, 0.41-0.60indicates moderate agreement, 0.61-0.80indicates strong agreement, and 0.81-1.00indicates almost perfect agreement [60].Fisher's exact tests for the nominal variables and Mann-Whitney tests for the ordinal variables were used to examine the relationship between CHA use and the related variables from our adapted conceptual model; due to the small sample size, the planned t-tests for the continuous variables could not be used, and the Mann-Whitney U tests were used instead.For all tests, the significance level was set at α = 0.05.Descriptive statistics (means, standard deviations, and percentages) were used to describe the feasibility of administration scores; since the feasibility of administration scale had no previous evidence of reliability, Cronbach's alpha was calculated-for both self and proxy-to determine internal consistency, with a coefficient of 0.7 or higher considered acceptable [61].
Power and sample size were calculated to allow the identification of a significant and strong agreement (i.e., κ2 = 0.7 or greater for each item of the modified WHAT questionnaires) over and above a null value of moderate agreement (κ1 = 0.5); that is, the required sample size was calculated so that the study would be sufficiently powered such that the confidence interval around items with a strong agreement would not cover k = 0.5.With a power of 80% and alpha of 0.05, a minimum sample of 69 parent-child dyads was required [62].Assuming 20% of the participants would drop out or be lost to follow-up [63], we aimed to enroll 83 parent-child dyads.

Results
Due to the impact of the COVID-19 pandemic on recruitment, we were unable to recruit the entire targeted sample.The results in this section are therefore preliminary.

Participants
A total of 42/106 (39.6%) parent-child dyads who were approached provided consent to participate, of whom 24/42 (57.1%) completed the study (Figure 3).The mean age of the 24 children was 14.3 years (SD = 2.8), and 50% were female, with diverse cancer diagnoses and school-grade levels.The mean age of the parents was 42.4 (SD = 8.6), and the majority were married (n = 19), mothers (n = 17), and had graduate degrees (n = 14).The participants' characteristics are outlined in Table 2.    + More than one option is possible per participant.

Inter-Rater Reliability
Twenty-four parent-child dyads participated in determining the inter-rater reliability of the child self-and parent proxy-report versions of the modified WHAT questionnaires.Findings (Table 3) showed that 22/24 (91.7%) parent-child dyads had moderate to almost perfect agreement (κ = 0.54-1.00)between them; the other two were fair (κ = 0.232) and slight (κ = 0.087).The mean of the overall weighted kappa showed strong inter-rater reliability (k = 0.77, SE = 0.056).4) showed strong to almost perfect agreement (κ = 0.66-1.00)between the child self-report questionnaire and the self-report CHA diary regarding recent use for 17 (89.5%) of those 19 participants.There was no agreement (κ = 0) for the other two (10.5%)children's responses.The mean of the overall weighted kappa indicated strong agreements between the modified WHAT and the diary (k = 0.806, SE = 0.046).Of the 24 parents, 19 (79.2%) reported their child's use of CHA in the previous four weeks.Weighted kappa coefficients (Table 5) showed strong to almost perfect agreement (κ = 0.74-1.00)between the proxy-report questionnaire and the proxy-report CHA diary regarding recent use for all 19 participants (100%).The mean of the overall weighted kappa indicated strong agreements between the modified WHAT and the diary (k = 0.894, SE = 0.057) for example, these findings suggest that with parent #3, there was almost perfect agreement with the parent's responses for the modified WHAT and the diary when averaged over all questions and responses.

Relationships between Variables Measured via the Modified WHAT and Theoretically Relevant Factors
Preliminary findings of the relationships between CHA use and the conceptually relevant variables are outlined in Table 6.As hypothesized, there was a significant difference between self-reported recent CHA users (14/18, 77.8%) and non-recent CHA users (2/6, 33.3%) regarding ease of access to CHA (p = 0.02); a significant difference was also found between proxy-reported CHA users (16/19, 84.2%) and non-recent users (0/5, 0%) (p < 0.001).Contrary to the stated hypotheses, there were no significant differences between recent CHA users and non-recent users, either self-or proxy-reported, regarding child age, parental education, the parents' use of CHA, family income, parent employment status, the parents' satisfaction with conventional cancer treatment, disease duration, the intensity of cancer treatment, the health-related quality of life score, the perceived helpfulness of the CHA used, the parents' expectations of CHA use, the parents' satisfaction with CHA use, and monthly CHA costs.There were also no significant differences, either self-or proxy-reported, between the respondents who intended a future use of CHA and those who did not regarding the perceived helpfulness of the child's recent use of CHA, the monthly costs of CHA, and the parents' satisfaction with the child's recent use of CHA.
The participants reported an average completion time of 11.54 (SD = 9.39) minutes for the self-report version of the modified WHAT and 11.92 (SD = 10.43)minutes for the proxy-report version; most of the children (16/24, 67%) and parents (15/24, 63%) found the amount of time "acceptable" or "very acceptable" .The feasibility of administration scale showed high internal consistency reliability, with Cronbach's alphas of 0.853 for the self-report version and 0.829 for the parent-report version.* Participants were asked to rate each question using 1-7 scale, with 1 being very unlikely and 7 very likely.** Participants were asked to rate each question using a 1-7 scale, with 1 being too long and 7 too short.

Discussion
This study evaluated the inter-rater reliability, construct validity, and feasibility of administration of the first electronic cancer-specific CHA questionnaires (i.e., self-and proxy-report modified WHAT) designed for use by HCPs to assess the children's use of CHA and to initiate clinical discussions about CHA with the children with cancer (8-18 years) and their parents.The preliminary findings provide initial evidence of reliability and validity of the modified WHAT questionnaires and their feasibility of administration in clinical settings.The small sample size reduced the study's statistical power and limited its ability to detect significant associations between variables.Only one hypothesized relationship was confirmed-the recent use of CHAs was significantly associated with ease of access to CHAs-but the findings offer promising initial insights into the potential of the modified electronic WHAT questionnaires.
The modified WHAT questionnaires demonstrated initial evidence of inter-rater reliability for assessing the use of CHA in children with cancer, which was not explored previously [31].A total of 22 out of 24 dyads showed moderate to almost perfect agreements (κ = 0.54-1.00);only two dyads showed slight (κ = 0.087) and fair (κ = 0.232) agreements.No clear reason was found to explain this low level of agreement by the two dyads.Future qualitative research involving interviews with parent-child dyads could provide insight into the reasons for such disagreements and may help to identify strategies to improve the inter-rater reliability of the questionnaires.
Overall, the high agreement between the self-and proxy-report versions for most of the dyads suggests that the modified WHAT questionnaires capture the same information about the child's use of CHA from the child or the parents.Self-report and proxy-report are two common methods of assessing the children's experiences and perspectives in healthcare [64], and although the findings suggest high inter-rater reliability, it is important for HCPs to consider both when assessing the children's use of CHA.In the context of children with cancer, self-reports may be limited by the child's physical, emotional, and cognitive development, while proxy-reports may be biased by the proxy's own perceptions or a lack of discussion between the parents and the child-for example, older children may use CHA without informing their parents.Thus, HCPs should seek information from both sources to gain a comprehensive understanding and should use this information to begin clinical discussions about CHA with the children and their parents.
Preliminary examination of the construct validity of the modified WHAT suggests good convergent validity, with strong to almost perfect agreement regarding CHA use between the WHAT questionnaires and both the self-and proxy-report diaries.There was no clear explanation for the disagreement (κ = 0) between the WHAT responses and the Children 2023, 10, 1500 15 of 20 diary entries of two of the children-both female, aged 14 and 16.Further research with a larger sample is therefore needed to confirm the convergent construct validity of the modified WHAT questionnaires.
Notably, only one of the theory-based hypotheses was confirmed, which may be due to the sample size falling short of the number calculated in the power analysis.Nevertheless, there is limited empirical data regarding the construct validity of CHA questionnaires in children with cancer [31], and the current study, despite its limitations, offers a valuable first step.Future research with larger sample sizes could adapt the current conceptual model to evaluate the theoretical relationships between CHA use and its related constructs and to more explore the potential of the modified WHAT questionnaires in clinical settings.
In this study, the electronic-modified WHAT questionnaires were shown to be feasible-short, easy to use, and not burdensome-clinical cancer-specific questionnaires for children with cancer in both self-and proxy-report versions, confirming the hypothesis of feasibility in clinical settings [65].Questionnaires with high feasibility can improve the quality of self-reporting while minimizing missing data [64,[66][67][68][69][70], and research has also shown that children and families prefer electronic screen-based questionnaires [66], suggesting that screen-based administration might help children stay focused and engaged [64].
The main limitation of this study is the small sample size, which may reduce the generalizability of the findings in addition to lowering the statistical power of the study, thus precluding subgroups analysis, limiting its ability to detect significant relationships between variables in hypotheses testing (H3, type II error).A larger sample could generate more accurate estimates of the reliability and validity of the modified WHAT.The study was also conducted at a single institution and included only parent-child dyads who speak English, which may limit the generalizability of the results to other contexts.The recruitment for this study was conducted over eight months at one study site.During this period, 106 eligible dyads were approached, of which only 24 dyads completed the study.The potential reasons for low enrollment and the dropout rates were that the children were overwhelmed or too sick to participate and their parents were too busy or not interested in participating in research because oncology settings are typically already research-intensive, with many calls for participants.The participants also needed to complete multiple surveys over four weeks which might impose a burden.To reduce this burden, the research team ensured that the surveys were brief and easy to complete.So, additional sites would likely be needed for future research to recruit 83 parent-child dyads at the same time.Future studies are needed to establish the reliability and validity of the modified WHAT questionnaire in other pediatric oncology settings using structural equation modeling, which may yield more precise information regarding the measurement properties.Convenience sampling is another limitation of this study as it may introduce selection bias; however, eligible participants were invited based on specific inclusion criteria, both consecutively to minimize the selection bias and over a relatively long period of time that ensured the inclusion of a diverse sample of participants.
Further research is also needed to explore pediatric oncology HCPs' perceptions of the clinical utility of the modified WHAT questionnaires, which are important because they would shed light on the likelihood that the questionnaires will be used (i.e., acceptance) [71], their perceived clinical usefulness, and the perceived barriers and facilitators of implementation in routine clinical practice.Such perspectives could be captured using a qualitative descriptive approach [72], like the one successfully used to explore the clinical utility of psychosocial screening tools in pediatric oncology [73].Individual semistructured interviews with a purposive sample of oncologists and oncology fellows, nurse practitioners, and registered nurses could be conducted, which would be recorded and transcribed verbatim and then analyzed using a simple content analysis approach to generate clinical utility themes [74].If the saturation of themes were to occur within the first 12 interviews [75], three participants from each professional group would be needed to achieve maximum variation.
Finally, future research should test the modified WHAT questionnaires with a range of cultural groups, including newly arrived immigrants to Canada.Immigrant families comprise a large and steadily growing segment of the Canadian population [76], so considering their perspectives and tailoring the modified WHAT questionnaires to their CHA use is vital, especially as CHA differ between cultural contexts [27,28,[77][78][79][80][81][82].Refinement of the questionnaires would help HCPs to ask important clinical questions and enrich their knowledge of CHA use by these important populations.As an extension, research could be conducted to test the reliability and validity of the questionnaires in low-and middleincome countries, where CHA use is high among children with cancer but disclosure rates are low compared to high-income nations [1].The modified WHAT questionnaires could be helpful in facilitating discussions with the children and their families and in generating data to compare CHA use between such countries.To achieve this, cross-cultural validity testing and linguistic adaptations would be needed, together with low-tech alternatives for administration that would fit with the available resources.

Conclusions
This study evaluated the inter-rater reliability, construct validity, and feasibility of electronic disease-specific CHA questionnaires in children with cancer.The preliminary findings suggest that the modified WHAT questionnaires are reliable and valid for assessing a child's use of CHAs, consistent between child and parent reports, and feasible for use in clinical settings.The pilot nature of this study and the use of an adapted conceptual model offer a valuable guide for future measurement property testing of the modified WHAT questionnaires.This study shows the potential of modified electronic WHAT questionnaires for assessing CHA use and initiating clinical discussions about CHA in pediatric oncology.Future research should address the limitations of this study, especially the sample size, and further explore the validity and clinical utility of the questionnaires.

Table 2 .
Characteristics of study participants.

Table 2 .
Characteristics of study participants.

Table 3 .
Weighted kappa of the dyad's responses on the two versions of the modified WHAT (selfand proxy-report).

Table 4 .
Weighted kappa of the children's responses on the self-report versions of the modified WHAT and the self-report diary of recent use of CHA.
* Findings did not meet the alternative hypothesis.Children 1, 10, 16, 21, and 22 reported no recent use of CHA in both modified WHAT and diary.Interpretation: For example, these findings suggest that with child #3, there was no agreement with the child's responses for the modified WHAT and the diary when averaged over all questions and responses.

Table 5 .
Weighted kappa of the parents' responses on the proxy-report versions of the modified WHAT and the proxy-report diary of child's recent use of CHA.
Parents 8,10,16,21,and 22reported no child's recent use of CHA in both modified WHAT and diary.Interpretation:

Table 6 .
Preliminary findings of relationships between CHA use and its related variables on the WHAT and other theoretically relevant variables.
+ More than one option is possible per participant.* Denotes statistical significance p-value < 0.05; all other p-values were not significant.

Table 7 .
Feasibility of administration testing.