Adherence to CONSORT Guidelines and Reporting of the Determinants of External Validity in Clinical Oncology Randomized Controlled Trials: A Review of Trials Published in Four Major Journals between 2013 and 2015

Our primary objective was to determine the proportion of trials that report the number of patients assessed for eligibility before randomization. We performed the systematic retrieval and analysis of all phase II, III, and IV RCTs published between 2013 and 2015 in four high-impact-factor journals in the field of clinical oncology. Among 456 RCTs reviewed, 236 trials (51.8%) reported the number of patients assessed for eligibility. Among the 236 trials that reported the entire enrollment process, the reasons for patient exclusion could be found in 184 trials (78%). A flow diagram was presented in 452 trials (99.1%), and 98 trials (21.5%) included a discussion on generalizability. Reporting the parameters of external validity in medical oncology RCTs is challenging. Improving adherence to the 2010 CONSORT guidelines concerning the enrollment process could help clinicians and health policymakers establish to whom trial results apply.


Introduction
The number of scientific research papers published yearly rapidly expands [1,2]. The Consolidated Standards of Reporting Trials (CONSORT) statement was developed by a group of biomedical journal editors to improve the quality of randomized controlled trial (RCT) reporting. It consists of a set of evidence-based recommendations published in 1996 and updated in 2010 [3,4]. The CONSORT statement has since been endorsed by the International Committee of Medical Journal Editors [5] and is recommended by the Enhancing the QUAlity and Transparency Of health Research (EQUATOR) network, which is an international initiative to improve the reliability and value of published health research literature [6]. Even though their recommendations are not enforced consistently [7], their adoption has translated into measurable improvements in many RCT quality criteria [8,9], including the method of sequence generation and allocation concealment.
Nevertheless, the reporting of the enrolment process and the other determinants of external validity (also called generalizability) is challenging [10,11]. The assessment of external validity is of paramount importance as it allows clinicians to determine whether they can extrapolate findings from a study to the patients they care for in everyday practice. It can be effective only if details are provided about the participants of the study sample and the source population from which they were recruited [12].
Providing information on the enrollment process, particularly the difference between the number of patients screened for eligibility and those ultimately randomized, is crucial for several reasons. First, it estimates the proportion of all potentially eligible people who met the study requirements, which may indicate how stringent inclusion and exclusion criteria were applied to select participants. Second, it allows the detection of arbitrary exclusions, which may introduce selection biases and thus affect the representativeness of the participants included in the trial. Finally, it allows for knowing the number of participants who withdrew consent before randomization. Thus, details of the enrollment process are relevant to assess the generalizability of trial findings and optimize recruitment to RCTs by helping to identify potential obstacles to accrual.
The CONSORT statement, updated in 2010, makes several recommendations pertaining to external validity (Table 1). Specifically, the flow of participants throughout the study, from enrollment to analysis, should be detailed, and it is strongly recommended to present it in diagram form (Figure 1). The number of patients assessed for eligibility should be stated in the diagram, along with the reasons for patient exclusion. A discussion on the generalizability of the trial findings should also be included.
This review was performed to evaluate the adherence to CONSORT guidelines concerning the enrollment process and other external validity determinants in medical oncology RCTs. Our objectives were to determine the proportion of trials that report the number of patients assessed for eligibility before randomization and to identify variables affecting this reporting. We further aimed to determine the proportion of trials that state the reasons for patient exclusion and include a flow diagram and a discussion on the generalizability of the findings.  This review was performed to evaluate the adherence to CONSORT guidelines concerning the enrollment process and other external validity determinants in medical oncology RCTs. Our objectives were to determine the proportion of trials that report the number of patients assessed for eligibility before randomization and to identify variables affecting this reporting. We further aimed to determine the proportion of trials that state the reasons for patient exclusion and include a flow diagram and a discussion on the generalizability of the findings.

Study Design and Trial Selection
We performed a systematic retrieval and analysis of all phase II, III, and IV RCTs One reviewer (SA) screened all titles and abstracts for relevance. The exclusion criteria were (1) studies presenting sub-group analyses, secondary endpoints, or follow-up studies; (2) studies presenting interim results; (3) studies reporting results from multiple studies simultaneously; (4) studies investigating non-therapeutic interventions; or (5) studies in which the unit of randomization was not individual patients.

Study Design and Trial Selection
We performed a systematic retrieval and analysis of all phase II, III, and IV RCTs published 3 to 5 years after the publication of the updated version of the CONSORT statement in 2010 (i.e., January 2013 and December 2015) in the field of clinical oncology and four high-impact-factor peer-reviewed journals: Journal of Clinical Oncology, Journal of the National Cancer Institute, Lancet Oncology, and The New England Journal of Medicine.
One reviewer (SA) screened all titles and abstracts for relevance. The exclusion criteria were (1) studies presenting sub-group analyses, secondary endpoints, or follow-up studies; (2) studies presenting interim results; (3) studies reporting results from multiple studies simultaneously; (4) studies investigating non-therapeutic interventions; or (5) studies in which the unit of randomization was not individual patients.

Data Extraction
For each paper, data were extracted by one reviewer (SA) and compared to data independently extracted by a second reviewer (CD, CL, M-AT, AG, or JL) using a standardized form (Supplementary Material File S1). Discrepancies were resolved by consensus between SA and JL and by a third reviewer when no consensus was reached between SA and JL. The following information was recorded: journal, date of paper publication, cancer type, cancer stage, intervention type, study phase, sponsor type and name, mention of the number of patients assessed for eligibility or screened in the main article or the supplementary appendix, number of patients assessed for eligibility or screened, number of patients enrolled or registered, number of patients excluded before randomization and number of patients randomized, reasons for patient exclusion with the numbers of patients for each reason, inclusion of a flow diagram (whether or not the section on eligibility was there) in the main article or the supplementary appendix, whether the trial had a positive or negative outcome, and presence of a discussion on the generalizability of the trial findings.

Definition of Trial Characteristics
For this review, cancer stages were defined as follows: localized was equivalent to stage I, locally advanced was equivalent to stages II and III, and advanced was equivalent to stage IV or recurrent cancer. For hematologic cancers, the stage was other/hematologic cancer. A trial was considered positive when the investigational treatment was shown to be superior (or non-inferior in the case of non-inferiority studies) to the standard treatment. The presence of a discussion on the generalizability of the trial findings was assessed by reviewing the discussion section of the articles.

Statistical Analyses
Descriptive statistics included the frequency and proportion estimations for categorical factors and median and interquartile ranges for continuous variables. The inter-rater agreement between the reviewers was estimated using simple and weighted Cohen's kappa coefficients for categorical factors and intra-class correlation coefficients for continuous variables. Bivariate analysis included Pearson's chi-square test or Fisher's exact test when appropriate. Cross-tabulation and tests on proportions were performed between the trials that reported the number of patients assessed for eligibility before randomization and the following variables: journal, year of publication, cancer type, cancer stage, intervention type, study phase, and sponsor type. Bivariate log-binomial models were fitted to compare the pairwise proportions between these groups using Tukey-Kramer adjusted p-values. Cross-tabulation and tests on proportions were also performed between trials, including a discussion on the generalizability of the findings and the following variables: study phase and whether the trial was positive or negative. Statistical analyses were performed using SAS Statistical Software v.9.4 (SAS Institute, Cary, NC, USA) with a two-sided significance level set at p < 0.05.

Results
A total of 592 RCTs met the inclusion criteria. Of these, 136 trials were excluded due to meeting at least one exclusion criterion ( Figure 2). The characteristics of the 456 included trials are presented in Table 2. The 456 included studies are listed in Supplementary Material File S2. Among the included trials, 228 (50%) were published in Journal of Clinical Oncology, 10 (2.2%) were published in Journal of the National Cancer Institute, 167 (36.6%) were published in Lancet Oncology, and 51 (11.2%) were published in The New England Journal of Medicine. About one-third of the trials were published each year from 2013 to 2015. The distribution of cancer/disease types is described in Table 2, with each of the main cancer categories representing 10%-15% of the trials. About half of the solid cancer stages were advanced. More than two-thirds of the studies were phase III and/or IV. Industry funding was present in 66.2% of the trials. The median sample size was 364 patients, ranging between 31 and 7576 participants.
Industry funding was present in 66.2% of the trials. The median sample size was 364 patients, ranging between 31 and 7576 participants.    The inter-rater agreement between reviewers (Table 3) was almost perfect (Cohen's kappa = κ ≥ 0.81) for most variables, except for the discussion on the generalizability of trial findings (κ = 0.49). For the study phase "not found" (κ = 0.58) and the sponsor types "none" (κ = 0.5) and "not found" (κ = 0.45), the inter-rater agreement was moderate. This is due to the small number of trials in these categories and the fact that some reviewers did not look in other resources (e.g., www.clinicaltrials.gov, accessed date 25 January 2023) for missing information.
The number of patients assessed for eligibility before randomization was reported in the main article in 219 trials, while 17 trials reported this information in the supplementary appendix, for a total of 236 trials, representing a proportion of 51.8% (236/456 trials). This proportion did not vary in a statistically significant manner with the cancer stage and did not increase over time, but it varied significantly between journals, cancer type, intervention type, study phase, and sponsor type, even though the absolute number of trials in each sub-group for cancer type and intervention type was small (Table 4).    Among the 236 trials that reported the proportion of patients assessed for eligibility, 78% (184/236 trials) mentioned the reasons for patient exclusion. The reasons for patient exclusion were "not meeting inclusion criteria" in 73.7% (174/236 trials), declining to participate or withdrawing consent in 59.3% (140/236 trials), death in 14.4% (34/236 trials), adverse events in 14.4% (34/236 trials), loss to follow-up in 5.9% (14/236 trials), and other reasons in 49.2% (116/236 trials). A total of 422 trials included a flow diagram in the main article, and an additional 30 trials included the diagram in the supplementary appendix, for a total of 452 trials, representing 99.1% (452/456) of trials. Finally, 21.5% (98/456) trials included a discussion on the generalizability of the results. This proportion was significantly higher among phase III and/or IV trials compared to phase II trials (85.2% (69/81 trials) vs. 14.8% (12/81 trials), p = 0.03). It was also higher among the trials that were considered positive compared to those considered negative (26.3% (67/255 trials) vs. 15.4% (31/201 trials), p = 0.005) ( Table 5). * Tests were based on Pearson's Chi-squared test. ¤ After the exclusion of trials that were phase II and III at the same time and trials where the study phase was not found. a,b When the analysis (last column) was statistically significant (p < 0.05), all pairwise comparisons among groups were tested using the Tukey-Kramer adjusted p-value. Pairwise comparisons that were significantly different from one another are indicated by superscripts as follows: when the values for 2 groups do share a common superscript, they are significantly different (p < 0.05), whereas if the values do not share a common superscript, they are not significantly different.

Discussion
Our review provides a systematic assessment of the reporting of external validity quality criteria, as defined by CONSORT guidelines, in RCTs in the field of medical oncology from influential journals. Even though 99% of the trials reported a flow diagram of patients, in half of them, the diagram started at the "randomization" stage. Thus, information on how many patients were screened for eligibility for the trial was missing. Among trials reporting eligible patients, almost one-quarter did not report the reasons for patient exclusion. A discussion on the generalizability of the findings was present in only 26% of the trials that reported positive results.
Reviews that examined the determinants of external validity in various medical specialties have been found. In a report of 113 RCTs of healthcare interventions published in six major journals in 2004, 79% included a flow diagram, and 60% of those reported the number of patients assessed for eligibility, representing 47% of the total patient sample [10]. Another report of 469 RCTs indexed in PubMed core clinical journals in 2009 revealed that only 56% included a flow diagram, of which 81% reported the number of patients assessed for eligibility, or 45% of the total patient sample, with the reasons for patient exclusion mentioned in 60% of the trials [11]. Another review restricted to the field of medical oncology included 357 RCTs between 2005 and 2009 and showed that only 60% of these studies included a flow diagram [7]. Contrary to our findings, the results of this latter review suggest that the publication of the revised version of the CONSORT statement positively influenced the inclusion of the flow diagram of patient selection.
When retrospectively evaluating patients eligible for a clinical trial in cancer centers, it was found that 83.5% of patients fitting the main criteria in breast trials met all the eligibility criteria [13], and this number was 68% in hematological trials [14]. The recruitment fraction (number of patients recruited over the number potentially eligible) was 19.7% in breast trials and 23.1% in hematological trials. A study looking specifically at which eligibility criteria were a barrier to the recruitment of patients in the trial could not identify a unique category of eligibility criteria precluding enrollment as no shared specific eligibility criteria were reported by all trials as impeding enrolment [15].
It should be noted that the recommendation from CONSORT to provide details regarding the enrollment process is found in the suggested flow diagram but not in the checklist. Adding mandatory items related to the flow diagram in the checklist would probably help readers to evaluate whether the study population constitutes a highly selected subgroup and assess the risk of selection bias. In fact, as Palys and Berger explained [16], the CON-SORT checklist is a subset of the Chalmers scale [17], which is more complete and stipulates the importance of detailing the entire enrolment process.
The four journals we selected officially endorse the CONSORT statement and provide links to the CONSORT website in their "Instructions to authors". One solution to improve adherence to the CONSORT statement would be to ask authors to complete the CONSORT checklist when submitting their manuscripts. Another solution would be to change the wording in the "Instructions to authors" to make adherence to the CONSORT statement mandatory rather than recommended, as Shamseer et al. suggested [18].
However, we recognize that it is challenging to collect these data, especially in large multicenter trials. Still, the reasons for screening failure are usually captured in large multicenter clinical trials, and the data should be made available. At least, the steps of the recruitment process should be detailed in the study protocol, and attempts should be made to implement them in a uniform manner across different centers and research professionals. The investigators should try to record the number of people identified as potentially eligible in the pre-screening, so that we can estimate the number of patients that need to be screened for every patient enrolled in the study. Nevertheless, the number of patients who have been offered participation in a trial by a physician and have declined to participate before being screened for eligibility would be difficult to collect in a meaningful way. The process of pre-screening is subject to selection bias.
Our study has limitations. First, we evaluated data over 3 years (2013-2015) since we assume that a 3-year period would be reasonable to assess the changes after the update publication of the CONSORT guidelines in 2010, but we cannot exclude that reporting has evolved since 2016. Another limitation of our study is that the presence of a discussion on the generalizability of the trial findings is a subjective concept that depends on the reader's judgment and the clinical context, as reflected by the poor inter-rater agreement coefficient between our reviewers. We were aware of this caveat, so we proceeded to a second reexamination of this variable when there was no consensus between the first two reviewers. To help readers interpret trial findings, a section entitled "To whom do the results of this trial apply?" in manuscripts, as suggested by Rothwell in 2005 [19], would be helpful. Lastly, there seemed to be confusion in the articles we reviewed between the terms "assessed for eligibility/screened" and "enrolled/registered". The enrollment process of an RCT consists of three main steps; the first is to define a target population (i.e., a pre-screening in which the eligibility criteria guide the investigators to select the target population to approach for consent), the second is to screen the potential participants to determine their eligibility (the numbers of screened patients will always be equal to or larger than the numbers of enrolled patients), and the third is to invite eligible patients to enroll. The terms are not interchangeable; therefore, if a trial reported the number of patients "enrolled/registered", we recorded it as such, even if it is possible that the authors meant "assessed for eligibility/screened". We feel that this could be the object of clarification in the next revision of the CONSORT statement. The importance of the terms and steps of the enrollment process should be thoroughly emphasized, along with the importance of using them appropriately. Examining the numbers of excluded patients against the unmet inclusion criteria and the met exclusion criteria is crucial to the transparency and evaluation of the reliability of the enrollment process. Reporting the entire enrollment process should be mandatory. In addition, the recommendation from CONSORT to provide details regarding the enrollment process is found in the suggested flow diagram but not in the checklist. They should be added to the checklist.
The generalizability of the present study is limited to oncology trials. Nevertheless, they were published in four major journals in oncology that maintain high publication standards. All four selected journals are high-impact-factor journals publishing pivotal trials in oncology. Whether the present study findings could be generalizable to more modest journals and fields other than oncology remains unknown. Additional studies are necessary to examine these issues.

Conclusions
Our findings demonstrate that the reporting of some parameters of external validity of RCTs in medical oncology journals has been improved (e.g., presentation of the flow diagram), while others could be optimized in the future (e.g., reporting the numbers of patients assessed for eligibility). In order to facilitate the evaluation of the generalizability of trial results, investigators should be encouraged to collect and report data on all patients who consented to proceed with screening and subsequently enrolled if found eligible for the trial; if not enrolled, the reasons why they were not recruited should be reported as well.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/curroncol30020160/s1, File S1: Standardized form used for data extraction. File S2: List of the revised reports in the present study. Institutional Review Board Statement: Ethical review and approval were waived for this study due to the use of published data only.