Comparative Efficacy and Safety of Tirbanibulin for Actinic Keratosis of the Face and Scalp in Europe: A Systematic Review and Network Meta-Analysis of Randomized Controlled Trials

Actinic keratosis (AK) is a chronic skin condition that may progress to cutaneous squamous cell carcinoma. We conducted a systematic review of efficacy and safety for key treatments for AK of the face and scalp, including the novel 5-day tirbanibulin 1% ointment. MEDLINE, PubMed, Embase, Cochrane Library, clinical trial registries and regulatory body websites were searched. The review included 46 studies, of which 35 studies included interventions commonly used in Europe and were sufficiently homogenous to inform a Bayesian network meta-analysis of complete clearance against topical placebo or vehicle. The network meta-analysis revealed the following odds ratios and 95% credible intervals: cryosurgery 13.4 (6.2–30.3); diclofenac 3% 2.9 (1.9–4.3); fluorouracil 0.5% + salicylic acid 7.6 (4.6–13.5); fluorouracil 4% 30.3 (9.1–144.7); fluorouracil 5% 35.0 (10.2–164.4); imiquimod 3.75% 8.5 (3.5–22.4); imiquimod 5% 17.9 (9.1–36.6); ingenol mebutate 0.015% 12.5 (8.1–19.9); photodynamic therapy with aminolevulinic acid 24.1 (10.9–52.8); photodynamic therapy with methyl aminolevulinate 11.7 (6.0–21.9); tirbanibulin 1% 11.1 (6.2–20.9). Four sensitivity analyses, from studies assessing efficacy after one treatment cycle only, for ≤25 cm2 treatment area, after 8 weeks post-treatment, and with single placebo/vehicle node confirmed the findings from the base case. Safety outcomes were assessed qualitatively. These results suggest that tirbanibulin 1% offers a novel treatment for AK, with a single short treatment period, favourable safety profile and efficacy, in line with existing topical treatments available in Europe.


Introduction
Actinic keratosis (AK) is a chronic, recurrent skin condition caused by long-term sun exposure, which leads to skin damage presenting as small, red, rough, scaly lesions [1]. These lesions are often asymptomatic but may be sore or itch [2]. AK is a heterogenous condition in its pathophysiology, clinical manifestation, histologic features and disease course [3]. Data on the prevalence of AK in Europe is still scarce [4,5], although prevalence increases with age, is positively correlated with male gender [6], and is highest in countries A MEDLINE (OvidSP) search strategy (Supplementary Figure S1) was designed to identify RCTs on the interventions of interest for patients with AK of the face or scalp and was translated appropriately for six further databases (Supplementary Table S1). Searches of trial registers, regulatory body websites and systematic review resources were also conducted (Supplementary Table S1). No date or language restrictions were applied to the searches.
A single reviewer assessed the search results and removed the obviously irrelevant records, such as those about ineligible diseases or conducted in children. Following this, titles, abstracts and then full texts were screened by double independent reviewers with any disagreements adjudicated by a third reviewer.
Where results for one trial were reported in more than one paper, all related papers were identified and grouped together to ensure that participants in individual trials were only included once. Data extraction and quality assessment (using the Cochrane Risk of Bias tool version 1.0 [24]) were undertaken by a single reviewer with a second reviewer checking all data points. For each outcome, data were extracted at all time points reported. Papers reporting pooled studies were only used where individual study data were not available for either of the studies being pooled.
The Cochrane Risk of Bias tool [24] considers seven criteria. Studies that adequately addressed all seven criteria were judged to be of 'high quality'. When one or more of the criteria were rated as 'unclear' (i.e., insufficient information was reported to assess the criteria) but all the other criteria were well addressed, the study was judged to have an overall 'unclear' risk of bias. When one or more of the criteria were not adequately addressed, the study was considered to have 'serious methodological concerns'.

Feasibility Assessment Methods
Following data extraction, the similarity of the included trials and their suitability for combining in an NMA was qualitatively assessed in accordance with guidance from the Pharmaceutical Benefits Advisory Committee (PBAC) [29]. Trials were compared based on their designs and risk of bias, characteristics of the recruited patient population, interventions included (e.g., doses, frequency and duration of treatment), outcomes reported (including timepoints of assessment and follow-up) and outcome measures used. Any trials deemed excessively methodologically or clinically heterogenous were not eligible for inclusion in any synthesis (see Supplementary Section S2).

Statistical Analysis Methods
Data were prioritized in this order: intent-to-treat population, data for the full analysis set, data for the per-protocol population. Outcome data for completers were only used when no other data were available.
Bayesian NMA was applied to the proportion of patients experiencing the outcome of interest. A regression model with a binomial likelihood and a logit link function was used [30]. Relative treatment effects were estimated as log odds ratios (LORs) and were transformed to odds ratios (ORs) for presentation, with 95% credible intervals (CrI) also reported. Model fit was assessed using deviance information criterion.
Both fixed and random effects models were fitted. Due to the heterogeneity of the studies included in the NMA, random effects models were deemed more appropriate and are reported in this manuscript. Networks were plotted as node = treatment, edge = study, and were examined for connectedness.
Non-informative prior distributions were used, with trial-specific baseline and treatment effects assigned Normal (0, 1000) priors. For random effects models, a weakly informative prior distribution was used for the between-study heterogeneity parameter as the number of links in the networks to inform the estimate of this parameter was relatively low. A log-normal (−2.29, 1.582) distribution was used, as suggested by Turner et al. [31], for between-study heterogeneity when analyzing 'symptoms reflecting continuation/end of condition' data.
Studies with zero counts for both/all arms did not contribute evidence to the network. This is a consequence of modeling relative treatment effects and is not surmountable by methods such as adding 0.5 to every arm [30]. Therefore, studies with zero counts for both/all arms were excluded from the NMA.
Heterogeneity was quantitatively assessed for all contrasts in each network informed by two or more studies. Pairwise meta-analyses were conducted and quantified with the I 2 statistic. Inconsistency was assessed using the node-splitting method [32] where feasible within the network geometry.

Timepoints for Outcome Assessment
Given that each intervention of interest has a different recommended (per label) length of treatment and expected time to optimum patient outcomes, to apply a blanket "timepoint of interest" on a per outcome basis would have increased heterogeneity. Instead, the point at which each outcome was assessed was specific to each intervention. The following list contains references to the labeling data used to inform these decisions. Not all formulations/doses were available in all markets at the time of conducting the analyses.
The following timepoints were established prior to the conduct of the NMA: • Tirbanibulin 1% (TIRBA1%): 57 days after start of treatment • 5-fluorouracil 5% (5FU5%) and 5-fluorouracil 4% (5FU4%) [ For studies reporting outcomes at multiple timepoints, outcome data were selected to be as close as possible to these timepoints. A sensitivity analysis (see Section 2.5.4) investigated the impact of excluding studies assessing efficacy outcomes at very early timepoints, i.e., less than eight weeks after the end of treatment.

Sensitivity Analysis: Single Course Data Only
In the base case NMA, all studies were analyzed together regardless of the number of courses or sessions of treatment assessed. To increase homogeneity across studies, and comparability with trials of TIRBA1%, we ran a sensitivity analysis including only studies assessing one course or session of treatment. The results were compared with those of the base case analysis (including all studies) to identify differences that may have been due to increased heterogeneity in the base case analysis.

Sensivity Analysis: ≤25 cm 2 Assessment Area Only
Labeling for TIRBA1% [20] suggests treatment across an area of maximum 25 cm 2 , reflecting the design of the included TIRBA1% trials. In order to increase homogeneity of the evidence base with the two studies of TIRBA1%, a sensitivity analysis of the outcome complete clearance was conducted to include only studies in which the skin area assessed is ≤25 cm 2 .

Sensitivity Analysis: Single Placebo Node
The analyses reported in this paper were conducted with two placebo nodes: topical placebo/vehicle (PLAC_TOP) and placebo PDT (PLAC_PDT)(see Supplementary Section 2 for more details). There are different approaches taken in the existing literature. Our methodology is aligned to that of Ezzedine et al. 2021 [47], but Vegter and Tolley 2014 [21] and Gupta 2013 [48] both merged the two placebo nodes to obtain a more compact network. We assessed the impact of these different approaches by conducting a sensitivity analysis of complete clearance in which all placebo/vehicle treatments were pooled into one single node (PLAC).

Sensitivity Analysis: Studies Assessing Outcomes at ≥8 Weeks after Treatment
A sensitivity analysis was conducted to include only studies that assessed efficacy at eight weeks or more after the end of treatment. This analysis was designed to exclude studies assessing efficacy prior to complete epidermal regeneration of the treated area. The regeneration cycle of the epidermis usually ranges from 35-45 days [49], but is subject to environmental differences [50]. In an elderly population, and those with chronically sun-damaged skin, we took eight weeks to be a realistic yet conservative assumption of regeneration time, before which earlier assessment of response could lead to over-or under-estimation of the clearance, as new or residual lesions may be masked by local skin reactions in the treatment area.

Results of the Literature Searches and Screening
Searches were conducted between 24 June 2020 and 2 September 2020 and identified 4129 records. Following deduplication, 2712 records were assessed for relevance. A total of 145 documents were excluded at full text screening (see Supplementary Table S2). The PRISMA flow diagram is shown in Figure 1, with explanatory details of exclusion reasons in Supplementary Section S1.3.
A total of 46 studies reported in 86 documents were included in the systematic review and are listed in Supplementary Table S3. Six additional papers were included but not data extracted (see Supplementary Table S4); one of these papers was not published in English and the remaining five papers reported pooled results of studies for which disaggregated data were already available.
English and the remaining five papers reported pooled results of studies for which disaggregated data were already available.

Results of the Feasibility Assessment
Key characteristics of the 46 studies included in the review are presented in Supplementary Table S5.
Following the qualitative assessment of similarity, 6 of the 46 studies were deemed to be unsuitable for inclusion in the NMA (see Supplementary Section 2.1). One study [94] did not report the number of patients assessed per arm, meaning that insufficient data were available for the NMA. The other five studies [58,76,88,90,93] were unsuitable as the treatment dose, length or schedule assessed differed from both the US and European labels and was insufficiently similar to other included studies assessing the same treatments (see Supplementary Table S5 for further details).
Due to differences in the US and EU labeling for IMQ5%, studies assessing this intervention were split into those assessing the US (treatment for 16 weeks [96]: IMQ5%_USA) and European (one or two treatment periods of 4 weeks [44]: IMQ5%_EU) schedules. Of the 40 studies remaining, a further five studies [55][56][57]80,83] were not relevant to a European perspective, as they assessed the US posology of IMQ5%. Only studies assessing IMQ5%_EU contributed to the analysis of the Europe perspective. Thirty-five studies (Supplementary Table S6) were therefore eligible for inclusion in the analyses. Not all the 35 studies included in the qualitative analyses and NMA reported data for all outcomes; therefore, some interventions are not represented in some analyses.

Qualitative Synthesis: Europe
Following the feasibility assessment, outcomes were summarized through a qualitative synthesis, with the exception of complete clearance for which quantitative synthesis with NMA was feasible. A summary is reported below and full results for outcomes other than complete clearance can be found in Supplementary Section S3.

Lesion Count Reduction
No networks assessing this outcome were possible due to insufficient reporting of data by the included studies. Data for the qualitative analysis of lesion count reduction were available for TIRBA1%, MAL_PDT, IM0.015%, DICLO3%, 5FU4%, 5FU5%, 5FU0.5% + SA, ALA_PDT and IMQ3.75%. Definitions of lesion count reduction were not well reported, and it was not always clear whether the included studies reported a mean or median percentage reduction. Further information can be found in Supplementary Section S3.1.1.

Discontinuation due to AEs or LSRs
Rates of discontinuation due to TEAEs, TRAEs, local AEs, or LSRs are detailed in Table 2.

Incidence of Severe LSRs
Data on the incidence of at least one severe LSR following treatment were available for TIRBA1%, DICLO3%, 5FU0.5% + SA, ALA_PDT, IMQ3.75% and IMQ5%_EU. No data on the incidence of specific severe LSRs were available for CRYO, IM0.015%, MAL_PDT, or 5FU (4% or 5%). Data are presented in Table 3 and additional narrative description can be found in Supplementary Section S3.1.3.

Base Case Analysis
The base case analysis included all eligible studies relevant to the European perspective, assessing any number of courses of treatment, and reporting the outcome complete clearance. The network diagram and ORs for the base case analysis are presented in Figure 3, with data displayed in Table 4. Details of posology, duration and number of cycles assessed are presented for all studies in Supplementary Table S5. In the base case, all active treatments were associated with higher odds of complete clearance than topical placebo/vehicle (5FU0.5% + SA 7. 5FU5% and 5FU4% had higher ORs than other active treatments, however, wide, overlapping credible intervals were reported. DICLO3% had the lowest OR compared to other active treatments. Therefore, ORs were comparable for TIRBA1%, MAL_PDT, IMQ5%_EU, IMQ3.75%, IM0.015%, CRYO, ALA_PDT, 5FU5%, 5FU4% and 5FU0.5% + SA, with overlapping credible intervals.

Sensitivity Analysis: Single Course Data Only
Application of a single-course-of-treatment-only filter resulted in the exclusion of ALA_PDT, CRYO, IMQ3.75%, IMQ5% and MAL_PDT. Otherwise, the results of the analysis based on a subset of studies reporting complete clearance after a single course of treatment were consistent with those from the base case analysis (see Supplementary Figure S2). All active treatments were associated with higher odds of complete clearance than topical placebo/vehicle (5FU0.5% + SA 6. Application of a filter to include only studies assessing a treatment area of ≤25 cm 2 resulted in the exclusion of ALA_PDT, IMQ3.75%, 5FU4% and 5FU5%. Results of the complete clearance analysis based on this subset of studies were broadly consistent with the base case analysis (see Supplementary Figure S3). The following active treatments were associated with higher odds of complete clearance than topical placebo/vehicle: 5FU0.5% + SA 6.

Sensitivity Analysis: Single Placebo Node
A sensitivity analysis of complete clearance was conducted whereby the two placebo nodes in the base case network (PLAC_TOP and PLAC_PDT) were pooled into a single placebo node (PLAC); the results were consistent with those of the base case analysis (see Supplementary Figure S4). This suggests that the equivalency of placebos is an acceptable assumption. All active treatments were associated with higher odds of complete clearance than placebo/vehicle (5FU0.5% + SA 7.

Sensitivity Analysis: Studies Assessing Efficacy ≥8 Weeks after Treatment Only
This sensitivity analysis of complete clearance included only studies assessing efficacy ≥8 weeks after the end of treatment: 5FU4% and 5FU5% were both excluded from this sensitivity analysis as the single study of 5FU4% or 5% relevant to the European perspective [63] assessed complete clearance at four weeks following the end of treatment. In this sensitivity analysis, the following active treatments were associated with higher odds of complete clearance than topical placebo/vehicle: 5FU0.5% + SA 6.

Discussion
This systematic review and NMA provides a comprehensive assessment of the comparative efficacy and safety of existing treatments for AK in Europe, including the novel treatment TIRBA1%.
In the base case analysis, the active treatments demonstrated higher odds of complete clearance than topical placebo/vehicle. TIRBA1% appeared superior to DICLO3% and similarly efficacious to MAL_PDT, IMQ5%_EU, IMQ3.75%, IM0.015%, CRYO, ALA_PDT, 5FU5%, 5FU4% and 5FU0.5% + SA. The same pattern of treatment effects over placebo can be seen in the sensitivity analysis including only studies assessing after a single treatment cycle, or in the sensitivity analysis assessing a treatment area of ≤25 cm 2 , designed to more closely reflect the labeled posology for TIRBA1% [20,97].
Other recent NMAs of treatments for AK found that ALA_PDT [21,98], IMQ5% [21], 5FU0.5% + SA [21,47,48], and 5FU [47,48,99] were associated with the highest probability of achieving clearance. Interestingly, a recent NMA of long-term efficacy [22] concluded that at 12-month follow-up, 5FU5% "did not show significant long-term efficacy over placebo/vehicle for participant complete . . . clearance". It is possible that 5FU efficacy varies with time, with a stronger effect sooner after treatment; this may be a confounder of the analysis. The study forming the evidence body for 5FU4% and 5FU5% [63] had, however, a high risk of bias, with incomplete reporting of blinding and no reporting of the main efficacy outcome (complete clearance) for the two placebo arms.
In the base case analysis of complete clearance, the OR for DICLO3% was lower than the other active comparators, with credible intervals overlapping with IMQ3.75% only. One factor which may have contributed to this result is the hyaluronic acid gel vehicle used in some trials of DICLO3%, which may have some efficacy of its own and lead to comparatively high response in the placebo/vehicle arm. Thus, the relative efficacy of DICLO3% may be underestimated for this reason.
In the sensitivity analysis including only studies assessing a treatment area of ≤25 cm 2 , all active treatments showed consistently overlapping credible intervals with TIRBA1% (except DICLO3%, which showed a very small overlap in one sensitivity analysis only). The consistency of these results suggests that the comparative efficacy of treatments may be independent of the size of the treatment field. We note that, since conducting these analyses, IM0.015% has been withdrawn from use in the EU and US [26,100].
The efficacy outcome lesion count reduction was not reported in sufficient detail to allow a quantitative analysis. Any amount of treatment was associated with lesion count reductions in the intervention arms and when compared with placebo/vehicle. However, the comparisons with placebo/vehicle were sparse and the evidence was not robust (as also concluded by , who stated that "The mean reduction of lesions and occurrence of adverse events was poorly reported" [98]).
Like Steeb et al. 2021 [98], the current review also found reporting of data on the relative safety profiles of the interventions assessed to be inconsistent and sparse (particularly in the case of severe LSRs). However, the safety profile of TIRBA1% showed low rates of severe LSRs (<11% for any given LSR [53,54]) while IMQ5%_EU and IMQ3.75% showed relatively high rates of some severe LSRs with 31% [61] and 25% [59] of patients experiencing severe redness and erythema, respectively. No data on the incidence of any specific severe LSR were found through this review for CRYO, IM0.015%, MAL_PDT, or 5FU (4% or 5%). Overall, the lack of reporting of severe LSRs is a major weakness of the evidence base, especially given the clinical significance of these outcomes for treatment selection, which should be addressed in future work.
Reported rates of discontinuation due to AEs were low across all interventions assessed. In both the TIRBA1% trials, 0% rates of discontinuation due to TRAEs, TEAEs, local AEs or LSRs were reported in the active (353 patients) and placebo (349 patients) arms, suggesting very high treatment adherence and tolerability. Length of treatment regimen has an impact on patient experience, and it is possible that single course treatments offering similar efficacy to multiple course treatments may have an advantage in terms of cost and patient convenience [101,102]. This offers potential advantages over other existing treatments, particularly when considering the short duration of treatment with once per day application for five days which may be of particular interest for patients with limited adherence or limited capacity for self-application, a therapeutic benefit acknowledged by the Scottish Medicines Consortium [103].

Limitations and Assumptions
Although the feasibility assessment identified which outcomes were suitable for the NMA, some remaining limitations due to heterogeneity of the studies need to be acknowledged. Sensitivity analyses were used to assess the impact of the possible main sources of heterogeneity.
Only four of the 46 studies (9%) included in the review were considered to have an overall low risk of bias [51][52][53][54], and in many of the included studies, it was not possible to truly blind the patients and/or study personnel involved in administrating the intervention. While this was unavoidable for those trials assessing interventions requiring different methods of administration, it introduced heterogeneity in the methods across the included studies and may have impacted the relative results obtained for the treatment comparisons informed by open-label studies.
The RCTs included in this review were characterized by substantial differences in their designs. Of the studies contributing to the analysis of complete clearance (Table 1), 16 RCTs were designed to evaluate strictly one course of treatment (independently of the recovery of the patient), while another 15 RCTs presented a design that allowed for multiple courses or sessions of treatment. Within this latter group of studies, some designs were more comparable to a real-life situation. Patients were assessed by the clinician after a course of treatment and could be prescribed additional courses of treatment, as deemed appropriate for the individual patient. This design introduced a high variability in the scheduling of the drug administration both within and between studies. Previous NMAs of AK on the face and scalp also incorporated all the courses of treatment in the main analysis but acknowledged that heterogeneity was introduced with this approach [21]. We assessed the effects of this with a sensitivity analysis (Section 3.4.2 and Supplementary Figure S2) including only RCTs that evaluated one course of treatment, the results of which reflected those of the base case analysis.
Across the 35 trials relevant to a European perspective, there were 29 placebo/vehicle arms. These placebo/vehicle arms were different in their formulations (as creams, gels, ointments, patches, and placebo PDT were used) and schedules of administration. We retained topical placebo/vehicle as a separate node to placebo PDT (with cream or patch sensitizer) but assumed equivalency of all topical placebos/vehicles. This assumption was consistent with previous NMAs of treatment for AK of the face or scalp [21,22,47,48]. Another earlier NMA [48], based on a Cochrane review, assumed equivalence of all topical placebos/vehicles and placebo PDT. In order to assess the effect of assuming placebo equivalence, we performed a sensitivity analysis (see Section 3.4.4 and Supplementary Figure S4). Results suggested that this assumption was reasonable and should not represent a major limitation.
One potential source of heterogeneity in NMAs in this clinical area is the variety of timepoints at which outcomes are reported. To mitigate this potential risk, clinical input and regulatory labels were used to define a set of common timepoints of outcome reporting for each treatment. Sensitivity analysis including only studies assessing efficacy ≥8 weeks after end of treatment (Section 3.4.5) excluded studies that could be confounded by the ongoing regeneration cycle of the epidermis and confirmed that the results of the base case are robust to this source of heterogeneity for all interventions except for IMQ5%_EU (5FU4% and 5FU5% were excluded by the analysis having an earlier timepoint of outcome assessment).
Safety outcomes, including the incidence of severe LSRs, were generally inconsistently reported and it is difficult to draw conclusions from the available data regarding the relative safety profiles of the AK treatments assessed by this review. Incidence of severe LSRs is likely to have an impact on patient tolerability and treatment satisfaction, and better reporting of the incidence of severe LSRs in future studies of treatments for AK would allow more accurate conclusions to be drawn.
Mechanical pretreatments, such as curettage, are hard to assess in an NMA because of poor reporting but should be kept in mind as a possible source of clinical heterogeneity. In trials allowing prior mild curettage this may confound assessment of the efficacy of interventions in both active and placebo/vehicle arms. As suggested by Vegter and Tolley [21], "curettage may have caused an underestimation of the true effectiveness of the active treatments in these studies".
All the above-mentioned factors contributed to heterogeneity in the sample. In the quantitative assessment of between-study heterogeneity, it was found to be generally low, but was still a potential concern in the evidence that informed some treatment comparisons. Random effects models were used to account for these differences between studies, and weakly informative prior distributions were used to estimate the between-study variance. The consistency of results between the base case and sensitivity analyses suggests that the findings obtained are robust to the heterogeneity present in our sample.

Conclusions
All active treatments commonly used in Europe for the treatment of AK demonstrated higher odds of complete clearance than topical placebo/vehicle, though credible intervals were generally wide and were overlapping for comparisons across most treatments. We identified some concerns on the robustness of the evidence for some interventions, such as 5FU5% and 4%. The results should be interpreted with caution considering the wide credible intervals due to limited availability of data in the networks. In the qualitative assessment of safety outcomes, TIRBA1% showed low rates of severe LSRs while IMQ5%_EU and IMQ3.75% showed relatively high rates of some severe LSRs. No data on the incidence of severe LSRs were found for CRYO, IM0.015%, MAL_PDT, or 5FU (4% or 5%). TIRBA1% showed 0% rates of discontinuation due to TRAEs, TEAEs, local AEs or LSRs across all patients treated in both included studies, indicating very high treatment adherence.
Tirbanibulin 1% ointment provides clinicians with a novel field-directed treatment option in management of patients with AK of the face and scalp, with a single short (5 day) treatment period, once daily application, good safety profile and efficacy comparable with existing topical treatments.   Funding: This review and manuscript were commissioned and funded by Almirall and no public grant funding was received.
Institutional Review Board Statement: No Institutional Review Board Statement or ethical approval were required for this systematic review and network meta-analysis as it utilized data from published studies and clinical study reports.
Informed Consent Statement: Not applicable; systematic review. Data Availability Statement: Data supporting the quantitative analysis of complete clearance can be found in Table 4. Data supporting the qualitative analyses of additional outcomes are described in Tables 2 and 3, and the Supplementary Material.