1. Introduction
According to the World Health Organization (WHO), both overweight and obesity are characterized by excessive body fat deposits, with obesity recognized as a chronic, complex condition detrimental to health [
1]. The current prevalence of these conditions is alarming, with 43% of adults worldwide being overweight as of 2022 [
1]. This issue is particularly severe in the UK. According to the Health Survey for England 2022, 64% of adults in England are living with obesity or overweight. Similar trends are observed across Scotland, Wales and Northern Ireland. The 2023 Scottish Health Survey reported the highest obesity rate in the country’s history reaching 32%; in Wales 2022/23, 65% of men and 57% of women reported being obese or overweight through a national survey; and The Health Survey Northern Ireland 2023/24 found 69% of men and 59.5% of women to be in the same category [
2]. Overweight is the second leading risk factor contributing to poor health and death, linked to 30 diseases, including cancer, cardiovascular diseases, and diabetes [
2]. Various stakeholders argue that overweight and obesity rates have reached epidemic proportions and represent one of the greatest global public health challenges [
3,
4,
5].
Given the chronic and multifactorial nature (biological, genetic, hormonal, sociocultural, and behavioral) of obesity, it has become clear that its treatment requires a long-term comprehensive approach [
6,
7,
8]. In recent years, glucagon-like peptide-1 receptor agonists (GLP-1 RAs) such as semaglutide have emerged as promising weight loss medications. These medications mimic the effects of GLP-1, a hormone that enhances insulin secretion, regulates appetite, and slows gastric emptying, significantly promoting weight loss [
8]. However, global health institutions such as the WHO and the UK National Institute for Health and Care Excellence (NICE) advise that these medications should only ever be taken as a supplement to comprehensive lifestyle interventions under the guidance of a multidisciplinary team [
9].
The advent of GLP-1 RAs as weight-loss supplements has coincided with the rapid growth of digital health solutions [
10]. Among these solutions are digital weight-loss services (DWLSs), which utilize mobile apps, web-based consultations, text messaging, and digital personal assistants to deliver care. A key benefit of DWLSs is that they mitigate geographic and temporal barriers to obesity care by removing the need to attend recurrent in-person clinical visits. Research has also found that patients are more comfortable discussing their overweight and obesity in digital settings [
11,
12].
While DWLSs improve access to care, their long-term effectiveness hinges on sustained digital engagement and behavioral adherence to in-app tools, rather than just medication uptake. Numerous studies within digital weight management have established that frequent interaction with the platform—such as regular weight tracking, goal logging, and message exchange with coaches—is strongly and positively correlated with superior weight loss outcomes [
13,
14]. This active, ongoing engagement is vital as it directly facilitates the recommended lifestyle interventions that underpin successful GLP-1 RA therapy, effectively translating the passive receipt of medication into an adherent behavioral strategy for chronic disease management [
15,
16].
Some DWLSs, such as Juniper health, a large provider operating in the UK, Australia, Germany and Japan, adhere to NICE advice of underpinning pharmacotherapy with behavioral therapy and multidisciplinary guidance [
17]. Early evidence suggests that these services can deliver safe and effective care [
18,
19]. However, several questions remain unanswered around the intersection of adherence and effectiveness in unsubsidized real-world DWLSs [
20]. Previous studies of Juniper programs have reported an average adherence duration of 171.2 days in Australia [
17] and 183.1 days in the UK [
21], and found program cost, medication availability, and dissatisfaction with outcomes to be the three most common discontinuation reasons [
17]. However, not only have hitherto studies of the Juniper DWLS reported effectiveness outcomes over periods shorter than these average adherence periods, but they also failed to assess the extent to which program adherence impacts effectiveness. Markers such as medication schedule adherence, program pause duration and weight tracker engagement could feasibly have a significant effect on weight loss outcomes.
Given the inherent variability and non-randomized nature of observational studies, patients inevitably deviate from the suggested treatment pathway. To rigorously and transparently quantify the impact of such deviations, we frame our research question using the Estimand framework defined by the International Conference on Harmonisation (ICH) E9(R1) addendum on estimands [
22]. This approach allows for the precise definition of two treatment effects: the efficacy estimand (representing efficacy under ideal adherence) and the treatment estimand (representing effectiveness of the strategy in the real-world, including non-adherence and dropout).
In line with this framework, this study aims to retrospectively assess 12-month weight loss and adherence from a cohort of patients from the Juniper UK DWLS. Specifically, we define two estimands: an efficacy estimand (representing efficacy under ideal adherence) and a treatment estimand (representing effectiveness in a real-world context, including non-adherence). The study will compare the outcomes of both the ideal adherence cohort and a modified full cohort, i.e., including those who deviated from the suggested clinical pathway or discontinued early. The study will also examine how weight loss is affected by demographic factors, program pauses, weight tracker engagement levels, and semaglutide brand (Wegovy and Ozempic).
2. Materials and Methods
2.1. Study Design
This investigation applied a retrospective cohort study design to analyze the trajectory of all patients who initiated the Juniper UK DWLS between 1 January 2023 and 1 May 2024. All patients were prescribed semaglutide by Juniper clinicians following NICE guidelines on eligibility and dosing and further supported by a multidisciplinary care team (MDT). All data were retrieved from the Juniper central data repository on Metabase—an open-source business intelligence tool. The Just Reasonable Independent Research Ethics Committee approved the study on 11 August 2025 (IREC015). Investigators follow the Strengthening the Reporting of Observational Studies in Epidemiology Statement (STROBE) guidelines throughout every phase of the study [
23].
2.2. Program Overview
Prospective patients of the Juniper UK DWLS complete a pre-consultation questionnaire that a pharmacist independent prescriber uses to determine their eligibility for the service. These questionnaires contain up to 100 questions and are often supported with medical imaging, pathology results, and/or reports from previous clinicians, should prescribing practitioners request them. All patients are required to submit 2 photos (1 front-on view; 1 side-on view) of themselves in ‘activewear or swimwear’ where their abdomen, face and knees are visible.
Eligibility decisions are primarily based on the product information documents for Semaglutide brands (Wegovy and Ozempic), which detail body mass index (BMI) ranges, contraindications, and drug interactions. Inclusion criteria included a BMI of 30 kg/m
2 for the general population. However, reflecting clinical guidelines that account for heterogeneous disease risk, the threshold was lowered to ≥27 kg/m
2 for patients with at least one weight-related comorbidity (e.g., symptomatic cardiovascular disease, disruptive sleep apnoea) or patients of non-Caucasian ethnicity. The lower BMI threshold for non-Caucasian individuals is justified by evidence indicating that certain ethnic groups (e.g., South Asian populations) have an elevated risk for cardiometabolic complications at lower BMI values [
24]. Key exclusion criteria included the following contraindications: multiple endocrine neoplasia syndrome type 2, a personal or family history of medullary thyroid cancer, acute gallbladder disease, acute pancreatitis, hypoglycaemia, a severe mental health condition, known hypersensitivity to Semaglutide or any of the product’s components, type 1 or type 2 diabetes, and current or planned pregnancy.
All prescribing decisions are automatically uploaded to the Juniper clinical auditing repository on Jira, which uses data analytics to alert auditors whenever an incorrect decision or safety error has been committed. Patients of the Juniper UK DWLS are allocated a multidisciplinary team (MDT), which includes an independent pharmacist prescriber, a university-qualified health coach, a pharmacist, and a medical support officer. Juniper UK patients communicate with their MDT through the program’s mobile phone app to facilitate care coordination and data management (via the app’s link to the Juniper central data repository).
The Juniper UK DWLS has only ever provided medication-supported therapy, i.e., it has never offered standalone lifestyle or GLP-1 RA/dual GIP/GLP-1 RA treatment. Upon payment of their first monthly subscription fee, Juniper patients receive a month’s supply of medication and access to personalized lifestyle coaching. Coaching is delivered via the Juniper app and includes access to multimodal educational materials, progress trackers, and meal and exercise plans. Patients can request changes to the diet and exercise plans at any stage of their care journey. Patients continue to receive medication at monthly intervals up until (and including) month 4 unless they cancel payment and/or request to opt out of the program. Prescribing practitioners are required to assess patients at the 5-month point of their care journey. Patients do not receive their 5th medication order and ongoing lifestyle coaching unless they are approved by their prescribing practitioner to continue the Juniper program at their 5-month follow-up consultation. While patients are encouraged by their MDT to upload weight data to the app on a fortnightly basis, the follow-up consultation represents the first moment since program initiation when weight data are a requirement for continuing the program. Patients are instructed to report side effects and associated severity levels to their MDT whenever they arise. Side effect severity was determined by patients, who were given the following matrix to guide their assessment:
Mild: Side effect is tolerable and easy to manage
Moderate: Side effect is noticeable but ok to manage
Severe: Side effect is uncomfortable and hard to manage
2.3. Medication Titration Schedule
All Juniper prescribing practitioners follow the titration guidelines provided by the relevant medication supplier. These are as follows:
Ozempic (0.25 mg once weekly for weeks 1–4; 0.5 mg for weeks 5–8, 1 mg for weeks 9–12; and 2 mg from week 13 onwards)
Wegovy (0.25 mg once weekly for weeks 1–4; 0.5 mg for weeks 5–8, 1 mg for weeks 9–12; 1.7 mg for weeks 13–16; and 2.4 mg from week 17 onwards)
The choice between Wegovy and Ozempic was primarily determined by product availability at the time of prescribing. When both products were available, patient preference served as the assignment criterion. Neither medication was prescribed off-label. Prescribing clinicians use their professional judgment to determine whether titration schedules should be delayed or lowered in certain cases (e.g., if a patient reports adverse events or indicates that they have missed a dose).
2.4. Program Cost
Throughout the study period, monthly fees for the Juniper UK DWLS were determined by a patient’s semaglutide (Wegovy and Ozempic) dose. For doses between 0.25 mg and 1 mg, patients paid 189 British Pounds. For doses between 1.7 mg and 2 mg, patients paid 249 British Pounds; and for doses above 2 mg, patients paid 299 British Pounds.
2.5. Endpoints
To generate meaningful and robust findings, patient outcomes were assessed based on two primary estimands, defined in accordance with the International Conference on Harmonisation E9(R1) framework:
Efficacy estimand: This estimand reflects the effectiveness of the treatment under the hypothetical condition of full protocol adherence. It corresponded to patients who received between 8 and 15 medication orders, reported weight measurements within a 12-month post-initiation assessment window (341–379 days), and did not pause treatment for any longer than 90 days.
Treatment (Intention-to-Treat) estimand: This estimand reflects the treatment strategy’s effectiveness in a real-world setting. It included all patients in the Efficacy Estimand plus those who demonstrated limited adherence (received fewer than 8 or above 15 orders, paused for longer than 90 days, or received between 8 and 15 orders but did not track weight within a 12-month post-initiation assessment window (341–379 days).
The estimand cohort was selected to reflect the optimal effectiveness under conditions of sustained adherence, requiring exclusion criteria to mitigate the impact of major intercurrent events. Patients were required to have received between 8 and 15 medication orders over the 12-month period. The minimum of 8 orders ensured adequate treatment exposure, while the maximum of 15 orders accounted for logistical variations such as pre-ordering for travel, without including excessive ordering that would suggest protocol deviation. A single or cumulative pause in medication supply exceeding 90 days resulted in exclusion from this estimand. This 90-day cutoff serves as an established determinant of non-adherence over a 12-month assessment period in real-world studies [
25], and was necessary to minimize the confounding risk associated with prolonged periods of non-exposure to the study medication.
Primary endpoints were mean weight loss and the proportion of patients who reached ≥5%, ≥10%, and ≥15% weight loss milestones in the efficacy estimand. Secondary endpoints included these measures in the treatment estimand (all 3 categories combined), along with side effect incidence in both efficacy and treatment estimands. Side effect incidence was assessed as a binary variable (yes/no), recording whether a patient experienced any side effects during the 12-month program. Investigators also analyzed the distribution of side effect severity based on each patient’s single highest severity rating reported during that time. Missing 12-month weight data in the treatment estimand were imputed using the Baseline Observation Carried Forward (BOCF) method. Correlation metrics included the effect of total semaglutide orders, pause incidence (yes/no), pause duration, and side effect incidence (yes/no) on weight loss. Key descriptive data were the mean time from treatment commencement to initial side effects and the proportion of patients who paused treatment. While the Juniper platform includes coaching messages and meal logging, weight tracking frequency was selected as the primary engagement metric for this analysis because it provided the most consistent and quantifiable longitudinal data across the entire cohort. Further analyses explored the impact of demographic factors on weight loss outcomes and side effect incidence.
2.6. Statistical Analysis
Data distribution normalcy was evaluated using quantile-quantile plots and Shapiro–Wilk tests. Given the significant departures from normality (p < 0.05) observed in the Shapiro–Wilk tests, non-parametric methods were declared a priori as the primary approach for inference involving continuous outcomes. Means and standard deviations are retained for descriptive purposes only. The effect of continuous independent variables on weight loss percentage was assessed using Spearman’s rank correlation test. Categorical independent variables (excluding the medication comparison) were analyzed using the Mann–Whitney U test (for binary variables) or the Kruskal–Wallis test (for multi-level variables). The Mann–Whitney U test results are reported using the Wilcoxon Rank-Sum statistic (W), where W is the sum of the ranks for the smaller group. Post hoc pairwise comparisons following the Kruskal–Wallis test were corrected using the Holm-Bonferroni method to control the Family-wise Error Rate (FWER). Categorical dependent and independent variables, such as the proportion of patients achieving weight loss milestones (≥5%, ≥10%, ≥15%) versus medication brand or patient group, were compared using Chi-Square tests.
To robustly compare the effect of Wegovy versus Ozempic on 12-month weight loss while adjusting for potential confounding variables (e.g., adherence behaviors and patient selection), Propensity Score Matching (PSM) was employed. Propensity scores were estimated using logistic regression, modeling the probability of receiving Wegovy (the treatment) based on the following covariates: age, initial BMI, initial weight, ethnicity, and pause incidence (yes/no). All covariates used in the PSM model had complete data. Nearest neighbor matching (1:1) with a caliper of 0.2 of the logit standard deviation was used to create the final matched cohort. The primary comparison of weight loss percentage within the matched cohort was then conducted using the Mann–Whitney U test, and the magnitude of the adjusted difference was reported using Cohen’s d and the 95% Confidence Interval of the mean difference from a linear model. The d statistic is presented to provide a standardized effect size for the difference in means, complementing the primary inference drawn from the non-parametric test (Mann–Whitney U) on medians. All visualizations and statistical analyses were conducted using RStudio, version 2023.06.1+524 (RStudio: Integrated Development Environment for R, Boston, MA, USA).
2.7. Sensitivity Analysis for Missing Outcome Data
To assess the robustness of the primary treatment estimand findings, which employed the conservative BOCF method, a prespecified sensitivity analysis was performed comparing the BOCF results to two alternative scenarios: Complete Case Analysis (CCA) and Multiple Imputation (MI). CCA included only those patients with observed (non-missing) 12-month weight loss data. MI was conducted under the assumption of Missing at Random (MAR) using the Multiple Imputation by Chained Equations (MICE) framework [
26]. Twenty imputed datasets were generated, and pooled results were calculated using Rubin’s rules [
27]. The imputation model included the primary outcome variable (12-month weight loss percentage) and the following prognostic factors and auxiliary variables: age, initial BMI, initial weight, ethnicity (binary), pause incidence (binary), product type, and total medication orders. The analysis compared the median weight loss percentage and the proportion achieving clinical milestones (≥5%) across the three scenarios (BOCF, CCA and MI) to quantify how conclusions change under different missing data assumptions. While BOCF was utilized as a conservative ‘floor’ for real-world effectiveness, MI was included to provide a more robust estimate by accounting for the potential weight loss trajectories of patients who discontinued treatment.
2.8. Dose-Restricted Sensitivity Analysis
To assess the robustness of the medication comparison against potential confounding by dose, a prespecified sensitivity analysis was conducted. This analysis was restricted to the largest comparable dose range used by both products. The cohort was filtered to include only those patients in the Efficacy Estimand who never received a weekly dose exceeding 1.0 mg (i.e., they remained on or below the maximum shared dose between Wegovy and Ozempic). The primary outcome (12-month weight loss percentage) in this dose-restricted cohort was compared using the Mann–Whitney U test.
3. Results
Of the 7279 patients who initiated semaglutide-supported treatment during the study period, 1678 (23.05%) met all efficacy estimand criteria. A further 791 (10.87%) patients received between 8 and 15 medication orders but did not submit weight data within the 341–379-day window (adherent patients), and 224 (3.07%) met both medication order and weight data entry criteria but paused for longer than 90 days (
Figure 1). Of the remaining 4586 patients, 4488 (97.86%) received fewer than 8 medication orders and 98 (2.14%) received more than 15 orders. In total (irrespective of medication order count and weight data submissions), 2858 (39.26%) patients in the treatment estimand had a treatment pause of longer than 90 days.
3.1. Treatment Outcomes Under Ideal Adherence Conditions
Primary analysis on the efficacy estimand (
Table 1) found a mean percentage weight loss at 12 months after treatment initiation of 15.67% (±8.0). Additionally, a substantial proportion of patients achieved clinically meaningful weight loss milestones, with 1552 (92.49%) losing at least 5% of their baseline weight, 1274 (75.92%) reaching ≥10% weight loss, and 901 (53.69%) achieving ≥15% weight loss. Only 42 (2.5%) patients experienced weight gain or reported no weight loss. Reported side effects were common, with 1274 (75.98%) patients noting at least one side effect over the course of their treatment. The mean time from program initiation to first reported side effect was 43.87 (±29.11) days. In total, 536 (31.92%) patients in the efficacy estimand paused their treatment on at least one occasion for a mean period of 14.17 (±22.42) days. The mean number of semaglutide orders for this cohort was 12.98 (±1.89).
3.2. Medication Subgroup Comparison
Among patients in the efficacy estimand, 536 (31.94%) received Wegovy and 1142 (68.05%) received Ozempic (
Table 1). Initial unadjusted analysis revealed a significant baseline imbalance in patient characteristics, notably in pause incidence (Wegovy 13.11% vs. Ozempic 40.76%; Std. Mean Diff. = −0.82). To account for this confounding, Propensity Score Matching (PSM) was performed, creating 536 matched pairs (1072 patients). The matching process resulted in the exclusion of 606 patients (36.1%) who were outside the common support region or could not be matched within the 0.2 caliper.
3.2.1. Propensity Score Matching Diagnostics
The PSM successfully achieved covariate balance.
Table 2 presents the Standardized Mean Differences (SMD) for all covariates before and after matching; all post-match SMDs were below 0.1. Furthermore, the overlap of propensity score distributions, shown in the PS Distribution Plot (
Figure 2), indicates satisfactory common support across the treatment and control groups post-matching.
The subsequent non-parametric (Mann–Whitney U) test on the matched cohort revealed that the difference in 12-month weight loss percentage between participants using Ozempic and those using Wegovy remained highly statistically significant (W = 111,385, p < 0.001). In the PSM cohort, participants treated with Ozempic experienced a significantly lower median weight loss percentage (Median = 14.0%; IQR [9.6%; 19.0]) than those who received Wegovy (Median = 17.0; IQR [12.0%; 23.0%]). The adjusted median difference was 3.0 percentage points. The effect size was medium and favored Wegovy (Cohen’s d) = 0.38; 95% CI [0.26, 0.51]).
Correspondingly, chi-square tests on the unadjusted efficacy cohort revealed that a statistically higher proportion of Wegovy users reached key weight loss thresholds (Lost > 5%, >10%, >15%, all p < 0.001). No significant difference was observed between the two groups in the frequency of patients who experienced weight stability or increase (W = 2.05% vs. O = 2.71, p = 0.5).
3.2.2. Dose-Restricted Sensitivity Analysis Outcomes
The sensitivity analysis restricted the cohort to 567 patients who met all efficacy estimand criteria but never received a weekly dose exceeding 1.0 mg (
Table 3). The analysis confirmed that the weight loss difference between the brands remained highly statistically significant (
W = 20,954,
p ≤ 0.001) even when the confounding effect of high doses was removed. In this restricted cohort, patients treated with Wegovy achieved a median weight loss of 17.0% (IQR [13.00%; 23.00%]), and those treated with Ozempic achieved a median weight loss of 14.00% (IQR [8.52%; 21.0%]). The adjusted mean difference favoring Wegovy was 3.55 percentage points (95% CI: 1.86 to 5.24), representing a medium effect size (Cohen’s
d = 0.42).
3.3. Efficacy vs. Treatment Estimands
A comparison between the efficacy and treatment estimands (
Table 4) revealed substantial differences in clinical outcomes. The distribution of weight loss percentage was assessed for both the treatment and efficacy estimands. A Mann–Whitney U test revealed that the median 12-month weight loss percentage was significantly higher in the efficacy group (Median = 15.0%; IQR [10.0%; 21.0%]) compared to the treatment group (Median = 5.9%; IQR [0.6%; 13.0%]) (
W = 4,902,466,
p < 0.001). Chi-square tests found that a significantly higher proportion of patients in the efficacy group achieved clinically relevant weight loss milestones (≥5%: E 92.49%, T 70.78%, (χ
2(1) = 623.16,
p < 0.001); ≥10%: E 75.92%, T 48.88%, (χ
2(1) = 801.06,
p < 0.001); ≥15%: E 53.69%, T 30.88%, (χ
2(1) = 666.98,
p < 0.001)), and a smaller proportion failed to lose weight (E 2.50% vs. T 12.79%, (χ
2(1) = 258.75,
p < 0.001)).
The average number of medication orders was notably higher in the efficacy group compared to the treatment group (E 12.98; T 6.97), a difference that was statistically significant (W = 5,618,985, p < 0.001). Treatment pause prevalence was significantly lower in the efficacy group (E 31.94%; T 57.02%), with a mean pause length of 14.17 days versus 106.57 days in the treatment group. Side effects were reported in 75.98% of patients in the efficacy group and 54.07% in the treatment group. However, no statistically significant difference was observed in the distribution of time to first side effect between groups, with a mean number of days in 43.87 days for the efficacy group and 36.18 days for the treatment group.
3.4. Demographics
Demographic analysis (
Table 5) performed on the treatment estimand found a mean age of 42.83 years (±11.43), of whom 91.19% were female. The efficacy estimand was slightly older on average (45.40 ± 10.74 years) but had a comparably high proportion of females (93.02%). The majority of participants in both cohorts identified as Caucasian (efficacy estimand = 87.16%; treatment estimand (81.80%). Other ethnic groups included individuals of Asian origin (TE = 7.74%; EE = 5.31%), Black African or African Caribbean (TE = 5.43%; EE 2.74%), Middle Eastern (TE = 1.25%; EE = 1%) and Latino/Hispanic (TE = 1.00%; EE = 1%). Baseline clinical characteristics showed a mean BMI of 35.13 kg/m
2 (±5.48) and weight of 96.88 kg (±17.60) in the treatment estimand, with slightly lower BMI and weight in the efficacy estimand (34.63 ± 5.23 kg/m
2 and 95.82 ± 17.24 kg).
A multiple linear regression model was created to predict 12-month weight loss in the efficacy estimand (
Table 6). The model’s predictor variables included age, sex at birth, longest pause length, days to weight loss measurement, ethnicity, maximum order count, product, weight tracker use, initial BMI, and side effects reported. Variance inflation factor scores were checked to assess multi-collinearity in the predictor variables, and all were well below a score of 5, indicating multi-collinearity was not a concern. The model was statistically significant overall (F(14, 1663) = 25.64,
p < 0.001); however, the effect size was only weak to moderate (Adjusted R-squared = 0.172).
Several variables were found to be significant predictors of weight loss percentage. Older age at consultation (b = −0.0012, p < 0.001) and male sex (b = −0.038, p < 0.001) were associated with significantly lower weight loss. Participants identifying as Asian (including Indian subcontinent) had significantly lower weight loss percentage outcomes than individuals who identified as White/Caucasian (b = −0.019 p < 0.05). Higher medication order count (b = 0.0074, p < 0.001), use of Wegovy (b = 0.0139, p = 0.0019), higher track count (b = 0.00029, p < 0.001), and side effect incidence (b = 0.0243, p < 0.001) were associated with significantly higher weight loss percentages. Longest pause length and initial BMI were not significant predictors in this model (p > 0.05).
Further demographic analysis of the efficacy estimand (
Table 7) revealed more details about the effect of age, weight tracker use and medication count on weight loss. A Kruskal–Wallis test found a statistically significant difference in weight loss across age groups, H(4) = 82.33, p 0.0099. A Holm-Bonferroni post hoc test revealed that participants in the 50–59 age group (Median = 14.43%) and the 60+ age group (Median = 13.31%) experienced significantly lower weight loss percentages compared to those in the <30 (Median = 17.98%) and 30–39 (Median = 16.01%) age groups, (
p < 0.001) and (
p < 0.01), respectively. Participants in the <30 age group experienced significantly lower weight loss percentages compared to those in the 40–49 age group (Median = 15.31%), (
p < 0.05). No other pairwise comparisons were statistically significant (
p > 0.05).
An additional Kruskal–Wallis test discovered that the association between the number of medication orders and weight loss was statistically significant, H(5) = 73.77,
p < 0.001. Results from a Holm-Bonferroni post hoc test revealed that patients who received 13 to 15 orders lost statistically more weight (Median = 16.63%) than those who received 8, 9, 10 and 11 orders, but not those who received 12 orders. The only other differences were found between patients who received 12 orders (Median = 15.41%) and those who had 10 (Median = 11.87%) and 8 (Median = 12.24%) orders, respectively. After restricting the cohort to patients with 12 or more orders (N = 1338), weight loss was also found to be significantly associated with weight tracker use, H(5) = 109.01,
p < 0.001. A Holm-Bonferroni test found that patients in the 40–59, 60–79, 80–99, and 100+ track count groups demonstrated significantly higher weight loss percentages compared to those in the <20 (Median = 11.83%) and 20–39 (Median = 14.37%) track count groups (
p < 0.001 for all comparisons). Additionally, the 100+ track count group (Median = 20.24%) demonstrated significantly higher weight loss percentages compared to those in the 40–59 category (Median = 16.79%), while a modest increase in tracking, from <20 entries to 20–39 entries, also resulted in a statistically significant increase in weight-loss outcomes (
Figure 3).
3.5. Sensitivity Analysis of the Treatment Estimand
The sensitivity analysis was conducted to assess the robustness of the primary Treatment Estimand outcomes, which employed the conservative BOCF method. Results were compared against CCA and MI under the MAR assumption (
Table 8). The analysis confirms that the primary BOCF method was highly conservative. The median weight loss percentage increased substantially from the BOCF estimate of 5.90% to 7.40% (CCA) and 8.25% (MI). Similarly, the proportion of patients achieving the 5% milestone increased from 54.21% (BOCF) to 62.40% (CCA) and 57.59% (MI). The increase in both median weight loss and milestone achievement observed under the MI scenario suggests that the primary (BOCF) analysis substantially underestimated the effectiveness of the treatment strategy. Consequently, the MI estimate (8.25%) likely provides a more plausible reflection of the real-world impact for the total cohort by adjusting for the conservative bias inherent in the BOCF method. The overall finding that outcomes in the treatment estimand are significantly inferior to the efficacy estimand remains consistent across all scenarios.
4. Discussion
This study contributes valuable findings to the scarce literature on unsubsidized medicated DWLSs. Although recent research suggested that such services can play an important role in combatting the obesity epidemic, nuance was lacking at the intersection of program adherence and effectiveness. Specifically, previous studies had demonstrated that the vast majority of adherent patients in real-world medicated DWLSs could lose a clinically meaningful amount of weight across a variety of countries [
28,
29,
30]. Prior research on the Juniper UK program found that patients who were supplemented with Semaglutide lost a mean of 10.1% of their baseline weight after 16 weeks, while those supplemented with Tirzepatide lost a mean of 13.79% at the same interval [
30,
31]. Studies that focussed on program adherence had reported a mean adherence period of 183.1 days in the Juniper UK program [
21] and 171.2 days among Juniper Australia patients [
17]. These latter investigations also revealed that a large proportion of patients paused or discontinued the Juniper program due to its cost, which is likely a key militating factor to patient adherence in other unsubsidized DWLSs. Thus, the extent to which weight loss outcomes differed between adherent (efficacy estimand) and full real-world DWLS cohorts (treatment estimand) was a key question prior to this study. In addition to this knowledge gap, little was known about the effect of Semaglutide brand and medication order count on 12-month weight loss outcomes.
4.1. The Adherence Gap: Efficacy vs. Treatment Outcomes
The first discovery of interest was the marked discrepancy between the efficacy and treatment estimates (15.67% vs. 7.88%) highlights the central importance of adherence and attrition in real-world digital obesity care. This gap suggests that while the pharmacological intervention is highly effective under ideal conditions, its clinical impact is heavily moderated by the challenges of long-term program retention and the various financial or behavioral barriers to persistence inherent in an unsubsidized service.
Following this was the finding that only 1678 (23.05%) patients satisfied all efficacy estimand criteria, which stipulated patients receive between 8 and 15 medication orders and make a minimum of one weight data submission within 341–379 days post program initiation. A previous study of the Juniper UK program reported a near identical dropout rate of 77.3% after 5 months of treatment [
18], however that study did not present a distribution of dropout reasons. In this study, 791 (10.87%) patients were excluded for having failed to submit weight data during the specified period (despite meeting the medication order requirement). An assessment of medication adherence would include these patients and therefore leave a total of 2469 (33.92%) patients who adhered to the semaglutide schedule reasonably well.
The study’s weight loss findings were also notable. Firstly, the efficacy estimand’s mean weight loss percentage of 15.67% is relatively high compared to other 12-month studies of real-world semaglutide-supported weight loss interventions [
28,
32]. The figure is also not too dissimilar from the 16.9% mean weight loss percentage recorded in the efficacy (trial product) estimand of the STEP 1 clinical trial [
32]. The proportion of patients who achieved 5, 10, and 15 percent milestones was also comparable to the latter trial, including 92.49% of patients who lost a clinically significant amount of weight (5%), and over half (53.69%) the cohort reaching the 15 percent milestone. The finding that women lost statistically more weight than men is consistent with a previous study of a Juniper UK semaglutide cohort and other clinical trials [
30,
33]. Further research needs to be conducted to determine whether there are biological mechanisms responsible for this trend. Similarly, the discovery that patients of Asian ethnicity lost, on average, less weight than Caucasian patients aligns with the results from a large clinical study [
34]. The latter suggested that lower initial BMI may explain this trend, but body composition or visceral adipose tissue may also be important factors. Although this study did not find a correlation between weight loss and initial BMI like previous studies of the Juniper DWLS, it is possible that the small proportion of Asian patients had a lower initial BMI than Caucasian patients.
4.2. Comparative Medication Profiles: Wegovy vs. Ozempic
A key methodological concern regarding the medication comparison was the potential for confounding, particularly due to the significant difference in adherence behaviors (pause incidence) between the Wegovy and Ozempic groups. To more robustly account for confounding, we conducted a PSM analysis. The success of the PSM in balancing all measured covariates reduces the likelihood that baseline characteristics or adherence behaviors drove the observed difference, suggesting that the weight loss advantage for Wegovy may be related to the medication/dosage profile and not to patient characteristics or baseline adherence behaviors.
The analysis found that the median weight loss among Juniper UK patients treated with Wegovy was 3.0 percentage points higher than those treated with Ozempic (Median 17.0% vs. Median 14.0%), and that the former were statistically more likely to reach all 3 weight loss milestones. The initial hypothesis that this superior outcome was simply due to the higher maximum dose of Wegovy (2.4 mg vs. 2.0 mg) was directly challenged by a dose-restricted sensitivity analysis (
Section 3.2.2). This analysis, which compared only patients who never exceeded the shared dose of 1.0 mg demonstrated that the observed weight loss advantage for Wegovy persisted (median difference: 3.0 percentage points,
p < 0.001) This suggests that the observed variation in outcomes between the brands may be present even at equivalent low-to-mid doses, potentially driven by formulation differences or other subtle factors, rather than being solely dependent on the high-dose ceilings. To the knowledge of the investigators, this was the first real-world study to compare the two under the same program. We recognize the modest explanatory power of the Multiple Linear Regression model (Adjusted
R2 = 0.171). This is expected in a retrospective, real-world cohort study utilizing routine service data. Unlike tightly controlled clinical trials, the model inherently omits critical, unmeasured confounders that drive weight loss, such as specific dietary intake, physical activity levels, and patient motivation. The model’s value lies in identifying the independent associations between specific program levers (such as medication count and weight tracker use) and weight loss outcomes rather than achieving high predictive power.
4.3. The Engagement-Outcome Association
Regarding medication count, while a steady positive association was observed between the number of semaglutide doses and weight loss, the post hoc analysis revealed a clear divide at the 12-order mark. In other words, patients who received 8–11 orders tended to lose a comparable amount of weight (median range: 11.87–13.38%), with median weight loss increasing significantly in the sub-group of patients who received 12 orders but flattening again among patients who received 13–15 orders. A possible explanation for this is that patients who received 12 or more orders may have also exhibited near ‘perfect’ adherence to the behavioral aspect of the Juniper UK program, whereas those who received 8–11 orders may have had periods where their adherence was slightly lower. Findings on the positive relationship between weight tracker use and weight loss were even stronger than those reported in previous studies on the Juniper DWLS. In this case, a steady upward trend was observed between the five tracker use brackets, with a discrepancy of 8.41 median weight loss percentage points recorded between the <20 group (Median = 11.83) and the ≥100 group (Median = 20.24). A likely reason this relationship was clearer in this study compared to previous Juniper studies is that the former only reported patient outcomes after 16 weeks.
Side effects were reported in 75.92% of patients, of which 86.65% were of mild severity, 12.33% of medium severity, and 1.02% were severe. These figures are largely consistent with previous studies of semaglutide weight loss populations. Interestingly, side effects were reported in only 54.07% of treatment estimand patients, of which 0.7% were severe, suggesting that reasons such as program cost played a more significant role in program discontinuation, as previous Juniper studies have demonstrated. Program cost could have feasibly played a role in the decision of the 2858 (39.26%) patients to pause treatment for longer than 90 days. This association between cost and adherence has also been well-established in other chronic care services [
35,
36].
As expected, weight loss outcomes in the treatment estimand were inferior to the efficacy estimand. The fact that both the mean figure (7.88%; ±8.46) and proportion of patients reaching the ≥5% milestone (54.21%) in the treatment estimand were comparable to other real-world semaglutide studies [
20,
25] suggests that the on the whole, there was nothing remarkable about the Juniper intervention.
These findings have considerable implications for patients and providers of real-world DWLSs. Research has consistently demonstrated that GLP-1 RAs can play a significant role in combating the obesity epidemic. However, major health organizations have emphasized that they should only be used as a supplement to lifestyle interventions under guidance from multidisciplinary care teams. DWLSs have improved access to such interventions by facilitating care continuity and coordination. While emerging research on real-world medicated DWLSs has generated some positive outcomes, several questions remain unanswered. Findings from this study reinforce the knowledge that adherent DWLS patients usually achieve good weight-loss outcomes. Arguably more importantly, they indicate that these outcomes improve the more patients adhere to their medication schedule and track their weight loss over a 12-month period. They also suggest that the maximum semaglutide dose (Wegovy vs. Ozempic) is significantly associated with weight loss for patients who progress through to their eighth order. Finally, the study’s drop-out rate consolidates knowledge that general perseverance with real-world unsubsidized DWLSs remains a considerable barrier for patients. Although discontinuation reason data were not available for this study, it is likely that the high cost of the Juniper UK program was a significant factor in patient discontinuation. Future research should seek to compare outcomes between subsidized and unsubsidized patients of the same medicated real-world DWLS.
The significant gradient observed between engagement (weight tracking) and clinical success underscores the importance of the behavioral dimension in medicated obesity care. This aligns with recent research in bariatric surgery candidates, where validated psychometric instruments are routinely used to capture psychological and psychosocial outcomes to predict long-term success [
37,
38]. By incorporating a similar biopsychosocial framework into digital weight-loss services, specifically through structured psychological and behavioral assessment, providers can better understand the drivers of adherence and transition from a purely pharmacological model to a truly multidisciplinary care pathway.
The study contained several limitations. Firstly, the study’s sample was predominantly Caucasian (82.8%) and female (91.2%), which, combined with a lack of data on socioeconomic status and specific comorbidities, limits the generalizability of these findings across more diverse populations. Given that this is an unsubsidized service, it is likely that individuals from lower socioeconomic backgrounds are underrepresented, further narrowing the external validity of the results. Secondly, all weight and side effect data were self-reported; consequently, they may have been affected by social desirability and reporting biases. Unlike traditional clinical settings, these measurements lack external validation from calibrated scales or independent clinical verification. Thirdly, quality of life and body composition data were not systematically collected by Juniper, which prevented investigators from extending upon the clinical relevance of the study’s findings. Fourthly, discontinuation reason data were not available for this study and thus, investigators were not able to enrich treatment estimand findings with discontinuation categories. Fifthly, Medication adherence was proxied by order frequency. While this confirms the patient possessed the medication, it does not guarantee clinical administration or adherence to the injection schedule. Finally, the efficacy estimand, while providing insight into the drug’s potential under optimal conditions, was subject to selection bias. Patients who adhere to a 12-month regimen are likely to possess higher baseline motivation and health literacy than those who discontinue early. Consequently, these results may overstate the outcomes achievable by the general population of initiators.
It is also important to consider the potential for reverse causality or bidirectional motivation regarding engagement; while frequent weight tracking may facilitate weight loss through accountability, it is equally plausible that patients experiencing early success are more motivated to log their progress. Furthermore, while the PSM analysis adjusted for baseline patient characteristics, medication assignment was also influenced by contextual factors such as global supply constraints and product access in the UK during the study period. These unmeasured motivational and logistical factors should be considered when interpreting the associations between engagement, medication brand, and clinical outcomes.