Next Article in Journal
Cryptocurrency Loss, Post-Traumatic Stress Symptoms, and Early Maladaptive Schemas in Physicians
Previous Article in Journal
Self-Reported Depressive Symptomatology Among University Students: Evidence from the PROTEGER-SE Project
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

Efficacy of Psilocybin-Assisted Therapy in Major Depressive Disorder: A Systematic Review and Meta-Analysis

by
Angel Labra-Lorenzana
1,
Dania Nimbe Lima-Sánchez
2,*,
Christian Alejandro Delaflor-Wagner
3,
Diana Martínez-Hernández
4,
Christian Ramos-Jiménez
5 and
Christian Gabriel Toledo-Lozano
6,*
1
Psychiatric Hospital Fray Bernardino Álvarez, National Autonomous University of Mexico, Mexico City 14080, Mexico
2
Digital Health Department, Faculty of Medicine, National Autonomous University of Mexico, Mexico City 04510, Mexico
3
Department of Clinical Research, National Medical Center “20 de Noviembre”, Mexico City 03100, Mexico
4
Faculty of Medicine, Anahuac University of Oaxaca, Oaxaca 71248, Mexico
5
Integrated Program in Neuroscience, McGill University, Montreal, QC H3A 0G4, Canada
6
Research Coordination, National Medical Center “20 de Noviembre”, Mexico City 03100, Mexico
*
Authors to whom correspondence should be addressed.
Psychiatry Int. 2026, 7(3), 137; https://doi.org/10.3390/psychiatryint7030137 (registering DOI)
Submission received: 17 March 2026 / Revised: 7 June 2026 / Accepted: 8 June 2026 / Published: 15 June 2026
(This article belongs to the Section Addiction Psychiatry)

Abstract

Background: This systematic review and meta-analysis evaluates the efficacy and safety of psilocybin-assisted psychotherapy (PAP) for adults with major depressive disorder (MDD). Methods: A PROSPERO-registered search (CRD42024561979) of CENTRAL, Scopus, PsycINFO, and MEDLINE (2010–2024) identified clinical trials assessing PAP. Risk of bias was assessed using RoB 2 for randomized controlled trials (RCTs), while non-randomized studies were appraised separately. Evidence certainty was evaluated using GRADE. Results: Ten trials were included; eight provided quantitative data. PAP was associated with large short-term reductions in depressive symptom severity. The overall pooled effect was large (d = 1.15, 95% CI 0.83–1.48), though within-subject designs yielded larger estimates (d = 1.63) than between-subject controlled comparisons (d = 0.96). Adverse events were transient and manageable, with no increased risk of serious adverse events on dosing days. Primary risk-of-bias concerns included functional unblinding. Conclusions: PAP may produce clinically meaningful, large short-term reductions in depressive symptoms. However, long-term efficacy remains understudied, and the overall certainty of evidence is low to moderate. Larger, rigorously blinded trials are required.

1. Introduction

Major depressive disorder (MDD) has been identified as one of the most significant causes of disability worldwide [1]. It is estimated that up to 4.4% of the adult population lives with depression. Patients with MDD experience a significant impact on their quality of life and social and working function, as well as an increase in the risk of medical and psychiatric comorbidities [2]. Between 1990 and 2007, years lived with disability (YLD) attributed to depressive disorders increased to 33.4% (IC95%: 31.1–35.8), which positioned them as the third most significant cause of YLD among all ages in 2007. Later on, between 2007 and 2017, it registered an additional increase of 14.3% (IC95%: 13.1–15.6%), which demonstrates the growing burden of disability associated with depressive disorders worldwide during these decades [3].
Despite the advances in pharmacological and psychotherapeutic treatments of depression, a considerable percentage of people with MDD do not respond adequately to standard treatment, which underlines the necessity to explore therapeutic alternatives to this condition. It is estimated that between 20% and 30% of patients with MDD receiving pharmacological treatment develop treatment-resistant depression (TRD), defined as the lack of response in at least two trials of antidepressants at an adequate dose and duration [2,3,4].
In view of the substantial and growing global burden associated with MDD and TRD, there is an urgent need to synthesize emerging evidence on innovative therapeutic strategies. The aim of this review and meta-analysis is to evaluate the efficacy and safety of psilocybin-assisted treatment in adult patients with MDD or TRD, compared with standard antidepressant therapies and placebo.
Some refractory cases qualify as TRD, a subtype that is associated with functional cumulative deterioration, greater risk of chronicity, and elevated rates of suicidal ideation. Moreover, even those who respond to antidepressants may experience significant limitations. Drugs take several weeks to produce a clinical benefit; they are only moderately effective, and many patients experience critical side effects such as sexual dysfunction, nausea, and weight gain, which can lead to non-adherence to treatment [4,5]. Patients with TRD represent approximately 30.9% of the population with MDD treated in the United States, which implies an annual cost of almost 47.2% of the total economic cost associated with depressive disorder treated with drugs [4].
Over approximately the last 30 years, interest in psychedelic substances has been renewed and grown steadily, as these compounds have captivated the attention of diverse scientific groups because of their potential to treat mental disorders, including MDD and TRD [6,7,8]. Mainly, psilocybin, a compound present in certain types of fungi, acts as a partial agonist of serotonin receptors (5-HT2A). This mechanism induces and promotes cognitive flexibility, associative learning, insightful experiences, and neuronal plasticity, which is essential to restructuring the neuronal network affected by depressive disorders [9].
At a neurobiological level, psilocybin acts as a potent agonist of the serotonergic receptor subtype 5-HT2A, producing acute changes in activity and brain connectivity, a different action mechanism than conventional antidepressants. Its administration has been associated with a reduction in the activity of the default mode network, facilitated access to autobiographical memories, and a more positive attentional bias toward emotional stimuli [10]. These neuropsychological effects have demonstrated fast reductions in depressive and anxious symptomatology in controlled trials.
Pilot trials and initial phases have demonstrated that a single dose or two doses of psilocybin assisted with psychotherapy may induce improvements in mood, and even in patients with long-lasting depression, refractory to previous treatments. However, these studies have presented limitations regarding sample size and follow-up time. Therefore, the generalizability of the findings has been limited [10,11,12].
Compared with prior reviews and meta-analyses [12,13,14], the present review incorporates a comparison of several recently published trials expanding the evidence base and focuses exclusively on adults with primary MDD or TRD. Additionally, unlike earlier reviews, we examine dose-specific effects (10–20 mg vs. 25–30 mg), include quantitative synthesis of short-term safety outcomes, and provide appraisal of psychotherapeutic support and set/setting variables. These variables strengthen the originality and relevance of the current review.
Previous meta-analyses evaluating psilocybin for depressive disorders have reported promising, although heterogeneous, findings. Differences across studies have included variability in effect sizes, follow-up duration, study design, and the representation of treatment-resistant populations. In addition, important aspects, such as dose-specific effects, the role of structured psychotherapeutic support, short-term safety outcomes, and the influence of set and setting variables, have not been consistently or comprehensively examined. Consequently, although the initial evidence is encouraging, a broader synthesis of the available literature with a meta-analytic approach remains necessary to clarify these inconsistencies and better inform evidence-based clinical practice and future guideline development. Therefore, the objective of the present systematic review and meta-analysis was to evaluate the efficacy and safety of psilocybin-assisted therapy in adults with major depressive disorder (MDD), including treatment-resistant depression (TRD), compared with placebo or standard antidepressant treatments. Furthermore, this review aimed to provide a more detailed assessment of different psilocybin dose ranges, psychotherapeutic support models, and short-term safety outcomes in order to consolidate and contextualize the currently available evidence.

2. Materials and Methods

2.1. Search Strategy

The study protocol was registered at PROSPERO on 25 June 2024 with the number CRD42024561979 and follows the PRISMA guidelines for systematic reviews. A comprehensive search was conducted in PubMed (including all MEDLINE-indexed records), the Cochrane Central Register of Controlled Trials (CENTRAL), PsycINFO, and Scopus. The search covered the period from 1 January 2010 to 31 July 2024. No language restrictions were applied beyond English and Spanish.
In PubMed, we used a combination of MeSH terms and free-text keywords, allowing retrieval of both MEDLINE-indexed articles and non-indexed records. Search strategies for CENTRAL, PsycINFO, and Scopus were adapted using equivalent controlled vocabulary and free-text terms. All searches followed PRISMA 2020 recommendations, and the complete search syntax for each database is provided in Table A1.
Data extraction was performed on 14 September 2024. Two researchers independently and in duplicate used the Rayyan research platform for the study selection. Disagreements were resolved through consensus or discussion. Out of 850 articles identified, 10 clinical trials were included in the review. For the meta-analysis, trials with clinically similar characteristics were pooled, including comparable psilocybin dosage and time points from the baseline of outcome measurement. Likewise, for the safety results, comparisons were made of adverse events during psilocybin sessions.

2.2. Inclusion and Exclusion Criteria

The studies included were selected according to the following criteria: (1) Population—adults aged 18 years or older diagnosed with MDD or TRD, according to DSM-IV, DSM-5, DSM-5-TR, ICD-10, or ICD-11 diagnostic criteria. (2) Intervention—psilocybin administration, with or without adjunctive psychotherapeutic support. (3) Comparators—placebo, active control, delayed-treatment control, or medications with action on serotonin, dopamine, or noradrenaline neurotransmitters like antidepressants and mood stabilizers. (4) Outcomes—primary outcomes included depressive symptom severity measured with validated instruments (e.g., the Montgomery–Åsberg Depression Rating Scale [MADRS], GRID Hamilton Depression Rating Scale [GRID-HAMD], Quick Inventory of Depressive Symptomatology [QIDS], or Beck Depression Inventory [BDI]). Secondary outcomes included response, remission, adverse events, and serious adverse events (SAEs). SAEs were defined according to each primary study’s criteria; these definitions are summarized in Appendix B, Table A2. Serious adverse events definitions (study design), randomized controlled trials, open-label trials, and prospective follow-up extensions.
We excluded studies according to the following criteria: (1) studies including participants with severe medical comorbidity associated with increased mortality (e.g., active cancer); (2) studies assessing microdosing psilocybin; (3) studies involving adolescents or participants < 18 years old; and (4) case reports, reviews, commentaries, preclinical studies, microdosing studies, and trials not reporting depressive outcomes.

2.3. Risk of Bias Assessment

Risk of bias was assessed according to study design. Randomized controlled trials were evaluated using the Cochrane Risk of Bias 2 tool. Non-randomized, open-label, and follow-up studies were appraised separately using an appropriate non-randomized study quality-assessment framework and were not pooled with randomized trials unless methodological and clinical comparability was justified. Two reviewers independently performed the assessments, and disagreements were resolved by consensus.
The certainty of evidence for the main outcomes was assessed using the GRADE approach. Outcomes assessed included short-term clinical response, remission, serious adverse events, and any adverse events. Certainty was rated as high, moderate, low, or very low according to risk of bias, inconsistency, indirectness, imprecision, and publication bias. A summary of findings table was prepared to present the certainty ratings and the main reasons for downgrading. GenAI [ChatGPT-5.2-Plus] was used to assist in the generation of the risk of bias graph [15] and the narrative appraisal of open-label studies. The assessment of risk of bias is presented in Figure 1. The narrative appraisal of open-label studies is presented in Figure 2.

2.4. Data Extraction

The outcome of interest was the effect of psilocybin on depressive symptoms as objectified by changes in scores before and after the intervention using internationally validated scales. The mean score per time point was extracted from the baseline, and the standard deviation was reported for the intervention and control groups. However, if this information was not reported, the number of patients reporting a response to the intervention was extracted, as well as the number of patients reporting a response in the control group.
Regarding the safety results, to homogenize the data, the following information was extracted: the number of severe adverse events during psilocybin administration in the intervention and control groups and whether any adverse event was reported. Data were collected independently and in duplicate by two reviewers using Excel from Microsoft 365 through a standardized database that included the following data: author, year, psilocybin dosage, number of dosing sessions, trial design, scale employed to measure changes in depressive symptoms, time point from baseline at which the scales were applied, and demographic data.
Due to the inconsistent reporting of statistical dispersion measures across the included clinical trials, particularly the frequent omission of standard deviation (SD) values for continuous outcomes at multiple time points, it was necessary to perform specific mathematical conversions. Following the methodological guidelines established in Chapter 6 of the Cochrane Handbook for Systematic Reviews of Interventions [22], missing standard deviations were systematically derived from the available standard errors (SE), 95% confidence intervals (CIs), or exact p-values reported by the original authors. This rigorous standardization process was fundamentally required to ensure that all eligible trials could be accurately converted into a common effect size metric (Cohen’s d) with its corresponding standard error, allowing for a precise and weighted quantitative synthesis within the meta-analysis despite the heterogeneous reporting formats of the primary literature.
Discrepancies in data extraction were resolved by discussion.

2.5. Data Analysis

Effect sizes were expressed as Cohen’s d with their corresponding standard errors to standardize outcomes across studies that used different depression rating scales (GRID-HAMD, MADRS, QIDS-SR). A classical random-effects meta-analysis was conducted in JASP (version 0.96) [23] with between-study variance estimated via restricted maximum-likelihood estimation (REML). Because the number of included studies was small, the significance of pooled effects and meta-regression coefficients was evaluated using the Knapp–Hartung adjustment, which provides more conservative inference by replacing the standard normal test with a t-distribution. When a study contributed multiple effect sizes from different follow-up time points, these were clustered at the study level to account for within-study dependency. Studies were classified into two subgroups based on design, within-subjects and between-subjects (randomized controlled trials), and this variable was entered as a subgroup moderator. Two additional moderators were included in the meta-regression: type of depression scale and weeks of follow-up after psilocybin administration. Residual heterogeneity was quantified with the Q-test and tau. Leave-one-out case-wise diagnostics were used to identify influential observations, and a sensitivity analysis was conducted by removing all flagged studies to evaluate the robustness of the pooled estimates (see Supplementary Materials for full diagnostic criteria and results).

2.6. Deviations from PROSPERO Protocol

The PROSPERO protocol prespecified standardized mean differences (SMDs) and relative risks (RRs) as summary measures, with heterogeneity assessed using the I2 statistic and certainty of evidence evaluated using the GRADE framework. During data extraction, substantial methodological and clinical heterogeneity across studies was identified, including the use of different depression rating scales (GRID-HAMD, MADRS, QIDS-SR), mixed follow-up time points, within-subject and between-subject designs, and variable reporting formats for continuous and dichotomous outcomes. To improve comparability across studies, effect sizes were standardized and expressed as Cohen’s d with corresponding standard errors.
A classical random-effects meta-analysis was conducted in JASP (version 0.96) [23], with between-study variance estimated using restricted maximum-likelihood estimation (REML). Given the small number of included studies, pooled effects and meta-regression coefficients were evaluated using the Knapp–Hartung adjustment to provide more conservative inference. When studies contributed multiple follow-up effect sizes, these were clustered at the study level to account for within-study dependency. Subgroup analyses were performed according to study design (within-subject vs. randomized controlled trials), and additional moderators included depression scale type and follow-up duration after psilocybin administration.
Residual heterogeneity was assessed using the Q statistic and tau (τ). Sensitivity analyses, including leave-one-out diagnostics, were conducted to evaluate the robustness of pooled estimates. Certainty of evidence was assessed using the GRADE framework for the main efficacy and safety outcomes. However, because of the exploratory nature of the evidence base, the small number of trials per comparison, mixed study designs, and heterogeneous reporting of outcomes, GRADE ratings were interpreted cautiously and are presented as outcome-level certainty judgments rather than definitive clinical recommendations.

2.7. Ethical Considerations

Because no new human subject data were collected, this study was exempt from Institutional Review Board (IRB) approval and informed consent requirements.

3. Results

3.1. Study Results

A PRISMA flow diagram presents the study’s details (Figure 3). The PubMed (including all MEDLINE-indexed records), CENTRAL, PsychINFO, and SCOPUS searches identified 850 publications. After the initial screening for duplicates in Rayyan, we removed 140 publications, leaving 710 articles for further review. In total, 83 publications were left for the full-text evaluation stage. However, two articles were excluded because they had restricted access and could not be obtained from the authors. Rayyan identified 140 duplicates at import, and an additional 18 duplicates emerged during full-text screening due to multiple database records linking to the same article: 11 were protocol registers; 19 were ongoing studies with no results; and 23 were secondary analyses. Finally, 10 clinical trials were included in our study.
Certainty of evidence was assessed using GRADE. Overall certainty was low for the pooled short-term antidepressant effect. Within-subject studies showed larger effects but lower certainty due to a lack of comparators, expectancy effects, functional unblinding, and confounding, while between-subject controlled studies provided more robust but still limited evidence. Follow-up duration did not significantly moderate the effect (Appendix C, Table A3).
A total of ten clinical trials met the inclusion criteria for the systematic review, out of which eight provided sufficient quantitative data to be included in the efficacy meta-analysis. The meta-analyzed studies incorporated diverse methodological approaches, which were subsequently stratified into two main subgroups: within-subjects evaluations (including open-label trials and longitudinal follow-ups) [18,19] and between-subjects comparisons comprising randomized controlled trials (RCTs) [5,10,11,17,20].
For the safety results, the meta-analysis of severe adverse events during the intervention included the studies by Goodwin, Davis, Gukasyan, Carhart-Harris, and von Rotz [5,10,11,17,20]. For the meta-analysis of any adverse event during the intervention, the studies by von Rotz and Davis were included [10,17], in which doses of 10 mg to 20 mg of psilocybin were compared; the studies by Gukasyan, Goodwin, and Carhart-Harris [5,11,20], in which doses of 25 mg to 30 mg were compared.

3.2. Study Characterization

Sociodemographic characteristics of included studies were depicted in Table 1. Across trials, medical and psychiatric comorbidities were generally excluded and therefore reported as absent, reflecting stringent eligibility criteria in psilocybin RCTs. Ten clinical trials were identified and included based on the above-mentioned criteria, six of which were randomized [5,10,11,16,17,20], three were open-label trials [9,18,19], and one trial was a placebo-controlled, within-subject, fixed-order design [21]. Of the above-mentioned trials, the study by Gukasyan [20] is a 12-month follow-up from the study by Davis [10], and the 2018 study by Carhart-Harris [18] is a six-month follow-up from his 2016 work [9].
The studies by Davis and Gukasyan [10,20] used a delayed-treatment condition as a control to differentiate the psilocybin intervention from spontaneous symptom improvement. The exploratory study by Goodwin and the feasibility study by Carhart-Harris did not have a control group [9,19]. A previous study by Goodwin [5], a dose-finding, parallel-group, randomized clinical trial, compared two psilocybin conditions with a 1 mg psilocybin dose as a control. The studies by Raison, von Rotz, and Sloshower [16,17,21] compared the psilocybin condition with a placebo. However, the Sloshower study used each participant as their control. This study had a fixed-order design of placebo being first, followed by psilocybin [21]. The 2021 Carhart trial compared two 25 mg doses of psilocybin with escitalopram, which was escalated to a 20 mg dose [11].
Regarding psilocybin dose and number of sessions, the studies by Carhart-Harris [9,18] evaluated two oral doses separated by one week, the first being a low psilocybin dose of 10 mg, and the second being a high psilocybin dose of 25 mg. The study by von Rotz assessed a single moderate psilocybin dose of 0.215 mg/kg, which was rounded up to 16 mg in a participant weighing 70 kg [17]. The studies by Davis and Gukasyan [10,20] evaluated two oral doses separated by a mean of 1.6 weeks; the first dose was a moderately high psilocybin dose (20 mg/70 kg), and the second was a high psilocybin dose (30 mg/70 kg). The study by Carhart-Harris evaluated two oral doses of 25 mg of psilocybin separated by three weeks [11]. The studies by Goodwin and Raison [5,16] evaluated a single dose of 25 mg of psilocybin. Finally, the study by Sloshower assessed the highest dose of oral psilocybin (0.3 mg/kg, maximum dose 35 mg) [21].
Across included studies, mean ages ranged from 36.7 to 44.1 years. None of the included trials reported efficacy or safety outcomes stratified specifically for participants older than 65 years; therefore, an age-based subgroup analysis was not feasible. Medical and psychiatric comorbidities were generally excluded or inconsistently reported across studies, limiting the possibility of pooled subgroup analyses by comorbidity status and reducing generalizability to more complex real-world clinical populations.

3.3. Efficacy Results

All trials demonstrated positive results. Davis conducted a randomized, waiting-list-controlled trial in which 27 participants with moderate-to-severe MDD were randomized to the immediate-treatment group (n = 13, analyzed) or the delayed-treatment group (n = 11, analyzed), which received the same treatment as the former but eight weeks apart [10]. They received two psilocybin sessions separated by a mean of 1.6 weeks; the first was a moderately high dose (20 mg/kg), and the second was a high dose (30 mg/kg). Mean and standard deviation (SD) scores of GRID-HAMD were lower at weeks 1 (8.0;7.1) and 4 (8.5;5.7), compared to the scores of corresponding weeks 5 (23.8;5.4) and 8 (23.5;6.0) in the delayed-treatment group.
Gukasyan conducted a 12-month follow-up analysis, in which it was reported that all 24 participants completed the long-term follow-up assessments. GRID-HAMD scores decreased from a mean (SD) of 22.8 (3.9) at the pretreatment baseline to 9.3 (8.8) at 3 months, 7.0 (7.7) at 6 months, and 7.7 (7.9) at 12 months post-treatment. This is the most extensive follow-up study. Therefore, it is noteworthy that the final response and remission rates reported were 75% and 58%, respectively [20].
Goodwin conducted an open-label, exploratory study in a population of participants diagnosed with treatment-resistant MDD. They received a single dose of 25 mg psilocybin as an adjunct to antidepressant treatment. Nineteen participants completed the study and continued their SSRI medication throughout the follow-up period and to the end of the study (week 3). Among the medications the participants were taking were sertraline (50 mg to 200 mg), escitalopram (5 mg to 20 mg), fluoxetine (20 mg to 80 mg), vilazodone (20 mg to 40 mg), paroxetine (60 mg), and citalopram (20 mg). At baseline, the mean MADRS was 31.7 (SD = 5.7), and at week 3, it was 16.8 (95% confidence interval [CI], 11.2–22.4), which is equivalent to a mean change from baseline of −14.9 (95% CI, −20.7 to −9.2). Response and remission rates were maintained at 42.1% of participants by week 3 [19].
Carhart-Harris conducted an open-label feasibility study on a population of participants with treatment-resistant MDD. Twelve participants received two doses of psilocybin, a low oral dose of 10 mg and a high dose of 25 mg, a week apart. QIDS depression scores were reduced from baseline (19.2, 2.0) to week 1 (7.4, 4.9) and 3 months (10.0, 6.0) post-treatment, observing a maximum effect at week 2 (6.3, 4.6) [9]. Afterward, Carhart-Harris performed a 6-month follow-up analysis, recruiting seven additional participants who completed all measures. Beck Depression Inventory (BDI) score at baseline was 34.5 (7.3), at week 1, 11.8 (11.1), at 3 months, 19.2 (13.9), and at 6 months, 19.5 (13.9), which demonstrated a significant reduction in depression scores throughout the study. However, two participants were on venlafaxine throughout the trial; six participants began new courses of antidepressant medication after the 3-month time point, and five participants received psychotherapy around the 3-month time point, which confounds the results of the trial [18].
Goodwin conducted the largest trial to date in a population with a diagnosis of MDD. Two hundred thirty-three participants underwent randomization: 79 participants were assigned to the 25 mg psilocybin group, 75 participants were assigned to the 10 mg psilocybin group, and 79 were assigned to the 1 mg psilocybin group (control). At baseline, depression was categorized according to the MADRS score as moderate in 30% and severe in 68% of participants. The least-squares mean change from baseline to week 3 in the MADRS score was −12.0 (95% CI, −14.6 to −9.3) points in the 25 mg group, −7.9 (95% CI, −10.6 to −5.2) points in the 10 mg group, and −5.4 (95% CI, −8.1 to −2.7) points in the 1 mg group. The difference between the 25 mg and the 1 mg group was significant (least squares mean change was −6.6, −10.2 to −2.9; p < 0.001). However, the difference between the 10 mg and 1 mg groups was insignificant. The sustained response at week 12 was 20% for the 25 mg group, 5% for the 10 mg group, and 10% in the 1 mg group [5].
Raison conducted a randomized trial to assess a single 25 mg psilocybin intervention vs. an active placebo. One hundred forty-four participants were randomized, 51 were included in the intent-to-treat psilocybin analysis, and 53 were included in the intent-to-treat niacin analysis. The psilocybin group had a more significant reduction in MADRS score from baseline to day 43 (mean difference, −12.3 [95% CI, −17.5 to −7,2], p < 0.001) [16].
Moreover, von Rotz conducted a double-blind, randomized trial to assess a single psilocybin dose of 0.215 mg/kg (rounded up to 16 mg in a 70 kg person) versus a placebo. Fifty-two participants were randomized to enter either group—26 were analyzed in the psilocybin group, and 26 were analyzed in the placebo group. Fourteen days after the intervention, the psilocybin condition demonstrated a decrease in MADRS score of −13.0 points compared to baseline and was more significant than those reported in the placebo condition (95% CI −15.0 to −1.3; Cohen’s d = 0.97; p = 0.0011; MADRS). At this time point, the response rate was reported in 58% of the psilocybin group, and remission was reported in 54% of the participants (using MADRS) [17].
Sloshower conducted an exploratory, placebo-controlled, within-subject, fixed-order study in which each participant served as their own control. Subjects received placebo first, followed by two sessions of psilocybin (0.3 mg/kg, maximum dose of 35 mg) 4 weeks apart. This design was selected to minimize carryover effects. Nineteen participants were enrolled and analyzed, although only fifteen participants completed both dosing sessions. All participants achieved clinically significant responses by the end of week 6; 33% responded after placebo, and 66.7% responded after psilocybin. The mean GRID-HAMD (reported in Supplementary Content) and SD baseline (day before the first dose) were 22.6 (1.6), reduced to 17.4 (1.7) at week 1 after placebo, and reduced to 9.5 (1.7) at week 1 after psilocybin, although this improvement was not statistically significant [21].
In 2021, Carhart-Harris conducted a double-blind, randomized trial to assess two psilocybin doses of 25 mg separated by 3 weeks versus escitalopram. Fifty-nine participants underwent randomization; 30 were allocated to the psilocybin group and 29 to the escitalopram group. The mean change in QIDS-SR16 score compared to baseline at week 6 was −8.0 in the psilocybin group, compared with −6.0 in the escitalopram group, without a statistically significant difference between groups (p = 0.17) [11].
Furthermore, different types of psychological support were provided in all trials before, during, and after the psilocybin intervention. In most cases, psychological support was described as non-directive and focused on the creation of therapeutic rapport, as well as integrative sessions after the psychedelic experience. Different mental health professionals provided support, although not all of the trials specify the number of sessions and specific techniques employed.
Clinical outcomes were summarized in Table 2.

3.4. Meta-Analysis

3.4.1. Efficacy

Both within-subjects and between-subjects designs yielded large, significant effects of psilocybin on depression scores. Within-subjects designs produced significantly larger estimates than RCTs; Qm(1) = 5.88, p = 0.015. Residual heterogeneity was non-significant in both subgroups. Table 3 summarizes the pooled effect sizes by subgroup, and the individual study effects are shown in Figure 4.
Weeks of follow-up did not moderate effects in either subgroup. The depression rating scale was non-significant for within-subjects studies but strongly moderated the between-subjects pooled effect; F(2, 5) = 23.44, p = 0.003. Relative to GRID-HAMD, MADRS-based studies yielded smaller effects (B = −1.47, p = 0.001) and QIDS-SR studies yielded even smaller effects (B = −2.02, p = 0.002). [CD2.1]. In addition, formal subgroup meta-analysis comparing MDD versus TRD was not feasible for the reasons detailed in the Limitations Section.
Leave-one-out diagnostics flagged three studies with influential observations: Carhart-Harris [18], with one effect size showing a large standardized residual (3.30) and near-zero covariance ratio; Davis [10], with two flagged effect sizes (DFFITS = −2.38 and 1.33); and Goodwin [5], an extreme outlier in the between-subjects subgroup (standardized residual = −5.75, covariance ratio = 0.001), as shown in Figure 5. Carhart-Harris [11] had maximum leverage (hat = 1.00), precluding leave-one-out estimation.
After excluding the three influential studies, the pooled effect remained large and significant, d = 1.21, 95% CI [0.90, 1.51], p < 0.001, with near-zero heterogeneity, Q(7) = 3, as shown in Table 4.
The subgroup difference was no longer significant, Qm(1) = 0.05, p = 0.818, suggesting the divergence in the primary analysis was driven by the excluded studies. Depression scale remained a significant moderator in the between-subjects subgroup (B = −1.31, p = 0.002). Case-wise diagnostics flagged residual influence from Carhart-Harris [18] within subjects, and one Raison [16] effect size; no further exclusions were made.

3.4.2. Safety

For the meta-analysis of severe adverse events, the studies by Goodwin, Davis, Gukasyan, Carhart-Harris, and von Rotz were included [5,10,11,17,20]. The risk difference (RD) was calculated on the day of the intervention (n = 5 studies; RD = −0.01, 95% CI: −0.04 to 0.02; p = 0.63), which indicates there is no greater significant risk of severe adverse events in the psilocybin group, as illustrated in Figure 6.
For the meta-analysis of any adverse events reported during the intervention, in which the studies by von Rotz, Gukasyan, and Davis [10,17,20] were included, the RD was calculated at a psilocybin dose of 10 mg to 20 mg (n = 2 studies; RD = 0.09, 95% CI: −0.22 to 0.39; p = 0.58), and we also calculated the risk difference at a psilocybin dose of 25 mg to 30 mg (n = 3 studies; RD = 0.12, 95% CI: 0.00 to 0.23; p = 0.04). There was no clear evidence for a greater risk of adverse events for the low-dose subgroup (10 mg–20 mg). As for the high-dose subgroup (25 mg–30 mg), there is a tendency favoring fewer adverse events in the control group. However, the overall effect is not significant (p = 0.14), as presented in Figure 7.

4. Discussion

The meta-analysis showed large short-term reductions in depressive symptom severity following psilocybin-assisted therapy. However, effect estimates differed according to study design. Within-subject and pre–post studies produced larger effects than between-subject controlled comparisons, suggesting that uncontrolled designs may overestimate treatment effects due to expectancy effects, functional unblinding, regression to the mean, or contextual therapeutic factors. Although between-subject randomized or controlled studies provide more robust evidence, their certainty remains limited by small sample sizes, heterogeneous comparators, differences in depression rating scales, and short follow-up periods. Therefore, while the results support a promising antidepressant signal, they should be interpreted cautiously.
Ten clinical trials were included in the systematic review, and eight provided sufficient quantitative data for the updated efficacy meta-analysis. The updated random-effects model showed a large short-term reduction in depressive symptom severity following psilocybin-assisted therapy. The pooled effect for the full dataset was large, with larger estimates in within-subject designs than in between-subject controlled comparisons. This pattern suggests that uncontrolled or pre–post designs may overestimate treatment effects due to expectancy effects, functional unblinding, regression to the mean, or contextual therapeutic factors. Therefore, although the findings support a promising short-term antidepressant signal, they should be interpreted with caution given the limited certainty of evidence, small number of trials, and heterogeneity across designs, comparators, scales, and follow-up periods.
Our findings align with recent systematic reviews evaluating the short-term improvement of depressive severity following psilocybin therapy, while emphasizing the need for more robust long-term data [12]. Metaxa and Clarke [13] similarly observed significant reductions in MADRS and QIDS scores at 3 to 6 weeks, which is highly consistent with our pooled effect sizes. Furthermore, the systematic review by Salvetti [14] highlighted dose-dependent responses and the critical importance of psychological support—elements that are consistently reflected in the trials included in our analysis. Our study incorporates recent randomized and exploratory trials—such as Goodwin, Raison, and von Rotz [5,16,17]—focusing exclusively on major depressive disorder (MDD). This allows us to provide a detailed breakdown of optimal dosing, short-term safety, and the relevance of the psychotherapeutic framework.
Adjusting for extreme variances in our meta-analysis provided a more accurate picture of psilocybin’s true efficacy. A rigorous sensitivity analysis was conducted, removing influential outliers, such as Davis [10] and the open-label data from Carhart-Harris [18], which reported unusually large effect sizes. Even after excluding these influential studies, the overall pooled effect remained significant (d = 1.21 for the full dataset; d = 1.42 for the between-subjects RCTs) with near-zero residual heterogeneity. Furthermore, meta-regression revealed that the duration of follow-up (in weeks) did not significantly moderate the effects, suggesting that the initial antidepressant response is durable over the evaluated periods. However, the choice of depression scale strongly moderated the effect size in between-subject trials; studies utilizing the MADRS or QIDS-SR yielded significantly more conservative estimates compared to those utilizing the GRID-HAMD. These meta-regression findings should be interpreted with caution. Although they were useful for exploring potential sources of heterogeneity, the number of included studies and effect sizes per moderator was limited. Therefore, these analyses were likely underpowered, and their findings should be considered exploratory and hypothesis-generating rather than confirmatory. Non-significant moderator effects should not be interpreted as evidence that no moderation exists, while significant associations may be unstable and require confirmation in larger, prospectively designed studies.
A notable characteristic of the included trials is the variable standardization of the psychotherapy-assisted protocol. This is highly relevant because, as pointed out in a comprehensive review by Reiff [24], part of the therapeutic effect of psychedelics is the result of complex interactions between the substance and the mindset of the patient (the “set”), as well as the interactions with the environment and the therapist (the “setting”). Notably, the trial conducted by Sloshower [21] sought to address this by following the principles of a manual based on Acceptance and Commitment Therapy (ACT) as a therapeutic frame for the sessions. As pointed out in such work, most psychedelic-assisted psychotherapy must be carefully structured in three distinct parts: preparatory sessions, support during the acute dosing sessions, and subsequent psychological integration [25].
The pooled safety data indicate that psilocybin administration does not carry an increased risk of severe adverse events compared to placebo. Most treatment-emergent adverse events (such as headache, nausea, and transient anxiety) are mild and resolve on the day of onset [11]. Furthermore, an exploratory Phase II study by Goodwin [19] demonstrated that a single 25 mg dose of psilocybin can be administered safely and effectively as an adjunct to an ongoing SSRI. This is a critical finding, as it suggests that the problematic process of withdrawing patients from serotonergic antidepressants prior to psilocybin therapy may not always be necessary.
As clinical trial phases continue to progress, a critical point to consider is future access to treatment and its health-economic impact. To date, cost-effectiveness analyses evaluating psilocybin treatment for depression have yet to be established [8,26,27]. However, comparisons can be drawn from other available and emerging treatments. For instance, the FDA-approved treatment for TRD, esketamine (Spravato), can cost between $4800 US dollars and $6800 US dollars for the first month of treatment, and up to US $3600 per month during maintenance therapy [27]. Furthermore, a recent health-economic model analyzing midomafetamine-assisted therapy (MDMA-AT) for chronic post-traumatic stress disorder estimated the total cost of a treatment course at $48,376. This cost is largely driven by the resource-intensive nature of the required 90 min preparation and integration sessions, as well as the 8 h interventional sessions requiring constant therapist monitoring [28]. Given that psilocybin therapy demands a similarly intensive psychotherapeutic framework, understanding its cost-effectiveness will be an essential gap to address for future health policy implementation. Finally, while other classic psychedelics such as ayahuasca and LSD have also demonstrated efficacy for depressive symptoms [7], these studies have placed comparatively less emphasis on exploring safety and long-term economic outcomes.

Limitations

Several limitations within the current body of evidence and our meta-analytical approach must be acknowledged. First, from a statistical perspective, there is a pervasive lack of complete data reporting in the original trials, particularly missing standard deviation (SD) values for continuous outcomes across multiple time points. This necessitated the imputation of SDs or the derivation of values via graphical extraction tools, adhering to the methodologies outlined in the Cochrane Handbook for Systematic Reviews of Interventions. While these techniques are validated, they introduce a degree of estimation error that complicates direct meta-analytic comparisons.
Second, the included trials present substantial methodological heterogeneity. Study designs ranged from open-label and waitlist-controlled trials to rigorous randomized controlled trials (RCTs). Furthermore, there was significant variability in dosing protocols—such as fixed doses versus weight-based doses—and varying frequencies of psilocybin administration and psychotherapy sessions. Crucially, our meta-regression identified that the choice of psychometric instrument acts as a major moderating variable. Specifically, trials utilizing the MADRS or QIDS-SR yielded significantly more conservative effect size estimates compared to those utilizing the GRID-HAMD. This scale-driven variance suggests that clinical interpretations of efficacy are heavily influenced by the specific tools used to measure depressive symptoms.
A subgroup analysis comparing participants with non-treatment-resistant MDD versus TRD was considered but could not be formally conducted. Several included trials enrolled mixed MDD/TRD populations, used heterogeneous operational definitions of TRD, or did not report independent effect estimates stratified by treatment-resistance status. Consequently, any pooled comparison between MDD and TRD would have been underpowered and at risk of ecological bias. This is an important limitation, as TRD may differ from non-resistant MDD in clinical severity, neurobiological mechanisms, chronicity, and treatment responsiveness. Future trials should report stratified outcomes by TRD status to allow more clinically informative meta-analytic comparisons.
Third, the profound psychoactive nature of psilocybin makes true double-blinding exceptionally difficult, rendering most trials highly susceptible to functional unblinding and expectancy bias. This is particularly problematic in within-subject and open-label designs, where the anticipation of receiving a psychedelic substance may artificially inflate the reported antidepressant effects.
Finally, demographic and health-economic limitations remain prominent. The patients across the included trials are overwhelmingly White, non-Hispanic, and from upper socioeconomic backgrounds, which severely restricts the generalizability of these findings to diverse clinical populations. Furthermore, robust cost-effectiveness analyses for psilocybin-assisted therapy have not yet been published. Given the resource-intensive nature of the required psychological support—similar to the substantial costs recently modeled for MDMA-assisted therapy or the maintenance costs of intranasal esketamine—understanding the economic viability of this treatment is an urgent gap that must be addressed prior to widespread clinical implementation.

5. Conclusions

The aggregated quantitative evidence shows that psilocybin-assisted therapy produces a clinically meaningful and temporally stable reduction in depressive symptoms for patients with MDD and TRD, maintaining a large effect size (d = 1.42 for controlled trials) even after rigorous sensitivity analyses excluded influential outliers. This potent antidepressant’s efficacy is accompanied by an acceptable short-term safety profile, with adverse events largely restricted to transient, mild-to-moderate effects on the dosing day. However, despite these highly promising results, psilocybin-assisted therapy is not yet ready for uncritical mainstream adoption. To successfully translate this therapeutic potential into an accessible and scalable psychiatric treatment, future large-scale phase 3 trials must implement standardized, manualized psychotherapeutic frameworks (such as Acceptance and Commitment Therapy), recruit ethnically and socioeconomically diverse clinical populations, establish long-term health-economic cost-effectiveness, and develop innovative active-placebo designs to mitigate the persistent unblinding and expectancy biases inherent to psychedelic research.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/psychiatryint7030137/s1 [29].

Author Contributions

Conceptualization, A.L.-L., D.N.L.-S., and C.G.T.-L.; methodology, A.L.-L., D.N.L.-S., and C.G.T.-L.; project administration, A.L.-L., D.N.L.-S., and C.G.T.-L.; formal analysis, all authors; investigation, all authors; validation, all authors; writing—original draft preparation, A.L.-L., D.N.L.-S., and C.G.T.-L.; writing—review and editing, all authors. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

Acknowledgments

During the preparation of this manuscript, the authors used GenAI [ChatGPT-5.2-Plus] for the purpose of assisting in the generation of the risk of bias graph. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Search strategy.
Table A1. Search strategy.
SEARCH STRATEGY MEDLINE.
((“psilocybin”[MeSH Terms] OR “psilocybin”[All Fields] OR “psilocybine”[All Fields] OR “psilocybin s”[All Fields]) AND (“depressed”[All Fields] OR “depression”[MeSH Terms] OR “depression”[All Fields] OR “depressions”[All Fields] OR “depression s”[All Fields] OR “depressive disorder”[MeSH Terms] OR (“depressive”[All Fields] AND “disorder”[All Fields]) OR “depressive disorder”[All Fields] OR “depressivity”[All Fields] OR “depressive”[All Fields] OR “depressively”[All Fields] OR “depressiveness”[All Fields] OR “depressives”[All Fields])) AND ((controlledclinicaltrial[Filter] OR randomizedcontrolledtrial[Filter]) AND (humans[Filter]))
SEARCH STRATEGY CENTRAL
  • ..nlp psilocybin depression {Sin términos relacionados}
  • limit 1 to year = “2010–2024”
  • limit 2 to (trial registry record and full text and year = “2010–2024” and english language)
Search Strategy PsycINFO
  • ..nlp psilocybin depression {Sin términos relacionados}
  • limit 1 to year = “2010–2024”
  • limit 2 to (trial registry record and full text and year = “2010–2024” and english language)
Search strategy SCOPUS
(KEY(psilocybin) AND KEY (depression)) AND (LIMIT-TO (SUBJAREA, “MEDI”) OR LIMIT-TO (SUBJAREA, “NEUR”)) AND (LIMIT-TO (DOCTYPE, “ar”)) AND (LIMIT-TO (LANGUAGE, “English”))

Appendix B

Table A2. Serious adverse events definitions [5,9,10,11,16,17,18,19,20,21].
Table A2. Serious adverse events definitions [5,9,10,11,16,17,18,19,20,21].
Davis (2021)No operational definition of serious adverse events (AEs) reported.
Physiological parameters measured: Heart rate and blood pressure measured pre-dose and at 30–360 min post-dose.
Psychological/behavioral measures: Challenging emotional experiences such as fear or sadness evaluated through Mystical Experience Questionnaire (MEQ30) and Challenging Experience Questionnaire (CEQ26).
Suicide related AEs: Columbia Suicide Severity Rating Scale (C-SSRS) was assessed at every in-person visit.
Gukasyan (2022)No operational definition of serious AEs reported.
Same as Davis 2021, and symptoms indicative of hallucinogen persisting perceptual disorder (HPPD) were solicited.
Goodwin (2023)Treatment-emergent adverse events (TEAEs) were coded using the Medical Dictionary for Regulatory Activities version 23.0. (MeDRA) TEAEs are any new or worsening adverse events occurring after treatment initiation.
Carhart-Harris (2016)No operational definition of serious AEs reported. Safety was evaluated through the following components:
-
Vital signs: Blood pressure and heart rate measured at baseline, 30, 60, 120, 180, 240, 300 and 360 min after dosing.
-
Observer ratings of acute psychological effects at the same time points (scale 0–4). Acute Altered State of Consciousness questionnaire (11D-ASC) was completed 6–7 h after dosing.
Telephone follow-up 1 day after the low dose.
In-person clinical evaluation 1 week after the high dose. All baseline questionnaires and assessments were completed.
Carhart-Harris (2018)Same as Carhart-Harris 2016. Additionally, the following definition is provided.
Adverse events were defined as any patient-reported or clinician-observed side effects following treatment. At each post-treatment visit, participants were explicitly asked whether they had experienced any side effects related to the intervention. Additionally, any spontaneously reported or observed side effects were documented.
Goodwin (2022)Adverse events were defined as any new or worsening symptom after dosing and were coded using MedDRA v23.0.
Serious AEs followed ICH-GCP criteria, including any life-threatening event, hospitalization, significant disability, medically important condition, or suicidal ideation/behavior identified via C-SSRS. Safety monitoring included vital signs (screening, baseline, Days 1–2), clinical labs (screening, Day 2, Week 3), and 12-lead ECG (screening, Day 2).
Raison (2023)AEs included any new or worsening symptom after dosing and were graded for severity, seriousness, and causality. Solicited AEs included suicidal ideation (via C-SSRS or MADRS item 10), elevated blood pressure or heart rate requiring medication, drug overdose with suicidal intent, headache, nausea, and visual perceptual effects.
Serious AEs followed standard regulatory criteria (death, life-threatening events, hospitalization, or significant/persistent disability).
von Rotz (2023)AEs were monitored at each visit and defined as clinically relevant symptoms persisting beyond acute psilocybin effects. Safety monitoring included psychological and physical well-being, suicidality, vital signs, and concomitant medication use. Blood pressure and pulse were evaluated hourly, and rescue medications (e.g., nifedipine, diazepam, olanzapine) were available but not required.
Sloshower (2023)No operational definition of serious AEs. Safety was monitored throughout each dosing session (vital signs every 30 min for 2 h, then hourly) and at all follow-up visits. No rescue medications were needed during dosing sessions.
Carhart-Harris (2021)AEs were defined as any new or worsening symptom occurring between dosing day 1 and week 6 and were coded using MedDRA v23.0. Recording procedures included structured patient questioning at every visit (“How have you been since your last visit?”), telephone follow-up, and clinician observation at the trial site.

Appendix C

Table A3. GRADE summary of findings for psilocybin-assisted therapy in major depressive disorder based on standardized effect sizes.
Table A3. GRADE summary of findings for psilocybin-assisted therapy in major depressive disorder based on standardized effect sizes.
OutcomeEvidence BaseEffect EstimateCertainty of EvidenceMain Reasons for Rating
Short-term reduction in depressive symptom severity, full dataset8 studies; 17 effect-size estimates; mixed within-subject and between-subject designsCohen’s d = 1.15, 95% CI 0.83–1.48LowDowngraded due to risk of bias, functional unblinding, expectancy effects, heterogeneous study designs, different depression scales, and limited number of trials.
Short-term reduction in depressive symptom severity, within-subject designsOpen-label, longitudinal, or within-subject comparisonsCohen’s d = 1.63, 95% CI 1.13–2.14Very low to lowDowngraded due to lack of independent comparators, open-label or pre–post designs, risk of confounding, regression to the mean, expectancy effects, and functional unblinding.
Short-term reduction in depressive symptom severity, between-subject controlled designsRandomized or controlled comparisonsCohen’s d = 0.96, 95% CI 0.54–1.37Low to moderateControlled designs provide more robust evidence, but certainty was downgraded due to functional unblinding, small number of trials, heterogeneous comparators, different depression scales, and limited follow-up.
Sensitivity analysis after exclusion of influential studiesRestricted dataset after leave-one-out diagnosticsCohen’s d = 1.21, 95% CI 0.90–1.51LowThe effect remained large and statistically significant after excluding influential observations; however, certainty remains limited by small number of studies, study-design heterogeneity, and residual risk of bias.
Serious adverse events during psilocybin administrationControlled trials reporting serious adverse eventsRD = −0.01, 95% CI −0.04 to 0.02LowDowngraded due to few events, small sample size, short safety follow-up, and imprecision. No clear increase in serious adverse events was observed.
Any adverse events during psilocybin administrationControlled trials reporting adverse eventsOverall RD = 0.10, 95% CI −0.03 to 0.23Low to very lowDowngraded due to heterogeneous adverse-event definitions, variable reporting, dose differences, imprecision, and limited number of trials. Most adverse events were transient.
Long-term efficacyFollow-up and extension studiesSustained improvement reported in selected cohortsVery lowDowngraded due to non-randomized follow-up, lack of independent long-term comparators, possible co-interventions, attrition, and limited generalizability.
Applicability to older adults and patients with comorbiditiesSubgroup evidence not availableNot estimableVery lowNo included trial reported efficacy or safety outcomes stratified for participants older than 65 years. Medical and psychiatric comorbidities were generally excluded or inconsistently reported.

References

  1. Depressive Disorder (Depression). Available online: https://www.who.int/news-room/fact-sheets/detail/depression (accessed on 13 May 2026).
  2. Halaris, A.; Sohl, E.; Whitham, E.A. Treatment-Resistant Depression Revisited: A Glimmer of Hope. J. Pers. Med. 2021, 11, 155. [Google Scholar] [CrossRef]
  3. James, S.L.; Abate, D.; Abate, K.H.; Abay, S.M.; Abbafati, C.; Abbasi, N.; Abbastabar, H.; Abd-Allah, F.; Abdela, J.; Abdelalim, A.; et al. Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: A systematic analysis for the Global Burden of Disease Study 2017. Lancet 2018, 392, 1789–1858. [Google Scholar] [CrossRef] [PubMed]
  4. Zhdanava, M.; Pilon, D.; Ghelerter, I.; Chow, W.; Joshi, K.; Lefebvre, P.; Sheehan, J.J. The Prevalence and National Burden of Treatment-Resistant Depression and Major Depressive Disorder in the United States. J. Clin. Psychiatry 2021, 82, 20m13699. [Google Scholar] [CrossRef] [PubMed]
  5. Goodwin, G.M.; Aaronson, S.T.; Alvarez, O.; Arden, P.C.; Baker, A.; Bennett, J.C.; Bird, C.; Blom, R.E.; Brennan, C.; Brusch, D.; et al. Single-Dose Psilocybin for a Treatment-Resistant Episode of Major Depression. N. Engl. J. Med. 2022, 387, 1637–1648. [Google Scholar] [CrossRef]
  6. Rosenblat, J.D.; Husain, M.I.; Lee, Y.; McIntyre, R.S.; Mansur, R.B.; Castle, D.; Offman, H.; Parikh, S.V.; Frey, B.N.; Schaffer, A.; et al. The Canadian Network for Mood and Anxiety Treatments (CANMAT) task force report: Serotonergic psychedelic treatments for major depressive disorder. Can. J. Psychiatry Rev. Can. Psychiatr. 2023, 68, 5–21. [Google Scholar] [CrossRef]
  7. Ko, K.; Kopra, E.I.; Cleare, A.J.; Rucker, J.J. Psychedelic therapy for depressive symptoms: A systematic review and meta-analysis. J. Affect. Disord. 2023, 322, 194–204. [Google Scholar] [CrossRef] [PubMed]
  8. Haikazian, S.; Chen-Li, D.C.J.; Johnson, D.E.; Fancy, F.; Levinta, A.; Husain, M.I.; Mansur, R.B.; McIntyre, R.S.; Rosenblat, J.D. Psilocybin-assisted therapy for depression: A systematic review and meta-analysis. Psychiatry Res. 2023, 329, 115531. [Google Scholar] [CrossRef]
  9. Carhart-Harris, R.L.; Bolstridge, M.; Rucker, J.; Day, C.M.J.; Erritzoe, D.; Kaelen, M.; Bloomfield, M.; A Rickard, J.; Forbes, B.; Feilding, A.; et al. Psilocybin with psychological support for treatment-resistant depression: An open-label feasibility study. Lancet Psychiatry 2016, 3, 619–627. [Google Scholar] [CrossRef]
  10. Davis, A.K.; Barrett, F.S.; May, D.G.; Cosimano, M.P.; Sepeda, N.D.; Johnson, M.W.; Finan, P.H.; Griffiths, R.R. Effects of Psilocybin-Assisted Therapy on Major Depressive Disorder: A Randomized Clinical Trial. JAMA Psychiatry 2021, 78, 481. [Google Scholar] [CrossRef]
  11. Carhart-Harris, R.; Giribaldi, B.; Watts, R.; Baker-Jones, M.; Murphy-Beiner, A.; Murphy, R.; Martell, J.; Blemings, A.; Erritzoe, D.; Nutt, D.J. Trial of Psilocybin versus Escitalopram for Depression. N. Engl. J. Med. 2021, 384, 1402–1411. [Google Scholar] [CrossRef]
  12. Li, L.J.; Mo, Y.; Shi, Z.M.; Huang, X.B.; Ning, Y.P.; Wu, H.W.; Yang, X.-H.; Zheng, W. Psilocybin for major depressive disorder: A systematic review of randomized controlled studies. Front Psychiatry 2024, 15, 1416420. [Google Scholar] [CrossRef]
  13. Metaxa, A.M.; Clarke, M. Efficacy of psilocybin for treating symptoms of depression: Systematic review and meta-analysis. BMJ 2024, 385, e078084. [Google Scholar] [CrossRef]
  14. Salvetti, G.; Saccenti, D.; Moro, A.S.; Lamanna, J.; Ferro, M. Comparison between Single-Dose and Two-Dose Psilocybin Administration in the Treatment of Major Depression: A Systematic Review and Meta-Analysis of Current Clinical Trials. Brain Sci. 2024, 14, 829. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  15. ChatGPT. Available online: https://chatgpt.com/g/g-p-68b8c359d2b48191836a27843165d489-articulo-psilo/c/6a065f84-64b0-83e8-954b-8ccd98d42c0f (accessed on 13 May 2026).
  16. Raison, C.L.; Sanacora, G.; Woolley, J.; Heinzerling, K.; Dunlop, B.W.; Brown, R.T.; Kakar, R.; Hassman, M.; Trivedi, R.P.; Robison, R.; et al. Single-Dose Psilocybin Treatment for Major Depressive Disorder: A Randomized Clinical Trial. JAMA 2023, 330, 843. [Google Scholar] [CrossRef]
  17. Von Rotz, R.; Schindowski, E.M.; Jungwirth, J.; Schuldt, A.; Rieser, N.M.; Zahoranszky, K.; Seifritz, E.; Nowak, A.; Nowak, P.; Jäncke, L.; et al. Single-Dose Psilocybin-Assisted Therapy in Major Depressive Disorder: A Placebo-Controlled, Double-Blind, Randomised Clinical Trial. eClinicalMedicine 2023, 56, 101809. [Google Scholar] [CrossRef]
  18. Carhart-Harris, R.L.; Bolstridge, M.; Day, C.M.J.; Rucker, J.; Watts, R.; Erritzoe, D.E.; Kaelen, M.; Giribaldi, B.; Bloomfield, M.; Pilling, S.; et al. Psilocybin with Psychological Support for Treatment-Resistant Depression: Six-Month Follow-Up. Psychopharmacology 2018, 235, 399–408. [Google Scholar] [CrossRef]
  19. Goodwin, G.M.; Croal, M.; Feifel, D.; Kelly, J.R.; Marwood, L.; Mistry, S.; O’Keane, V.; Peck, S.K.; Simmons, H.; Sisa, C.; et al. Psilocybin for Treatment Resistant Depression in Patients Taking a Concomitant SSRI Medication. Neuropsychopharmacology 2023, 48, 1492–1499. [Google Scholar] [CrossRef]
  20. Gukasyan, N.; Davis, A.K.; Barrett, F.S.; Cosimano, M.P.; Sepeda, N.D.; Johnson, M.W.; Griffiths, R.R. Efficacy and Safety of Psilocybin-Assisted Treatment for Major Depressive Disorder: Prospective 12-Month Follow-Up. J. Psychopharmacol. 2022, 36, 151–158. [Google Scholar] [CrossRef]
  21. Sloshower, J.; Skosnik, P.D.; Safi-Aghdam, H.; Pathania, S.; Syed, S.; Pittman, B.; D’Souza, D.C. Psilocybin-Assisted Therapy for Major Depressive Disorder: An Exploratory Placebo-Controlled, Fixed-Order Trial. J. Psychopharmacol. 2023, 37, 698–706. [Google Scholar] [CrossRef]
  22. Higgins, J.P.T.; Altman, D.G.; Gotzsche, P.C.; Juni, P.; Moher, D.; Oxman, A.D.; Savović, J.; Schulz, K.F.; Weeks, L.; Sterne, J.A.C.; et al. The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. BMJ 2011, 343, d5928. [Google Scholar] [CrossRef]
  23. JASP Team. JASP (Version 0.96.0) [Computer Software]. 2026. Available online: https://jasp-stats.org/ (accessed on 12 May 2026).
  24. Reiff, C.M.; Richman, E.E.; Nemeroff, C.B.; Carpenter, L.L.; Widge, A.S.; Rodriguez, C.I.; Kalin, N.H.; McDonald, W.M.; the Work Group on Biomarkers and Novel Treatments, a Division of the American Psychiatric Association Council of Research. Psychedelics and Psychedelic-Assisted Psychotherapy. Am. J. Psychiatry 2020, 177, 391–410. [Google Scholar] [CrossRef]
  25. Sloshower, J.; Guss, J.; Krause, R.; Wallace, R.M.; Williams, M.T.; Reed, S.; Skinta, M.D. Psilocybin-Assisted Therapy of Major Depressive Disorder Using Acceptance and Commitment Therapy as a Therapeutic Frame. J. Context. Behav. Sci. 2020, 15, 12–19. [Google Scholar] [CrossRef]
  26. McCarthy, B.; Bunn, H.; Santalucia, M.; Wilmouth, C.; Muzyk, A.; Smith, C.M. Dextromethorphan-Bupropion (Auvelity) for the Treatment of Major Depressive Disorder. Clin. Psychopharmacol. Neurosci. 2023, 21, 609–616. [Google Scholar] [CrossRef]
  27. Salahudeen, M.S.; Wright, C.M.; Peterson, G.M. Esketamine: New Hope for the Treatment of Treatment-Resistant Depression? A Narrative Review. Ther. Adv. Drug Saf. 2020, 11, 2042098620937899. [Google Scholar] [CrossRef]
  28. Stanicic, F.; Zah, V.; Grbic, D.; De Angelo, D. Cost-Effectiveness of Midomafetamine-Assisted Therapy (MDMA-AT) in Chronic and Treatment-Resistant Post-Traumatic Stress Disorder of Moderate or Higher Severity: A Health-Economic Model. PLoS ONE 2024, 19, e0313569. [Google Scholar] [CrossRef] [PubMed]
  29. Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
Figure 1. Risk-of-bias assessment by study design. Risk-of-bias assessment for the randomized controlled trials included in the review [5,10,11,16,17].
Figure 1. Risk-of-bias assessment by study design. Risk-of-bias assessment for the randomized controlled trials included in the review [5,10,11,16,17].
Psychiatryint 07 00137 g001
Figure 2. Narrative appraisal of study limitations. Narrative appraisal of study limitations across open-label, non-randomized, and follow-up studies included in the review [9,18,19,20,21].
Figure 2. Narrative appraisal of study limitations. Narrative appraisal of study limitations across open-label, non-randomized, and follow-up studies included in the review [9,18,19,20,21].
Psychiatryint 07 00137 g002
Figure 3. PRISMA flow diagram for new systematic reviews.
Figure 3. PRISMA flow diagram for new systematic reviews.
Psychiatryint 07 00137 g003
Figure 4. Efficacy effect size. Efficacy effect size of included studies [5,10,11,16,17,18,20].
Figure 4. Efficacy effect size. Efficacy effect size of included studies [5,10,11,16,17,18,20].
Psychiatryint 07 00137 g004
Figure 5. Residual funnel plot.
Figure 5. Residual funnel plot.
Psychiatryint 07 00137 g005
Figure 6. Severe adverse events. Severe adverse events of included studies [5,10,11,17,20].
Figure 6. Severe adverse events. Severe adverse events of included studies [5,10,11,17,20].
Psychiatryint 07 00137 g006
Figure 7. Any adverse events. Any adverse events of included studies [5,10,11,17,20].
Figure 7. Any adverse events. Any adverse events of included studies [5,10,11,17,20].
Psychiatryint 07 00137 g007
Table 1. Sociodemographic characteristics. Sociodemographic characteristics of the studies included in the review [5,9,10,11,16,17,18,19,20,21].
Table 1. Sociodemographic characteristics. Sociodemographic characteristics of the studies included in the review [5,9,10,11,16,17,18,19,20,21].
StudyNWomen (%)Mean Age
Davis (2021)2467%39.8
Gukasyan (2022)2467%39.8
Goodwin (2023)1968.442.2
Carhart-Harris (2016)1250.042.6
Carhart-Harris (2018)2030.044.1
Goodwin (2022)23351.939.8
Raison (2023)10450.041.1
von Rotz (2023)5263.536.7
Sloshower (2023)1968.442.79
Carhart-Harris (2021)5933.941.2
Table 2. Summary of clinical outcomes. Summary of clinical outcomes of the studies included in the review [5,9,10,11,16,17,18,19,20,21].
Table 2. Summary of clinical outcomes. Summary of clinical outcomes of the studies included in the review [5,9,10,11,16,17,18,19,20,21].
StudyDesign/DiagnosisDose/SessionsN Randomized (I/C/ARMS)Primary ScaleMain ResultsResponse
% (n)
Remission
% (n)
Davis et al. (2021)RCT parallel; MDD/TRD.20 mg
30 mg
27 (15/12)
24 completed both sessions
(13/11)
GRID-HAMDWeek 5:
8.0 (7.1) vs. 23.8 (5.4); d = 2.5 (95% CI 1.4–3.5); p < 0.001
Week 8:
8.5 (5.7) vs. 23.5 (6.0); d = 2.6 (95% CI 1.5–3.7); p < 0.001
Week 1:
71% (17/24)
Week 4:
71% (17/24)
Week 1:
58% (14/24)
Week 4:
54% (13/24)
Gukasyan et al.
(2022)
RCT parallel; MDD/TRD
12-month follow-up from Davis 2021.
20 mg
30 mg
27 (15/12)
24 completed both sessions
(13/11)
GRID-HAMD3 months: 9.3 (8.8); d = 2.0 (95% CI 1.3–2.7); p < 0.001
6 months: 7.0 (7.7); d = 2.6 (95% CI 1.7–3.4); p < 0.001
12 months: 7.7 (7.9); d = 2.4 (95% CI 1.6–3.2); p < 0.001
3 months: 67%
6 months: 79%
12 months: 75%
3 months: 58%
6 months: 71%
12 months: 58%
Goodwin et al.
(2023)
Phase II, exploratory, open-label, fixed-dose clinical trial; TRD.25 mg 19MADRSBaseline: 31.7 (SD 5.77)
Week 3: 16.8 (95% CI 11.2–22.4)
Change from baseline: −14.9 (95% CI −20.7 to −9.2)
Significant improvement evident from Day 2 and maintained through Week 3
Day 2: 63.2%
Week 1: 57.9%
Week 2: 57.9%
Week 3: 42.1%
Day 2: 52.6%
Week 1: 47.4%
Week 2: 42.1%
Week 3: 42.1%
Carhart-Harris. et al.
(2016)
Open-label feasibility study; TRD. 10 mg
25 mg
12BDIWeek 1: 8.7 (8.4); Δ −25.0 (95% CI −20.1 a −29.9); p = 0.002
3 months: 15.2 (11.0); Δ −18.5 (95% CI −11.8 a −25.2); p = 0.002
Week 1: NR
3 months:
58% (7/12)
Week 1: 67% (8/12)
3 months: 42% (5/12)
Carhart-Harris et al.
(2018)
Open-label feasibility study; TRD.
Six-month follow-up from Carhart-Harris 2016.
10 mg
25 mg
20
19 completed measurements
BDIMean (SD): Baseline 34.5 (7.3); 1 week 11.8 (11.1); 3 months 19.2 (13.9); 6 months 19.5 (13.9)
Change vs. baseline: –22.7 (10.6) at 1 week; −15.3 (13.7) at 3 months; −14.9 (12.0) at 6 months
Effect sizes: d = 2.5 (1 week); d = 1.4 (3 and 6 months); all p < 0.001
6 months: 67% (6/9) (subset of responders)6 months: NR
Goodwin et al.
(2022)
Phase II, double-blind, dose-finding, parallel-group, RCT.
MDD/TRD.
25 mg
10 mg
233
(79–25 mg;75–10 mg; 79–1 mg)
77;65;68 included in the per-protocol analysis
MADRSWeek 3:
25 mg: Δ −12.0 (SE 1.3), 95% CI −14.6 to −9.3; vs. 1 mg: −6.6 (95% CI −10.2 to −2.9); p < 0.001
10 mg: Δ −7.9 (SE 1.4), 95% CI −10.6 to −5.2; vs. 1 mg: −2.5 (95% CI −6.2 to 1.2); p = 0.18
1 mg: Δ −5.4 (SE 1.4), 95% CI −8.1 to −2.7
Week 3:
25 mg: 37% (29/79)
10 mg: 19% (14/75)
1 mg: 18% (14/79)
Week 3:
25 mg: 29% (23/79)
10 mg: 9% (7/75)
1 mg: 8% (6/79)
Raison et al.
(2023)
Phase II, 2-group, clinical RCT.
MDD.
25 mg104
(51/53)
Placebo-like active comparator
(niacin 100 mg)
MADRSDay 8: −17.8 vs. −5.8 → mean difference ≈ −12.0 (95% CI −16.6 to −7.4), p < 0.001
Day 15: −19.2 vs. −8.0 → p < 0.001
Day 29: −19.2 vs. −9.1 → p < 0.001
Day 43: −19.1 (−22.7 to −15.5); −12.3 (95% CI −17.5 to −7.2); <0.001
Psilocybin:
41.7% (20/48)
Niacin:
11.4% (5/44)
OR = 5.60 (95% CI 1.87–16.74), p = 0.002
Sustained response across days 8, 15, 29 and 43
Psilocybin:
25% (12/48)
Niacin:
9.1% (4/44)
OR = 3.37 (95% CI 0.99–11.47), p = 0.04
Sustained response across days 8, 15, 29 and 43
von Rotz et al.
(2023)
RCT; double-blind, placebo-controlled, parallel-group design (single-center).
MDD.
16 mg52
(26/26)
MADRSDay 2:
−14.4 (95% CI −5.5 to −16.3); p = 0.0002; d = 1.14
Week 2 (day 14):
−13.0 (95% CI −15.0 to −1.3); Cohen’s d = 0.97; p = 0.0011
Week 2:
58% (15/26)
Week 2:
54% (14/26)
Sloshower et al.
(2023)
Placebo-controlled, within-subject, fixed-order exploratory trial (placebo first, then psilocybin)
MDD.
0.3 mg/kg19
15 completed both sessions
GRID-HAMDWeek 6:
Reduction of −12.4 points after psilocybin; effect size d = 1.02–2.27, larger than placebo (d = 0.65–0.99); p < 0.0001 for treatment main effect
Greater improvement at 2 days post-psilocybin than 2 days post-placebo
Week 6 overall response: 100% (15/15)
Placebo: 33.3% (5/15)
Psilocybin: 66.7% (10/15)
Week 6 overall remission: 66.7% (10/15)
Placebo: 20% (3/15)
Psilocybin: 46.7% (7/15)
Carhart-Harris et al.
(2021)
Phase 2, double-blind, RCT.
MDD.
25 mg
25 mg
59
(30/29)
C: escitalopram 20 mg
QIDS-SRWeek 6:
Psilocybin: −8.0 ± 1.0
Escitalopram: −6.0 ± 1.0
Between-group difference: −2.0 (95% CI −5.0 to 0.9), p = 0.17; no statistically significant difference
Secondary outcomes: larger reductions with psilocybin on HAM-D-17 (−10.5 vs. −5.1), MADRS (–14.4 vs. −7.2), and BDI-1A (−18.4 vs. −10.8)
Week 6:
Psilocybin 70% (21/30)
Escitalopram 48% (14/29).
Week 6:
Psilocybin: 57% (17/30)
Escitalopram: 28% (8/29)
Abbreviations: BDI-1A = Beck Depression Inventory–IA; CI = Confidence interval; d = Cohen’s d; C = control; GRID-HAMD = Grid Hamilton Depression Rating Scale; I = intervention; MADRS = Montgomery–Åsberg Depression Rating Scale; MDD = major depressive disorder; NR = no report; OR = odds ratio; QIDS-SR16 = Quick Inventory of Depressive Symptomatology–Self Report; RCT = randomized controlled trial; SD = standard deviation; SE = standard error; SHAPS = Snaith–Hamilton Pleasure Scale; TRD = treatment-resistant depression.
Table 3. Pooled effect sizes by study design subgroup.
Table 3. Pooled effect sizes by study design subgroup.
Subgroupkd95% CItpτQeP
Within-subjects81.75[1.31, 2.20]10.97<0.0010.1996.020.198
Between-subjects91.31[1.08, 1.54]14.47<0.0010.1415.240.387
Table 4. Sensitivity analysis: pooled effects after exclusion of influential studies.
Table 4. Sensitivity analysis: pooled effects after exclusion of influential studies.
Subgroupkd95% CItpτQeP
Within-subjects51.46[0.75, 2.17]8.850.0130.0872.330.311
Between-subjects61.42[1.28, 1.57]30.51<0.0010.0000.460.928
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Labra-Lorenzana, A.; Lima-Sánchez, D.N.; Delaflor-Wagner, C.A.; Martínez-Hernández, D.; Ramos-Jiménez, C.; Toledo-Lozano, C.G. Efficacy of Psilocybin-Assisted Therapy in Major Depressive Disorder: A Systematic Review and Meta-Analysis. Psychiatry Int. 2026, 7, 137. https://doi.org/10.3390/psychiatryint7030137

AMA Style

Labra-Lorenzana A, Lima-Sánchez DN, Delaflor-Wagner CA, Martínez-Hernández D, Ramos-Jiménez C, Toledo-Lozano CG. Efficacy of Psilocybin-Assisted Therapy in Major Depressive Disorder: A Systematic Review and Meta-Analysis. Psychiatry International. 2026; 7(3):137. https://doi.org/10.3390/psychiatryint7030137

Chicago/Turabian Style

Labra-Lorenzana, Angel, Dania Nimbe Lima-Sánchez, Christian Alejandro Delaflor-Wagner, Diana Martínez-Hernández, Christian Ramos-Jiménez, and Christian Gabriel Toledo-Lozano. 2026. "Efficacy of Psilocybin-Assisted Therapy in Major Depressive Disorder: A Systematic Review and Meta-Analysis" Psychiatry International 7, no. 3: 137. https://doi.org/10.3390/psychiatryint7030137

APA Style

Labra-Lorenzana, A., Lima-Sánchez, D. N., Delaflor-Wagner, C. A., Martínez-Hernández, D., Ramos-Jiménez, C., & Toledo-Lozano, C. G. (2026). Efficacy of Psilocybin-Assisted Therapy in Major Depressive Disorder: A Systematic Review and Meta-Analysis. Psychiatry International, 7(3), 137. https://doi.org/10.3390/psychiatryint7030137

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop