Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (572)

Search Parameters:
Keywords = generative music

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 788 KB  
Review
A Focused Survey of Generative AI-Based Music Therapy Systems: Recent Progress and Open Challenges
by Jin S. Seo
Appl. Sci. 2026, 16(9), 4120; https://doi.org/10.3390/app16094120 - 23 Apr 2026
Viewed by 106
Abstract
Generative artificial intelligence (AI)-based music generation has the potential to create new opportunities for music therapy; however, integrated examinations of generative AI and music therapy remain limited. This paper provides a focused survey of recent studies that apply generative AI within music therapy-related [...] Read more.
Generative artificial intelligence (AI)-based music generation has the potential to create new opportunities for music therapy; however, integrated examinations of generative AI and music therapy remain limited. This paper provides a focused survey of recent studies that apply generative AI within music therapy-related contexts, examining how such approaches have been explored in relation to therapeutic considerations, including emotional and physiological regulation. Rather than offering an exhaustive historical review, we analyze generative AI-augmented music therapy systems from a system-level perspective, focusing on their overall design and implementation. Based on this survey, we discuss open research challenges at the intersection of generative music, adaptive systems, and digital health, and outline future research directions toward scalable and personalized generative AI-based music therapy. Full article
(This article belongs to the Special Issue Advances in Digital Health Technologies)
24 pages, 386 KB  
Article
Curating Awareness and Hope: Performing Field and Finzi as Gentle Climate Activism
by Mine Doğantan-Dack
Arts 2026, 15(4), 84; https://doi.org/10.3390/arts15040084 - 17 Apr 2026
Viewed by 371
Abstract
This article presents an autoethnographic narrative account of curating and performing two pieces for solo piano and string orchestra—Climate Concerto by Brian Field and Eclogue by Gerald Finzi—to advocate for climate action. It discusses the selection of a concert venue that could [...] Read more.
This article presents an autoethnographic narrative account of curating and performing two pieces for solo piano and string orchestra—Climate Concerto by Brian Field and Eclogue by Gerald Finzi—to advocate for climate action. It discusses the selection of a concert venue that could be “thickly lived”, offering layers of cultural, historical and aesthetic resonance, and a concert date that could generate “interaction chains”, where engagement in one event motivates engagement in others. The article reflects on the multiple forms of loss brought about by the climate emergency, exploring Field’s musical portrayal of environmental loss and Finzi’s evocation of a harmonious human-nature relationship, which highlights a way of being-in-the-world that has been lost. In response to pervasive pessimism and dystopian narratives in climate communication, the discussion foregrounds hope as a powerful motivator for positive action, showing how the narrative scope of Field’s large-scale forms and the aesthetic beauty of Finzi’s music can elicit felt hope. The article also advocates for gentle musical activism for climate action, emphasising music’s capacity to cultivate relational sensitivity, ethical responsiveness, and collective responsibility toward each other and the world—even amid ecological crisis, social fragmentation, and uncertainty. Full article
(This article belongs to the Special Issue Creating Musical Experiences)
15 pages, 1420 KB  
Article
DC-MEPV: Dual-Channel Assisted Music Emotion Perception and Visualization in Acousto-Optic Synergistic Intelligent Cockpits
by Wei Shen, Xingang Mou, Songqing Le, Zhixing Zong and Jiaji Li
Appl. Sci. 2026, 16(8), 3800; https://doi.org/10.3390/app16083800 - 13 Apr 2026
Viewed by 293
Abstract
We propose a Dual-Channel assisted Music Emotion Perception and Visualization (DC-MEPV) framework designed for ambient lighting in intelligent vehicle cockpits, addressing the increasing demand for advanced human–machine interaction in the automotive industry. This framework consists of three main components: the Multi-Scale Feature Extraction [...] Read more.
We propose a Dual-Channel assisted Music Emotion Perception and Visualization (DC-MEPV) framework designed for ambient lighting in intelligent vehicle cockpits, addressing the increasing demand for advanced human–machine interaction in the automotive industry. This framework consists of three main components: the Multi-Scale Feature Extraction Block (MSFEB), the Global Sequence Modeling Block (GSMB), and the Emotional Color Visualization Algorithm (ECV-Algo). The MSFEB extracts valence and arousal (V-A) features from dual channels at multiple temporal scales, with each channel employing a hybrid neural network architecture to capture multi-scale emotional representations. The GSMB integrates positional encoding, bidirectional long short-term memory (BiLSTM) networks, and multi-head self-attention mechanisms to dynamically model global emotional sequences. The ECV algorithm utilizes personalized emotion–color association rules to achieve expressive emotion-driven lighting visualization based on a continuous mapping from emotion space to color space. We conducted comprehensive comparison and ablation experiments to evaluate the model’s emotion perception performance, and designed three metrics to evaluate the quality of the generated visualizations. The model outperformed other networks in both comparative and ablation experiments. Additionally, the generated lights demonstrated strong performance in terms of CIEDE2000 variation rates, unique color ratios, and joint histogram entropy. DC-MEPV achieved excellent performance in emotion perception and visualizations on the DEAM and PMEmo datasets. Full article
(This article belongs to the Section Electrical, Electronics and Communications Engineering)
Show Figures

Figure 1

24 pages, 1821 KB  
Article
MAVIS: Multi-Stem Audio Visualisation in Immersive Spaces Framework
by Jethro Shell and Sophy Smith
Electronics 2026, 15(8), 1559; https://doi.org/10.3390/electronics15081559 - 8 Apr 2026
Viewed by 334
Abstract
The visualisation of music has gained traction in both research and musical composition in recent years. The increased accessibility to immersive technologies, such as virtual reality (VR) and other forms of mixed reality (MR), lend themselves to the examination of how visualisation can [...] Read more.
The visualisation of music has gained traction in both research and musical composition in recent years. The increased accessibility to immersive technologies, such as virtual reality (VR) and other forms of mixed reality (MR), lend themselves to the examination of how visualisation can impact the perception of audio virtual worlds. In this paper, we propose the MAVIS (Multi-stem Audio Visualisation in Immersive Spaces) design framework, an approach to generating a visualisation of multi-stem structured orchestral music in a virtual world. This research explores the impact on participants’ interaction with an orchestral musical composition through the use of a two framework iterations informed by use cases. The resulting final design structure outlined in this article points towards constructing multi-stem virtual orchestral experiences through three pillars: semantic consistency, spatial agency, and complexity control. Whilst this research serves to propose a design intervention, future work requires a more extensive participant testing approach, coupled with an exploration of additional multimodal analysis. Full article
Show Figures

Figure 1

9 pages, 195 KB  
Essay
Cultural Diversity in Music Education: An Agenda for the Second Quarter of the 21st Century
by Huib Schippers
Educ. Sci. 2026, 16(4), 585; https://doi.org/10.3390/educsci16040585 - 7 Apr 2026
Viewed by 403
Abstract
In the late 1990s, there was much speculation on what music and music education would look like at the beginning of the 21st century. Few predicted the level of change that we have witnessed since then. In fact, developments in technologies, demographics, societies [...] Read more.
In the late 1990s, there was much speculation on what music and music education would look like at the beginning of the 21st century. Few predicted the level of change that we have witnessed since then. In fact, developments in technologies, demographics, societies and global relations that have taken place in the world over the past 100 years would have been neigh unimaginable decade by decade, and keep coming with ever-increasing intensity. Travel, trade and technology have connected people and cultures in myriad and often wonderful ways. But inequities, divisions, and conflicts also reached new heights, with the first half of the 2020s subject to a seemingly endless stream of natural and manmade disasters and conflicts. Inevitably, all of these developments impacted on the world of music in general, and also on music education. In this essay, I try to summarise some key experiences and observations of my own first fifty years of living musical diversity (a world that started to open before me when I began learning Indian sitar in Amsterdam in 1975), and efforts across five continents that I have been involved in or researched. Juxtaposing this with key literature on the topic provides a broad basis for presenting ideas and views on progress towards giving musical practices from across the globe an appropriate place in music education at all levels: in community settings, schools, and institutions for professional training of performers and educators. In that process, I identify three critical junctures which can simultaneously present obstacles and opportunities for positive change: (1) terminologies, social inclusion, and the politics of diversity; (2) musical dynamics, technology, and institutional change; and (3) evolutions and revolutions in music learning and teaching. These inform a challenging but clear agenda for scholars, policy makers, institutional leaders, practising musicians and music educators worldwide who strive for more inclusive, diverse, equitable and relevant practices. Full article
(This article belongs to the Special Issue Music Education: Current Changes, Future Trajectories)
21 pages, 3333 KB  
Article
A Methodological Framework for Runtime Ontology Evolution in Dynamic Environments
by Valeria Seidita, Lucrezia Mosca and Antonio Chella
Appl. Sci. 2026, 16(7), 3494; https://doi.org/10.3390/app16073494 - 3 Apr 2026
Viewed by 374
Abstract
Intelligent systems operating in real-world environments are often required to make decisions in contexts that are only partially known at design time. In such scenarios, the assumption of a static and fully specified knowledge base becomes unrealistic, limiting the system’s ability to adapt [...] Read more.
Intelligent systems operating in real-world environments are often required to make decisions in contexts that are only partially known at design time. In such scenarios, the assumption of a static and fully specified knowledge base becomes unrealistic, limiting the system’s ability to adapt to novel situations. This challenge is particularly relevant for robotic systems, whose behavior cannot be entirely pre-programmed when operating in dynamic and evolving environments. This paper proposes a methodological and architectural approach for the runtime update of ontologies and knowledge bases, enabling intelligent systems to autonomously adapt their internal representation of the world during execution. The proposed approach enables the system to identify knowledge gaps by distinguishing between previously unknown concepts and known concepts enriched with newly observed instances, and to integrate such information into the ontology in a controlled and consistent manner. The approach is implemented as an end-to-end pipeline that combines visual perception, semantic interpretation through large language models, and a robust ontology update mechanism. Particular attention is devoted to ensuring formal consistency during runtime evolution, addressing challenges such as the generation of valid OWL constructs, the management of inverse properties, datatype normalization, and the prevention of semantic degradation over iterative updates. By enabling knowledge-driven adaptation at runtime, the proposed framework supports autonomous decision-making in environments that cannot be fully anticipated at design time. The approach was developed within the MUSIC4D and MHARA projects, which explore the use of intelligent systems in dynamic, partially structured contexts, focusing on knowledge-based adaptation. Full article
Show Figures

Figure 1

22 pages, 3896 KB  
Article
Experimental Validation of an SDR-Based Direction of Arrival Estimation Testbed
by Nikita Sheremet and Grigoriy Fokin
Information 2026, 17(4), 313; https://doi.org/10.3390/info17040313 - 24 Mar 2026
Viewed by 400
Abstract
Advanced mobile communication standards of the fifth and subsequent generations widely use beamforming technology. While many publications on this topic rely on simulation tools, some work has been dedicated to experimental testing using software-defined radio (SDR) platforms. These platforms are often expensive and [...] Read more.
Advanced mobile communication standards of the fifth and subsequent generations widely use beamforming technology. While many publications on this topic rely on simulation tools, some work has been dedicated to experimental testing using software-defined radio (SDR) platforms. These platforms are often expensive and require significant expertise to configure. This paper proposes a novel cost-effective method for combining a pair of dual-channel Universal Software Radio Peripheral (USRP) B210 boards into a four-element antenna array direction of arrival estimation testbed using Metronom synchronization devices. The hardware and developed software implementation is detailed, including the antenna layout and software modules, based on USRP Hardware Driver, that provide the frequency and time synchronization necessary for amplitude-phase processing. Experimental validation of the testbed using the MUltiple SIgnal Classification (MUSIC) algorithm demonstrates high stability of angle of arrival estimates, with a standard deviation not exceeding 0.4°. The algorithm achieved a resolution of 16.1° for two sources, which surpasses the half-power beamwidth of 25.6°. The theoretical significance of this work lies in the scientific validation of combining SDR devices with the precise synchronization required for beamforming. Its practical value is in enabling the experimental testing of beamforming without the need for costly multichannel SDR hardware. Full article
(This article belongs to the Section Wireless Technologies)
Show Figures

Figure 1

23 pages, 56439 KB  
Article
Multipath Credibility Selection for Robust UWB Angle-of-Arrival Estimation in Narrow Underground Corridors
by Jianjia Li, Baoguo Yu, Songzuo Cui, Menghuan Yang, Jun Zhao, Runjia Su and Runze Tian
Sensors 2026, 26(6), 2002; https://doi.org/10.3390/s26062002 - 23 Mar 2026
Viewed by 460
Abstract
Waveguide-like propagation in elongated underground environments—utility corridors, logistics tunnels—generates dense multipath that can cause the earliest or strongest resolvable channel impulse response (CIR) component to originate from a specular reflection rather than the direct line-of-sight (LOS) path. In the single-anchor CIR-tap-based implementations common [...] Read more.
Waveguide-like propagation in elongated underground environments—utility corridors, logistics tunnels—generates dense multipath that can cause the earliest or strongest resolvable channel impulse response (CIR) component to originate from a specular reflection rather than the direct line-of-sight (LOS) path. In the single-anchor CIR-tap-based implementations common to practical ultra-wideband (UWB) systems, baseline estimators such as phase-difference-of-arrival (PDOA) and MUSIC rely on selecting a single dominant CIR component, producing large angle-of-arrival (AoA) errors whenever the selected path is a reflection. We propose a multipath credibility selection (MCS) AoA estimator, MCS-AoA, that does not require explicit LOS/NLOS classification. The algorithm scores each resolvable CIR component with four credibility factors—amplitude significance, time-of-flight (TOF) consistency, inter-baseline phase–geometry agreement, and cross-baseline coherence—and fuses retained candidates into a credibility-weighted spatial covariance matrix for 2D MUSIC search. Field experiments on a custom five-channel coherent UWB platform compare MCS-AoA against six baselines—PDOA, MUSIC, MVDR/Capon, TLS-ESPRIT, PwMUSIC, and DNN-AoA. In an underground corridor (5–40 m), MCS-AoA achieves an azimuth/elevation MAE of 1.00°/1.46°, outperforming all baselines (PDOA: 2.26°/2.49°; MUSIC: 1.76°/2.40°; next-best PwMUSIC: 1.44°/2.17°); in a logistics tunnel (5–80 m), it achieves a 1.19° overall azimuth MAE. Simulations corroborate these gains, with a 0.71° azimuth RMSE at 80 m (69.3% reduction over PDOA) and 86.6% of estimates falling within 1°. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Figure 1

19 pages, 992 KB  
Article
Hybrid Music Similarity with Hypergraph and Siamese Network
by Sera Kim, Youngjun Kim, Jaewon Lee and Dalwon Jang
Big Data Cogn. Comput. 2026, 10(3), 96; https://doi.org/10.3390/bdcc10030096 - 21 Mar 2026
Viewed by 414
Abstract
This paper proposes a novel method for measuring music similarity. Existing music similarity measurements have often been used for music appreciation, but this paper proposes a method for measuring the similarity between music samples which are used for music production. Conventional music recommendation [...] Read more.
This paper proposes a novel method for measuring music similarity. Existing music similarity measurements have often been used for music appreciation, but this paper proposes a method for measuring the similarity between music samples which are used for music production. Conventional music recommendation approaches often rely on either metadata-based similarity or audio-based feature similarity in isolation, which limits their effectiveness in sample-based recommendation scenarios where both compositional context and acoustic characteristics are important. To address this limitation, the proposed framework combines a hypergraph-based information similarity module with a feature-based similarity module learned using Siamese networks and triplet loss. In the information-based module, metadata attributes such as beats per minute (BPM), genre, chord, key, and instrument are modeled as vertices in a hypergraph, and Random Walk–Word2Vec embeddings are learned to capture structural relationships between music samples and their attributes. In parallel, the feature-based module employs vertex-specific Siamese networks trained on instrument and key classification tasks to learn perceptual similarity directly from audio signals. The two modules are trained independently and jointly utilized at the recommendation stage to provide attribute-specific similarity results for a given query sample. Results show that the proposed system achieves high Precision@k across multiple attributes and forms stable similarity structures in the embedding space, even without relying on user interaction data. These results reflect embedding consistency evaluated over the entire dataset where training and retrieval are performed on the same sample pool, rather than generalization to unseen samples. These results demonstrate that the proposed hybrid framework effectively captures both structural and perceptual similarity among music samples and is well suited for sample-based music recommendation in music production environments. Full article
Show Figures

Figure 1

25 pages, 4349 KB  
Article
Research on AUV Underwater Localization Method Based on an n-Shaped Array
by Chuang Han, Mengran Gao, Tao Shen and Chengli Guo
Sensors 2026, 26(6), 1845; https://doi.org/10.3390/s26061845 - 15 Mar 2026
Viewed by 300
Abstract
During continuous navigation of the mother ship, an autonomous underwater vehicle (AUV) can be recovered through an underwater hangar, and the accurate localization of the AUV relative to the mother ship is a key step in the recovery process. To address the AUV [...] Read more.
During continuous navigation of the mother ship, an autonomous underwater vehicle (AUV) can be recovered through an underwater hangar, and the accurate localization of the AUV relative to the mother ship is a key step in the recovery process. To address the AUV localization problem, an n-shaped hydrophone array is designed based on the spatial configuration of the underwater hangar. Since underwater acoustic signals are susceptible to multipath propagation, co-channel interference, and other transmission impairments, the signals received by the array often exhibit coherence. Accordingly, a far-field sound source localization method based on the n-shaped array is proposed. The proposed algorithm first applies spatial smoothing to the x-axis and y-axis subarrays and jointly constructs a received data vector, followed by eigenvalue decomposition of the corresponding covariance matrix. The Multiple Signal Classification (MUSIC) algorithm is then employed to obtain coarse estimates of the source angles. These coarse estimates are subsequently used as initial values for the Space-Alternating Generalized Expectation-maximization (SAGE) algorithm, which performs refined optimization of the angular parameters in a continuous parameter space, thereby effectively improving estimation accuracy. Furthermore, the proposed algorithm is extended from far-field scenarios to near-field localization. Simulation results demonstrate that the proposed method achieves good parameter estimation performance. Full article
Show Figures

Figure 1

20 pages, 5457 KB  
Article
High-Precision Time-of-Arrival Estimation in HF Sensor Networks via Multipath Separation and Independent Tracking
by Qiwei Ji and Huabing Wu
Sensors 2026, 26(5), 1640; https://doi.org/10.3390/s26051640 - 5 Mar 2026
Viewed by 368
Abstract
High-frequency (HF) sensor networks play an irreplaceable role in remote sensing and emergency communications but suffer severely from ionospheric multipath interference, which degrades Time-of-Arrival (TOA) estimation accuracy. Conventional methods, such as the Generalized Cross-Correlation (GCC) and standard Delay-Locked Loops (DLL), often treat multipath [...] Read more.
High-frequency (HF) sensor networks play an irreplaceable role in remote sensing and emergency communications but suffer severely from ionospheric multipath interference, which degrades Time-of-Arrival (TOA) estimation accuracy. Conventional methods, such as the Generalized Cross-Correlation (GCC) and standard Delay-Locked Loops (DLL), often treat multipath components as noise, leading to significant measurement bias in dynamic environments. To address this, we propose a Multipath Separation and Independent Tracking (MSIT) architecture. This framework transforms multipath interference into valuable observables by establishing a closed-loop synergy: a Maximum Likelihood Estimation (MLE)-based module iteratively separates signal components, while parallel tracking loops update phase and delay parameters. Additionally, a super-resolution MUSIC algorithm is employed for initialization to resolve sub-chip multipath components. Simulations demonstrate that under disturbed channel conditions, the MSIT method achieves a mean delay estimation error reduction of about two orders of magnitude relative to the GCC method. Furthermore, field experiments on the Xi’an–Ürümqi link demonstrate its capability to stably resolve and track multiple propagation paths in real-world environments. This approach significantly enhances the measurement precision and reliability of HF sensing systems. Full article
(This article belongs to the Section Communications)
Show Figures

Figure 1

36 pages, 7153 KB  
Article
Benchmarking an Integrated Deep Learning Pipeline for Robust Detection and Individual Counting of the Greater Caribbean Manatee
by Fabricio Quirós-Corella, Athena Rycyk, Beth Brady and Priscilla Cubero-Pardo
Appl. Sci. 2026, 16(5), 2446; https://doi.org/10.3390/app16052446 - 3 Mar 2026
Viewed by 372
Abstract
The Greater Caribbean manatee faces significant conservation challenges due to a lack of demographic data in low-visibility habitats. To address this, we present a refined automated manatee counting method pipeline integrating deep learning-based call detection with unsupervised individual counting. We resolved significant computational [...] Read more.
The Greater Caribbean manatee faces significant conservation challenges due to a lack of demographic data in low-visibility habitats. To address this, we present a refined automated manatee counting method pipeline integrating deep learning-based call detection with unsupervised individual counting. We resolved significant computational bottlenecks by implementing an offline feature extraction strategy, bypassing a 13-h processing lag for 43,031 audio samples. To mitigate overfitting in imbalanced bioacoustic datasets, non-parametric bootstrap resampling was employed to generate 100,000 balanced spectrograms. Benchmarking revealed that transfer learning via a VGG-16 backbone achieved a mean 10-fold cross-validation accuracy of 98.92% (±0.08%) and an F1-score of 98.08% for genuine vocalizations. Following detection, individual counting utilized k-means clustering on prioritized music information retrieval descriptors—spectral bandwidth, centroid, and roll-off—to resolve distinct acoustic signatures. This framework identified three individuals with a silhouette coefficient of 79.20%, demonstrating superior cohesion over previous benchmarks. These results confirm the automatic manatee count method as a robust, scalable framework for generating the scientific evidence required for regional conservation policies. Full article
Show Figures

Figure 1

21 pages, 5543 KB  
Article
Evaluation of Mechanical Properties and Interface Interactions in Thermoplastic Composites Including Discarded Musical Instrument Reeds
by Tetsuo Takayama and Syunsuke Oneda
Recycling 2026, 11(3), 45; https://doi.org/10.3390/recycling11030045 - 2 Mar 2026
Cited by 1 | Viewed by 578
Abstract
This study investigates the material recycling potential of discarded wind instrument reeds (Arundo donax), which are conventionally incinerated, by compounding them with thermoplastics (thermoplastic polyolefin, TPO; polybutylene succinate, PBS). After recovered reeds were pulverized and injection-molded at 10 and 30 wt% [...] Read more.
This study investigates the material recycling potential of discarded wind instrument reeds (Arundo donax), which are conventionally incinerated, by compounding them with thermoplastics (thermoplastic polyolefin, TPO; polybutylene succinate, PBS). After recovered reeds were pulverized and injection-molded at 10 and 30 wt% concentrations, their mechanical and interfacial properties were evaluated. Experimentally obtained results indicate that waste reeds function as effective reinforcing agents, particularly when combined with biodegradable PBS. Incorporating 30 wt% reed flour into PBS enhanced flexural strength by approximately 1.7 times and flexural modulus by 2.8 times compared to the neat resin. This superior performance relative to TPO composites is attributed to robust interfacial hydrogen bonding among PBS carbonyl groups and the hydroxyl groups on the reed surface. Additionally, thermal and spectroscopic analyses revealed that these strong interactions elevate the crystallization temperature and generate a “Rigid Amorphous Phase” (RAF) that facilitates efficient stress transfer. These research findings demonstrate the feasibility of creating high-quality, bio-based composites, offering a sustainable method to reduce petroleum reliance and carbon dioxide emissions by upcycling musical waste. Full article
Show Figures

Graphical abstract

20 pages, 1419 KB  
Article
Building Prototype Evolution Pathway for Emotion Recognition in User-Generated Videos
by Yujie Liu, Zhenyang Dong, Yante Li and Guoying Zhao
Big Data Cogn. Comput. 2026, 10(3), 73; https://doi.org/10.3390/bdcc10030073 - 28 Feb 2026
Viewed by 512
Abstract
Large-scale pretrained foundation models are increasingly essential for affective analysis in user-generated videos. However, current approaches typically reuse generic multi-modal representations directly with task-specific adapters learned from scratch, and their performance is limited by the large affective domain gap and scarce emotion annotations. [...] Read more.
Large-scale pretrained foundation models are increasingly essential for affective analysis in user-generated videos. However, current approaches typically reuse generic multi-modal representations directly with task-specific adapters learned from scratch, and their performance is limited by the large affective domain gap and scarce emotion annotations. To address these issues, we introduce a novel paradigm that leverages auxiliary cross-modal priors to enhance unimodal emotion modeling, effectively exploiting modality-shared semantics and modality-specific inductive biases. Specifically, we propose a progressive prototype evolution framework that gradually transforms a neutral prototype into discriminative emotional representations through fine-grained cross-modal interactions with visual cues. The auxiliary prior serves as a structural constraint, reframing the adaptation challenge from a difficult domain shift problem into a more tractable prototype shift within the affective space. To ensure robust prototype construction and guided evolution, we further design category-aggregated prompting and bidirectional supervision mechanisms. Extensive experiments on VideoEmotion-8, Ekman-6, and MusicVideo-6 validate the superiority of our approach, achieving state-of-the-art results and demonstrating the effectiveness of leveraging auxiliary modality priors for foundation-model-based emotion recognition. Full article
(This article belongs to the Special Issue Sentiment Analysis in the Context of Big Data)
Show Figures

Figure 1

48 pages, 4777 KB  
Review
Predictors of the Effectiveness of Psychedelics in Treating Depression—A Scoping Review
by James Chmiel and Filip Rybakowski
Int. J. Mol. Sci. 2026, 27(5), 2202; https://doi.org/10.3390/ijms27052202 - 26 Feb 2026
Cited by 1 | Viewed by 1452
Abstract
Psychedelic-assisted therapies (PATs) can produce rapid and sustained antidepressant effects, yet variability in response remains substantial. Identifying predictors and moderators is essential for optimising patient selection, preparation, and delivery. To map and synthesise the evidence on the predictors of antidepressant response to classic/serotonergic [...] Read more.
Psychedelic-assisted therapies (PATs) can produce rapid and sustained antidepressant effects, yet variability in response remains substantial. Identifying predictors and moderators is essential for optimising patient selection, preparation, and delivery. To map and synthesise the evidence on the predictors of antidepressant response to classic/serotonergic psychedelics administered with psychotherapeutic support in adults with depressive disorders, including treatment-resistant depression. Following PRISMA-ScR principles, we conducted a scoping review of major biomedical and psychology databases (PubMed (MEDLINE), Embase, PsycINFO, and Web of Science) and trial registries (searches September–October 2025), supplemented by reference-list screening. We included randomised trials, open-label studies, and naturalistic cohorts reporting associations between candidate predictors (baseline traits/clinical features, set/setting variables, acute in-session phenomenology, and biological measures) and validated depression outcomes. We charted study characteristics, analytic approaches (including moderation/mediation where available), and indicators of robustness (e.g., adjustment for overall intensity, preregistration, external validation). A total of 48 studies were included in the review. Across study designs, process-level features during the dosing session were the most consistent correlates of antidepressant improvement. Greater emotional breakthrough, mystical/unitive experiences, and ego dissolution-linked reappraisal/insight generally predicted larger and more durable symptom reductions, whereas anxiety-dominant or dysphoric states tended to attenuate benefit, often independent of overall subjective intensity. Set and setting—particularly a stronger therapeutic alliance and music experienced as resonant—predicted both the emergence of therapeutically salient acute experiences and downstream clinical gains. Baseline moderators showed smaller and mixed effects: PTSD comorbidity sometimes weakened trajectories; extensive prior psychedelic exposure was associated with smaller incremental gains; demographics were typically uninformative. Converging biological findings associated better outcomes with markers consistent with increased neural flexibility and plasticity (e.g., less segregated network dynamics; EEG indices), alongside peripheral changes implicating neurotrophic, inflammatory, and HPA axis pathways. Current evidence suggests that antidepressant response in PATs is driven less by static patient characteristics and more by what occurs during dosing and how the context shapes that experience. Optimising preparation, alliance, and music; facilitating emotional breakthrough and meaning making; and mitigating anxious dysregulation are actionable levers. Future trials should harmonise measures, pre-specify and validate moderators/mediators, intensively sample in-session experience and physiology, and report benefits and harms more consistently. Full article
(This article belongs to the Special Issue Advances in the Pharmacology of Depression and Mood Disorders)
Show Figures

Figure 1

Back to TopTop