Text-to-Speech and AI Music

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Information Processes".

Deadline for manuscript submissions: 14 May 2026 | Viewed by 595

Special Issue Editor


E-Mail Website
Guest Editor
School of Computer Science and Technology, Fudan University, Shanghai 200433, China
Interests: text to speech; voice conversion; talking face; music AI; large model

Special Issue Information

Dear Colleagues,

The field of audio research has long been dominated by two primary areas of study: speech and music. With the rapid advancement in deep learning and the emergence of large-scale models, these domains are now facing heightened demands and a plethora of new challenges. This Special Issue aims to bring together researchers working on speech synthesis and music artificial intelligence (AI) to contribute their insights and research findings, collectively propelling the development of both fields. The scope of this Special Issue is broad and encompasses a range of topics that are at the intersection of audio technology and AI. We welcome submissions that address, but are not limited to, the following research directions:

  • Speech synthesis: the creation of natural-sounding speech from text using AI models;
  • Voice conversion: techniques for altering the voice characteristics of one speaker to match those of another;
  • Talking face generation: the synthesis of visual speech movements in faces that correspond with the generated speech;
  • Melody extraction: algorithms for extracting the main melody from complex musical compositions;
  • Vocal accompaniment separation: methods for isolating vocals from an instrumental accompaniment in a musical piece;
  • Singing voice synthesis: the generation of singing voices from lyrics or melodies;
  • Automatic music composition: AI-driven processes for creating original musical pieces;
  • Humming recognition: technologies that identify songs based on a user's humming;
  • Music AI: broad explorations into how AI can innovate and enhance music creation, performance, and interaction.

Objectives and Goals

The primary objectives of this Special Issue are to do the following:

  • Showcase the latest research and developments in speech synthesis and music AI;
  • Foster interdisciplinary collaboration among researchers, engineers, and industry professionals;
  • Encourage the submission of high-quality, original research that addresses current challenges and proposes innovative solutions;
  • Disseminate knowledge and promote the exchange of ideas to inspire future research directions;
  • Facilitate the integration of theoretical advancements with practical applications in the audio industry.

Dr. Xulong Zhang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • text to speech
  • voice conversion
  • talking face
  • singing voice synthesis
  • melody extraction
  • music AI

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Review

30 pages, 894 KiB  
Review
From Tools to Creators: A Review on the Development and Application of Artificial Intelligence Music Generation
by Lijun Wei, Yuanyu Yu, Yuping Qin and Shuang Zhang
Information 2025, 16(8), 656; https://doi.org/10.3390/info16080656 (registering DOI) - 31 Jul 2025
Viewed by 2
Abstract
Artificial intelligence (AI) has emerged as a significant driving force in the development of technology and industry. It is also integrated with music as music AI in music generation and analysis. It originated from early algorithmic composition techniques in the mid-20th century. Recent [...] Read more.
Artificial intelligence (AI) has emerged as a significant driving force in the development of technology and industry. It is also integrated with music as music AI in music generation and analysis. It originated from early algorithmic composition techniques in the mid-20th century. Recent advancements in machine learning and neural networks have enabled innovative music generation and exploration. This article surveys the development history and technical route of music AI, analyzes the current status and limitations of music artificial intelligence across various areas, including music generation and composition, rehabilitation and treatment, as well as education and learning. It reveals that music AI has become a promising creator in the field of music generation. The influence of music AI on the music industry and the challenges it encounters are explored. Additionally, an emotional music generation system driven by multimodal signals is proposed. Although music artificial intelligence technology still needs to be further improved, with the continuous breakthroughs in technology, it will have a more profound impact on all areas of music. Full article
(This article belongs to the Special Issue Text-to-Speech and AI Music)
Show Figures

Figure 1

Back to TopTop