Advances in Speech Recognition and Speech Processing Technology

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Electronic Multimedia".

Deadline for manuscript submissions: 15 May 2026 | Viewed by 126

Special Issue Editor

Department of Electrical and Electronic Engineering, The Hong Kong Polytechnic University, Hong Kong
Interests: speech synthesis; text-to-speech; representation learning; multimodal language model

Special Issue Information

Dear Colleagues,

Speech processing has become a cornerstone of human–computer interaction, powering a wide range of applications from virtual assistants and customer service bots to accessibility technologies and intelligent tutoring systems. Recent advancements in multi-modal large language models (MLLMs) have further expanded the frontiers of speech-related technologies by enabling seamless integration of speech, text, and visual modalities. These innovations are redefining how machines understand, generate, and interact with human speech.

At the heart of this transformation lie key areas such as speech synthesis, speech recognition, speaker verification, emotion recognition, voice conversion, and text-to-speech systems. These technologies are rapidly evolving, with growing emphasis on personalization, robustness, data efficiency, and real-time performance. Moreover, as speech becomes a primary input/output channel for AI systems, ensuring high-quality and trustworthy speech generation and understanding is more critical than ever.

This Special Issue aims to bring together the latest research and developments in speech processing, with a special focus on the intersection with multi-modal large language models and emerging AI capabilities. We invite high-quality contributions addressing theoretical innovations, practical systems, and comprehensive reviews in the following (but not limited to) areas:

  *   Multi-modal large language models for speech and audio applications;

  *   End-to-end speech recognition in noisy and multilingual environments;

  *   Neural speech synthesis and zero-shot text-to-speech;

  *   Robust speaker verification and anti-spoofing techniques;

  *   Emotion recognition from speech and multi-modal signals;

  *   Voice conversion and style transfer in speech generation;

  *   Self-supervised and unsupervised learning for speech tasks;

  *   Speech processing for low-resource and endangered languages;

  *   Personalization and speaker adaptation in speech systems;

  *   Ethical, privacy, and fairness concerns in speech AI;

  *   Benchmarking and evaluation of speech models and datasets;

  *   Efficient and lightweight models for on-device speech processing;

  *   Applications of speech technology in healthcare, education, and accessibility;

  *   Integration of speech interfaces in robotics and virtual environments.

We welcome original research articles, comprehensive reviews, and system demonstrations that push the boundaries of speech processing and foster its integration into broader AI ecosystems.

Dr. Weiwei Lin
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • multi-modal large language models
  • speech synthesis
  • speech processing
  • speech recognition
  • speaker verification
  • emotion recognition
  • text-to-speech
  • voice conversion

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers

This special issue is now open for submission.
Back to TopTop