Modeling of Multimodal Speech Recognition and Language Processing
A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Electronic Multimedia".
Deadline for manuscript submissions: closed (15 April 2025) | Viewed by 2176
Special Issue Editors
Interests: automatic lyrics transcription; speech recognition; speech-to-singing conversion; singing information processing; music information retrieval and multi-modal processing.
Interests: multi-modal fusion; speaker localization and tracking; speech-related topics
Interests: multi-modal processing; speaker recognition; active speaker detection; self-supervised learning
Interests: artificial intelligence; artificial neural networks; brain-inspired computational intelligence
Special Issues, Collections and Topics in MDPI journals
Special Issue Information
Dear Colleagues,
This Special Issue, ‘Modeling of Multimodal Speech Recognition and Language Processing,’ aims to delve into the rapidly evolving landscape of automatic speech recognition (ASR) and language processing. It seeks to collate papers exploring innovative approaches that bridge the gap between human speech comprehension and computational interpretation, as well as that emphasize the development of novel techniques to enhance ASR and language modeling, particularly in challenging environments such as diverse noisy settings, multimodal contexts, and multi-lingual speech recognition.
By concentrating on challenging real-world scenarios, we encourage researchers to push the boundaries of existing knowledge and contribute ground-breaking solutions to the field. Furthermore, this Special Issue is designed to provide a comprehensive resource for researchers, both newcomers and experts, by presenting cutting-edge research, methodologies, and insights that are directly applicable to real-world ASR and language processing challenges.
In relation to the existing approaches, this Special Issue seeks to build upon the foundation laid by prior research in ASR and language processing, as well as extend and enhance the existing literature by focusing on emerging challenges, such as multimodal recognition and security concerns, that have gained prominence in recent years. By addressing these gaps in the literature, we aim to offer a forward-looking perspective on ASR and language processing, showcasing practical solutions and insights that align with contemporary demands. Researchers can expect to find valuable references and inspiration to address the most pressing issues in the field, making this Special Issue a pivotal addition to the existing body of work.
Topics of interests include, but are not limited to, the following:
- Robust speech recognition;
- Language modeling;
- Multi-lingual speech recognition;
- Audio-visual speech recognition;
- Fast decoding techniques;
- Representation learning for audio, text, or/and vision;
- Speaker recognition for speech recognition;
- Audio security and adversarial attacks on speech recognition models;
- Large speech models;
- Large language models for speech recognition.
Dr. Xiaoxue Gao
Dr. Xinyuan Qian
Dr. Ruijie Tao
Dr. Malu Zhang
Dr. Zhaojie Luo
Guest Editors
Manuscript Submission Information
Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.
Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.
Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.
Keywords
- robust speech recognition
- novel approaches for speech recognition
- multi-lingual speech recognition
- audio-visual speech recognition
- language modelling
- self-supervised learning for speech processing
- representation learning for audio, language and vision
Benefits of Publishing in a Special Issue
- Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
- Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
- Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
- External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
- e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.
Further information on MDPI's Special Issue policies can be found here.