MDPI - Publisher of Open Access Journals

17 pages, 2827 KB

Open AccessArticle

Low-Resourced Alphabet-Level Pivot-Based Neural Machine Translation for Translating Korean Dialects

by Junho Park and Seong-Bae Park

Appl. Sci. 2025, 15(17), 9459; https://doi.org/10.3390/app15179459 - 28 Aug 2025

Developing a machine translator from a Korean dialect to a foreign language presents significant challenges due to a lack of a parallel corpus for direct dialect translation. To solve this issue, this paper proposes a pivot-based machine translation model that consists of two [...] Read more.

Developing a machine translator from a Korean dialect to a foreign language presents significant challenges due to a lack of a parallel corpus for direct dialect translation. To solve this issue, this paper proposes a pivot-based machine translation model that consists of two sub-translators. The first sub-translator is a sequence-to-sequence model with minGRU as an encoder and GRU as a decoder. It normalizes a dialect sentence into a standard sentence, and it employs alphabet-level tokenization. The other type of sub-translator is a legacy translator, such as off-the-shelf neural machine translators or LLMs, which translates the normalized standard sentence to a foreign sentence. The effectiveness of the alphabet-level tokenization and the minGRU encoder for the normalization model is demonstrated through empirical analysis. Alphabet-level tokenization is proven to be more effective for Korean dialect normalization than other widely used sub-word tokenizations. The minGRU encoder exhibits comparable performance to GRU as an encoder, and it is faster and more effective in managing longer token sequences. The pivot-based translation method is also validated through a broad range of experiments, and its effectiveness in translating Korean dialects to English, Chinese, and Japanese is demonstrated empirically. Full article

(This article belongs to the Special Issue Deep Learning and Its Applications in Natural Language Processing)

► Show Figures

Figure 1

23 pages, 1009 KB

Open AccessArticle

Enhancement of English-Bengali Machine Translation Leveraging Back-Translation

by Subrota Kumar Mondal, Chengwei Wang, Yijun Chen, Yuning Cheng, Yanbo Huang, Hong-Ning Dai and H. M. Dipu Kabir

Appl. Sci. 2024, 14(15), 6848; https://doi.org/10.3390/app14156848 - 5 Aug 2024

Cited by 1 | Viewed by 2952

Abstract

An English-Bengali machine translation (MT) application can convert an English text into a corresponding Bengali translation. To build a better model for this task, we can optimize English-Bengali MT. MT for languages with rich resources, like English-German, started decades ago. However, MT for [...] Read more.

An English-Bengali machine translation (MT) application can convert an English text into a corresponding Bengali translation. To build a better model for this task, we can optimize English-Bengali MT. MT for languages with rich resources, like English-German, started decades ago. However, MT for languages lacking many parallel corpora remains challenging. In our study, we employed back-translation to improve the translation accuracy. With back-translation, we can have a pseudo-parallel corpus, and the generated (pseudo) corpus can be added to the original dataset to obtain an augmented dataset. However, the new data can be regarded as noisy data because they are generated by models that may not be trained very well or not evaluated well, like human translators. Since the original output of a translation model is a probability distribution of candidate words, to make the model more robust, different decoding methods are used, such as beam search, top-k random sampling and random sampling with temperature T, and others. Notably, top-k random sampling and random sampling with temperature T are more commonly used and more optimal decoding methods than the beam search. To this end, our study compares LSTM (Long-Short Term Memory, as a baseline) and Transformer. Our results show that Transformer (BLEU:

27.80

in validation,

1.33

in test) outperforms LSTM (

3.62

in validation,

0.00

in test) by a large margin in the English-Bengali translation task. (Evaluating LSTM and Transformer without any augmented data is our baseline study.) We also incorporate two decoding methods, top-k random sampling and random sampling with temperature T, for back-translation that help improve the translation accuracy of the model. The results show that data generated by back-translation without top-k or temperature sampling (“no strategy”) help improve the accuracy (BLEU

38.22

, +

10.42

on validation,

2.07

, +

0.74

on test). Specifically, back-translation with top-k sampling is less effective (

k = 10

, BLEU

29.43

, +

1.83

on validation,

1.36

, +

0.03

on test), while sampling with a proper value of T,

T = 0.5

makes the model achieve a higher score (

T = 0.5

, BLEU

35.02

, +

7.22

on validation,

2.35

, +

1.02

on test). This implies that in English-Bengali MT, we can augment the training set through back-translation using random sampling with a proper temperature T. Full article

► Show Figures

Figure 1

18 pages, 4741 KB

Open AccessArticle

Research on a Mongolian Text to Speech Model Based on Ghost and ILPCnet

by Qing-Dao-Er-Ji Ren, Lele Wang, Wenjing Zhang and Leixiao Li

Appl. Sci. 2024, 14(2), 625; https://doi.org/10.3390/app14020625 - 11 Jan 2024

Viewed by 1527

Abstract

The core challenge of speech synthesis technology is how to convert text information into an audible audio form to meet the needs of users. In recent years, the quality of speech synthesis based on end-to-end speech synthesis models has been significantly improved. However, [...] Read more.

The core challenge of speech synthesis technology is how to convert text information into an audible audio form to meet the needs of users. In recent years, the quality of speech synthesis based on end-to-end speech synthesis models has been significantly improved. However, due to the characteristics of the Mongolian language and the lack of an audio corpus, the Mongolian speech synthesis model has achieved few results, and there are still some problems with the performance and synthesis quality. First, the phoneme information of Mongolian was further improved and a Bang-based pre-training model was constructed to reduce the error rate of Mongolian phonetic synthesized words. Second, a Mongolian speech synthesis model based on Ghost and ILPCnet was proposed, named the Ghost-ILPCnet model, which was improved based on the Para-WaveNet acoustic model, replacing ordinary convolution blocks with stacked Ghost modules to generate Mongolian acoustic features in parallel and improve the speed of speech generation. At the same time, the improved vocoder ILPCnet had a high synthesis quality and low complexity compared to other vocoders. Finally, a large number of data experiments were conducted on the proposed model to verify its effectiveness. The experimental results show that the Ghost-ILPCnet model has a simple structure, fewer model generation parameters, fewer hardware requirements, and can be trained in parallel. The average subjective opinion score of its synthesized speech reached 4.48 and the real-time rate reached 0.0041. It ensures the naturalness and clarity of synthesized speech, speeds up the synthesis speed, and effectively improves the performance of the Mongolian speech synthesis model. Full article

(This article belongs to the Special Issue Audio, Speech and Language Processing)

► Show Figures

Figure 1

18 pages, 4709 KB

Open AccessArticle

Framework for Handling Rare Word Problems in Neural Machine Translation System Using Multi-Word Expressions

by Kamal Deep Garg, Shashi Shekhar, Ajit Kumar, Vishal Goyal, Bhisham Sharma, Rajeswari Chengoden and Gautam Srivastava

Appl. Sci. 2022, 12(21), 11038; https://doi.org/10.3390/app122111038 - 31 Oct 2022

Cited by 19 | Viewed by 3541

Abstract

Machine Translation (MT) systems are now being improved with the use of an ongoing methodology known as Neural Machine Translation (NMT). Natural language processing (NLP) researchers have shown that NMT systems are unable to deal with out-of-vocabulary (OOV) words and multi-word expressions (MWEs) [...] Read more.

Machine Translation (MT) systems are now being improved with the use of an ongoing methodology known as Neural Machine Translation (NMT). Natural language processing (NLP) researchers have shown that NMT systems are unable to deal with out-of-vocabulary (OOV) words and multi-word expressions (MWEs) in the text. OOV terms are those that are not currently included in the vocabulary that is used by the NMT system. MWEs are phrases that consist of a minimum of two terms but are treated as a single unit. MWEs have great importance in NLP, linguistic theory, and MT systems. In this article, OOV words and MWEs are handled for the Punjabi to English NMT system. A parallel corpus for Punjabi to English containing MWEs was developed and used to train the different models of NMT. Punjabi is a low-resource language as it lacks the availability of a large parallel corpus for building various NLP tools, and this is an attempt to improve the accuracy of Punjabi in the English NMT system by using named entities and MWEs in the corpus. The developed NMT models were assessed using human evaluation through adequacy, fluency and overall rating as well as automated assessment tools such as the bilingual evaluation study (BLEU) and translation error rate (TER) score. Results show that using word embedding (WE) and MWEs corpus increased the accuracy of translation for the Punjabi to English language pair. The best BLEU score obtained was 15.45 for the small test set, 43.32 for the medium test set, and 34.5 for the large test set, respectively. The best TER rate score obtained was 57.34% for the small test set, 37.29% for the medium test set, and 53.79% for the large test set, repectively. Full article

(This article belongs to the Special Issue New Technologies and Applications of Natural Language Processing)

► Show Figures

Figure 1

17 pages, 905 KB

Open AccessArticle

High-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain Corpus

by Lanxin Zhao, Wanrong Gao and Jianbin Fang

Appl. Sci. 2021, 11(22), 10915; https://doi.org/10.3390/app112210915 - 18 Nov 2021

Cited by 11 | Viewed by 4752

Abstract

The ability to automate machine translation has various applications in international commerce, medicine, travel, education, and text digitization. Due to the different grammar and lack of clear word boundaries in Chinese, it is challenging to conduct translation from word-based languages (e.g., English) to [...] Read more.

The ability to automate machine translation has various applications in international commerce, medicine, travel, education, and text digitization. Due to the different grammar and lack of clear word boundaries in Chinese, it is challenging to conduct translation from word-based languages (e.g., English) to Chinese. This article has implemented a GPU-enabled deep learning machine translation system based on a domain-specific corpus. Our system takes English text as input and uses an encoder-decoder model with an attention mechanism based on Google’s Transformer to translate the text to Chinese output. The model was trained using a simple self-designed entropy loss function and an Adam optimizer on English–Chinese bilingual text sentences from the News area of the UM-Corpus. The parallel training process of our model can be performed on common laptops, desktops, and servers with one or more GPUs. At training time, we not only track loss over training epochs but also measure the quality of our model’s translations with the BLEU score. We also provide an easy-to-use web interface for users so as to manage corpus, training projects, and trained models. The experimental results show that we can achieve a maximum BLEU score of 29.2. We can further improve this score by tuning other hyperparameters. The GPU-enabled model training runs over 15x faster than on a multi-core CPU, which facilitates us having a shorter turn-around time. As a case study, we compare the performance of our model to that of Baidu’s, which shows that our model can compete with the industry-level translation system. We argue that our deep-learning-based translation system is particularly suitable for teaching purposes and small/medium-sized enterprises. Full article

(This article belongs to the Special Issue Hardware-Aware Deep Learning)

► Show Figures

Figure 1

16 pages, 414 KB

Open AccessArticle

An Investigation of EAP Teachers’ Views and Experiences of E-Learning Technology

by Sundeep Dhillon and Neil Murray

Educ. Sci. 2021, 11(2), 54; https://doi.org/10.3390/educsci11020054 - 1 Feb 2021

Cited by 25 | Viewed by 6072

Abstract

The near universal use of electronic learning (e-learning) in higher education (HE) today requires that students and teachers are equipped with the requisite digital literacy skills. The small-scale pilot study we report on here explored the views and experiences of EAP (English for [...] Read more.

The near universal use of electronic learning (e-learning) in higher education (HE) today requires that students and teachers are equipped with the requisite digital literacy skills. The small-scale pilot study we report on here explored the views and experiences of EAP (English for Academic Purposes) teachers regarding their development of digital literacy skills, their application of e-learning technology in their teaching, and their perceptions of its value as a learning tool—areas on which there has been little research to date. A convergent parallel mixed methods approach was adopted, in which a survey was administered to the research participants and a follow-up focus group conducted. The data were analysed, with findings revealing that the EAP practitioners surveyed utilised a range of online tools such as video, plagiarism software and corpus linguistics tools. A number of benefits and limitations associated with e-learning were cited by participants, including increased student engagement and motivation, the development of learner autonomy, and the cultural capital it represented in respect of students’ future careers. Meanwhile, the limitations identified included a lack of time for teachers to develop digital literacy and insufficient pre- and in-service training opportunities focused on the effective use of digital technologies and managing technical issues. We conclude with a series of recommendations to facilitate EAP teachers’ development and use of e-learning in their practice. Full article

(This article belongs to the Section Higher Education)

► Show Figures

Figure 1

19 pages, 970 KB

Open AccessArticle

Fusion-ConvBERT: Parallel Convolution and BERT Fusion for Speech Emotion Recognition

by Sanghyun Lee, David K. Han and Hanseok Ko

Sensors 2020, 20(22), 6688; https://doi.org/10.3390/s20226688 - 23 Nov 2020

Cited by 30 | Viewed by 6447

Abstract

Speech emotion recognition predicts the emotional state of a speaker based on the person’s speech. It brings an additional element for creating more natural human–computer interactions. Earlier studies on emotional recognition have been primarily based on handcrafted features and manual labels. With the [...] Read more.

Speech emotion recognition predicts the emotional state of a speaker based on the person’s speech. It brings an additional element for creating more natural human–computer interactions. Earlier studies on emotional recognition have been primarily based on handcrafted features and manual labels. With the advent of deep learning, there have been some efforts in applying the deep-network-based approach to the problem of emotion recognition. As deep learning automatically extracts salient features correlated to speaker emotion, it brings certain advantages over the handcrafted-feature-based methods. There are, however, some challenges in applying them to the emotion recognition problem, because data required for properly training deep networks are often lacking. Therefore, there is a need for a new deep-learning-based approach which can exploit available information from given speech signals to the maximum extent possible. Our proposed method, called “Fusion-ConvBERT”, is a parallel fusion model consisting of bidirectional encoder representations from transformers and convolutional neural networks. Extensive experiments were conducted on the proposed model using the EMO-DB and Interactive Emotional Dyadic Motion Capture Database emotion corpus, and it was shown that the proposed method outperformed state-of-the-art techniques in most of the test configurations. Full article

(This article belongs to the Special Issue Sensor Fusion for Object Detection, Classification and Tracking)

► Show Figures

Figure 1

18 pages, 2531 KB

Open AccessArticle

UPC: An Open Word-Sense Annotated Parallel Corpora for Machine Translation Study

by Van-Hai Vu, Quang-Phuoc Nguyen, Joon-Choul Shin and Cheol-Young Ock

Appl. Sci. 2020, 10(11), 3904; https://doi.org/10.3390/app10113904 - 4 Jun 2020

Cited by 5 | Viewed by 4231

Abstract

Machine translation (MT) has recently attracted much research on various advanced techniques (i.e., statistical-based and deep learning-based) and achieved great results for popular languages. However, the research on it involving low-resource languages such as Korean often suffer from the lack of openly available [...] Read more.

Machine translation (MT) has recently attracted much research on various advanced techniques (i.e., statistical-based and deep learning-based) and achieved great results for popular languages. However, the research on it involving low-resource languages such as Korean often suffer from the lack of openly available bilingual language resources. In this research, we built the open extensive parallel corpora for training MT models, named Ulsan parallel corpora (UPC). Currently, UPC contains two parallel corpora consisting of Korean-English and Korean-Vietnamese datasets. The Korean-English dataset has over 969 thousand sentence pairs, and the Korean-Vietnamese parallel corpus consists of over 412 thousand sentence pairs. Furthermore, the high rate of homographs of Korean causes an ambiguous word issue in MT. To address this problem, we developed a powerful word-sense annotation system based on a combination of sub-word conditional probability and knowledge-based methods, named UTagger. We applied UTagger to UPC and used these corpora to train both statistical-based and deep learning-based neural MT systems. The experimental results demonstrated that using UPC, high-quality MT systems (in terms of the Bi-Lingual Evaluation Understudy (BLEU) and Translation Error Rate (TER) score) can be built. Both UPC and UTagger are available for free download and usage. Full article

(This article belongs to the Special Issue Machine Learning and Natural Language Processing)

► Show Figures

Figure 1

Search Results (8)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (8)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI