Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (33)

Search Parameters:
Keywords = low-resource neural machine translation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 1467 KiB  
Article
Confidence-Based Knowledge Distillation to Reduce Training Costs and Carbon Footprint for Low-Resource Neural Machine Translation
by Maria Zafar, Patrick J. Wall, Souhail Bakkali and Rejwanul Haque
Appl. Sci. 2025, 15(14), 8091; https://doi.org/10.3390/app15148091 - 21 Jul 2025
Viewed by 182
Abstract
The transformer-based deep learning approach represents the current state-of-the-art in machine translation (MT) research. Large-scale pretrained transformer models produce state-of-the-art performance across a wide range of MT tasks for many languages. However, such deep neural network (NN) models are often data-, compute-, space-, [...] Read more.
The transformer-based deep learning approach represents the current state-of-the-art in machine translation (MT) research. Large-scale pretrained transformer models produce state-of-the-art performance across a wide range of MT tasks for many languages. However, such deep neural network (NN) models are often data-, compute-, space-, power-, and energy-hungry, typically requiring powerful GPUs or large-scale clusters to train and deploy. As a result, they are often regarded as “non-green” and “unsustainable” technologies. Distilling knowledge from large deep NN models (teachers) to smaller NN models (students) is a widely adopted sustainable development approach in MT as well as in broader areas of natural language processing (NLP), including speech, and image processing. However, distilling large pretrained models presents several challenges. First, increased training time and cost that scales with the volume of data used for training a student model. This could pose a challenge for translation service providers (TSPs), as they may have limited budgets for training. Moreover, CO2 emissions generated during model training are typically proportional to the amount of data used, contributing to environmental harm. Second, when querying teacher models, including encoder–decoder models such as NLLB, the translations they produce for low-resource languages may be noisy or of low quality. This can undermine sequence-level knowledge distillation (SKD), as student models may inherit and reinforce errors from inaccurate labels. In this study, the teacher model’s confidence estimation is employed to filter those instances from the distilled training data for which the teacher exhibits low confidence. We tested our methods on a low-resource Urdu-to-English translation task operating within a constrained training budget in an industrial translation setting. Our findings show that confidence estimation-based filtering can significantly reduce the cost and CO2 emissions associated with training a student model without drop in translation quality, making it a practical and environmentally sustainable solution for the TSPs. Full article
(This article belongs to the Special Issue Deep Learning and Its Applications in Natural Language Processing)
Show Figures

Figure 1

35 pages, 2649 KiB  
Review
Integrating Radiogenomics and Machine Learning in Musculoskeletal Oncology Care
by Rahul Kumar, Kyle Sporn, Akshay Khanna, Phani Paladugu, Chirag Gowda, Alex Ngo, Ram Jagadeesan, Nasif Zaman and Alireza Tavakkoli
Diagnostics 2025, 15(11), 1377; https://doi.org/10.3390/diagnostics15111377 - 29 May 2025
Cited by 2 | Viewed by 879
Abstract
Musculoskeletal tumors present a diagnostic challenge due to their rarity, histological diversity, and overlapping imaging features. Accurate characterization is essential for effective treatment planning and prognosis, yet current diagnostic workflows rely heavily on invasive biopsy and subjective radiologic interpretation. This review explores the [...] Read more.
Musculoskeletal tumors present a diagnostic challenge due to their rarity, histological diversity, and overlapping imaging features. Accurate characterization is essential for effective treatment planning and prognosis, yet current diagnostic workflows rely heavily on invasive biopsy and subjective radiologic interpretation. This review explores the evolving role of radiogenomics and machine learning in improving diagnostic accuracy for bone and soft tissue tumors. We examine integrating quantitative imaging features from MRI, CT, and PET with genomic and transcriptomic data to enable non-invasive tumor profiling. AI-powered platforms employing convolutional neural networks (CNNs) and radiomic texture analysis show promising results in tumor grading, subtype differentiation (e.g., Osteosarcoma vs. Ewing sarcoma), and predicting mutation signatures (e.g., TP53, RB1). Moreover, we highlight the use of liquid biopsy and circulating tumor DNA (ctDNA) as emerging diagnostic biomarkers, coupled with point-of-care molecular assays, to enable early and accurate detection in low-resource settings. The review concludes by discussing translational barriers, including data harmonization, regulatory challenges, and the need for multi-institutional datasets to validate AI-based diagnostic frameworks. This article synthesizes current advancements and provides a forward-looking view of precision diagnostics in musculoskeletal oncology. Full article
(This article belongs to the Special Issue Advances in Musculoskeletal Imaging: From Diagnosis to Treatment)
Show Figures

Figure 1

30 pages, 6387 KiB  
Article
Transformer-Based Re-Ranking Model for Enhancing Contextual and Syntactic Translation in Low-Resource Neural Machine Translation
by Arifa Javed, Hongying Zan, Orken Mamyrbayev, Muhammad Abdullah, Kanwal Ahmed, Dina Oralbekova, Kassymova Dinara and Ainur Akhmediyarova
Electronics 2025, 14(2), 243; https://doi.org/10.3390/electronics14020243 - 8 Jan 2025
Cited by 1 | Viewed by 2894
Abstract
Neural machine translation (NMT) plays a vital role in modern communication by bridging language barriers and enabling effective information exchange across diverse linguistic communities. Due to the limited availability of data in low-resource languages, NMT faces significant translation challenges. Data sparsity limits NMT [...] Read more.
Neural machine translation (NMT) plays a vital role in modern communication by bridging language barriers and enabling effective information exchange across diverse linguistic communities. Due to the limited availability of data in low-resource languages, NMT faces significant translation challenges. Data sparsity limits NMT models’ ability to learn, generalize, and produce accurate translations, which leads to low coherence and poor context awareness. This paper proposes a transformer-based approach incorporating an encoder–decoder structure, bilingual curriculum learning, and contrastive re-ranking mechanisms. Our approach enriches the training dataset using back-translation and enhances the model’s contextual learning through BERT embeddings. An incomplete-trust (in-trust) loss function is introduced to replace the traditional cross-entropy loss during training. The proposed model effectively handles out-of-vocabulary words and integrates named entity recognition techniques to maintain semantic accuracy. Additionally, the self-attention layers in the transformer architecture enhance the model’s syntactic analysis capabilities, which enables better context awareness and more accurate translations. Extensive experiments are performed on a diverse Chinese–Urdu parallel corpus, developed using human effort and publicly available datasets such as OPUS, WMT, and WiLi. The proposed model demonstrates a BLEU score improvement of 1.80% for Zh→Ur and 2.22% for Ur→Zh compared to the highest-performing comparative model. This significant enhancement indicates better translation quality and accuracy. Full article
Show Figures

Graphical abstract

15 pages, 4255 KiB  
Article
Enhancing Neural Machine Translation Quality for Kannada–Tulu Language Pairs through Transformer Architecture: A Linguistic Feature Integration
by Musica Supriya, U Dinesh Acharya and Ashalatha Nayak
Designs 2024, 8(5), 100; https://doi.org/10.3390/designs8050100 - 12 Oct 2024
Cited by 1 | Viewed by 1831
Abstract
The rise of intelligent systems demands good machine translation models that are less data hungry and more efficient, especially for low- and extremely-low-resource languages with few or no data available. By integrating a linguistic feature to enhance the quality of translation, we have [...] Read more.
The rise of intelligent systems demands good machine translation models that are less data hungry and more efficient, especially for low- and extremely-low-resource languages with few or no data available. By integrating a linguistic feature to enhance the quality of translation, we have developed a generic Neural Machine Translation (NMT) model for Kannada–Tulu language pairs. The NMT model uses Transformer architecture and a state-of-the-art model for translating text from Kannada to Tulu and learns based on the parallel data. Kannada and Tulu are both low-resource Dravidian languages, with Tulu recognised as an extremely-low-resource language. Dravidian languages are morphologically rich and are highly agglutinative in nature and there exist only a few NMT models for Kannada–Tulu language pairs. They exhibit poor translation scores as they fail to capture the linguistic features of the language. The proposed generic approach can benefit other low-resource Indic languages that have smaller parallel corpora for NMT tasks. Evaluation metrics like Bilingual Evaluation Understudy (BLEU), character-level F-score (chrF) and Word Error Rate (WER) are considered to obtain the improved translation scores for the linguistic-feature-embedded NMT model. These results hold promise for further experimentation with other low- and extremely-low-resource language pairs. Full article
Show Figures

Figure 1

11 pages, 1288 KiB  
Article
Efficient Adaptation: Enhancing Multilingual Models for Low-Resource Language Translation
by Ilhami Sel and Davut Hanbay
Mathematics 2024, 12(19), 3149; https://doi.org/10.3390/math12193149 - 8 Oct 2024
Cited by 4 | Viewed by 2586
Abstract
This study focuses on the neural machine translation task for the TR-EN language pair, which is considered a low-resource language pair. We investigated fine-tuning strategies for pre-trained language models. Specifically, we explored the effectiveness of parameter-efficient adapter methods for fine-tuning multilingual pre-trained language [...] Read more.
This study focuses on the neural machine translation task for the TR-EN language pair, which is considered a low-resource language pair. We investigated fine-tuning strategies for pre-trained language models. Specifically, we explored the effectiveness of parameter-efficient adapter methods for fine-tuning multilingual pre-trained language models. Various combinations of LoRA and bottleneck adapters were experimented with. The combination of LoRA and bottleneck adapters demonstrated superior performance compared to other methods. This combination required only 5% of the pre-trained language model to be fine-tuned. The proposed method enhances parameter efficiency and reduces computational costs. Compared to the full fine-tuning of the multilingual pre-trained language model, it showed only a 3% difference in the BLEU score. Thus, nearly the same performance was achieved at a significantly lower cost. Additionally, models using only bottleneck adapters performed worse despite having a higher parameter count. Although adding LoRA to pre-trained language models alone did not yield sufficient performance, the proposed method improved machine translation. The results obtained are promising, particularly for low-resource language pairs. The proposed method requires less memory and computational load while maintaining translation quality. Full article
Show Figures

Figure 1

13 pages, 772 KiB  
Article
A Mongolian–Chinese Neural Machine Translation Method Based on Semantic-Context Data Augmentation
by Huinuan Zhang, Yatu Ji, Nier Wu and Min Lu
Appl. Sci. 2024, 14(8), 3442; https://doi.org/10.3390/app14083442 - 19 Apr 2024
Cited by 1 | Viewed by 1660
Abstract
Neural machine translation (NMT) typically relies on a substantial number of bilingual parallel corpora for effective training. Mongolian, as a low-resource language, has relatively few parallel corpora, resulting in poor translation performance. Data augmentation (DA) is a practical and promising method to solve [...] Read more.
Neural machine translation (NMT) typically relies on a substantial number of bilingual parallel corpora for effective training. Mongolian, as a low-resource language, has relatively few parallel corpora, resulting in poor translation performance. Data augmentation (DA) is a practical and promising method to solve problems related to data sparsity and single semantic structure by expanding the size and structure of available data. In order to address the issues of data sparsity and semantic inconsistency in Mongolian–Chinese NMT processes, this paper proposes a new semantic-context DA method. This method adds an additional semantic encoder based on the original translation model, which utilizes both source and target sentences to generate different semantic vectors to enhance each training instance. The results show that this method significantly improves the quality of Mongolian–Chinese NMT tasks, with an increase of approximately 2.5 BLEU values compared to the basic Transformer model. Compared to the basic model, this method can achieve the same translation results with about half of the data, greatly improving translation efficiency. Full article
(This article belongs to the Special Issue Natural Language Processing: Theory, Methods and Applications)
Show Figures

Figure 1

28 pages, 933 KiB  
Article
A Systematic Evaluation of Recurrent Neural Network Models for Edge Intelligence and Human Activity Recognition Applications
by Varsha S. Lalapura, Veerender Reddy Bhimavarapu, J. Amudha and Hariram Selvamurugan Satheesh
Algorithms 2024, 17(3), 104; https://doi.org/10.3390/a17030104 - 28 Feb 2024
Cited by 10 | Viewed by 2827
Abstract
The Recurrent Neural Networks (RNNs) are an essential class of supervised learning algorithms. Complex tasks like speech recognition, machine translation, sentiment classification, weather prediction, etc., are now performed by well-trained RNNs. Local or cloud-based GPU machines are used to train them. However, inference [...] Read more.
The Recurrent Neural Networks (RNNs) are an essential class of supervised learning algorithms. Complex tasks like speech recognition, machine translation, sentiment classification, weather prediction, etc., are now performed by well-trained RNNs. Local or cloud-based GPU machines are used to train them. However, inference is now shifting to miniature, mobile, IoT devices and even micro-controllers. Due to their colossal memory and computing requirements, mapping RNNs directly onto resource-constrained platforms is arcane and challenging. The efficacy of edge-intelligent RNNs (EI-RNNs) must satisfy both performance and memory-fitting requirements at the same time without compromising one for the other. This study’s aim was to provide an empirical evaluation and optimization of historic as well as recent RNN architectures for high-performance and low-memory footprint goals. We focused on Human Activity Recognition (HAR) tasks based on wearable sensor data for embedded healthcare applications. We evaluated and optimized six different recurrent units, namely Vanilla RNNs, Long Short-Term Memory (LSTM) units, Gated Recurrent Units (GRUs), Fast Gated Recurrent Neural Networks (FGRNNs), Fast Recurrent Neural Networks (FRNNs), and Unitary Gated Recurrent Neural Networks (UGRNNs) on eight publicly available time-series HAR datasets. We used the hold-out and cross-validation protocols for training the RNNs. We used low-rank parameterization, iterative hard thresholding, and spare retraining compression for RNNs. We found that efficient training (i.e., dataset handling and preprocessing procedures, hyperparameter tuning, and so on, and suitable compression methods (like low-rank parameterization and iterative pruning) are critical in optimizing RNNs for performance and memory efficiency. We implemented the inference of the optimized models on Raspberry Pi. Full article
Show Figures

Figure 1

19 pages, 3185 KiB  
Article
The Task of Post-Editing Machine Translation for the Low-Resource Language
by Diana Rakhimova, Aidana Karibayeva and Assem Turarbek
Appl. Sci. 2024, 14(2), 486; https://doi.org/10.3390/app14020486 - 5 Jan 2024
Cited by 6 | Viewed by 3337
Abstract
In recent years, machine translation has made significant advancements; however, its effectiveness can vary widely depending on the language pair. Languages with limited resources, such as Kazakh, Uzbek, Kalmyk, Tatar, and others, often encounter challenges in achieving high-quality machine translations. Kazakh is an [...] Read more.
In recent years, machine translation has made significant advancements; however, its effectiveness can vary widely depending on the language pair. Languages with limited resources, such as Kazakh, Uzbek, Kalmyk, Tatar, and others, often encounter challenges in achieving high-quality machine translations. Kazakh is an agglutinative language with complex morphology, making it a low-resource language. This article addresses the task of post-editing machine translation for the Kazakh language. The research begins by discussing the history and evolution of machine translation and how it has developed to meet the unique needs of languages with limited resources. The research resulted in the development of a machine translation post-editing system. The system utilizes modern machine learning methods, starting with neural machine translation using the BRNN model in the initial post-editing stage. Subsequently, the transformer model is applied to further edit the text. Complex structural and grammatical forms are processed, and abbreviations are replaced. Practical experiments were conducted on various texts: news publications, legislative documents, IT sphere, etc. This article serves as a valuable resource for researchers and practitioners in the field of machine translation, shedding light on effective post-editing strategies to enhance translation quality, particularly in scenarios involving languages with limited resources such as Kazakh and Uzbek. The obtained results were tested and evaluated using specialized metrics—BLEU, TER, and WER. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

14 pages, 6671 KiB  
Article
Neural Machine Translation Research on Syntactic Information Fusion Based on the Field of Electrical Engineering
by Yanna Sang, Yuan Chen and Juwei Zhang
Appl. Sci. 2023, 13(23), 12905; https://doi.org/10.3390/app132312905 - 1 Dec 2023
Viewed by 1919
Abstract
Neural machine translation has achieved good translation results, but needs further improvement in low-resource and domain-specific translation. To this end, the paper proposed to incorporate source language syntactic information into neural machine translation models. Two novel approaches, namely Contrastive Language–Image Pre-training(CLIP) and Cross-attention [...] Read more.
Neural machine translation has achieved good translation results, but needs further improvement in low-resource and domain-specific translation. To this end, the paper proposed to incorporate source language syntactic information into neural machine translation models. Two novel approaches, namely Contrastive Language–Image Pre-training(CLIP) and Cross-attention Fusion (CAF), were compared to a base transformer model on EN–ZH and ZH–EN pair machine translation focusing on the electrical engineering domain. In addition, an ablation study on the effect of both proposed methods was presented. Among them, the CLIP pre-training method improved significantly compared with the baseline system, and the BLEU values in the EN–ZH and ZH–EN tasks increased by 3.37 and 3.18 percentage points, respectively. Full article
(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications)
Show Figures

Figure 1

24 pages, 1246 KiB  
Article
adaptMLLM: Fine-Tuning Multilingual Language Models on Low-Resource Languages with Integrated LLM Playgrounds
by Séamus Lankford, Haithem Afli and Andy Way
Information 2023, 14(12), 638; https://doi.org/10.3390/info14120638 - 29 Nov 2023
Cited by 27 | Viewed by 13200
Abstract
The advent of Multilingual Language Models (MLLMs) and Large Language Models (LLMs) has spawned innovation in many areas of natural language processing. Despite the exciting potential of this technology, its impact on developing high-quality Machine Translation (MT) outputs for low-resource languages remains relatively [...] Read more.
The advent of Multilingual Language Models (MLLMs) and Large Language Models (LLMs) has spawned innovation in many areas of natural language processing. Despite the exciting potential of this technology, its impact on developing high-quality Machine Translation (MT) outputs for low-resource languages remains relatively under-explored. Furthermore, an open-source application, dedicated to both fine-tuning MLLMs and managing the complete MT workflow for low-resources languages, remains unavailable. We aim to address these imbalances through the development of adaptMLLM, which streamlines all processes involved in the fine-tuning of MLLMs for MT. This open-source application is tailored for developers, translators, and users who are engaged in MT. It is particularly useful for newcomers to the field, as it significantly streamlines the configuration of the development environment. An intuitive interface allows for easy customisation of hyperparameters, and the application offers a range of metrics for model evaluation and the capability to deploy models as a translation service directly within the application. As a multilingual tool, we used adaptMLLM to fine-tune models for two low-resource language pairs: English to Irish (EN GA) and English to Marathi (ENMR). Compared with baselines from the LoResMT2021 Shared Task, the adaptMLLM system demonstrated significant improvements. In the ENGA direction, an improvement of 5.2 BLEU points was observed and an increase of 40.5 BLEU points was recorded in the GAEN direction representing relative improvements of 14% and 117%, respectively. Significant improvements in the translation performance of the ENMR pair were also observed notably in the MREN direction with an increase of 21.3 BLEU points which corresponds to a relative improvement of 68%. Finally, a fine-grained human evaluation of the MLLM output on the ENGA pair was conducted using the Multidimensional Quality Metrics and Scalar Quality Metrics error taxonomies. The application and models are freely available. Full article
(This article belongs to the Special Issue Machine Translation for Conquering Language Barriers)
Show Figures

Graphical abstract

13 pages, 2003 KiB  
Article
Neural Machine Translation of Electrical Engineering with Fusion of Memory Information
by Yuan Chen, Zikang Liu and Juwei Zhang
Appl. Sci. 2023, 13(18), 10279; https://doi.org/10.3390/app131810279 - 13 Sep 2023
Viewed by 1647
Abstract
This paper proposes a new neural machine translation model of electrical engineering that combines a transformer with gated recurrent unit (GRU) networks. By fusing global information and memory information, the model effectively improves the performance of low-resource neural machine translation. Unlike traditional transformers, [...] Read more.
This paper proposes a new neural machine translation model of electrical engineering that combines a transformer with gated recurrent unit (GRU) networks. By fusing global information and memory information, the model effectively improves the performance of low-resource neural machine translation. Unlike traditional transformers, our proposed model includes two different encoders: one is the global information encoder, which focuses on contextual information, and the other is the memory encoder, which is responsible for capturing recurrent memory information. The model with these two types of attention can encode both global and memory information and learn richer semantic knowledge. Because transformers require global attention calculation for each word position, the time and space complexity are both squared with the length of the source language sequence. When the length of the source language sequence becomes too long, the performance of the transformer will sharply decline. Therefore, we propose a memory information encoder based on the GRU to improve this drawback. The model proposed in this paper has a maximum improvement of 2.04 BLEU points over the baseline model in the field of electrical engineering with low resources. Full article
(This article belongs to the Special Issue Natural Language Processing (NLP) and Applications)
Show Figures

Figure 1

21 pages, 1620 KiB  
Article
Deep Models for Low-Resourced Speech Recognition: Livvi-Karelian Case
by Irina Kipyatkova and Ildar Kagirov
Mathematics 2023, 11(18), 3814; https://doi.org/10.3390/math11183814 - 5 Sep 2023
Cited by 3 | Viewed by 1831
Abstract
Recently, there has been a growth in the number of studies addressing the automatic processing of low-resource languages. The lack of speech and text data significantly hinders the development of speech technologies for such languages. This paper introduces an automatic speech recognition system [...] Read more.
Recently, there has been a growth in the number of studies addressing the automatic processing of low-resource languages. The lack of speech and text data significantly hinders the development of speech technologies for such languages. This paper introduces an automatic speech recognition system for Livvi-Karelian. Acoustic models based on artificial neural networks with time delays and hidden Markov models were trained using a limited speech dataset of 3.5 h. To augment the data, pitch and speech rate perturbation, SpecAugment, and their combinations were employed. Language models based on 3-grams and neural networks were trained using written texts and transcripts. The achieved word error rate metric of 22.80% is comparable to other low-resource languages. To the best of our knowledge, this is the first speech recognition system for Livvi-Karelian. The results obtained can be of a certain significance for development of automatic speech recognition systems not only for Livvi-Karelian, but also for other low-resource languages, including the fields of speech recognition and machine translation systems. Future work includes experiments with Karelian data using techniques such as transfer learning and DNN language models. Full article
(This article belongs to the Special Issue Recent Advances in Neural Networks and Applications)
Show Figures

Figure 1

15 pages, 2746 KiB  
Article
Neural Machine Translation of Electrical Engineering Based on Integrated Convolutional Neural Networks
by Zikang Liu, Yuan Chen and Juwei Zhang
Electronics 2023, 12(17), 3604; https://doi.org/10.3390/electronics12173604 - 25 Aug 2023
Cited by 1 | Viewed by 2031
Abstract
Research has shown that neural machine translation performs poorly on low-resource and specific domain parallel corpora. In this paper, we focus on the problem of neural machine translation in the field of electrical engineering. To address the mistranslation caused by the Transformer model’s [...] Read more.
Research has shown that neural machine translation performs poorly on low-resource and specific domain parallel corpora. In this paper, we focus on the problem of neural machine translation in the field of electrical engineering. To address the mistranslation caused by the Transformer model’s limited ability to extract feature information from certain sentences, we propose two new models that integrate a convolutional neural network as a feature extraction layer into the Transformer model. The feature information extracted by the CNN is fused separately in the source-side and target-side models, which enhances the Transformer model’s ability to extract feature information, optimizes model performance, and improves translation quality. On the dataset of the field of electrical engineering, the proposed source-side and target-side models improved BLEU scores by 1.63 and 1.12 percentage points, respectively, compared to the baseline model. In addition, the two models proposed in this paper can learn rich semantic knowledge without relying on auxiliary knowledge such as part-of-speech tagging and named entity recognition, which saves a certain amount of human resources and time costs. Full article
(This article belongs to the Special Issue Natural Language Processing and Information Retrieval)
Show Figures

Figure 1

17 pages, 892 KiB  
Article
Part-of-Speech Tags Guide Low-Resource Machine Translation
by Zaokere Kadeer, Nian Yi and Aishan Wumaier
Electronics 2023, 12(16), 3401; https://doi.org/10.3390/electronics12163401 - 10 Aug 2023
Cited by 3 | Viewed by 1930
Abstract
Neural machine translation models are guided by loss function to select source sentence features and generate results close to human annotation. When the data resources are abundant, neural machine translation models can focus on the features used to produce high-quality translations. These features [...] Read more.
Neural machine translation models are guided by loss function to select source sentence features and generate results close to human annotation. When the data resources are abundant, neural machine translation models can focus on the features used to produce high-quality translations. These features include POS or other grammatical features. However, models cannot focus precisely on these features when data resources are limited. The reason is that the lack of samples makes the model overfit before considering these features. Previous works have enriched the features by integrating source POS or multitask methods. However, these methods only utilize the source POS or produce translations by introducing the generated target POS. We propose introducing POS information based on multitask methods and reconstructors. We obtain the POS tags by the additional encoder and decoder and compute the corresponding loss function. These loss functions are used with the loss function of machine translation to optimize the parameters of the entire model, which makes the model pay attention to POS features. The POS features focused on by models will guide the translation process and alleviate the problem that models cannot focus on the POS features in the case of low resources. Experiments on multiple translation tasks show that the method improves 0.4∼1 BLEU compared with the baseline model on different translation tasks. Full article
(This article belongs to the Special Issue Natural Language Processing and Information Retrieval)
Show Figures

Figure 1

15 pages, 2108 KiB  
Article
A Scenario-Generic Neural Machine Translation Data Augmentation Method
by Xiner Liu, Jianshu He, Mingzhe Liu, Zhengtong Yin, Lirong Yin and Wenfeng Zheng
Electronics 2023, 12(10), 2320; https://doi.org/10.3390/electronics12102320 - 21 May 2023
Cited by 72 | Viewed by 3992
Abstract
Amid the rapid advancement of neural machine translation, the challenge of data sparsity has been a major obstacle. To address this issue, this study proposes a general data augmentation technique for various scenarios. It examines the predicament of parallel corpora diversity and high [...] Read more.
Amid the rapid advancement of neural machine translation, the challenge of data sparsity has been a major obstacle. To address this issue, this study proposes a general data augmentation technique for various scenarios. It examines the predicament of parallel corpora diversity and high quality in both rich- and low-resource settings, and integrates the low-frequency word substitution method and reverse translation approach for complementary benefits. Additionally, this method improves the pseudo-parallel corpus generated by the reverse translation method by substituting low-frequency words and includes a grammar error correction module to reduce grammatical errors in low-resource scenarios. The experimental data are partitioned into rich- and low-resource scenarios at a 10:1 ratio. It verifies the necessity of grammatical error correction for pseudo-corpus in low-resource scenarios. Models and methods are chosen from the backbone network and related literature for comparative experiments. The experimental findings demonstrate that the data augmentation approach proposed in this study is suitable for both rich- and low-resource scenarios and is effective in enhancing the training corpus to improve the performance of translation tasks. Full article
(This article belongs to the Special Issue Natural Language Processing and Information Retrieval)
Show Figures

Figure 1

Back to TopTop