Frontiers in Machine Translation

A special issue of Information (ISSN 2078-2489). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: closed (31 March 2022) | Viewed by 28347

Special Issue Editors


E-Mail Website
Guest Editor
Insight Centre for Data Analytics, Data Science Institute, Galway, Ireland
Interests: Natural Language Processing; Machine Translation; Dialogue Systems; Knowledge Graphs

E-Mail Website
Guest Editor
Tilburg School of Humanities and Digital Sciences, Department Cognitive Science and Artificial Intelligence, Tilburg, The Netherlands
Interests: Machine Translation; Natural Language Processing; Linguistics; Gender in Language

E-Mail Website
Guest Editor
Tilburg School of Humanities and Digital Sciences, Department Cognitive Science and Artificial Intelligence, Tilburg, The Netherlands
Interests: Machine Translation; Natural Language Processing; Machine Learning; Artificial Intelligence

Special Issue Information

The MDPI Information journal invites submissions to a SPECIAL ISSUE on “Frontiers in Machine Translation”.

Since its inception in the late 60s, Machine Translation (MT) has been steadily increasing its impact on the translation industry and translation practitioners. With current state-of-the-art systems reaching unprecedented translation quality and response times, unique language pairs and domains covered, it has become an inseparable part of the translation workflow.

The current success of MT relies heavily on large high-quality datasets and advanced computational resources. As such, low-resource use-cases still remain challenging. Aside from current systems being resource demanding, there are still no effective mechanisms for handling terminological expressions, named entities, coreference resolution, and various other (language-specific) linguistic phenomena.

The purpose of this Special Issue is to address such challenges and barriers that MT faces, to present novel approaches to MT, new datasets, new ways to incorporate external knowledge into the current frameworks, etc. Researchers in the field are invited to contribute with their original and unpublished works.

Topics of interest include but are not limited to:

  • Comparison of machine translation models (rule-based vs. phrase-based vs. neural);
  • Machine translation for low-resourced languages;
  • Usage of non-parallel corpora for machine translation;
  • Inclusion of linguistic/semantic knowledge into machine translation;
  • Usage of named entities and terminological expressions in machine translation;
  • Multimodal machine translation;
  • Unsupervised or semi-supervised approaches to machine translation;
  • Novel approaches to automatic post-editing;
  • Automatic evaluation methods for machine

Dr. Mihael Arcan
Dr. Eva Vanmassenhove
Dr. Dimitar Shterionov
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Machine translation
  • Parallel corpora
  • Low-resourced languages
  • Automatic post editing
  • Multimodal approaches
  • Named entities
  • Terminology
  • Multilinguality

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

19 pages, 625 KiB  
Article
Human Evaluation of English–Irish Transformer-Based NMT
by Séamus Lankford, Haithem Afli and Andy Way
Information 2022, 13(7), 309; https://doi.org/10.3390/info13070309 - 25 Jun 2022
Cited by 5 | Viewed by 2113
Abstract
In this study, a human evaluation is carried out on how hyperparameter settings impact the quality of Transformer-based Neural Machine Translation (NMT) for the low-resourced English–Irish pair. SentencePiece models using both Byte Pair Encoding (BPE) and unigram approaches were appraised. Variations in model [...] Read more.
In this study, a human evaluation is carried out on how hyperparameter settings impact the quality of Transformer-based Neural Machine Translation (NMT) for the low-resourced English–Irish pair. SentencePiece models using both Byte Pair Encoding (BPE) and unigram approaches were appraised. Variations in model architectures included modifying the number of layers, evaluating the optimal number of heads for attention and testing various regularisation techniques. The greatest performance improvement was recorded for a Transformer-optimized model with a 16k BPE subword model. Compared with a baseline Recurrent Neural Network (RNN) model, a Transformer-optimized model demonstrated a BLEU score improvement of 7.8 points. When benchmarked against Google Translate, our translation engines demonstrated significant improvements. Furthermore, a quantitative fine-grained manual evaluation was conducted which compared the performance of machine translation systems. Using the Multidimensional Quality Metrics (MQM) error taxonomy, a human evaluation of the error types generated by an RNN-based system and a Transformer-based system was explored. Our findings show the best-performing Transformer system significantly reduces both accuracy and fluency errors when compared with an RNN-based model. Full article
(This article belongs to the Special Issue Frontiers in Machine Translation)
Show Figures

Figure 1

15 pages, 311 KiB  
Article
On the Use of Mouse Actions at the Character Level
by Ángel Navarro and Francisco Casacuberta
Information 2022, 13(6), 294; https://doi.org/10.3390/info13060294 - 09 Jun 2022
Cited by 2 | Viewed by 1405
Abstract
Neural Machine Translation (NMT) has improved performance in several tasks up to human parity. However, many companies still use Computer-Assisted Translation (CAT) tools to achieve perfect translation, as well as other tools. Among these tools, we find Interactive-Predictive Neural Machine Translation (IPNMT) systems, [...] Read more.
Neural Machine Translation (NMT) has improved performance in several tasks up to human parity. However, many companies still use Computer-Assisted Translation (CAT) tools to achieve perfect translation, as well as other tools. Among these tools, we find Interactive-Predictive Neural Machine Translation (IPNMT) systems, whose main feature is facilitating machine–human interactions. In the most conventional systems, the human user fixes a translation error by typing the correct word, sending this feedback to the machine which generates a new translation that satisfies it. In this article, we remove the necessity of typing to correct translations by using the bandit feedback obtained from the cursor position when the user performs a Mouse Action (MA). Our system generates a new translation that fixes the error using only the error position. The user can perform multiple MAs at the same position if the error is not fixed, each of which increases the correction probability. One of the main objectives in the IPNMT field is reducing the required human effort, in order to optimize the translation time. With the proposed technique, an 84% reduction in the number of keystrokes performed can be achieved, while still generating perfect translations. For this reason, we recommend the use of this technique in IPNMT systems. Full article
(This article belongs to the Special Issue Frontiers in Machine Translation)
Show Figures

Figure 1

14 pages, 871 KiB  
Article
Investigating Contextual Influence in Document-Level Translation
by Prashanth Nayak, Rejwanul Haque, John D. Kelleher and Andy Way
Information 2022, 13(5), 249; https://doi.org/10.3390/info13050249 - 12 May 2022
Cited by 4 | Viewed by 2542
Abstract
Current state-of-the-art neural machine translation (NMT) architectures usually do not take document-level context into account. However, the document-level context of a source sentence to be translated could encode valuable information to guide the MT model to generate a better translation. In recent times, [...] Read more.
Current state-of-the-art neural machine translation (NMT) architectures usually do not take document-level context into account. However, the document-level context of a source sentence to be translated could encode valuable information to guide the MT model to generate a better translation. In recent times, MT researchers have turned their focus to this line of MT research. As an example, hierarchical attention network (HAN) models use document-level context for translation prediction. In this work, we studied translations produced by the HAN-based MT systems. We examined how contextual information improves translation in document-level NMT. More specifically, we investigated why context-aware models such as HAN perform better than vanilla baseline NMT systems that do not take context into account. We considered Hindi-to-English, Spanish-to-English and Chinese-to-English for our investigation. We experimented with the formation of conditional context (i.e., neighbouring sentences) of the source sentences to be translated in HAN to predict their target translations. Interestingly, we observed that the quality of the target translations of specific source sentences highly relates to the context in which the source sentences appear. Based on their sensitivity to context, we classify our test set sentences into three categories, i.e., context-sensitive, context-insensitive and normal. We believe that this categorization may change the way in which context is utilized in document-level translation. Full article
(This article belongs to the Special Issue Frontiers in Machine Translation)
Show Figures

Figure 1

11 pages, 228 KiB  
Article
Improving English-to-Indian Language Neural Machine Translation Systems
by Akshara Kandimalla, Pintu Lohar, Souvik Kumar Maji and Andy Way
Information 2022, 13(5), 245; https://doi.org/10.3390/info13050245 - 11 May 2022
Cited by 7 | Viewed by 4487
Abstract
Most Indian languages lack sufficient parallel data for Machine Translation (MT) training. In this study, we build English-to-Indian language Neural Machine Translation (NMT) systems using the state-of-the-art transformer architecture. In addition, we investigate the utility of back-translation and its effect on system performance. [...] Read more.
Most Indian languages lack sufficient parallel data for Machine Translation (MT) training. In this study, we build English-to-Indian language Neural Machine Translation (NMT) systems using the state-of-the-art transformer architecture. In addition, we investigate the utility of back-translation and its effect on system performance. Our experimental evaluation reveals that the back-translation method helps to improve the BLEU scores for both English-to-Hindi and English-to-Bengali NMT systems. We also observe that back-translation is more useful in improving the quality of weaker baseline MT systems. In addition, we perform a manual evaluation of the translation outputs and observe that the BLEU metric cannot always analyse the MT quality as well as humans. Our analysis shows that MT outputs for the English–Bengali pair are actually better than that evaluated by BLEU metric. Full article
(This article belongs to the Special Issue Frontiers in Machine Translation)
Show Figures

Figure 1

17 pages, 734 KiB  
Article
Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation
by Mathieu De Coster and Joni Dambre
Information 2022, 13(5), 220; https://doi.org/10.3390/info13050220 - 23 Apr 2022
Cited by 7 | Viewed by 3167
Abstract
We consider neural sign language translation: machine translation from signed to written languages using encoder–decoder neural networks. Translating sign language videos to written language text is especially complex because of the difference in modality between source and target language and, consequently, the required [...] Read more.
We consider neural sign language translation: machine translation from signed to written languages using encoder–decoder neural networks. Translating sign language videos to written language text is especially complex because of the difference in modality between source and target language and, consequently, the required video processing. At the same time, sign languages are low-resource languages, their datasets dwarfed by those available for written languages. Recent advances in written language processing and success stories of transfer learning raise the question of how pretrained written language models can be leveraged to improve sign language translation. We apply the Frozen Pretrained Transformer (FPT) technique to initialize the encoder, decoder, or both, of a sign language translation model with parts of a pretrained written language model. We observe that the attention patterns transfer in zero-shot to the different modality and, in some experiments, we obtain higher scores (from 18.85 to 21.39 BLEU-4). Especially when gloss annotations are unavailable, FPTs can increase performance on unseen data. However, current models appear to be limited primarily by data quality and only then by data quantity, limiting potential gains with FPTs. Therefore, in further research, we will focus on improving the representations used as inputs to translation models. Full article
(This article belongs to the Special Issue Frontiers in Machine Translation)
Show Figures

Figure 1

14 pages, 344 KiB  
Article
Lexical Diversity in Statistical and Neural Machine Translation
by Mojca Brglez and Špela Vintar
Information 2022, 13(2), 93; https://doi.org/10.3390/info13020093 - 15 Feb 2022
Cited by 3 | Viewed by 3070
Abstract
Neural machine translation systems have revolutionized translation processes in terms of quantity and speed in recent years, and they have even been claimed to achieve human parity. However, the quality of their output has also raised serious doubts and concerns, such as loss [...] Read more.
Neural machine translation systems have revolutionized translation processes in terms of quantity and speed in recent years, and they have even been claimed to achieve human parity. However, the quality of their output has also raised serious doubts and concerns, such as loss in lexical variation, evidence of “machine translationese”, and its effect on post-editing, which results in “post-editese”. In this study, we analyze the outputs of three English to Slovenian machine translation systems in terms of lexical diversity in three different genres. Using both quantitative and qualitative methods, we analyze one statistical and two neural systems, and we compare them to a human reference translation. Our quantitative analyses based on lexical diversity metrics show diverging results; however, translation systems, particularly neural ones, mostly exhibit larger lexical diversity than their human counterparts. Nevertheless, a qualitative method shows that these quantitative results are not always a reliable tool to assess true lexical diversity and that a lot of lexical “creativity”, especially by neural translation systems, is often unreliable, inconsistent, and misguided. Full article
(This article belongs to the Special Issue Frontiers in Machine Translation)
Show Figures

Figure 1

21 pages, 2307 KiB  
Article
Recent Advances in Dialogue Machine Translation
by Siyou Liu, Yuqi Sun and Longyue Wang
Information 2021, 12(11), 484; https://doi.org/10.3390/info12110484 - 22 Nov 2021
Cited by 6 | Viewed by 2615
Abstract
Recent years have seen a surge of interest in dialogue translation, which is a significant application task for machine translation (MT) technology. However, this has so far not been extensively explored due to its inherent characteristics including data limitation, discourse properties and personality [...] Read more.
Recent years have seen a surge of interest in dialogue translation, which is a significant application task for machine translation (MT) technology. However, this has so far not been extensively explored due to its inherent characteristics including data limitation, discourse properties and personality traits. In this article, we give the first comprehensive review of dialogue MT, including well-defined problems (e.g., 4 perspectives), collected resources (e.g., 5 language pairs and 4 sub-domains), representative approaches (e.g., architecture, discourse phenomena and personality) and useful applications (e.g., hotel-booking chat system). After systematical investigation, we also build a state-of-the-art dialogue NMT system by leveraging a breadth of established approaches such as novel architectures, popular pre-training and advanced techniques. Encouragingly, we push the state-of-the-art performance up to 62.7 BLEU points on a commonly-used benchmark by using mBART pre-training. We hope that this survey paper could significantly promote the research in dialogue MT. Full article
(This article belongs to the Special Issue Frontiers in Machine Translation)
Show Figures

Figure 1

Review

Jump to: Research

17 pages, 15504 KiB  
Review
Sign Language Avatars: A Question of Representation
by Rosalee Wolfe, John C. McDonald, Thomas Hanke, Sarah Ebling, Davy Van Landuyt, Frankie Picron, Verena Krausneker, Eleni Efthimiou, Evita Fotinea and Annelies Braffort
Information 2022, 13(4), 206; https://doi.org/10.3390/info13040206 - 18 Apr 2022
Cited by 12 | Viewed by 6411
Abstract
Given the achievements in automatically translating text from one language to another, one would expect to see similar advancements in translating between signed and spoken languages. However, progress in this effort has lagged in comparison. Typically, machine translation consists of processing text from [...] Read more.
Given the achievements in automatically translating text from one language to another, one would expect to see similar advancements in translating between signed and spoken languages. However, progress in this effort has lagged in comparison. Typically, machine translation consists of processing text from one language to produce text in another. Because signed languages have no generally-accepted written form, translating spoken to signed language requires the additional step of displaying the language visually as animation through the use of a three-dimensional (3D) virtual human commonly known as an avatar. Researchers have been grappling with this problem for over twenty years, and it is still an open question. With the goal of developing a deeper understanding of the challenges posed by this question, this article gives a summary overview of the unique aspects of signed languages, briefly surveys the technology underlying avatars and performs an in-depth analysis of the features in a textual representation for avatar display. It concludes with a comparison of these features and makes observations about future research directions. Full article
(This article belongs to the Special Issue Frontiers in Machine Translation)
Show Figures

Figure 1

Back to TopTop