Information

Research

Jump to: Review

19 pages, 625 KiB

Open AccessArticle

Human Evaluation of English–Irish Transformer-Based NMT

by Séamus Lankford, Haithem Afli and Andy Way

Information 2022, 13(7), 309; https://doi.org/10.3390/info13070309 - 25 Jun 2022

Cited by 10 | Viewed by 3726

Abstract

In this study, a human evaluation is carried out on how hyperparameter settings impact the quality of Transformer-based Neural Machine Translation (NMT) for the low-resourced English–Irish pair. SentencePiece models using both Byte Pair Encoding (BPE) and unigram approaches were appraised. Variations in model [...] Read more.

In this study, a human evaluation is carried out on how hyperparameter settings impact the quality of Transformer-based Neural Machine Translation (NMT) for the low-resourced English–Irish pair. SentencePiece models using both Byte Pair Encoding (BPE) and unigram approaches were appraised. Variations in model architectures included modifying the number of layers, evaluating the optimal number of heads for attention and testing various regularisation techniques. The greatest performance improvement was recorded for a Transformer-optimized model with a 16k BPE subword model. Compared with a baseline Recurrent Neural Network (RNN) model, a Transformer-optimized model demonstrated a BLEU score improvement of 7.8 points. When benchmarked against Google Translate, our translation engines demonstrated significant improvements. Furthermore, a quantitative fine-grained manual evaluation was conducted which compared the performance of machine translation systems. Using the Multidimensional Quality Metrics (MQM) error taxonomy, a human evaluation of the error types generated by an RNN-based system and a Transformer-based system was explored. Our findings show the best-performing Transformer system significantly reduces both accuracy and fluency errors when compared with an RNN-based model. Full article

(This article belongs to the Special Issue Frontiers in Machine Translation)

► Show Figures

Figure 1

15 pages, 311 KiB

Open AccessArticle

On the Use of Mouse Actions at the Character Level

by Ángel Navarro and Francisco Casacuberta

Information 2022, 13(6), 294; https://doi.org/10.3390/info13060294 - 9 Jun 2022

Cited by 2 | Viewed by 2056

Abstract

Neural Machine Translation (NMT) has improved performance in several tasks up to human parity. However, many companies still use Computer-Assisted Translation (CAT) tools to achieve perfect translation, as well as other tools. Among these tools, we find Interactive-Predictive Neural Machine Translation (IPNMT) systems, [...] Read more.

Neural Machine Translation (NMT) has improved performance in several tasks up to human parity. However, many companies still use Computer-Assisted Translation (CAT) tools to achieve perfect translation, as well as other tools. Among these tools, we find Interactive-Predictive Neural Machine Translation (IPNMT) systems, whose main feature is facilitating machine–human interactions. In the most conventional systems, the human user fixes a translation error by typing the correct word, sending this feedback to the machine which generates a new translation that satisfies it. In this article, we remove the necessity of typing to correct translations by using the bandit feedback obtained from the cursor position when the user performs a Mouse Action (MA). Our system generates a new translation that fixes the error using only the error position. The user can perform multiple MAs at the same position if the error is not fixed, each of which increases the correction probability. One of the main objectives in the IPNMT field is reducing the required human effort, in order to optimize the translation time. With the proposed technique, an 84% reduction in the number of keystrokes performed can be achieved, while still generating perfect translations. For this reason, we recommend the use of this technique in IPNMT systems. Full article

(This article belongs to the Special Issue Frontiers in Machine Translation)

► Show Figures

Figure 1

14 pages, 871 KiB

Open AccessArticle

Investigating Contextual Influence in Document-Level Translation

by Prashanth Nayak, Rejwanul Haque, John D. Kelleher and Andy Way

Information 2022, 13(5), 249; https://doi.org/10.3390/info13050249 - 12 May 2022

Cited by 5 | Viewed by 3867

Abstract

Current state-of-the-art neural machine translation (NMT) architectures usually do not take document-level context into account. However, the document-level context of a source sentence to be translated could encode valuable information to guide the MT model to generate a better translation. In recent times, [...] Read more.

Current state-of-the-art neural machine translation (NMT) architectures usually do not take document-level context into account. However, the document-level context of a source sentence to be translated could encode valuable information to guide the MT model to generate a better translation. In recent times, MT researchers have turned their focus to this line of MT research. As an example, hierarchical attention network (HAN) models use document-level context for translation prediction. In this work, we studied translations produced by the HAN-based MT systems. We examined how contextual information improves translation in document-level NMT. More specifically, we investigated why context-aware models such as HAN perform better than vanilla baseline NMT systems that do not take context into account. We considered Hindi-to-English, Spanish-to-English and Chinese-to-English for our investigation. We experimented with the formation of conditional context (i.e., neighbouring sentences) of the source sentences to be translated in HAN to predict their target translations. Interestingly, we observed that the quality of the target translations of specific source sentences highly relates to the context in which the source sentences appear. Based on their sensitivity to context, we classify our test set sentences into three categories, i.e., context-sensitive, context-insensitive and normal. We believe that this categorization may change the way in which context is utilized in document-level translation. Full article

(This article belongs to the Special Issue Frontiers in Machine Translation)

► Show Figures

Figure 1

11 pages, 228 KiB

Open AccessArticle

Improving English-to-Indian Language Neural Machine Translation Systems

by Akshara Kandimalla, Pintu Lohar, Souvik Kumar Maji and Andy Way

Information 2022, 13(5), 245; https://doi.org/10.3390/info13050245 - 11 May 2022

Cited by 19 | Viewed by 7394

Abstract

Most Indian languages lack sufficient parallel data for Machine Translation (MT) training. In this study, we build English-to-Indian language Neural Machine Translation (NMT) systems using the state-of-the-art transformer architecture. In addition, we investigate the utility of back-translation and its effect on system performance. [...] Read more.

Most Indian languages lack sufficient parallel data for Machine Translation (MT) training. In this study, we build English-to-Indian language Neural Machine Translation (NMT) systems using the state-of-the-art transformer architecture. In addition, we investigate the utility of back-translation and its effect on system performance. Our experimental evaluation reveals that the back-translation method helps to improve the BLEU scores for both English-to-Hindi and English-to-Bengali NMT systems. We also observe that back-translation is more useful in improving the quality of weaker baseline MT systems. In addition, we perform a manual evaluation of the translation outputs and observe that the BLEU metric cannot always analyse the MT quality as well as humans. Our analysis shows that MT outputs for the English–Bengali pair are actually better than that evaluated by BLEU metric. Full article

(This article belongs to the Special Issue Frontiers in Machine Translation)

► Show Figures

Figure 1

17 pages, 734 KiB

Open AccessArticle

Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation

by Mathieu De Coster and Joni Dambre

Information 2022, 13(5), 220; https://doi.org/10.3390/info13050220 - 23 Apr 2022

Cited by 16 | Viewed by 4627

Abstract

We consider neural sign language translation: machine translation from signed to written languages using encoder–decoder neural networks. Translating sign language videos to written language text is especially complex because of the difference in modality between source and target language and, consequently, the required [...] Read more.

We consider neural sign language translation: machine translation from signed to written languages using encoder–decoder neural networks. Translating sign language videos to written language text is especially complex because of the difference in modality between source and target language and, consequently, the required video processing. At the same time, sign languages are low-resource languages, their datasets dwarfed by those available for written languages. Recent advances in written language processing and success stories of transfer learning raise the question of how pretrained written language models can be leveraged to improve sign language translation. We apply the Frozen Pretrained Transformer (FPT) technique to initialize the encoder, decoder, or both, of a sign language translation model with parts of a pretrained written language model. We observe that the attention patterns transfer in zero-shot to the different modality and, in some experiments, we obtain higher scores (from 18.85 to 21.39 BLEU-4). Especially when gloss annotations are unavailable, FPTs can increase performance on unseen data. However, current models appear to be limited primarily by data quality and only then by data quantity, limiting potential gains with FPTs. Therefore, in further research, we will focus on improving the representations used as inputs to translation models. Full article

(This article belongs to the Special Issue Frontiers in Machine Translation)

► Show Figures

Figure 1

14 pages, 344 KiB

Open AccessArticle

Lexical Diversity in Statistical and Neural Machine Translation

by Mojca Brglez and Špela Vintar

Information 2022, 13(2), 93; https://doi.org/10.3390/info13020093 - 15 Feb 2022

Cited by 9 | Viewed by 5076

Abstract

Neural machine translation systems have revolutionized translation processes in terms of quantity and speed in recent years, and they have even been claimed to achieve human parity. However, the quality of their output has also raised serious doubts and concerns, such as loss [...] Read more.

Neural machine translation systems have revolutionized translation processes in terms of quantity and speed in recent years, and they have even been claimed to achieve human parity. However, the quality of their output has also raised serious doubts and concerns, such as loss in lexical variation, evidence of “machine translationese”, and its effect on post-editing, which results in “post-editese”. In this study, we analyze the outputs of three English to Slovenian machine translation systems in terms of lexical diversity in three different genres. Using both quantitative and qualitative methods, we analyze one statistical and two neural systems, and we compare them to a human reference translation. Our quantitative analyses based on lexical diversity metrics show diverging results; however, translation systems, particularly neural ones, mostly exhibit larger lexical diversity than their human counterparts. Nevertheless, a qualitative method shows that these quantitative results are not always a reliable tool to assess true lexical diversity and that a lot of lexical “creativity”, especially by neural translation systems, is often unreliable, inconsistent, and misguided. Full article

(This article belongs to the Special Issue Frontiers in Machine Translation)

► Show Figures

Figure 1

21 pages, 2307 KiB

Open AccessArticle

Recent Advances in Dialogue Machine Translation

by Siyou Liu, Yuqi Sun and Longyue Wang

Information 2021, 12(11), 484; https://doi.org/10.3390/info12110484 - 22 Nov 2021

Cited by 6 | Viewed by 4612

Abstract

Recent years have seen a surge of interest in dialogue translation, which is a significant application task for machine translation (MT) technology. However, this has so far not been extensively explored due to its inherent characteristics including data limitation, discourse properties and personality [...] Read more.

Recent years have seen a surge of interest in dialogue translation, which is a significant application task for machine translation (MT) technology. However, this has so far not been extensively explored due to its inherent characteristics including data limitation, discourse properties and personality traits. In this article, we give the first comprehensive review of dialogue MT, including well-defined problems (e.g., 4 perspectives), collected resources (e.g., 5 language pairs and 4 sub-domains), representative approaches (e.g., architecture, discourse phenomena and personality) and useful applications (e.g., hotel-booking chat system). After systematical investigation, we also build a state-of-the-art dialogue NMT system by leveraging a breadth of established approaches such as novel architectures, popular pre-training and advanced techniques. Encouragingly, we push the state-of-the-art performance up to 62.7 BLEU points on a commonly-used benchmark by using mBART pre-training. We hope that this survey paper could significantly promote the research in dialogue MT. Full article

(This article belongs to the Special Issue Frontiers in Machine Translation)

► Show Figures

Figure 1

Review

Jump to: Research

17 pages, 15504 KiB

Open AccessReview

Sign Language Avatars: A Question of Representation

by Rosalee Wolfe, John C. McDonald, Thomas Hanke, Sarah Ebling, Davy Van Landuyt, Frankie Picron, Verena Krausneker, Eleni Efthimiou, Evita Fotinea and Annelies Braffort

Information 2022, 13(4), 206; https://doi.org/10.3390/info13040206 - 18 Apr 2022

Cited by 28 | Viewed by 9474

Abstract

Given the achievements in automatically translating text from one language to another, one would expect to see similar advancements in translating between signed and spoken languages. However, progress in this effort has lagged in comparison. Typically, machine translation consists of processing text from [...] Read more.

Given the achievements in automatically translating text from one language to another, one would expect to see similar advancements in translating between signed and spoken languages. However, progress in this effort has lagged in comparison. Typically, machine translation consists of processing text from one language to produce text in another. Because signed languages have no generally-accepted written form, translating spoken to signed language requires the additional step of displaying the language visually as animation through the use of a three-dimensional (3D) virtual human commonly known as an avatar. Researchers have been grappling with this problem for over twenty years, and it is still an open question. With the goal of developing a deeper understanding of the challenges posed by this question, this article gives a summary overview of the unique aspects of signed languages, briefly surveys the technology underlying avatars and performs an in-depth analysis of the features in a textual representation for avatar display. It concludes with a comparison of these features and makes observations about future research directions. Full article

(This article belongs to the Special Issue Frontiers in Machine Translation)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Frontiers in Machine Translation

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (8 papers)

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI