Special Issue "Advances in Computer-Aided Translation Technology"
A special issue of Informatics (ISSN 2227-9709).
Deadline for manuscript submissions: 30 April 2019
Dr. Joke Daems
Dr. Arda Tezcan
Translation technology has become an integral part of the life of a professional translator. Computer-aided translation (CAT) tools have evolved over the years from basic translation memory systems to full-fledged translation environment tools (TEnTs), offering a wide range of support to the professional translator. Moreover, these environments attempt to reach the optimal level of human–machine interactions by increasingly integrating translation memory (TM) and machine translation (MT) suggestions in more interactive ways. However, with the growing variety of MT paradigms and changing translation work flows (e.g. collaborative translation), new challenges lie ahead.
For this Special Issue we seek novel, original contributions across the entire spectrum of computer-aided translation technology, covering advances in the
- Matching and retrieval of segments in translation memories
- Integration of TM and MT suggestions
- Integration of client-specific terminology in neural MT
- Multilingual terminology extraction
- Quality estimation of MT and TM suggestions
- Translation quality assurance
- Automatic methods for translation memory cleaning and maintenance
- Productivity measurements
- Effort prediction and price estimation
- Methods for collaborative translation
- Post-editing guidelines and best practices
- Intelligent interface design
- User-adaptive systems
- Automatic speech recognition for dictating translations
- Integration with text authoring tools
Prof. Dr. Lieve Macken
Dr. Joke Daems
Dr. Arda Tezcan
Manuscript Submission Information
Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.
Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Informatics is an international peer-reviewed open access quarterly journal published by MDPI.
Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 350 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.
The below list represents only planned manuscripts. Some of these manuscripts have not been received by the Editorial Office yet. Papers submitted to MDPI journals are subject to peer-review.
1. Title: Focusing on the users of computer-aided translation tools
Authors: Britta Upsing, Leibniz-Institut für Bildungsforschung
Abstract: According to a recent representative survey by Schmitt, Gerstmeyer, and Müller (2016), 75 percent of professional translators use computer-aided translation (CAT) tools. This high percentage seems to confirm the assumption that translation technology has become an integral part of the life of a translator. Still, this survey also shows that one quarter of translators does not participate in this trend. This paper focuses on the users (and non-users) of CAT tools. It investigates in detail the motivations of translators for and against CAT tool use. Furthermore, it analyses how workflows, translations assignments, work environment, and attitude towards technology impact this decision and the translator’s satisfaction with translation technology. The study is based on semi-structured qualitative interviews with 20 professional translators/reviewers. For the interpretation of the interviews, qualitative content analysis is used. Results provide insight into translators’ working practices: They indicate for instance that switching to CAT tool use may have been enforced by the translation buyer, but that continued use is voluntary. Translators cherish that CAT tools facilitate routine work, but results also show that under certain circumstances CAT tools may have a detrimental effect on translators’ creativity, their view of their work and their self-image. These findings will give a new perspective to a possible design pace of CAT tools.
2. Title: Black-box interactive translation prediction
Author: Daniel Torregrosa
Abstract: Interactive translation prediction (ITP) is a modality of computer-aided translation that assists professional translators by offering context-based computer-generated continuation suggestions as they type. While most state-of-the-art ITP systems follow a glass-box approach, meaning that they are tightly coupled to an adapted machine translation system, we propose a black-box approach which does not need access to the inner workings of the bilingual resources used to generate the suggestions. In this paper we explore different ways of generating suggestions and evaluate the performance of the models using both automatic and human evaluation.
3. Title: Misalignment detection for web scraped corpora
Author: Joachim Van den Bogaert
Abstract: To build state-of-the-art Neural Machine Translation (NMT) systems, millions of high-quality parallel sentences are needed. Typically, large amounts of data are scraped from multilingual web sites and aligned into data sets for training. Many tools exist for automatic alignment of such datasets. However, the quality of the resulting aligned corpus can be disappointing. In this paper we present a tool for automatic misalignment detection (MAD). We treat the task of determining whether a pair of aligned sentences constitutes a genuine translation as a supervised regression problem. We train our algorithm on a manually labeled dataset in the FR-NL language pair. Our algorithm uses an MT system and Levenshtein distance as a similarity score in combination with support vector regression. It achieves a Pearson correlation of 0.85, and, treated as a classification problem, it reaches an AUC of 0.97. We use our tool to create an aligned corpus of high quality and show that our tool can improve the performance of a Neural MT system.
4. Title: Comparing Post-editing based on Four Editing Actions against Translating with an Auto-Complete Feature
Author: Félix do Carmo
Affiliation: ADAPT Centre / CTTS, Dublin City University
Abstract: This paper describes an empirical experiment that was carried out during a workshop with 50 translators as the initial data analysis stage of a study on the development of tools to learn and support editing work. In the workshop, translators were asked to try a typical auto-complete feature, and then to test an experimental interface, constrained by the notion of editing as being composed of four actions (deleting, inserting, moving and replacing). The results of this workshop show that the constrained editing interface, although more intrusive, was considered by the translators as a useful approach for planning and for pedagogical purposes. Furthermore, the experiment allowed us to do a broad analysis of different factors playing a role in editing decisions, such as texts, segments, users, the work mode and the editing actions, as measured by edit scores and throughput.
5. Title: Post-editing neural MT in medical LSP: lexicogrammatical patterns and distortion in the communication of specialized knowledge
Author: Hanna Martikainen
Affiliation: Université Paris Diderot, CLILLAC-ARP, EA 3967, 75013, Paris, France; [email protected]
Abstract: The recent arrival on the market of high-performing neural MT engines will likely lead to a profound transformation of the translation profession. The purpose of this study is to explore how this paradigm change impacts the post-editing process, with a focus on lexicogrammatical patterns that are used in the communication of specialized knowledge. A corpus of 100 medical abstracts pre-translated from English into French by the neural MT engine DeepL and post-edited by Master’s students in translation was used to study potential distortions in the translation of lexicogrammatical patterns. The results suggest that neural MT tends to lead to specific sources of distortion in the translation of these patterns, not unlike what has previously been observed in human translation. These observations highlight the need to pay specific attention to lexicogrammatical patterns when post-editing neural MT in order to achieve functional equivalence in the translation of specialized texts.
Keywords: neural MT; post-editing; functional equivalence; distortion; lexicogrammatical patterns; medical LSP
6. Title: Speech synthesis in the translation revision process: evidence from eye-tracking and error analysis
Authors: Alina Secară1, Dragoș Ciobanu2, Valentina Ragni3
1 University of Leeds; [email protected] (+44-(0)113-343-3365
2 University of Leeds; [email protected]
3 University of Leeds; [email protected]
Abstract: This article presents initial results from an experimental eye-tracking study of the use of speech synthesis for a revision task performed with a computer-assisted translation (CAT) tool - memoQ. The experiment was designed to investigate if and how the presence of sound affects both translators’ viewing behaviour and revision quality. For the former, the participants’ eye movements were monitored during the revision task, while for the latter the level of error detection and correction – which are a common indicator of quality – was used. Immediately after carrying out the revisions, the subjects also completed a questionnaire on their perceived experience using speech synthesis.
This article will describe the methodology in detail, present a statistical analysis of eye-tracking, error and questionnaire data, discuss the results and consider the implications not only for professionals working in the private and IO sector, but also for trainee translators and related pedagogical practices in the Higher Education sector.
Keywords: speech synthesis, translation revision, computer-assisted translation (CAT), eye-tracking, error detection, error analysis
7. Title: Improving the Translation Environment for Professional Translators
Authors: Vincent Vandeghinste, Tom Vanallemeersch, Liesbeth Augustinus, Bram Bulté, Frank Van Eynde, Joris Pelemans, Lyan Verwimp, Patrick Wambacq, Geert Heyman, Marie-Francine Moens, Iulianna van der Lek-Ciudin, Frieda Steurs, Ayla Rigouts Terryn, Els Lefever, Arda Tezcan, Lieve Macken, Véronique Hoste, Sven Coppers, Jens Brulmans, Jan Van den Bergh, Kris Luyten and Karin Coninx
Abstract: When using CAT systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side. This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts through bilingual lexicon induction, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of this topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project.
9. Title: Translation quality and effort prediction in professional MT post-editing
8. Title: The Role of Machine Translation Quality Estimation in the Post-Editing Workflows
Authors: Hanna Bechara, Constantin Orasan, Marcos Zampieri, Carla Parra
Abstract: As Machine Translation (MT) becomes increasingly ubiquitous, so has its use in professional translation workflows. However, its proliferation in the translation industry has brought about new challenges in the field of Post-Editing (PE). We are now faced with a need to find effective tools to assess the quality of MT systems to avoid underpayments and mistrust by professional translators. In this scenario, one promising field of study is MT Quality Estimation (MTQE), as this aims to determine the quality of a translation and its degree of post-editing difficulty. However, its impact on the translation workflows and the translators’ cognitive load is still to be fully explored.
Authors: Moritz Schaeffer, Jennifer Vardaro, Silvia Hansen-Schirra
Affiliation: Johannes Gutenberg-University Mainz
Abstract: The main focus of this controlled eye-tracking and key-logging study is to determine the most efficient way of error detection and correction in neural machine translated and post-edited texts in the European Commission’s Directorate-General for Translation (DGT). The experiment was informed by quality analyses of authentic DGT corpora including automatic (Hjerson, Popović 2011) and manual (MQM framework, Lommel 2014) error annotations as well as linear regressions to identify error categories in MT output and post-edited texts. The corpus results show that lexical errors, (particularly mistranslations), terminology errors and stylistic changes pose the most frequent problems to post-editors.
For this study, carried out in Translog II (Carl 2012), temporal, technical and cognitive effort is operationalized through eye movements and typing behavior of test sentences including the above mentioned error categories compared to control sentences without errors. 25 professional DGT translators post-edited 100 English-German neural machine translated sentences from the DGT corpus. We will examine the effect of the three error types on early (first fixation durations, gaze durations) and late eye movement measures (e.g., total reading time and scanpath measures) as well as typing behaviour. On the basis of statistical regression analyses we will be able to predict how much temporal, technical and cognitive effort will be corelated to the different error categories with respect to the recognition and correction of the three error types during the DGT post-editing process. In addition, the behavioural data of the DGT translation professionals will be compared to those of a group of 25 translation students. Behavioural differences in the two groups will allow for further predictions regarding the effect of expertise on the post-editing process.