Next Article in Journal
Exact Negative Solutions for Guyer–Krumhansl Type Equation and the Maximum Principle Violation
Previous Article in Journal
Partial Discharge Feature Extraction Based on Ensemble Empirical Mode Decomposition and Sample Entropy
Article Menu
Issue 9 (September) cover image

Export Article

Open AccessArticle
Entropy 2017, 19(9), 340; https://doi.org/10.3390/e19090340

Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models

1
Department of Computer Engineering, University of Isfahan, Isfahan 81746-73441, Iran
2
Center for Language and Cognition, University of Groningen, Groningen 9712 EK, The Netherlands
3
ADAPT Centre, School of Computing, Dublin City University, Dublin 9, Ireland
*
Authors to whom correspondence should be addressed.
Received: 2 June 2017 / Revised: 18 June 2017 / Accepted: 23 June 2017 / Published: 24 August 2017
(This article belongs to the Section Statistical Mechanics)
View Full-Text   |   Download PDF [1923 KB, uploaded 29 August 2017]   |  

Abstract

Reordering is one of the most important factors affecting the quality of the output in statistical machine translation (SMT). A considerable number of approaches that proposed addressing the reordering problem are discriminative reordering models (DRM). The core component of the DRMs is a classifier which tries to predict the correct word order of the sentence. Unfortunately, the relationship between classification quality and ultimate SMT performance has not been investigated to date. Understanding this relationship will allow researchers to select the classifier that results in the best possible MT quality. It might be assumed that there is a monotonic relationship between classification quality and SMT performance, i.e., any improvement in classification performance will be monotonically reflected in overall SMT quality. In this paper, we experimentally show that this assumption does not always hold, i.e., an improvement in classification performance might actually degrade the quality of an SMT system, from the point of view of MT automatic evaluation metrics. However, we show that if the improvement in the classification performance is high enough, we can expect the SMT quality to improve as well. In addition to this, we show that there is a negative relationship between classification accuracy and SMT performance in imbalanced parallel corpora. For these types of corpora, we provide evidence that, for the evaluation of the classifier, macro-averaged metrics such as macro-averaged F-measure are better suited than accuracy, the metric commonly used to date. View Full-Text
Keywords: statistical machine translation; reordering model; classification; performance; correlation; intrinsic evaluation statistical machine translation; reordering model; classification; performance; correlation; intrinsic evaluation
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Kazemi, A.; Toral, A.; Way, A.; Monadjemi, A.; Nematbakhsh, M. Investigating the Relationship between Classification Quality and SMT Performance in Discriminative Reordering Models. Entropy 2017, 19, 340.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top