Fine-Tuning Methods and Dataset Structures for Multilingual Neural Machine Translation: A Kazakh–English–Russian Case Study in the IT Domain
Abstract
1. Introduction
- -
- Fine-tuning analysis: a thorough comparison of one-stage and two-stage fine-tuning strategies to balance domain specialization and generalization in multilingual NMT.
- -
- Dataset structure assessment: an analysis of the triplet and non-triplet structures of the dataset, demonstrating their impact on cross-linguistic consistency and translation accuracy.
- -
- High-quality dataset: A case study using an IT dataset of 50,000 manually translated and expert-validated Kazakh–English–Russian triplets to demonstrate domain-specific adaptation.
- -
- Generalizable framework: A reproducible framework for NMT optimization with suggestions for new methodologies and broader applicability to resource-constrained languages.
2. Literature Review
3. Materials and Methods
3.1. Dataset
3.1.1. IT-Domain Dataset (KK ↔ EN ↔ RU)
3.1.2. Out-of-Domain Dataset
3.2. Methods
3.2.1. No Language Left Behind Method
3.2.2. Domain-Specific Analysis
3.2.3. Dataset Type Comparison
3.2.4. Fine-Tuning Setup
3.2.5. Evaluation Metrics
3.2.6. Statistical Significance Testing
4. Experimental Setup
- -
- Single vs. Two-Stage Fine-Tuning: S-Exp-1 vs. D-Exp-1, S-Exp-2 vs. D-Exp-2, S-Exp-3 vs. D-Exp-3 to assess whether sequential training mitigates catastrophic forgetting, enhancing IT-domain accuracy without sacrificing generalization.
- -
- In-Domain vs. Out-of-Domain: Exp-1 vs. Exp-2, Exp-3 to evaluate whether IT-specific training outperforms general-domain training on technical texts.
- -
- Triplet vs. non-triplet: Exp-1, Exp-2 vs. Exp-4 to compare whether simultaneous three-language training improves cross-lingual coherence over individual pair training.
- -
- LangPair Corpus Impact: S-Exp-1 vs. S-Exp-3, S-Exp-2 vs. Exp-4 to test non-triplet data’s contribution.
- -
- Benchmarking against Google and Yandex APIs.
5. Results
6. Discussion
6.1. Quantitative Analysis of Translation Outputs
6.2. Qualitative Analysis of Translation Outputs
6.3. Limitations and Future Work
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Costa-Jussà, M.R.; Cross, J.; Çelebi, O.; Elbayad, M.; Heafield, K.; Heffernan, K.; Kalbassi, E.; Lam, J.; Licht, D.; Maillard, J.; et al. No language left behind: Scaling human-centered machine translation. arXiv 2022, arXiv:2207.04672. [Google Scholar] [CrossRef]
- Saunders, D. Domain adaptation and multi-domain adaptation for neural machine translation: A survey. J. Artif. Intell. Res. 2022, 75, 351–424. [Google Scholar] [CrossRef]
- Luong, M.T.; Manning, C.D. Stanford neural machine translation systems for spoken language domains. In Proceedings of the 12th International Workshop on Spoken Language Translation: Evaluation Campaign, Da Nang, Vietnam, 3–4 December 2015; pp. 76–79. [Google Scholar]
- Chu, C.; Dabre, R.; Kurohashi, S. An empirical comparison of domain adaptation methods for neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, BC, Canada, 30 July–4 August 2017; pp. 385–391. [Google Scholar] [CrossRef]
- Gao, Y.; Hou, F.; Wang, R. A Novel Two-step Fine-tuning Framework for Transfer Learning in Low-Resource Neural Machine Translation. In Proceedings of the Annual Conference of the North American Association for Computational Linguistics (NAACL), Mexico City, Mexico, 16–21 June 2024; pp. 3214–3224. [Google Scholar] [CrossRef]
- Yeshpanov, R.; Polonskaya, A.; Varol, H.A. KazParC: Kazakh Parallel Corpus for Machine Translation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italy, 20–25 May 2024; pp. 9633–9644. [Google Scholar]
- Sennrich, R.; Haddow, B.; Birch, A. Neural Machine Translation of Rare Words with Subword Units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 7–12 August 2016; pp. 1715–1725. [Google Scholar] [CrossRef]
- Chowdhury, K.D.; Hasanuzzaman, M.; Liu, Q. Multimodal Neural Machine Translation for Low-resource Language Pairs using Synthetic Data. In Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP, Melbourne, Australia, 19 July 2018; pp. 33–42. [Google Scholar] [CrossRef]
- Gururangan, S.; Marasović, A.; Swayamdipta, S.; Lo, K.; Beltagy, I.; Downey, D.; Smith, N.A. Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 6–10 July 2020; pp. 8342–8360. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Domhan, T. How much attention do you need? A granular analysis of neural machine translation architectures. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 15–20 July 2018; pp. 1799–1808. [Google Scholar] [CrossRef]
- Ruder, S.; Peters, M.E.; Swayamdipta, S.; Wolf, T. Transfer learning in natural language processing. In Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: Tutorials, Minneapolis, MN, USA, 2–7 June 2019; pp. 15–18. [Google Scholar] [CrossRef]
- Bapna, A.; Firat, O. Simple, Scalable Adaptation for Neural Machine Translation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; pp. 1538–1548. [Google Scholar] [CrossRef]
- Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. Lora: Low-rank adaptation of large language models. In Proceedings of the International Conference on Learning Representations (ICLR), Online, 25–29 April 2022. [Google Scholar]
- Pham, M.Q.; Crego, J.M.; Yvon, F. Multi-Domain Adaptation in Neural Machine Translation with Dynamic Sampling Strategies. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, Ghent, Belgium, 1–3 June 2022; pp. 13–22. [Google Scholar]
- Wang, Y.; Zhai, C.; Awadalla, H.H. Multi-task Learning for Multilingual Neural Machine Translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 8–12 November 2020; pp. 1022–1034. [Google Scholar] [CrossRef]
- He, J.; Sun, S.; Peng, S.; Xu, J.; Jia, X.; Li, W. Contrastive Preference Learning for Neural Machine Translation. In Proceedings of the Annual Conference of the North American Association for Computational Linguistics (NAACL), Mexico City, Mexico, 16–21 June 2024; pp. 2724–2735. [Google Scholar] [CrossRef]
- Thompson, B.; Gwinnup, J.; Khayrallah, H.; Duh, K.; Koehn, P. Overcoming catastrophic forgetting during domain adaptation of neural machine translation. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, MN, USA, 2–7 June 2019; pp. 2062–2068. [Google Scholar] [CrossRef]
- Xu, H.; Ebner, S.; Yarmohammadi, M.; White, A.S.; Van Durme, B.; Murray, K. Gradual Fine-Tuning for Low-Resource Domain Adaptation. In Proceedings of the Second Workshop on Domain Adaptation for NLP, Kyiv, Ukraine, 19–20 April 2021; pp. 214–221. [Google Scholar]
- Zhang, X.; Shapiro, P.; Kumar, G.; McNamee, P.; Carpuat, M.; Duh, K. Curriculum Learning for Domain Adaptation in Neural Machine Translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; pp. 1903–1915. [Google Scholar] [CrossRef]
- Saunders, D.; DeNeefe, S. Domain adapted machine translation: What does catastrophic forgetting forget and why? In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA, 12–16 November 2024; pp. 12660–12671. [Google Scholar] [CrossRef]
- Dakwale, P.; Monz, C. Fine-tuning for neural machine translation with limited degradation across in-and out-of-domain data. In Proceedings of the Machine Translation Summit XVI: Research Track, Nagoya, Japan, 18–22 September 2017; pp. 156–169. [Google Scholar]
- Koehn, P.; Knowles, R. Six Challenges for Neural Machine Translation. In Proceedings of the First Workshop on Neural Machine Translation, Vancouver, BC, Canada, 3–4 July 2017; pp. 28–39. [Google Scholar] [CrossRef]
- Freitag, M.; Al-Onaizan, Y. Fast domain adaptation for neural machine translation. arXiv 2016, arXiv:1612.06897. [Google Scholar] [CrossRef]
- Miceli-Barone, A.V.; Haddow, B.; Germann, U.; Sennrich, R. Regularization techniques for fine-tuning in neural machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 7–11 September 2017; pp. 1489–1494. [Google Scholar] [CrossRef]
- Karyukin, V.; Rakhimova, D.; Karibayeva, A.; Turganbayeva, A.; Turarbek, A. The neural machine translation models for the low-resource Kazakh–English language pair. PeerJ Comput. Sci. 2023, 9, e1224. [Google Scholar] [CrossRef] [PubMed]
- Ruiz, A.T.; Edman, L.; Spenader, J.; Yeshmagambetova, G. Neural machine translation for English–Kazakh with morphological segmentation and synthetic data. In Proceedings of the Fourth Conference on Machine Translation, Florence, Italy, 1–2 August 2019; pp. 386–392. [Google Scholar] [CrossRef]
- Makazhanov, A.; Myrzakhmetov, B.; Kozhirbayev, Z. On various approaches to machine translation from Russian to Kazakh. In Proceedings of the 5th International Conference on Turkic Languages Processing (TurkLang 2017), Kazan, Russia, 18–21 October 2017; pp. 195–209. [Google Scholar]
- Littell, P.; Lo, C.-k.; Larkin, S.; Stewart, D. Multi-source transformer for Kazakh-Russian-English neural machine translation. In Proceedings of the Fourth Conference on Machine Translation, Florence, Italy, 1–2 August 2019; pp. 267–274. [Google Scholar] [CrossRef]
- Sundetova, A.; Forcada, M.; Tyers, F. A free/open-source machine translation system from English to Kazakh. In Proceedings of the International Conference on Turkic Languages Processing (TurkLang 2015), Kazan, Russia, 17–19 September 2015; pp. 78–90. [Google Scholar]
- Rakhimova, D.; Karibayeva, A. Aligning and extending technologies of parallel corpora for the Kazakh language. East.-Eur. J. Enterp. Technol. 2022, 4, 32–39. [Google Scholar] [CrossRef]
- Wang, M.; Hu, F. The application of nltk library for python natural language processing in corpus research. Theory Pract. Lang. Stud. 2021, 11, 1041–1049. [Google Scholar] [CrossRef]
- Koehn, P.; Hoang, H.; Birch, A.; Callison-Burch, C.; Federico, M.; Bertoldi, N.; Cowan, B.; Shen, W.; Moran, C.; Zens, R.; et al. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, Prague, Czech Republic, 23–30 June 2007; pp. 177–180. [Google Scholar]
- Bañón, M.; Chen, P.; Haddow, B.; Heafield, K.; Hoang, H.; Esplà-Gomis, M.; Forcada, M.; Kamran, A.; Kirefu, F.; Koehn, P.; et al. ParaCrawl: Web-scale acquisition of parallel corpora. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 4555–4567. [Google Scholar] [CrossRef]
- Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.J. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, USA, 7–12 July 2002; pp. 311–318. [Google Scholar] [CrossRef]
- Popović, M. chrF: Character n-gram F-score for automatic MT evaluation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisbon, Portugal, 17–18 September 2015; pp. 392–395. [Google Scholar] [CrossRef]
- Snover, M.; Madnani, N.; Dorr, B.; Schwartz, R. Fluency, adequacy, or HTER? Exploring different human judgments with a tunable MT metric. In Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece, 30–31 March 2009; pp. 259–268. [Google Scholar] [CrossRef]
- Banerjee, S.; Lavie, A. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, MI, USA, June 2005; pp. 65–72. [Google Scholar]
- Koehn, P. Statistical Significance Tests for Machine Translation Evaluation. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, 25–26 July 2004; pp. 388–395. [Google Scholar]
Pair | # Lines | # Sents | # Tokens | # Types |
---|---|---|---|---|
KK ↔ EN | 50,000 | 49,849/49,854 | 911,578/1,112,194 | 57,506/30,190 |
KK ↔ RU | 50,000 | 49,849/49,845 | 911,578/1,027,658 | 57,506/64,148 |
EN ↔ RU | 50,000 | 49,854/49,845 | 1,112,194/1,027,658 | 30,190/64,148 |
Pair | Train | Valid | Test |
---|---|---|---|
KK ↔ EN | 40,705 | 4600 | 4695 |
KK ↔ RU | 40,705 | 4600 | 4695 |
EN ↔ RU | 40,705 | 4600 | 4695 |
Pair | Train | Valid | Test |
---|---|---|---|
KK ↔ EN | 290,877 | 72,719 | 4750 |
KK ↔ RU | 290,785 | 72,697 | 4750 |
KK ↔ TR | 289,720 | 72,430 | 4750 |
EN ↔ RU | 290,764 | 72,692 | 4750 |
EN ↔ TR | 289,913 | 72,479 | 4750 |
RU ↔ TR | 290,899 | 72,725 | 4750 |
Pair | Train | Valid |
---|---|---|
KK ↔ EN | 1,608,345 | 178,705 |
KK ↔ RU | 1,608,703 | 178,745 |
KK ↔ TR | 1,612,282 | 179,143 |
EN ↔ RU | 1,606,061 | 178,452 |
EN ↔ TR | 1,609,707 | 178,857 |
RU ↔ TR | 1,609,224 | 178,803 |
Dataset | Pair | Train | Valid |
---|---|---|---|
NLA | KK ↔ RU | 400,000 | 50,000 |
KazNU | KK ↔ EN | 400,000 | 50,000 |
ParaCrawl | EN ↔ RU | 400,000 | 50,000 |
Experiment | Training Data | Validation Data | Test Data | Setup |
---|---|---|---|---|
Exp-1 | IT-domain train set | IT-domain valid set | IT-domain/KazParC | In-domain |
Exp-2 | None (base NLLB-200-distilled-1.3B model) | None | IT-domain/KazParC | Out-of-domain |
Exp-3 | KazParC train set | KazParC valid set | IT-domain/KazParC | Out-of-domain |
Exp-4 | LangPair train set | LangPair valid set | IT-domain/KazParC | Out-of-domain |
Exp-5 | KazParC + LangPair train set | KazParC + LangPair valid set | IT-domain/KazParC | Out-of-domain |
S-Exp-1 | IT-domain + KazParC train set | IT-domain + KazParC valid set | IT-domain/KazParC | Single-stage |
S-Exp-2 | IT-domain + KazParC + LangPair train set | IT-domain + KazParC + LangPair valid set | IT-domain/KazParC | Single-stage |
S-Exp-3 | IT-domain + LangPair train set | IT-domain + LangPair valid set | IT-domain/KazParC | Single-stage |
D-Exp-1 | Stage 1: KazParC train set; Stage 2: IT-domain train set | Stage 1: KazParC valid set; Stage 2: IT-domain valid set | IT-domain/KazParC | Two-stage |
D-Exp-2 | Stage 1: KazParC + LangPair train set; Stage 2: IT-domain train set | Stage 1: KazParC + LangPair valid set; Stage 2: IT-domain valid set | IT-domain/KazParC | Two-stage |
D-Exp-3 | Stage 1: LangPair train set; Stage 2: IT-domain train set | Stage 1: LangPair valid set; Stage 2: IT-domain valid set | IT-domain/KazParC | Two-stage |
None (Google Translate) | None | IT-domain/KazParC | Comparisons | |
Yandex | None (Yandex Translate) | None | IT-domain/KazParC | Comparisons |
Experiment | Batch Size (Per-Device Train/Eval) | Evaluation Steps | Epochs | Duration (Hours) |
---|---|---|---|---|
Exp-1 | 6 | 100 | 30 | 43.47 |
Exp-3 | 3 | 5000 | 3 | 137.83 |
Exp-4 | 3 | 5000 | 10 | 97.06 |
Exp-5 | 2 | 5000 | 2 | 183.91 |
S-Exp-1 | 2 | 5000 | 2 | 155.74 |
S-Exp-2 | 1 | 10,000 | 2 | 369.65 |
S-Exp-3 | 4 | 1000 | 2 | 20.58 |
D-Exp-1 (Stage 1) | 3 | 5000 | 3 | 137.83 |
D-Exp-1 (Stage 2) | 6 | 200 | 30 | 38.37 |
D-Exp-2 (Stage 1) | 2 | 5000 | 2 | 183.91 |
D-Exp-2 (Stage 2) | 6 | 500 | 30 | 38.04 |
D-Exp-3 (Stage 1) | 4 | 1000 | 2 | 97.06 |
D-Exp-3 (Stage 2) | 6 | 500 | 30 | 37.93 |
Pair | Exp-1 | Exp-2 | Exp-3 | Exp-4 | Exp-5 | S-Exp-1 | S-Exp-2 | S-Exp-3 | D-Exp-1 | D-Exp-2 | D-Exp-3 | Yandex | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
en-kk | 0.44| 0.72| 0.71| 0.25| | 0.13| 0.53| 0.61| 0.29| | 0.22| 0.61| 0.62| 0.16| | 0.12| 0.52| 0.57| 0.35| | 0.23| 0.62| 0.74| 0.17| | 0.48| 0.73| 0.75| 0.18| | 0.48| 0.72| 0.76| 0.19| | 0.41| 0.69| 0.72| 0.35| | 0.44| 0.71| 0.73| 0.24| | 0.44| 0.70| 0.70| 0.24| | 0.42| 0.70| 0.70| 0.26| | 0.48| 0.78| 0.75| 0.33| | 0.43| 0.76| 0.73| 0.34| |
en-ru | 0.48| 0.74| 0.71| 0.25| | 0.30| 0.62| 0.70| 0.26| | 0.38| 0.67| 0.71| 0.12| | 0.29| 0.64| 0.66| 0.32| | 0.38| 0.70| 0.70| 0.14| | 0.52| 0.76| 0.77| 0.17| | 0.51| 0.76| 0.77| 0.18| | 0.51| 0.71| 0.73| 0.32| | 0.49| 0.76| 0.72| 0.25| | 0.48| 0.74| 0.72| 0.25| | 0.48| 0.73| 0.71| 0.26| | 0.51| 0.76| 0.75| 0.33| | 0.53| 0.77| 0.77| 0.32| |
kk-en | 0.38| 0.69| 0.79| 0.33| | 0.25| 0.56| 0.69| 0.40| | 0.31| 0.64| 0.73| 0.41| | 0.24| 0.57| 0.68| 0.42| | 0.32| 0.63| 0.73| 0.42| | 0.46| 0.73| 0.78| 0.35| | 0.46| 0.73| 0.78| 0.35| | 0.44| 0.72| 0.77| 0.52| | 0.46| 0.75| 0.80| 0.33| | 0.48| 0.75| 0.80| 0.33| | 0.48| 0.75| 0.79| 0.33| | 0.47| 0.73| 0.79| 0.34| | 0.34| 0.65| 0.71| 0.46| |
kk-ru | 0.39| 0.70| 0.70| 0.40| | 0.22| 0.57| 0.62| 0.49| | 0.30| 0.64| 0.71| 0.39| | 0.25| 0.59| 0.61| 0.50| | 0.29| 0.62| 0.70| 0.40| | 0.46| 0.73| 0.72| 0.37| | 0.46| 0.73| 0.72| 0.38| | 0.42| 0.71| 0.69| 0.50| | 0.44| 0.72| 0.71| 0.39| | 0.44| 0.72| 0.71| 0.39| | 0.43| 0.72| 0.71| 0.40| | 0.39| 0.69| 0.67| 0.44| | 0.41| 0.70| 0.69| 0.41| |
ru-en | 0.42| 0.71| 0.81| 0.29| | 0.33| 0.64| 0.75| 0.27| | 0.41| 0.71| 0.76| 0.37| | 0.27| 0.65| 0.70| 0.31| | 0.42| 0.70| 0.77| 0.37| | 0.50| 0.75| 0.80| 0.31| | 0.50| 0.76| 0.81| 0.31| | 0.49| 0.75| 0.80| 0.32| | 0.52| 0.77| 0.81| 0.30| | 0.52| 0.77| 0.81| 0.30| | 0.52| 0.77| 0.81| 0.30| | 0.51| 0.76| 0.81| 0.30| | 0.47| 0.74| 0.79| 0.34| |
ru-kk | 0.41| 0.74| 0.72| 0.37| | 0.16| 0.55| 0.57| 0.36| | 0.22| 0.56| 0.73| 0.35| | 0.18| 0.52| 0.57| 0.54| | 0.23| 0.63| 0.73| 0.36| | 0.47| 0.77| 0.75| 0.34| | 0.47| 0.77| 0.75| 0.34| | 0.41| 0.74| 0.71| 0.35| | 0.44| 0.76| 0.73| 0.36| | 0.44| 0.76| 0.73| 0.36| | 0.43| 0.75| 0.72| 0.37| | 0.39| 0.73| 0.69| 0.40| | 0.39| 0.74| 0.70| 0.40| |
Pair | Exp-1 | Exp-2 | Exp-3 | Exp-4 | Exp-5 | S-Exp-1 | S-Exp-2 | S-Exp-3 | D-Exp-1 | D-Exp-2 | D-Exp-3 | Yandex | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
en-kk | 0.14| 0.55| 0.46| 0.65| | 0.12| 0.51| 0.61| 0.49| | 0.22| 0.61| 0.53| 0.56| | 0.12| 0.52| 0.43| 0.76| | 0.21| 0.61| 0.53| 0.56| | 0.21| 0.61| 0.53| 0.56| | 0.22| 0.61| 0.53| 0.56| | 0.13| 0.53| 0.44| 0.72| | 0.12| 0.59| 0.51| 0.59| | 0.19| 0.59| 0.51| 0.59| | 0.14| 0.55| 0.46| 0.65| | 0.30| 0.65| 0.61| 0.59| | 0.18| 0.58| 0.52| 0.65| |
en-ru | 0.30| 0.65| 0.61| 0.50| | 0.32| 0.65| 0.80| 0.26| | 0.39| 0.70| 0.67| 0.42| | 0.30| 0.61| 0.59| 0.55| | 0.38| 0.70| 0.66| 0.42| | 0.38| 0.70| 0.66| 0.42| | 0.39| 0.70| 0.66| 0.42| | 0.32| 0.65| 0.62| 0.50| | 0.35| 0.67| 0.64| 0.46| | 0.34| 0.67| 0.64| 0.46| | 0.30| 0.64| 0.61| 0.51| | 0.41| 0.71| 0.74| 0.53| | 0.39| 0.70| 0.75| 0.51| |
kk-en | 0.23| 0.56| 0.59| 0.60| | 0.24| 0.56| 0.69| 0.50| | 0.33| 0.63| 0.67| 0.49| | 0.26| 0.57| 0.60| 0.61| | 0.32| 0.63| 0.66| 0.50| | 0.33| 0.63| 0.66| 0.49| | 0.32| 0.63| 0.66| 0.50| | 0.26| 0.58| 0.60| 0.60| | 0.27| 0.59| 0.62| 0.56| | 0.27| 0.59| 0.62| 0.56| | 0.23| 0.56| 0.59| 0.61| | 0.31| 0.62| 0.62| 0.58| | 0.28| 0.60| 0.65| 0.60| |
kk-ru | 0.20| 0.57| 0.52| 0.64| | 0.22| 0.57| 0.62| 0.49| | 0.30| 0.64| 0.60| 0.52| | 0.24| 0.59| 0.55| 0.58| | 0.29| 0.64| 0.60| 0.52| | 0.29| 0.64| 0.60| 0.52| | 0.30| 0.64| 0.60| 0.52| | 0.24| 0.59| 0.55| 0.58| | 0.24| 0.60| 0.56| 0.59| | 0.24| 0.60| 0.55| 0.59| | 0.20| 0.56| 0.52| 0.64| | 0.29| 0.61| 0.61| 0.53| | 0.29| 0.63| 0.62| 0.61| |
ru-en | 0.32| 0.63| 0.66| 0.48| | 0.35| 0.64| 0.73| 0.41| | 0.43| 0.71| 0.74| 0.38| | 0.25| 0.56| 0.59| 0.61| | 0.43| 0.71| 0.74| 0.38| | 0.42| 0.71| 0.74| 0.38| | 0.42| 0.71| 0.74| 0.38| | 0.34| 0.62| 0.66| 0.46| | 0.36| 0.67| 0.70| 0.44| | 0.35| 0.66| 0.69| 0.44| | 0.31| 0.61| 0.65| 0.46| | 0.42| 0.71| 0.74| 0.43| | 0.43| 0.71| 0.69| 0.44| |
ru-kk | 0.14| 0.54| 0.45| 0.66| | 0.15| 0.56| 0.57| 0.55| | 0.23| 0.63| 0.56| 0.54| | 0.17| 0.58| 0.49| 0.66| | 0.23| 0.63| 0.55| 0.54| | 0.23| 0.62| 0.55| 0.55| | 0.23| 0.63| 0.56| 0.54| | 0.16| 0.57| 0.48| 0.62| | 0.19| 0.59| 0.51| 0.59| | 0.18| 0.59| 0.51| 0.60| | 0.14| 0.55| 0.46| 0.66| | 0.24| 0.62| 0.55| 0.62| | 0.23| 0.62| 0.56| 0.64| |
Comparison | Direction | BLEU Difference | p-Value | CI 1 (Exp 1) | CI 2 (Exp 2) |
---|---|---|---|---|---|
S-Exp-1 vs. D-Exp-1 | EN → KK | 0.03 | 0.021 | [0.45, 0.51] | [0.42, 0.48] |
S-Exp-1 vs. D-Exp-1 | KK → RU | 0.02 | 0.048 | [0.46, 0.50] | [0.44, 0.48] |
S-Exp-2 vs. D-Exp-2 | RU → EN | 0.02 | 0.067 | [0.49, 0.53] | [0.51, 0.55] |
Exp-1 vs. Exp-4 | EN → KK | 0.05 | 0.008 | [0.45, 0.51] | [0.40, 0.46] |
Exp-1 vs. Exp-4 | KK → RU | 0.04 | 0.012 | [0.46, 0.50] | [0.42, 0.46] |
Direction | Source Text | Reference Text | Model | Translated Text | BLEU↑ | chrF↑ | METEOR↑ | TER↓ |
---|---|---|---|---|---|---|---|---|
EN → KK | Deep learning algorithms optimize neural networks. | Tepeң oқытy aлгopитмдepi нeйpoндық жeлiлepдi oңтaйлaнды-paды. | S-Exp-1 | Tepeң oқытy aлгopитмдepi нeйpoндық жeлiлepдi oңтaйлaндыpaды. | 1.00 | 1.00 | 1.00 | 0.00 |
Exp-4 | Tepeң бiлiм aлгopитмдepi жeлiлepдi жaқcapтaды. | 0.10 | 0.42 | 0.25 | 0.50 | |||
KK → RU | Бaғдapлaмaлық жacaқтaмa әзipлey тиiмдiлiгiн apттыpaды. | Пoвышaeт эφφeктив- нocть paзpaбoтки пpoгpaммнoгo oбecпeчeния. | S-Exp-1 | Paзpaбoткa пpoгpaммнoгo oбecпeчeния пoвышaeт эφφeктивнocть. | 0.19 | 0.83 | 0.20 | 1.00 |
Exp-4 | Coздaниe пpoгpaмм yлyчшaeт пpoизвoдитeльнocть. | 0.09 | 0.25 | 0.00 | 1.00 | |||
RU → EN | Иcкyccтвeнный интeллeкт пpeoбpaзyeт ИT-индycтpию. | Artificial Intelligence is Transforming the IT Industry | S-Exp-1 | Artificial intelligence transforms the IT industry. | 0.15 | 0.63 | 0.70 | 0.43 |
Exp-4 | Artificial intellect changes IT sphere. | 0.08 | 0.31 | 0.15 | 0.71 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kozhirbayev, Z.; Yessenbayev, Z. Fine-Tuning Methods and Dataset Structures for Multilingual Neural Machine Translation: A Kazakh–English–Russian Case Study in the IT Domain. Electronics 2025, 14, 3126. https://doi.org/10.3390/electronics14153126
Kozhirbayev Z, Yessenbayev Z. Fine-Tuning Methods and Dataset Structures for Multilingual Neural Machine Translation: A Kazakh–English–Russian Case Study in the IT Domain. Electronics. 2025; 14(15):3126. https://doi.org/10.3390/electronics14153126
Chicago/Turabian StyleKozhirbayev, Zhanibek, and Zhandos Yessenbayev. 2025. "Fine-Tuning Methods and Dataset Structures for Multilingual Neural Machine Translation: A Kazakh–English–Russian Case Study in the IT Domain" Electronics 14, no. 15: 3126. https://doi.org/10.3390/electronics14153126
APA StyleKozhirbayev, Z., & Yessenbayev, Z. (2025). Fine-Tuning Methods and Dataset Structures for Multilingual Neural Machine Translation: A Kazakh–English–Russian Case Study in the IT Domain. Electronics, 14(15), 3126. https://doi.org/10.3390/electronics14153126