Generative AI for Code Translation: A Systematic Mapping Study †
Abstract
1. Introduction
2. Research Methodology
- Inclusion Criteria:
- IC1: Only fully published research papers from conferences, books or journals.
- IC2: Generative AI for Code Translation that covers issues including source code transformation, code migration, or AI-driven programming language conversion must take the front stage.
- IC3: Only English-language papers are considered.
- Exclusion Criteria:
- EX1: Publication Type: Short papers, posters, and editorials are excluded.
- EX2: Studies that do not focus on Generative AI applied to Code Translation are excluded.
- EX3: Identical studies found in multiple databases are filtered out, retaining only the most complete version.
- RQ1: With every paper that was selected, we captured key contextual information such as document type (whether it is a journal article, conference paper or a book chapter), publication source (name of publisher or digital repository), and the date of publication [6].
- RQ2: Each paper was classified in relation to the primary existing research types.
- Evaluation Research (ER): These are the papers that analyze how well existing Generative AI models for code translation perform after measuring them against certain benchmarks [7].
- Solution Proposal (SP): Papers that either develop new Generative AI techniques or significantly enhance existing ones with respect to code translation
- Experience Papers (EP):These are papers in which authors elaborate on their experiences and the difficulties faced while implementing AI models during translation of codes into different languages [2].
- Review: These are papers summarizing the most detailed and up to date information about the use of Generative AI in code translation [2].
- RQ3: To identify the publication types (journal articles, conference papers, and book chapters) that emerged most frequently and significantly within this branch of research [8].
- RQ4: Collected information on the datasets most answered questions related to training and evaluating Generative AI models performance for code translation purposes. This includes dataset titles, their providers, dataset quantities, and the reasons why they were used [5].
- RQ5: The literature frequently cited AI models and architectures in the literature, which include: transformer based models, LSTM models, autoregressive models, and big language models.
- RQ6: The type of empirical research can be classified as (Survey, historical evaluation (HBE), Case study) [9].
3. Results and Discussion
3.1. Overview of the Selected Studies
3.2. RQ1: What Are the Years of Publication for Research on Generative AI for Code Translation?
3.3. RQ2: What Types of Publications Contribute the Most to This Research Area?
3.4. RQ3: What Are the Research Types in Studies on Generative AI for Code Translation?
3.5. RQ4: What Are the Most Commonly Used AI Models and Architectures for Code Translation?
3.6. RQ5: Which Datasets Are Frequently Used for Training and Evaluating Generative AI Models for Code Translation?
3.7. RQ6: What Types of Empirical Studies Have Been Conducted in the Application of Generative AI for Code Translation?
4. Threats to Validity
- Study Selection: To identify the relevant papers for this mapping study, we formulated a search string aligned with our research questions, which we used to conduct an automatic search across different digital databases related to this line of work. We then applied specific selection criteria to filter the results. However, some relevant studies may not have been captured through this search process. To address this limitation, we reviewed the reference lists of the selected papers to identify and include additional pertinent studies.
- Publication Bias: As many studies report only positive findings in the context of this study, we reduce mitigate accept dismiss this threat by including comparisons to models studied with unsatisfactory alternatives. A common tendency in the literature is for researchers to emphasize the superior performance of their proposed models, which may contribute to an overestimation of their actual effectiveness. To mitigate this threat to validity, we introduced a third inclusion criterion aimed at selecting studies that conduct models comparisons studied with unsatisfactory alternatives.
- Data Extraction Bias: To minimize the risk of inaccuracies during this process, two authors independently assessed each selected paper and extracted the necessary information pertinent to addressing the research questions.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Macedo, M.; Tian, Y.; Cogo, F.R.; Adams, B. Exploring the Impact of the Output Format on the Evaluation of Large Language Models for Code Translation. In Proceedings of the International Conference on AI Foundation Models and Software Engineering (Forge), Lisbon, Portugal, 14 April 2024. [Google Scholar]
- Luo, Y.; Yu, R.; Zhang, F.; Liang, L.; Xiong, Y. Bridging Gaps in LLM Code Translation: Reducing Errors with Call Graphs and Bridged Debuggers. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (ASE), Sacramento, CA, USA, 27 October–1 November 2024. [Google Scholar]
- Gandhi, S.; Patwardhan, M.; Khatri, J.; Vig, L.; Medicherla, R.K. Translation of Low-Resource COBOL to Logically Correct and Readable Java leveraging High-Resource Java Refinement. In Proceedings of the 1st International Workshop on Large Language Models for Code (LLM4Code), Lisbon, Portugal, 20 April 2024. [Google Scholar]
- Huang, D.; Liu, X.; Wei, J.; Jiang, L. A Survey on Neural Code Translation. In Proceedings of the International Conference on Neural Information Processing (ICONIP), Auckland, New Zealand, 2–6 December 2024; Springer: Berlin/Heidelberg, Germany, 2024. [Google Scholar]
- Li, T.R.; Zhang, H.; Chen, P.; Qian, X. A Comparative Study of LLMs in Code Translation. In ACM Conference on Software Language Engineering (SLE); ACM: New York, NY, USA, 2024. [Google Scholar]
- Petersen, K.; Feldt, R.; Mujtaba, S.; Mattsson, M. Systematic Mapping Studies in Software Engineering. In Proceedings of the 12th International Conference on Evaluation and Assessment in Software Engineering (EASE), Bari, Italy, 26–27 June 2008. [Google Scholar]
- Fang, D.; Song, P.; Zhang, Z. LLM-Powered Code Translation with Semantic Preservation. In Proceedings of the International Conference on Software Engineering (ICSE), Lisbon, Portugal, 14–20 April 2024. [Google Scholar]
- Kitchenham, B.; Charters, S. Guidelines for Performing Systematic Literature Reviews in Software Engineering; EBSE: Durham, UK, 2007. [Google Scholar]
- Fernández, D.M.; Passoth, J.-H. Empirical software engineering: From discipline to interdiscipline. J. Syst. Softw. 2019, 148, 170–179. [Google Scholar] [CrossRef]
- Oliveira, F.; Costa, A.; Barreto, R.; Quintero, S. LLM-Coded: A Code Translation Framework for Improving Software Compatibility. In Proceedings of the International Conference on Software Maintenance and Evolution (ICSME), Flagstaff, AZ, USA, 6–11 October 2024. [Google Scholar]
- He, J.; Yang, Z.; Sun, L.; Tang, Q. Cross-Language Code Translation via LLM-Based Transfer Learning. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Jeju, Korea, 3–9 August 2024. [Google Scholar]
- Zhu, M.; Karim, M.; Lourentzou, I.; Yao, D.D. Semi-Supervised Code Translation Overcoming the Scarcity of Parallel Code Data. In Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (ASE), Sacramento, CA, USA, 27 October–1 November 2024. [Google Scholar]
- Lei, B.; Ding, C.; Chen, L.; Lin, P.-H.; Liao, C. Creating a Dataset for High-Performance Computing Code Translation using LLMs: A Bridge Between OpenMP Fortran and C++. In Proceedings of the High Performance Extreme Computing Conference (HPEC), Boston, MA, USA, 25–29 September 2023. [Google Scholar]
- Feng, Z.; Guo, D.; Tang, D.; Duan, N.; Feng, X.; Gong, M.; Shou, L.; Qin, B.; Liu, T.; Jiang, D.; et al. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. arXiv 2020, arXiv:2002.08155. [Google Scholar]
- Husain, H.; Wu, H.-H.; Gazit, T.; Allamanis, M.; Brockschmidt, M. CodeSearchNet Challenge: Evaluating the State of Semantic Code Search. arXiv 2019, arXiv:1909.09436. [Google Scholar]
- Puri, R.; Kung, D.S.; Janssen, G.; Zhang, W.; Domeniconi, G.; Zolotov, V.; Dolby, J.; Chen, J.; Choudhury, M.; Decker, L.; et al. CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks. arXiv 2021, arXiv:2105.12655. [Google Scholar]
- Guo, D.; Ren, S.; Lu, S.; Feng, Z.; Tang, D.; Liu, S.; Zhou, L.; Duan, N.; Svyatkovskiy, A.; Fu, S.; et al. GraphCodeBERT: Pre-training Code Representations with Data Flow. arXiv 2020, arXiv:2009.08366. [Google Scholar]
- Lachaux, M.-A.; Roziere, G.; Chanussot, L.; Lample, G. Unsupervised Translation of Programming Languages. arXiv 2020, arXiv:2006.03511. [Google Scholar] [CrossRef]
- Xu, F.F.; Jiang, Z.; Yin, P.; Vasilescu, B.; Neubig, G. Incorporating External Knowledge through Pre-training for Natural Language to Code Generation. In Proceedings of the the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 6045–6052. [Google Scholar]
- Xiao, C.; Tang, M.; Zhao, W. Deep Transfer Learning for Code Translation Across Languages. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD), Barcelona, Spain, 25–29 August 2024; ACM: New York, NY, USA, 2024. [Google Scholar]




| ID | Research Question | Rationale |
|---|---|---|
| RQ1 | What are the years of publication for research on Generative AI for Code Translation? | Distinct periods of heightened interest and chronological patterns of research activity were revealed. |
| RQ2 | What types of publications contribute the most to this research area? | Determines whether certain publication outlets have consistently hosted work in this area. |
| RQ3 | What are the research types in studies on Generative AI for Code Translation? | Identifies the different types of studies focusing on Generative AI for Code Translation. |
| RQ4 | What are the most commonly used AI models and architectures for Code Translation? | Indicates which models and architectures are most prominently utilised in Generative AI for Code Translation. |
| RQ5 | Which datasets are frequently used for training and evaluating Generative AI models for Code Translation? | Identifies key data resources leveraged by researchers, offering insight into how model performance is benchmarked. |
| RQ6 | What types of empirical studies have been conducted in the application of Generative AI for Code Translation? | Categorises the investigations as observational, experimental, or case-based. |
| Publication Year | Number of Papers | Percentage (%) |
|---|---|---|
| 2020 | 1 | 1.89 |
| 2021 | 3 | 5.66 |
| 2022 | 6 | 11.32 |
| 2023 | 10 | 18.87 |
| 2024 | 29 | 54.72 |
| 2025 | 4 | 7.55 |
| Total | 53 | 100 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Rgaguena, A.; Chlioui, I.; Radgui, M. Generative AI for Code Translation: A Systematic Mapping Study. Eng. Proc. 2025, 112, 33. https://doi.org/10.3390/engproc2025112033
Rgaguena A, Chlioui I, Radgui M. Generative AI for Code Translation: A Systematic Mapping Study. Engineering Proceedings. 2025; 112(1):33. https://doi.org/10.3390/engproc2025112033
Chicago/Turabian StyleRgaguena, Aymane, Imane Chlioui, and Maryam Radgui. 2025. "Generative AI for Code Translation: A Systematic Mapping Study" Engineering Proceedings 112, no. 1: 33. https://doi.org/10.3390/engproc2025112033
APA StyleRgaguena, A., Chlioui, I., & Radgui, M. (2025). Generative AI for Code Translation: A Systematic Mapping Study. Engineering Proceedings, 112(1), 33. https://doi.org/10.3390/engproc2025112033

