Cross-Lingual Sentiment Classification in Sustainable Mobility: A Zero-Shot Domain Transfer Evaluation Framework
Abstract
1. Introduction
2. Materials and Methods
2.1. Dataset and Multilingual Translation Pipeline
2.2. Zero-Shot Sentiment Classification Model
2.3. Evaluation Protocol
- Distributional coherence analysis: We assess whether predicted sentiment class proportions remain consistent across languages, as significant deviations from the source-language distribution may indicate systematic translation drift or language-specific model biases.
- Confidence-based evaluation: We analyze average confidence scores per sentiment class and per language as a proxy for model certainty. This metric is particularly valuable in the absence of gold-standard annotations, as it provides an interpretable diagnostic signal of model behavior without requiring labeled target-language data. Low confidence scores, in turn, are interpreted as indicators of potential model uncertainty, typically arising in ambiguous or linguistically complex cases such as those involving irony, idiomatic expressions, or mixed-sentiment structures. Specifically, high confidence in polarized classes and lower confidence in neutral or ambiguous cases are expected and interpretable outcomes.
- Qualitative taxonomy of low-confidence cases: We categorize low-confidence sentences that can be assigned to a dominant linguistic pattern using heuristic rules to identify recurring linguistic patterns, providing actionable guidance for future pre-processing and fine-tuning efforts.
2.4. Implementation Details
3. Results
3.1. Overall Sentiment Distributions and Model Confidence
3.2. Language-Specific Sentiment Patterns
3.3. Qualitative Taxonomy of Low-Confidence Cases
3.4. Limitations of Quantitative Evaluation
4. Discussion
4.1. Methodological Contributions
4.2. Generalizability of the Framework and Domain Suitability
4.3. Limitations and Future Directions
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| NLP | Natural Language Processing |
| UGC | User-Generated Content |
| UGSC | User Gold Standard Corpus |
| UGSC-ML | User Gold Standard Corpus Multilingual |
| CLSA | Cross-Lingual Sentiment Analysis |
| XLM-RoBERTa | Cross-lingual Language Model Robustly Optimized BERT Pretraining Approach |
| mBERT | Multilingual Bidirectional Encoder Representations from Transformers |
| XLM | Cross-lingual Language Model Pretraining |
| ABSA | Aspect-Based Sentiment Analysis |
| GDPR | General Data Protection Regulation |
| SDGs | Sustainable Development Goals |
References
- Gudmundsson, H.; Marsden, G.; Zietsman, J. Sustainable Transportation: Indicators, Frameworks, and Performance Management; Springer: Cham, Switzerland, 2016. [Google Scholar]
- European Commission. Sustainable and Smart Mobility Strategy—Putting European Transport on Track for the Future. 2020. Available online: https://transport.ec.europa.eu (accessed on 17 September 2025).
- United Nations. Transforming Our World: The 2030 Agenda for Sustainable Development. 2015. Available online: https://sdgs.un.org/goals (accessed on 26 April 2025).
- Gobierno Vasco. Estrategia de Movilidad Sostenible de Euskadi 2030. 2025. Available online: https://www.euskadi.eus/plan-director-del-transporte-sostenible/web01-a2kudeak/es/ (accessed on 29 April 2025).
- Castillo, H.; Pitfield, D.E. ELASTIC—A Methodological Framework for Identifying and Selecting Sustainable Transport Indicators. Transp. Res. Part D Transp. Environ. 2010, 15, 179–188. [Google Scholar] [CrossRef]
- Gitto, S.; Mancuso, P. Brand perceptions of airports using social networks. J. Air Transp. Manag. 2019, 75, 153–163. [Google Scholar] [CrossRef]
- Grant-Muller, S.M.; Gal-Tzur, A.; Minkov, E.; Nocera, S.; Kuflik, T.; Shoor, I. Enhancing transport data collection through social media sources: Methods, challenges and opportunities for textual data. IET Intell. Transp. Syst. 2015, 9, 407–417. [Google Scholar] [CrossRef]
- Joshi, P.; Santy, S.; Budhiraja, A.; Bali, K.; Choudhury, M. The State and Fate of Linguistic Diversity and Inclusion in the NLP World. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 6282–6293. [Google Scholar] [CrossRef]
- Liu, B. Sentiment Analysis and Opinion Mining; Morgan & Claypool Publishers: San Rafael, CA, USA, 2012; Volume 5. [Google Scholar]
- Cambria, E.; Schuller, B.; Xia, Y.; Havasi, C. New Avenues in Opinion Mining and Sentiment Analysis. IEEE Intell. Syst. 2013, 28, 15–21. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Collins, C.; Hasan, S.; Ukkusuri, S.V. A novel transit rider satisfaction metric: Rider sentiments measured from online social media data. J. Public Transp. 2013, 16, 21–45. [Google Scholar] [CrossRef]
- Hadiuzzman, M.; Das, T.; Hasnat, M.M.; Hossain, S.; Musabbir, S.R. Structural equation modeling of user satisfaction of bus transit service quality based on stated preferences and latent variables. Transp. Plan. Technol. 2017, 40, 257–277. [Google Scholar] [CrossRef]
- Serna, A.; Soroa, A.; Agerri, R. Applying Deep Learning Techniques for Sentiment Analysis to Assess Sustainable Transport. Sustainability 2021, 13, 2397. [Google Scholar] [CrossRef]
- Conneau, A.; Khandelwal, K.; Goyal, N.; Chaudhary, V.; Wenzek, G.; Guzman, F.; Grave, E.; Ott, M.; Zettlemoyer, L.; Stoyanov, V. Unsupervised Cross-Lingual Representation Learning at Scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), Online, 5–10 July 2020; pp. 8440–8451. [Google Scholar] [CrossRef]
- Conneau, A.; Lample, G. Cross-Lingual Language Model Pretraining. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Curran Associates Inc.: Red Hook, NY, USA, 2019; pp. 7057–7067. [Google Scholar]
- Barbieri, F.; Espinosa Anke, L.; Camacho-Collados, J. TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, Online, 16–20 November 2020; pp. 1644–1650. [Google Scholar] [CrossRef]
- Barnes, J.; Klinger, R.; Schulte im Walde, S. Projecting Embeddings for Domain Adaptation: Joint Modeling of Sentiment Analysis in Diverse Domains. In Proceedings of the 27th International Conference on Computational Linguistics (COLING), Santa Fe, NM, USA, 20–26 August 2018; pp. 818–829. [Google Scholar]
- Hu, J.; Ruder, S.; Siddhant, A.; Neubig, G.; Firat, O.; Johnson, M. XTREME: A Massively Multilingual Multi-Task Benchmark for Evaluating Cross-Lingual Generalization. In Proceedings of the 37th International Conference on Machine Learning (ICML), Virtual Event, 13–18 July 2020; pp. 4411–4421. [Google Scholar]
- FitzGerald, J.; Hench, C.; Peris, C.; Mackie, S.; Rottmann, K.; Sanchez, A.; Natarajan, P. MASSIVE: A 1M-Example Multilingual Natural Language Understanding Dataset. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), Toronto, ON, Canada, 9–14 July 2023; pp. 4277–4302. [Google Scholar] [CrossRef]
- Fan, A.; Bhosale, S.; Schwenk, H.; Ma, Z.; El-Kishky, A.; Goyal, S.; Joulin, A. Beyond English-Centric Multilingual Machine Translation. J. Mach. Learn. Res. 2021, 22, 4839–4886. [Google Scholar]
- NLLB Team. No Language Left Behind: Scaling Human-Centered Machine Translation. arXiv 2022, arXiv:2207.04672. [Google Scholar] [CrossRef]
- Serna, A. UGSC Multilingual Sentiment Dataset for Sustainable Mobility. Zenodo. 2026. Available online: https://zenodo.org/records/15085521 (accessed on 10 April 2026).
- Serna, A.; Gerrikagoitia, J.K. UGSC Multilingual Sentiment: Code and Resources. 2026. Available online: https://github.com/ainhoaserna/UGSC-multilingual-sentiment (accessed on 10 April 2026).
- Agerri, R.; Soroa, A.; Serna, A. Sustainable Transport Sentiment Corpus. 2021. Available online: https://github.com/ixa-ehu/sustainable-transport-sentiment-corpus (accessed on 21 September 2025).
- Barbieri, F.; Espinosa Anke, L.; Camacho-Collados, J. CardiffNLP Twitter XLM-RoBERTa Base Sentiment Model. Hugging Face 2022. Available online: https://huggingface.co/cardiffnlp/twitter-xlm-roberta-base-sentiment (accessed on 29 April 2025).
- Hall, E.T. Beyond Culture; Anchor Books: New York, NY, USA, 1976. [Google Scholar]
- Hofstede, G. Culture’s Consequences: Comparing Values, Behaviors, Institutions and Organizations Across Nations, 2nd ed.; Sage: Thousand Oaks, CA, USA, 2001. [Google Scholar]



| Language | Sentiment | Prediction Count | Mean Confidence | Hypothesis Support |
|---|---|---|---|---|
| English | Negative | 129 | 0.760 | H1 ✓ |
| Neutral | 73 | 0.580 | H1 ✓ | |
| Positive | 173 | 0.770 | H1 ✓ | |
| French | Negative | 147 | 0.809 | H1 ✓ |
| Neutral | 19 | 0.576 | H1 ✓ | |
| Positive | 209 | 0.841 | H1 ✓ | |
| German | Negative | 114 | 0.762 | H1 ✓ |
| Neutral | 89 | 0.642 | H1 ✓ | |
| Positive | 172 | 0.850 | H1 ✓ | |
| Italian | Negative | 152 | 0.785 | H2—higher neg |
| Neutral | 43 | 0.587 | H1 ✓ | |
| Positive | 180 | 0.831 | H1 ✓ | |
| Spanish | Negative | 132 | 0.775 | H1 ✓ |
| Neutral | 71 | 0.648 | H1 ✓ | |
| Positive | 172 | 0.848 | H1 ✓ |
| Pattern | Example | Observed Issue | Effect on Model |
|---|---|---|---|
| Mixed Sentiment | “Cheap but always late.” | Conflicting polarity signals | Low confidence; unstable label |
| Conditional/ Hypothetical | “It would be great if buses arrived on time.” | Speculative, non-assertive sentiment | Tends to neutral; low certainty |
| Idiomatic (ES) | “Me dejó frío.” | Cultural idiom for disappointment | Misinterpreted as neutral/positive |
| Idiomatic (DE) | “Nicht der Rede wert.” | Implicit dissatisfaction masked by idiom | Misclassified or uncertain |
| Irony/Sarcasm | “Just what I needed—another delayed train.” | Surface polarity inverts actual sentiment | Wrong polarity; low confidence |
| Informal Punctuation | “Always late,,,, again.” | Repeated punctuation introduces noise | Low token-level alignment |
| Translation Drift (FR) | EN: “The ride was okay.” → FR: “Le trajet était agréable.” | Slight positive shift in translation | Polarity misalignment |
| Pattern | N Cases | % of Categorized Low-Conf. | Mean Confidence |
|---|---|---|---|
| Irony/Sarcasm | 8 | 7.1% | 0.456 |
| Idiomatic Expressions (ES/DE) | 24 | 21.2% | 0.447 |
| Conditional/Hypothetical | 17 | 15.0% | 0.458 |
| Mixed Sentiment | 28 | 24.8% | 0.460 |
| Informal Punctuation | 2 | 1.8% | 0.469 |
| Translation Drift | 34 | 30.1% | 0.454 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Serna, A.; Gerrikagoitia, J.K.; de Oña, J. Cross-Lingual Sentiment Classification in Sustainable Mobility: A Zero-Shot Domain Transfer Evaluation Framework. AI 2026, 7, 216. https://doi.org/10.3390/ai7060216
Serna A, Gerrikagoitia JK, de Oña J. Cross-Lingual Sentiment Classification in Sustainable Mobility: A Zero-Shot Domain Transfer Evaluation Framework. AI. 2026; 7(6):216. https://doi.org/10.3390/ai7060216
Chicago/Turabian StyleSerna, Ainhoa, Jon Kepa Gerrikagoitia, and Juan de Oña. 2026. "Cross-Lingual Sentiment Classification in Sustainable Mobility: A Zero-Shot Domain Transfer Evaluation Framework" AI 7, no. 6: 216. https://doi.org/10.3390/ai7060216
APA StyleSerna, A., Gerrikagoitia, J. K., & de Oña, J. (2026). Cross-Lingual Sentiment Classification in Sustainable Mobility: A Zero-Shot Domain Transfer Evaluation Framework. AI, 7(6), 216. https://doi.org/10.3390/ai7060216

