1. Introduction
Large language models (LLMs) are rapidly becoming ubiquitous decision-support systems, guiding decisions from financial planning to medical advice across linguistically diverse global populations [
1,
2,
3,
4,
5,
6]. As these systems mediate increasingly consequential decisions, a critical question arises: do LLMs provide consistent recommendations when queried in different languages? Recent evidence suggests troubling inconsistencies, with multilingual models exhibiting substantial cross-linguistic variation in moral reasoning [
7,
8,
9]. This linguistic dependency threatens the reliability of LLM-based systems deployed across multilingual contexts, where users reasonably expect equivalent queries to produce consistent guidance regardless of input language.
The human analog of this phenomenon, the foreign language effect [
10,
11], provides both context and urgency. People make systematically more utilitarian moral choices when reasoning in foreign versus native languages, driven by reduced emotional engagement rather than enhanced deliberation. Prior work has documented similar cross-linguistic variation in LLM moral reasoning [
7,
9,
12], yet strategies for mitigation remain underexplored.
We investigate this problem through two interconnected objectives. First, we provide empirical documentation of cross-linguistic inconsistencies in LLM moral reasoning within behavioral economics contexts, specifically distributive justice scenarios involving trade-offs between efficiency, equity, and self-interest [
13,
14]. Unlike abstract moral dilemmas (e.g., trolley problems), these scenarios involve quantifiable trade-offs directly relevant to real-world resource allocation decisions. Second, we systematically evaluate whether domain persona prompting (i.e., embedding professional expertise in prompts) can serve as a practical intervention to reduce these inconsistencies. Our approach differs from prior work that documented cross-linguistic variation but did not test interventions and from cultural prompting studies that use demographic identity rather than professional expertise as a moderating framework [
15].
We conduct a large-scale evaluation across 1,201,200 independent queries, comparing baseline language effects (English vs. Korean) against persona-injected conditions across ten professional domains. We report three principal findings. First, language fundamentally shapes baseline moral reasoning: five of six scenarios exhibit statistically significant cross-linguistic divergence with effect sizes ranging from 9 to 56 percentage points, including complete preference reversals. Second, domain persona injection substantially reduces these gaps, achieving a 62.7% average reduction, with normative domains (sociology, economics, law, philosophy, and history) demonstrating greater effectiveness than technical domains. Third, persona-based moderation encounters systematic boundary conditions: scenarios presenting isolated ethical conflict (particularly large uncompensated self-sacrifice) resist intervention.
These findings advance three research areas. For AI alignment, we introduce domain persona prompting as a practical intervention achieving substantial but imperfect gap reduction. For computational social science, we extend the homo silicus framework [
14] to multilingual contexts, revealing that LLMs exhibit effects resembling the human foreign language effect. For cross-cultural AI research, we establish that compensatory integration (the ability to synthesize trade-offs across multiple ethical dimensions) determines when professional framing can bridge language gaps.
The remainder of this paper is organized as follows.
Section 2 reviews prior work on LLMs as economic agents, cross-linguistic moral reasoning, and persona prompting interventions.
Section 3 details our experimental design, translation protocol, and alignment metrics.
Section 4 presents findings addressing baseline language effects (RQ1), persona-based moderation (RQ2), and boundary conditions (RQ3).
Section 5 discusses theoretical mechanisms, parallels to human psycholinguistics, practical implications, limitations, and future research directions.
Section 6 concludes with contributions.
3. Methodology
3.1. Experimental Design
Following Horton [
14], we employ LLMs as simulated economic agents, or homo silicus, to investigate cross-linguistic consistency in moral reasoning. Critically, we are not using LLMs as a data analysis tool but rather as the object of study itself. We examine whether these increasingly deployed AI systems exhibit language-dependent moral preferences and whether professional framing can reduce such dependencies. This approach offers a key methodological advantage: we can systematically vary experimental parameters (language, professional context) while holding the underlying decision-maker constant, thereby isolating the causal impact of prompt framing on moral judgments. Unlike traditional human subject experiments where individual differences confound treatment effects, computational agents enable systematic manipulation of contextual inputs while holding the underlying decision-maker constant [
14,
24].
We implement a hierarchically structured design with two primary conditions tested across two languages (English and Korean) and six distributive justice scenarios adapted from the social preferences literature in behavioral economics [
13]. The persona-injected condition introduces professional framing through ten academic domains (i.e., economics, law, philosophy, history, sociology, environmental science, mathematics, finance, engineering, and computer science), each represented by 1000 distinct personas from the PersonaHub dataset [
25]. Each persona responds to all six scenarios with ten repetitions per scenario, yielding 600,000 responses per language. The no-persona baseline condition presents identical scenarios without professional framing, with 100 repetitions per scenario per language (1200 responses total). Details are presented in
Section 3.7, totaling 1,201,200 independent model queries. We note that each of our 1,201,200 queries represents an independent decision by the same model under systematically varied linguistic and professional framing conditions.
This design enables systematic assessment of: (1) baseline language-dependent variation in distributive preferences, (2) how professional expertise modulates cross-linguistic alignment across ten domains, and (3) scenario-specific boundary conditions where domain framing succeeds or fails to bridge language-based differences.
Figure 1 illustrates the experimental pipeline from scenario input through persona selection, prompt construction, model inference, and structured output parsing.
3.2. Model Selection and Configuration
We use Google’s Gemini 2.0 Flash [
26] as our primary model, accessed via the LangChain API [
27]. Model selection prioritized as follows: (1) documented multilingual proficiency with explicit Korean language support; (2) cost-effectiveness enabling large-scale evaluation (1.2 M+ queries) within practical computational budgets; and (3) sufficient context window size (1 M input, 8192 output tokens) for persona embeddings and scenario descriptions.
We configured the model with temperature 1.0 to enable stochastic sampling necessary for distributional analysis, generating response variability across repeated queries while maintaining semantic coherence within individual responses. All other parameters (e.g., top-p and top-k) were set to API defaults. We pinned the model version to gemini-2.0-flash-001 throughout data collection to ensure consistency across experimental conditions.
We employed zero-shot chain-of-thought prompting [
28] to encourage deliberative reasoning, instructing the model to explain its decision before providing a binary choice (“Left” or “Right”). Responses were returned in structured JSON format to ensure reliable parsing. Our analysis focuses exclusively on choice distributions; reasoning traces were not analyzed in this study. Complete response format specifications are provided in
Appendix B.
3.3. Behavioral Economics Scenarios
We adapt six distributive justice scenarios from Charness and Rabin [
13], selected from the same experimental stimulus set employed in prior LLM simulation work [
14]. These scenarios provide established empirical benchmarks with human subjects and systematic variation in trade-offs between efficiency (total payoff maximization), equity (payoff equality), and self-interest (own payoff maximization). Each scenario presents a binary choice between two monetary allocations, with the respondent assuming the role of Person B (recipient).
Table 2 presents the complete scenario set with human baseline choices from the original study.
The six scenarios systematically vary preference dimensions:
Berk29 isolates efficiency preferences: Person B’s payoff remains constant (USD 400) while choosing whether to increase Person A’s payoff by USD 350, testing pure efficiency orientation independent of self-interest.
Berk26 tests equity preferences against self-interest: Person B must sacrifice USD 400 to achieve equal distribution (USD 400, USD 400), measuring willingness to bear costs for fairness.
Berk23 tests for spite: Person B must forgo USD 200 to prevent Person A from receiving USD 800, a preference rarely observed in human populations.
Berk15 tests self-interest against aligned equity and efficiency: Person B sacrifices USD 100 personal payoff to achieve both perfect equality and USD 300 higher total surplus, requiring rejection of self-interest alone.
Barc8 tests disadvantageous inequality aversion: choosing Right increases total surplus by USD 300 but places Person B in an inferior position (USD 500 vs. USD 700), requiring acceptance of earning less than the counterpart.
Barc2 tests equity against efficiency: choosing Left maintains equal payoffs (USD 400 each), while choosing Right increases total surplus by USD 325 but reduces Person B’s payoff by USD 25 and creates inequality.
3.4. Persona Design and Selection
To simulate domain-specific moral reasoning, we employed personas from ten academic disciplines: economics, law, philosophy, history, sociology, environmental science, mathematics, finance, engineering, and computer science. English personas were sourced from the elite subset of the PersonaHub dataset [
25], which provides 1000 high-quality, narrative-based persona profiles per discipline. The elite subset was selected for its rich contextual descriptions reflecting disciplinary worldviews, ethical orientations, and decision-making heuristics characteristic of professional domains.
For each domain, we sampled all 1000 available personas, yielding 10,000 unique profiles. Each persona provides a detailed narrative describing the individual’s professional background, domain expertise, and characteristic reasoning patterns. For example, a computer science persona as follows: “A software developer looking for a way to simplify the integration of GPRS technology into embedded system designs. Interested in developing a stable and efficient software stack while seeking products that are easy to use with minimal technical knowledge requirements, yet capable of reliable data transmission and compatibility with other network protocols.”
We embedded these personas directly into prompts as contextual framing, enabling the model to condition responses on domain-specific perspectives, but without explicit instruction about how domain expertise should influence moral judgments. Representative persona descriptions from each domain are provided in
Appendix C.1.
3.5. Prompt Construction
Each experimental prompt consisted of three components: persona framing (when applicable), scenario presentation, and response instructions. In the persona-injected condition, personas were embedded as contextual paragraphs describing the agent’s academic background, ethical orientation, and decision-making heuristics. This approach provides substantive professional context enabling domain-appropriate reasoning patterns, rather than simple role labels (e.g., “You are an economist”). The model was not explicitly instructed how domain expertise should influence choices; instead, persona descriptions were designed to implicitly activate domain-specific cognitive frames. Following persona embedding (or immediately in the no-persona condition), each prompt presented the distributive justice scenario with consistent structure across all experimental conditions. The prompt achieved the following: (1) identified the respondent as Person B in a social preferences experiment; (2) presented two allocation options with explicit payoffs for both Person A and Person B labeled “Left” and “Right,”; and (3) requested a choice with a brief explanation. The standard instruction without persona framing reads as follows:
“You are Person B in a distributive justice scenario. Person A and B (you) must allocate money between themselves. You, as Person B, choose between: …Please explain your reasoning and then state your choice as either “Left” or “Right”. This structured format ensured consistent task framing while allowing natural language reasoning to emerge through zero-shot chain-of-thought prompting (
Section 3.2). Complete prompt templates for both baseline and persona-injected conditions, including Korean translations, are provided in
Appendix D.
3.6. Cross-Linguistic Translation Protocol
All experimental materials (including personas, scenario descriptions, and task instructions) were adapted for Korean through a four-stage translation protocol employing LLM-as-a-Judge methodology [
29] to ensure semantic equivalence while preserving naturalness and task comprehension:
Initial translation: Korean translations were generated using Gemini 2.0 Flash with the prompt: “Translate the following English text to Korean, preserving exact meaning and nuance, using correct professional terminology, ensuring native-like expression, and maintaining disciplinary perspective.”
Multi-instance verification: Five independent Gemini 2.0 Flash instances (with temperature 0.3 to reduce determinism) evaluated each translation for the following: (a) semantic accuracy and cultural appropriateness, (b) preservation of disciplinary framing and domain-specific terminology, (c) naturalness of Korean formal register, and (d) parallel scenario framing and payoff salience. Each instance provided binary accept/reject judgments with justifications.
Consensus-based acceptance: Translations receiving ≥3 acceptance votes were retained. Rejected translations were regenerated and re-evaluated until achieving the consensus threshold.
Back-translation verification: Final Korean materials were back-translated to English using an independent API call. Gemini 2.0 Flash then evaluated semantic equivalence between the original and back-translated English texts, flagging instances where meaning diverged. Flagged materials were revised and resubmitted through the verification pipeline.
This fully automated protocol enabled scalable verification of 10,000+ persona translations while maintaining consistency in evaluation criteria. We acknowledge that LLM-based verification cannot fully substitute for expert human judgment in detecting subtle pragmatic or cultural nuances [
29]; however, the combination of multi-instance verification, consensus voting, and back-translation checking provides systematic quality control appropriate for the scale of our experimental design [
30,
31]. Monetary values were presented in US dollars in both languages to eliminate currency conversion confounds and maintain direct comparability with the original human baseline [
13]. The exact prompt templates used in each stage of the translation pipeline are documented in
Appendix D (
Appendix D.4).
3.7. Data Collection and Sampling
Our sampling design balances two analytical objectives: characterizing domain-specific variation in persona-conditioned responses requires dense sampling across domains, while establishing language-specific base rates requires only precise proportion estimation. Consequently, we employ differential repetition rates across conditions: 10 repetitions per persona-scenario combination (persona-injected) versus 100 repetitions per scenario (no-persona baseline).
Table 3 summarizes the complete sampling structure, yielding 1,201,200 total observations: 1,200,000 from the persona-injected condition (10,000 per scenario–language–domain combination) and 1200 from the no-persona baseline (200 per scenario-language pair).
Data collection proceeded through asynchronous API queries using the LangChain framework [
27]. Each call is independently stateless without conversational memory, ensuring that each of the 1,201,200 queries represents a fully independent decision context without requiring explicit randomization. Invalid responses (defined as non-conforming JSON outputs, API errors, or missing choice indicators) occurred in <0.5% of queries and were replaced through resampling until target sample sizes were achieved. The JSON response format and complete field specifications are documented in
Appendix B. Sample experimental materials, including representative persona descriptions and model responses, are provided in
Appendix C.
3.8. Alignment Metrics
Prior work on cross-linguistic LLM moral reasoning has relied primarily on accuracy metrics, agreement rates, or qualitative comparisons [
7,
8,
12]. Separately, research using LLMs as behavioral economics agents has compared model responses to human baselines [
13], though such comparisons remained qualitative rather than formally quantified [
14]. These existing approaches are insufficient for our research objectives, which require (1) measuring signed deviation from human preferences to determine whether cross-linguistic differences move responses toward or away from human norms; (2) quantifying intervention effects to isolate the impact of persona prompting from baseline tendencies; and (3) assessing whether interventions improve human alignment, since gap reduction is meaningful only if convergence occurs toward human preferences. We therefore introduce three complementary metrics tailored to our experimental design. For each scenario and condition, let
denote the proportion of Left choices.
3.8.1. Human Deviation Index (HDI)
HDI quantifies baseline deviation from empirical human behavior in the no-persona condition as follows:
where
is the proportion of Left choices in the no-persona baseline for language
ℓ, and
is the human baseline from Charness and Rabin [
13]. Positive values indicate stronger Left preference than humans; negative values indicate weaker preference.
3.8.2. Persona Effect Magnitude (PEM)
PEM captures the behavioral shift induced by domain persona framing:
where
is the proportion of Left choices for domain
d in language
ℓ under persona injection. Positive PEM indicates that persona injection increases Left preference relative to baseline; negative PEM indicates decreased preference.
3.8.3. Persona–Human Alignment Score (PHAS)
PHAS quantifies whether persona conditioning moves LLM behavior closer to human preferences:
Values near zero indicates strong alignment with human preferences; larger absolute values indicate divergence. Comparing PHAS across domains reveals which professional contexts facilitate or hinder human-like moral reasoning.
5. Discussion
5.1. Summary of Key Findings
Our investigation reveals three principal findings. First, language fundamentally shapes baseline moral preferences: five of six scenarios exhibited statistically significant cross-linguistic divergence with effect sizes ranging from 9 to 56 percentage points, including complete preference reversals (Berk15: 92% Korean vs. 36% English). These differences reflect systematic value hierarchies rather than uniform biases; English and Korean prioritize competing dimensions (efficiency vs. equity, self-interest vs. collective welfare) in scenario-dependent patterns.
Second, domain persona injection reduces cross-linguistic gaps by 62.7% on average, with normative disciplines demonstrating 23.9% greater effectiveness than technical domains. Third, personas encounter systematic boundary conditions: scenarios presenting isolated ethical conflict without compensatory dimensions, particularly large uncompensated self-sacrifice (Berk26), exhibit both minimal gap reduction (by 3.3pp) and inconsistent cross-linguistic response patterns.
5.2. Theoretical Mechanism and Parallels to Human Moral Psychology
Our results closely parallel the foreign-language effect observed in human moral cognition [
10,
11], where individuals make more utilitarian choices when reasoning in foreign versus native languages. While humans exhibit this effect through reduced emotional engagement, LLMs activate different value systems depending on the input language; these values were learned from culturally distinct training corpora. Korean prompts consistently elicited stronger self-interest maximization (Berk15: 92%), while English favored egalitarian outcomes (Berk29, Barc2), suggesting that language acts as a cue that triggers statistically dominant response patterns from language-specific training corpora [
32].
Meanwhile, our persona intervention reveals a novel finding absent from human psycholinguistics literature: professional identity framing can moderate language-dependent moral reasoning, while humans exhibit persistent foreign language effects even under explicit instruction [
11]. LLMs demonstrated a 62.7% gap reduction through domain personas, suggesting that LLMs respond better to alignment interventions than humans do.
We claim that this operates through a compensatory integration mechanism. Domain personas succeed by providing professional reasoning frameworks that are not tied to any particular culture, enabling the model to weigh multiple ethical considerations together. When scenarios involve trade-offs between efficiency, equality, and self-interest (such as Berk15, where personal sacrifice (USD 100) achieves both perfect equality and collective efficiency (USD 300 total gain)), professional identities offer coherent guidance transcending language-specific value hierarchies. This explains both high gap reduction and consistent cross-linguistic PHAS.
Conversely, personas fail when scenario structure prevents ethical integration. Berk26 presents pure self-interest versus altruism (USD 0 vs. USD 400 for oneself) with no efficiency justification (total welfare constant at USD 800). Without compensatory dimensions, professional frameworks cannot provide guidance that reconciles the conflict, allowing each language’s distinct moral priorities to dominate, explaining both minimal gap reduction and cross-linguistic PHAS inconsistency.
This mechanism yields specific testable predictions: scenarios with moderate self-sacrifice paired with large efficiency gains should maintain persona effectiveness, while scenarios combining large sacrifice with minimal compensation should exhibit boundary conditions similar to Berk26. The framework also explains why normative disciplines with explicit theories of distributive justice outperform technical domains lacking such frameworks (
Table 6).
5.3. Explaining Cross-Linguistic Patterns: Cultural Norms and Training Data
While
Section 5.2 established how persona intervention moderates language effects through compensatory integration, a critical question remains: why do English prompts favor egalitarian distributions while Korean prompts emphasize self-interest? This pattern appears counterintuitive given traditional characterizations of Korean culture as collectivist. We propose three complementary explanations.
5.3.1. Dynamic Collectivism: In-Group Versus Out-Group Boundaries
Korean collectivism operates through “dynamic collectivism” [
33]: applying collectivist norms to in-group members but individualistic norms to out-group members. Our scenarios present unrelated individuals without specified group affiliation. In Korean dynamic collectivism, such contexts may activate competitive self-preservation rather than prosocial redistribution. Korean prosocial tendencies operate primarily within defined in-groups (family, company, nation), not toward unaffiliated strangers [
34].
This explains Berk15, where 92% of Korean responses chose the selfish option. Without in-group bonds, Korean prompts activate pragmatic self-interest characteristic of out-group transactions. Moreover, modern Korean society exhibits “cultural duality”, where both collectivistic and individualistic values coexist, with context determining which dominates [
35].
5.3.2. Differential Training Data Content
LLM training data comprises 80 to 90% English content [
36], affecting not just linguistic proficiency but encoded cultural frameworks. English corpora overrepresent Western philosophical discourse on distributive justice and egalitarian principles from Anglo-American political philosophy and behavioral economics literature [
37]. Korean training data reflects different priorities: contemporary Korean digital content heavily features business and economic discourse emphasizing efficiency, competitive advantage, and hierarchical relationships shaped by Korea’s rapid industrialization and Confucian heritage [
38,
39], while systematic cross-linguistic corpus analysis of distributional semantics remains limited, the documented emphasis on competition and efficiency in Korean business discourse suggests potential differences in how economic preferences are encoded compared to English corpora’s stronger egalitarian discourse traditions.
5.3.3. Entanglement and Implications
We cannot definitively separate cultural norms from training data composition: training data reflects cultural production, and cultural norms shape online discourse. The patterns we observe emerge from interactions among the following: (a) genuine cultural differences (particularly in-group versus out-group distinctions); (b) differential representation of philosophical versus pragmatic discourse; and (c) multilingual corpus composition biases.
Critically, cross-linguistic inconsistency is not merely a technical translation problem. Language activates fundamentally different moral frameworks learned from culturally distinct training data. Users in different linguistic markets receive systematically different guidance because models encode different value hierarchies from language-specific corpora.
5.4. Implications for Multilingual LLM Deployment
Our findings have direct implications for deploying LLMs in multilingual contexts requiring consistent moral reasoning. Practitioners often assume that translating prompts preserves behavioral equivalence, an assumption our baseline results decisively refute. Organizations deploying LLMs for decision support across linguistic markets face substantial alignment risks, particularly in scenarios producing preference reversals.
Domain persona injection offers a practical mitigation strategy requiring minimal architectural modification (prompt engineering rather than retraining). By embedding professional context from normative disciplines, developers can reduce cross-linguistic gaps by 62.7% on average, with peak reductions exceeding 84%. However, our boundary conditions demand caution: scenarios involving isolated ethical conflict resist persona-based moderation. In such contexts, developers should either avoid relying on LLM recommendations when cross-linguistic consistency is critical, implement ensemble methods combining predictions across multiple languages and domains, or explicitly reframe queries to introduce compensatory dimensions.
More broadly, our results suggest that multilingual LLM evaluation should routinely assess cross-linguistic consistency, not merely per-language performance [
7,
8]. A model exhibiting high human alignment in English may show poor alignment in Korean on identical scenarios, a pattern invisible to monolingual evaluation. Benchmarks should therefore report both within-language accuracy and cross-linguistic variance as complementary alignment metrics.
5.5. Limitations
Our findings derive from a single language pair (English and Korean) evaluated on a single model (Gemini 2.0 Flash), which limits generalizability in two important ways.
First, English and Korean represent a specific pairing: both have relatively strong representation in LLM training data compared to lower-resource languages [
40], and both reflect economically developed contexts with distinct cultural characteristics. Cross-linguistic patterns may differ substantially for language pairs with greater typological distance, more extreme training data disparities [
36], or fundamentally different cultural value systems [
41]. Languages with minimal training data representation or those from non-Western contexts underrepresented in digital corpora [
8,
36,
37] may exhibit larger cross-linguistic gaps or different patterns of persona effectiveness.
Second, cross-linguistic consistency and intervention effectiveness may vary across model families due to differences in architectural design, multilingual pretraining strategies, and alignment procedures [
12]. Our demonstration that persona prompting reduces cross-linguistic gaps by 63% provides proof of concept, but validation across multiple models and language pairs remains essential for establishing the generality of this intervention strategy.
Beyond model and language scope, several methodological considerations warrant acknowledgment. Our human baseline derives from a single Western study [
13], introducing potential cultural confounds. Ideally, cross-linguistic LLM evaluation would be benchmarked against matched human samples from each language community [
9]. Our translation protocol employed LLM-as-a-judge methodology for scalability [
29], enabling verification of 10,000+ personas; while multi-instance consensus and back-translation provide systematic quality control, subtle pragmatic nuances may require expert human validation in future work.
We analyze choice distributions without examining reasoning traces, precluding insight into whether personas modulate decisions through altered value weighting or different ethical frameworks. Finally, our scenarios focus on distributive preferences in economic contexts. Extensions to deontological dilemmas, virtue ethics scenarios, and harm-based decisions would establish whether persona-based moderation generalizes beyond economic distributive justice.
5.6. Future Research Directions
Our findings open several promising directions. First, systematic exploration of the compensatory integration mechanism through targeted scenario design could establish precise boundary conditions by parametrically varying self-interest magnitude, efficiency gains, and inequality outcomes. Second, extending analysis to typologically distant language pairs, particularly those with documented moral value differences between individualist versus collectivist cultures [
42], would test whether our findings reflect universal mechanisms or English-Korean-specific patterns.
Third, investigating alternative persona framing strategies, including simpler role specifications (“You are a behavioral economist”), hybrid approaches combining demographic and professional identities, or adversarial prompting, might achieve comparable gap reduction with reduced prompt complexity. Fourth, examining persona effects in interactive, multi-turn contexts would assess whether consistency persists when LLMs engage in extended moral deliberation, handle challenges to initial positions, or exhibit priming effects from earlier exchanges. These factors are all critical for real-world deployment.
Moreover, our work focuses on economic distributive justice, but cross-linguistic consistency matters across diverse domains: medical triage, legal reasoning, educational advice, and content moderation. Extending persona-based interventions to these contexts would establish domain generality and identify task-specific boundary conditions, advancing toward truly multilingual AI systems that maintain coherent values independent of query language.
Finally, our work establishes that domain persona prompting can reduce cross-linguistic inconsistency, but comparative evaluation of alternative intervention strategies remains an important direction. Future research should systematically compare domain personas against other approaches, including cultural prompting, hybrid strategies combining demographic and professional framing, explicit fairness instructions, chain-of-thought variations, or adversarial prompting techniques. Such head-to-head comparisons would reveal which intervention types are most effective for different scenario structures, language pairs, and deployment contexts. Additionally, investigating combinations of interventions (e.g., domain persona plus explicit value alignment instructions) may yield synergistic effects beyond what single interventions are capable of achieving.
6. Conclusions
The global deployment of large language models as decision-support systems assumes that translated prompts preserve behavioral equivalence. Our findings refute this assumption and suggest the opposite: linguistic framing fundamentally shapes LLM moral reasoning, producing evident cross-linguistic gaps and complete preference reversals in distributive justice scenarios.
Domain persona prompting offers practical mitigation. Professional identity framing reduces cross-linguistic gaps by 62.7% on average by enabling compensatory integration across ethical dimensions: efficiency, equality, and self-interest. However, systematic boundary conditions also exist. Scenarios presenting isolated ethical conflict without compensatory dimensions resist intervention, revealing fundamental limits to prompt-based alignment strategies.
These findings bridge human psycholinguistics and AI alignment research. LLMs exhibit language-dependent moral reasoning similar to the foreign language effect observed in humans, while these computational agents demonstrate greater amenability to intervention through explicit professional framing. This suggests a path towards culturally neutral expert frameworks that can partially override language-specific values learned during pretraining, though with limitations.
For practitioners, the implications are clear. Multilingual deployments cannot rely on translation alone, as models aligned to human preferences in one language may diverge substantially in others. Persona-based framing provides an immediately actionable mitigation requiring only prompt engineering, but effectiveness also depends on scenario structure. For researchers, cross-linguistic consistency should be evaluated as a primary alignment metric alongside within-language performance.
As LLMs increasingly mediate consequential decisions across linguistically diverse populations, ensuring consistent moral reasoning becomes paramount. Our work demonstrates that such consistency is achievable but not automatic, requiring deliberate intervention informed by scenario structure and linguistic framing mechanisms. Building multilingual AI systems with language-independent values remains an open challenge.