Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (315)

Search Parameters:
Keywords = linguistics domain

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 341 KB  
Review
The Role of Artificial Intelligence in Enhancing ESG Disclosure Quality in Accounting
by Jiacheng Liu, Ye Yuan and Zhelun Zhu
J. Risk Financial Manag. 2026, 19(1), 58; https://doi.org/10.3390/jrfm19010058 - 9 Jan 2026
Viewed by 103
Abstract
As corporate sustainability reporting evolves into a pivotal resource for investors, regulators, and stakeholders, the imperative to evaluate and elevate ESG disclosure quality intensifies amid persistent challenges like opacity, inconsistency, and greenwashing. This review synthesizes interdisciplinary insights from accounting, finance, and computational linguistics [...] Read more.
As corporate sustainability reporting evolves into a pivotal resource for investors, regulators, and stakeholders, the imperative to evaluate and elevate ESG disclosure quality intensifies amid persistent challenges like opacity, inconsistency, and greenwashing. This review synthesizes interdisciplinary insights from accounting, finance, and computational linguistics on artificial intelligence (AI), particularly natural language processing (NLP) and machine learning (ML), as a transformative force in this domain. We delineate ESG disclosure quality across four operational dimensions: readability, comparability, informativeness, and credibility. By integrating cutting-edge methodological innovations (e.g., transformer-based models for semantic analysis), empirical linkages between AI-extracted signals and market/governance outcomes, and normative discussions on AI’s auditing potential, we demonstrate AI’s efficacy in scaling measurement, harmonizing heterogeneous narratives, and prototyping greenwashing detection. Nonetheless, causal evidence linking managerial AI adoption to stakeholder-perceived enhancements remains limited, compounded by biases in multilingual applications and interpretability deficits. We propose a forward-looking agenda, prioritizing cross-lingual benchmarking, curated greenwashing datasets, AI-assurance pilots, and interpretability standards, to harness AI for substantive, equitable improvements in ESG reporting and accountability. Full article
30 pages, 588 KB  
Article
Comparative Performance Analysis of Large Language Models for Structured Data Processing: An Evaluation Framework Applied to Bibliometric Analysis
by Maryam Abbasi, Paulo Váz, José Silva, Filipe Cardoso, Filipe Sá and Pedro Martins
Appl. Sci. 2026, 16(2), 669; https://doi.org/10.3390/app16020669 - 8 Jan 2026
Viewed by 138
Abstract
The proliferation of Large Language Models (LLMs) has transformed natural language processing (NLP) applications across diverse domains. This paper presents a comprehensive comparative analysis of three state-of-the-art language models—GPT-4o, Claude-3, and Julius AI—evaluating their performance across systematic NLP tasks using standardized datasets and [...] Read more.
The proliferation of Large Language Models (LLMs) has transformed natural language processing (NLP) applications across diverse domains. This paper presents a comprehensive comparative analysis of three state-of-the-art language models—GPT-4o, Claude-3, and Julius AI—evaluating their performance across systematic NLP tasks using standardized datasets and evaluation frameworks. We introduce a reusable evaluation methodology incorporating five distinct prompt engineering techniques (Prefix, Cloze, Anticipatory, Heuristic, and Chain of Thought) applied to three categories of linguistic challenges: data extraction, aggregation, and contextual reasoning. Using a bibliometric analysis use case as our evaluation domain, we demonstrate the framework’s application to structured data processing tasks common in academic research, business intelligence, and data analytics applications. Our experimental design utilized a curated Scopus bibliographic dataset containing 3212 academic publications to ensure reproducible and objective comparisons, representing structured data processing tasks. The results demonstrated significant performance variations across models and tasks, with GPT-4o achieving 89.3% average accuracy, Julius AI reaching 85.7%, and Claude-3 demonstrating 72.1%. The results demonstrated significant performance variations across models and tasks, with Claude-3 showing notably high prompt sensitivity (consistency score: 74.3%, compared with GPT-4o: 91.2% and Julius AI: 86.7%). This study revealed critical insights into prompt sensitivity, contextual understanding limitations, and the effectiveness of different prompting strategies for specific task categories. Statistical analysis using repeated measures ANOVA and pairwise t-tests with Bonferroni’s correction confirmed significant differences between models (F(2, 132) = 142.3, p < 0.001), with effect sizes ranging from 0.51 to 1.33. Response time analysis showed task-dependent latency patterns: for data extraction tasks, Claude-3 averaged 1.9 s (fastest), GPT-4o 2.1 s, and Julius AI 2.8 s; however, for contextual reasoning tasks, latency increased as follows for Claude-3: 3.8 s, GPT-4o: 4.5 s, and Julius AI: 5.8 s. Overall averages were as follows for GPT-4o: 3.2 s, Julius AI: 4.1 s, and Claude-3: 2.8 s. While specific performance metrics reflect current model versions (GPT-4o: gpt-4o-2024-05-13; Claude-3 Opus: 20240229; Julius AI: v2.1.4), the evaluation framework provides a reusable methodology for ongoing LLM assessment as new versions emerge. These findings provide practical guidance for researchers and practitioners in selecting appropriate LLMs for domain-specific applications and highlight areas requiring further development in language model capabilities. While demonstrated on bibliometric data, this evaluation framework is generalizable to other structured data processing domains. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

19 pages, 463 KB  
Review
Family Caregiver Burden in Providing Home Healthcare for Migrant Older Adults: A Scoping Review
by Areej Al-Hamad, Yasin M. Yasin, Lujain Yasin and Shrishti Kumar
Fam. Sci. 2026, 2(1), 2; https://doi.org/10.3390/famsci2010002 - 8 Jan 2026
Viewed by 86
Abstract
Background/Objectives: Family members are the principal providers of home-based care for migrant older adults. Linguistic, cultural, and structural barriers within health systems exacerbate the caregiver burden across emotional, physical and financial domains. Although home healthcare services may alleviate this burden, variability in access, [...] Read more.
Background/Objectives: Family members are the principal providers of home-based care for migrant older adults. Linguistic, cultural, and structural barriers within health systems exacerbate the caregiver burden across emotional, physical and financial domains. Although home healthcare services may alleviate this burden, variability in access, cultural safety, and care coordination can also intensify it. This scoping review maps the evidence on the burden experienced by family caregivers who deliver home-based healthcare to migrant older adults and examines how these arrangements affect caregivers’ health and well-being. It synthesizes the literature on facilitators and barriers—including access, cultural-linguistic fit, coordination with formal services, and legal/immigration constraints—and distills implications for policy and practice to strengthen equitable, culturally responsive home care. Method: The Joanna Briggs Institute (JBI) scoping review framework was used to conduct the review. A comprehensive search was performed across six databases (CINAHL, Scopus, Web of Science, PsycINFO, MEDLINE and Sociological Abstracts) for articles published between 2000 and 2025. Studies were selected based on predefined inclusion criteria focusing on the family caregiver burden in providing home healthcare for migrant older adults. Data extraction and thematic analysis were conducted to identify key themes. Results: The review identified 20 studies across various geographical regions, highlighting four key themes: (1) Multidimensional Caregiver Burden, (2) The Influence of Gender, Family Hierarchy, and Migratory Trajectories on Caregiving, (3) Limited Access to Formal and Culturally Appropriate Support, and (4) Health Outcomes, Coping, and the Need for Community-Based Solutions. Conclusions: System-level reforms are required to advance equity in home healthcare for aging migrants. Priorities include establishing accountable cultural-safety training for providers; expanding multilingual access across intake, assessment, and follow-up; and formally recognizing and resourcing family caregivers (e.g., navigation support, respite, training, and financial relief). Investment in community-driven programs, frameworks and targeted outreach—co-designed with migrant communities—can mitigate isolation and improve uptake. While home healthcare is pivotal, structural inequities and cultural barriers continue to constrain equitable access. Addressing these gaps demands coordinated policy action, enhanced provider preparation, and culturally responsive care models. Future research should evaluate innovative frameworks that integrate community partnerships and culturally responsive practices to reduce the caregiver burden and improve outcomes for migrant families. Full article
Show Figures

Figure 1

20 pages, 657 KB  
Article
Evaluating Intralingual Machine Translation Quality: Application of an Adapted MQM Scheme to German Plain Language
by Silvana Deilen, Sergio Hernández Garrido, Ekaterina Lapshinova-Koltunski, Chris Maaß and Annie Werner
Information 2026, 17(1), 53; https://doi.org/10.3390/info17010053 - 6 Jan 2026
Viewed by 112
Abstract
This paper presents the results of a study in which we conducted a fine-grained error analysis for intralingual machine translations into Plain Language. As there are no established error schemes for intralingual translations, we adapted the MQM scheme to fit the purposes of [...] Read more.
This paper presents the results of a study in which we conducted a fine-grained error analysis for intralingual machine translations into Plain Language. As there are no established error schemes for intralingual translations, we adapted the MQM scheme to fit the purposes of intralingual translation and expanded the scheme by error categories that are only relevant to intralingual translation. Our study has revealed that substantial differences exist between general-purpose and domain-specific models, with fine-tuned systems achieving notably higher accuracy and fewer severe errors across most categories. We found that across all four models, most errors occurred in the “Accuracy” category, closely followed by errors in the “Linguistic conventions” category and that all evaluated models produced persistent issues, particularly in terms of accuracy, linguistic conventions, and alignment with the target audience. In addition, we identified subcategories from the MQM scheme that are primarily relevant to interlingual translation, such as “Textual conventions”. Furthermore, we found that manual error annotation is resource-intensive and subjective, highlighting the urgent need for the development of automatic or semi-automatic error annotation tools. We also discuss difficulties that arose in the annotation process and show how methodological limitations might be overcome in future studies. Our findings provide practical directions for improving both machine translation technology and quality assurance frameworks for intralingual translation into Plain Language. Full article
(This article belongs to the Special Issue Human and Machine Translation: Recent Trends and Foundations)
Show Figures

Figure 1

35 pages, 3498 KB  
Article
PSYCH—Psychometric Assessment of Large Language Model Characters: An Exploration of the German Language
by Nane Kratzke, Niklas Beuter, André Drews and Monique Janneck
Analytics 2026, 5(1), 5; https://doi.org/10.3390/analytics5010005 - 6 Jan 2026
Viewed by 173
Abstract
Background: Existing evaluations of large language models (LLMs) largely emphasize linguistic and factual performance, while their psychometric characteristics and behavioral biases remain insufficiently examined, particularly beyond English-language contexts. This study presents a systematic psychometric screening of LLMs in German using the validated Big [...] Read more.
Background: Existing evaluations of large language models (LLMs) largely emphasize linguistic and factual performance, while their psychometric characteristics and behavioral biases remain insufficiently examined, particularly beyond English-language contexts. This study presents a systematic psychometric screening of LLMs in German using the validated Big Five Inventory-2 (BFI-2). Methods: Thirty-two contemporary commercial and open-source LLMs completed all 60 BFI-2 items 60 times each (once with and once without having to justify their answers), yielding over 330,000 responses. Models answered independently, under male and female impersonation, and with and without required justifications. Responses were compared to German human reference data using Welch’s t-tests (p<0.01) to assess deviations, response stability, justification effects, and gender differences. Results: At the domain level, LLM personality profiles broadly align with human means. Facet-level analyses, however, reveal systematic deviations, including inflated agreement—especially in Agreeableness and Aesthetic Sensitivity—and reduced Negative Emotionality. Only a few models show minimal deviations. Justification prompts significantly altered responses in 56% of models, often increasing variability. Commercial models exhibited substantially higher response stability than open-source models. Gender impersonation affected up to 25% of BFI-2 items, reflecting and occasionally amplifying human gender differences. Conclusions: This study introduces a reproducible psychometric framework for benchmarking LLM behavior against validated human norms and shows that LLMs produce stable yet systematically biased personality-like response patterns. Psychometric screening could therefore complement traditional LLM evaluation in sensitive applications. Full article
Show Figures

Figure 1

16 pages, 258 KB  
Article
The Cosmic Extension of Fin: Aesthetics of Perceptual, Reflexive and Sensual Temporality in Nabokov’s Ada
by Juan Wu
Humanities 2026, 15(1), 10; https://doi.org/10.3390/h15010010 - 5 Jan 2026
Viewed by 135
Abstract
Fin-de-siècle decadence—marked by symbolism, dandyism, aesthetic withdrawal, and defiance of bourgeois norms—has long been reimagined beyond its original European contours. Vladimir Nabokov’s Ada or Ardor: A Family Chronicle exemplifies this transformation by extending decadent aesthetics into the domains of modern physics, perception, and [...] Read more.
Fin-de-siècle decadence—marked by symbolism, dandyism, aesthetic withdrawal, and defiance of bourgeois norms—has long been reimagined beyond its original European contours. Vladimir Nabokov’s Ada or Ardor: A Family Chronicle exemplifies this transformation by extending decadent aesthetics into the domains of modern physics, perception, and experimental temporality. While Ada is often read as a retreat into aestheticism, this paper argues that Nabokov reconfigures decadence through a radical engagement with time, science, and sensual consciousness. Through Van Veen’s philosophical treatise “The Texture of Time”—a burlesque of Bergsonian introspection—Nabokov constructs a vision of purified, de-spatialized, and self-reflexive time that destabilises the boundary between decadent and modernist aesthetics. The novel fuses metaphysical decadence with Bergsonian duration, creating a poetic meditation on temporality as both perceptual and sensual experience. Through intricate linguistic play—anagrams, palindromes, and recursive narrative structures—Nabokov fashions a labyrinthine temporality that mirrors the paradoxes of the decadent imagination: time that is linear yet cyclical, finite yet infinitely recurrent. Positioning Ada within broader debates on the afterlife of decadence, this paper examines how Nabokov preserves the movement’s aesthetic essence while transforming it through scientific analogy and linguistic experimentation. Ada simultaneously honours and subverts decadence, reimagining its hedonism and nostalgia within a cosmological framework that renders temporality itself a site of aesthetic play and metaphysical desire. Full article
(This article belongs to the Special Issue The Use and Misuse of Fin-De-Siècle Decadence and Its Imagination)
17 pages, 1042 KB  
Article
Cross-Cultural Identification of Acoustic Voice Features for Depression: A Cross-Sectional Study of Vietnamese and Japanese Datasets
by Phuc Truong Vinh Le, Mitsuteru Nakamura, Masakazu Higuchi, Lanh Thi My Vuu, Nhu Huynh and Shinichi Tokuno
Bioengineering 2026, 13(1), 33; https://doi.org/10.3390/bioengineering13010033 - 27 Dec 2025
Viewed by 354
Abstract
Acoustic voice analysis demonstrates potential as a non-invasive biomarker for depression, yet its generalizability across languages remains underexplored. This cross-sectional study aimed to identify a set of cross-culturally consistent acoustic features for depression screening using distinct Vietnamese and Japanese voice datasets. We analyzed [...] Read more.
Acoustic voice analysis demonstrates potential as a non-invasive biomarker for depression, yet its generalizability across languages remains underexplored. This cross-sectional study aimed to identify a set of cross-culturally consistent acoustic features for depression screening using distinct Vietnamese and Japanese voice datasets. We analyzed anonymized recordings from 251 participants, comprising 123 Vietnamese individuals assessed via the self-report Beck Depression Inventory (BDI) and 128 Japanese individuals assessed via the clinician-rated Hamilton Depression Rating Scale (HAM-D). From 6373 features extracted with openSMILE, a multi-stage selection pipeline identified 12 cross-cultural features, primarily from the auditory spectrum (AudSpec), Mel-Frequency Cepstral Coefficients (MFCCs), and logarithmic Harmonics-to-Noise Ratio (logHNR) domains. The cross-cultural model achieved a combined Area Under the Curve (AUC) of 0.934, with performance disparities observed between the Japanese (AUC = 0.993) and Vietnamese (AUC = 0.913) cohorts. This disparity may be attributed to dataset heterogeneity, including mismatched diagnostic tools and differing sample compositions (clinical vs. mixed community). Furthermore, the limited number of high-risk cases (n = 33) warrants cautious interpretation regarding the reliability of reported AUC values for severe depression classification. These findings suggest the presence of a core acoustic signature related to physiological psychomotor changes that may transcend linguistic boundaries. This study advances the exploration of global vocal biomarkers but underscores the need for prospective, standardized multilingual trials to overcome the limitations of secondary data analysis. Full article
(This article belongs to the Special Issue Voice Analysis Techniques for Medical Diagnosis)
Show Figures

Figure 1

13 pages, 284 KB  
Article
Two-Stage Domain Adaptation for LLM-Based ASR by Decoupling Linguistic and Acoustic Factors
by Lin Zheng, Xuyang Wang, Qingwei Zhao and Ta Li
Appl. Sci. 2026, 16(1), 60; https://doi.org/10.3390/app16010060 - 20 Dec 2025
Viewed by 303
Abstract
Large language models (LLMs) have been increasingly applied in Automatic Speech Recognition (ASR), achieving significant advancements. However, the performance of LLM-based ASR (LLM-ASR) models remains unsatisfactory when applied across domains due to domain shifts between acoustic and linguistic conditions. To address this challenge, [...] Read more.
Large language models (LLMs) have been increasingly applied in Automatic Speech Recognition (ASR), achieving significant advancements. However, the performance of LLM-based ASR (LLM-ASR) models remains unsatisfactory when applied across domains due to domain shifts between acoustic and linguistic conditions. To address this challenge, we propose a decoupled two-stage domain adaptation framework that separates the adaptation process into text-only and audio-only stages. In the first stage, we leverage abundant text data from the target domain to refine the LLM component, thereby improving its contextual and linguistic alignment with the target domain. In the second stage, we employ a pseudo-labeling method with unlabeled audio data in the target domain and introduce two key enhancements: (1) incorporating decoupled auxiliary Connectionist Temporal Classification (CTC) loss to improve the robustness of the speech encoder under different acoustic conditions; (2) adopting a synchronous LLM tuning strategy, allowing the LLM to continuously learn linguistic alignment from pseudo-labeled transcriptions enriched with domain textual knowledge. The experimental results demonstrate that our proposed methods significantly improve the performance of LLM-ASR in the target domain, achieving a relative word error rate reduction of 19.2%. Full article
(This article belongs to the Special Issue Speech Recognition: Techniques, Applications and Prospects)
Show Figures

Figure 1

36 pages, 1158 KB  
Article
A Novel Linguistic Framework for Dynamic Multi-Criteria Group Decision-Making Using Hedge Algebras
by Hoang Van Thong, Luu Quoc Dat, Nguyen Cat Ho and Nhu Van Kien
Appl. Sci. 2026, 16(1), 30; https://doi.org/10.3390/app16010030 - 19 Dec 2025
Viewed by 253
Abstract
Dynamic multi-criteria group decision-making (MCGDM) is widely applied in complex real-world settings where multiple experts evaluate alternatives across diverse criteria under uncertain and evolving conditions. This study proposes a transparent and interpretable linguistic (L-) framework for dynamic MCGDM grounded in hedge algebras (HA), [...] Read more.
Dynamic multi-criteria group decision-making (MCGDM) is widely applied in complex real-world settings where multiple experts evaluate alternatives across diverse criteria under uncertain and evolving conditions. This study proposes a transparent and interpretable linguistic (L-) framework for dynamic MCGDM grounded in hedge algebras (HA), a mathematical formalism that provides explicit algebraic and semantic structures for L-domains. A novel binary L-aggregation operator is developed using the 4-tuple semantic representation of HA, ensuring closure, commutativity, monotonicity, partial associativity, the existence of an identity element, and semantic consistency throughout the aggregation process. Using this operator, a two-stage dynamic decision-making model is developed—(i) L-FAHP for determining the criterion weights in dynamic environments, and (ii) L-FTOPSIS for ranking alternatives—where both methods are formulated on HA L-scales. To address temporal dynamics, a dynamic L-aggregation mechanism is further proposed to integrate current expert judgments with historical evaluations through a semantic decay factor, enabling the controlled attenuation of outdated information. A case study on enterprise digital transformation readiness illustrates that the proposed framework enhances semantic interpretability, maintains stability under uncertainty, and more accurately captures the temporal evolution of expert assessments. These results underscore the practical value and applicability of the HA-based dynamic L-approach in complex decision environments where expert knowledge and temporal variability are critical. Full article
(This article belongs to the Section Electrical, Electronics and Communications Engineering)
Show Figures

Figure 1

20 pages, 597 KB  
Article
The Language of Numbers: Reading Comprehension and Applied Math Problem-Solving
by Dana Sury and Lia Pilchin
Behav. Sci. 2025, 15(12), 1746; https://doi.org/10.3390/bs15121746 - 17 Dec 2025
Viewed by 679
Abstract
Reading and mathematics are intricately linked through shared cognitive processes that underpin developmental relationships across domains. Despite extensive research on early-grade links between reading and basic arithmetic, gaps persist in understanding how reading comprehension (RC) supports applied math problem-solving (AMP) in older students [...] Read more.
Reading and mathematics are intricately linked through shared cognitive processes that underpin developmental relationships across domains. Despite extensive research on early-grade links between reading and basic arithmetic, gaps persist in understanding how reading comprehension (RC) supports applied math problem-solving (AMP) in older students and non-English contexts. The current study investigates the grade-level relationship between RC and AMP in typically developing Hebrew-speaking fourth (N = 41) and eleventh graders (N = 43), focusing on the contributions of working memory (WM), reading fluency, and arithmetic fluency. Results indicated significant positive associations between RC and AMP in both age groups. In fourth graders, arithmetic fluency partially statistically mediated the RC-AMP relationship in a cross-sectional mediation model. This indicates that students rely on computational proficiency to translate textual understanding into solutions. In contrast, eleventh graders exhibited a direct RC-AMP link, reflecting advanced comprehension and metacognitive strategies as computational skills are automatized. WM showed stronger correlations with RC and AMP among younger students, whereas these associations were weaker in older students. These findings support a Developmental Linguistic–Cognitive Scaffold Model, highlighting age-related shifts in cognitive and linguistic mechanisms supporting AMP. The results emphasize the need for integrated curricula incorporating RC strategies to enhance mathematical reasoning, particularly in morphologically rich languages like Hebrew. Full article
Show Figures

Figure 1

36 pages, 8767 KB  
Article
AI-Powered Multimodal System for Haiku Appreciation Based on Intelligent Data Analysis: Validation and Cross-Cultural Extension Potential
by Renjie Fan and Yuanyuan Wang
Electronics 2025, 14(24), 4921; https://doi.org/10.3390/electronics14244921 - 15 Dec 2025
Viewed by 348
Abstract
This study proposes an artificial intelligence (AI)-powered multimodal system designed to enhance the appreciation of traditional poetry, using Japanese haiku as the primary application domain. At the core of the system is an intelligent data analysis pipeline that extracts key emotional features from [...] Read more.
This study proposes an artificial intelligence (AI)-powered multimodal system designed to enhance the appreciation of traditional poetry, using Japanese haiku as the primary application domain. At the core of the system is an intelligent data analysis pipeline that extracts key emotional features from poetic texts. A fine-tuned Japanese BERT model is employed to compute three affective indices—valence, energy, and dynamism—which form a quantitative emotional representation of each haiku. These features guide a generative AI workflow: ChatGPT constructs structured image prompts based on the extracted affective cues and contextual information, and these prompts are used by DALL·E to synthesize stylistically consistent watercolor illustrations. Simultaneously, background music is automatically selected from an open-source collection by matching each poem’s affective vector with that of instrumental tracks, producing a coherent multimodal (text, image, sound) experience. A series of validation experiments demonstrated the reliability and stability of the extracted emotional features, as well as their effectiveness in supporting consistent cross-modal alignment. These results indicate that poetic emotion can be represented within a low-dimensional affective space and used as a bridge across linguistic and artistic modalities. The proposed framework illustrates a novel integration of affective computing and natural language processing (NLP) within cultural computing. Because the underlying emotional representation is linguistically agnostic, the system holds strong potential for cross-cultural extensions, including applications to Chinese classical poetry and other forms of traditional literature. Full article
Show Figures

Figure 1

32 pages, 1073 KB  
Article
Cross-Linguistic Moral Preferences in Large Language Models: Evidence from Distributive Justice Scenarios and Domain Persona Interventions
by Seongyu Jang, Chaewon Jeong, Jimin Kim and Hyungu Kahng
Electronics 2025, 14(24), 4919; https://doi.org/10.3390/electronics14244919 - 15 Dec 2025
Viewed by 492
Abstract
Large language models (LLMs) increasingly serve as decision-support systems across linguistically diverse populations, yet whether they reason consistently across languages remains underexplored. We investigate whether LLMs exhibit language-dependent preferences in distributive justice scenarios and whether domain persona prompting can reduce cross-linguistic inconsistencies. Using [...] Read more.
Large language models (LLMs) increasingly serve as decision-support systems across linguistically diverse populations, yet whether they reason consistently across languages remains underexplored. We investigate whether LLMs exhibit language-dependent preferences in distributive justice scenarios and whether domain persona prompting can reduce cross-linguistic inconsistencies. Using six behavioral economics scenarios adapted from canonical social preferences research, we evaluate Gemini 2.0 Flash across English and Korean in both baseline and persona-injected conditions, yielding 1,201,200 observations across ten professional domains. Results reveal substantial baseline cross-linguistic divergence: five of six scenarios exhibit significant language effects (9–56 percentage point gaps), including complete preference reversals. Domain persona injection reduces these gaps by 62.7% on average, with normative disciplines (sociology, economics, law, philosophy, and history) demonstrating greater effectiveness than technical domains. Systematic boundary conditions emerge: scenarios presenting isolated ethical conflict resist intervention. These findings parallel human foreign-language effects in moral psychology while demonstrating that computational agents are more amenable to alignment interventions. We propose a compensatory integration framework explaining when professional framing succeeds or fails, providing practical guidance for multilingual LLM deployment, and establishing cross-linguistic consistency as a critical alignment metric. Full article
Show Figures

Figure 1

14 pages, 2851 KB  
Article
Automated Building of a Multidialectal Parallel Arabic Corpus Using Large Language Models
by Khalid Almeman
Data 2025, 10(12), 208; https://doi.org/10.3390/data10120208 - 12 Dec 2025
Viewed by 674
Abstract
The development of Natural Language Processing applications tailored for diverse Arabic-speaking users requires specialized Arabic corpora, which are currently lacking in existing Arabic linguistic resources. Therefore, in this study, a multidialectal parallel Arabic corpus is built, focusing on the travel and tourism domain. [...] Read more.
The development of Natural Language Processing applications tailored for diverse Arabic-speaking users requires specialized Arabic corpora, which are currently lacking in existing Arabic linguistic resources. Therefore, in this study, a multidialectal parallel Arabic corpus is built, focusing on the travel and tourism domain. By leveraging the text generation and dialectal transformation capabilities of Large Language Models, an initial set of approximately 100,000 parallel sentences was generated. Following a rigorous multi-stage deduplication process, 50,010 unique parallel sentences were obtained from Modern Standard Arabic (MSA) and five major Arabic dialects—Saudi, Egyptian, Iraqi, Levantine, and Moroccan. This study presents the detailed methodology of corpus generation and refinement, describes the characteristics of the generated corpus, and provides a comprehensive statistical analysis highlighting the corpus size, lexical diversity, and linguistic overlap between MSA and the five dialects. This corpus represents a valuable resource for researchers and developers in Arabic dialect processing and AI applications that require nuanced contextual understanding. Full article
Show Figures

Figure 1

35 pages, 2974 KB  
Article
Multi-Agent Coordination Strategies vs. Retrieval-Augmented Generation in LLMs: A Comparative Evaluation
by Irina Radeva, Ivan Popchev, Lyubka Doukovska and Miroslava Dimitrova
Electronics 2025, 14(24), 4883; https://doi.org/10.3390/electronics14244883 - 11 Dec 2025
Viewed by 901
Abstract
This paper evaluates multi-agent coordination strategies against single-agent retrieval-augmented generation (RAG) for open-source language models. Four coordination strategies (collaborative, sequential, competitive, hierarchical) were tested across Mistral 7B, Llama 3.1 8B, and Granite 3.2 8B using 100 domain-specific question–answer pairs (3100 total evaluations). Performance [...] Read more.
This paper evaluates multi-agent coordination strategies against single-agent retrieval-augmented generation (RAG) for open-source language models. Four coordination strategies (collaborative, sequential, competitive, hierarchical) were tested across Mistral 7B, Llama 3.1 8B, and Granite 3.2 8B using 100 domain-specific question–answer pairs (3100 total evaluations). Performance was assessed using Composite Performance Score (CPS) and Threshold-aware CPS (T-CPS), aggregating nine metrics spanning lexical, semantic, and linguistic dimensions. Under the tested conditions, all 28 multi-agent configurations showed degradation relative to single-agent baselines, ranging from −4.4% to −35.3%. Coordination overhead was identified as a primary contributing factor. Llama 3.1 8B tolerated Sequential and Hierarchical coordination with minimal degradation (−4.9% to −5.3%). Mistral 7B with shared context retrieval achieved comparable results. Granite 3.2 8B showed degradation of 14–35% across all strategies. Collaborative coordination exhibited the largest degradation across all models. Study limitations include evaluation on a single domain (agriculture), use of 7–8B parameter models, and homogeneous agent architectures. These findings suggest that single-agent RAG may be preferable for factual question-answering tasks in local deployment scenarios with computational constraints. Future research should explore larger models, heterogeneous agent teams, role-specific prompting, and advanced consensus mechanisms. Full article
Show Figures

Figure 1

21 pages, 2065 KB  
Article
Reading and Writing Abilities in Students with Mild Nonspecific Intellectual Disability: A Multivariate Examination of Literacy and Cognitive Processing Abilities
by Urszula Sajewicz-Radtke, Ariadna Beata Łada-Maśko, Paweł Jurek, Michał Olech and Bartosz Mikołaj Radtke
J. Intell. 2025, 13(12), 161; https://doi.org/10.3390/jintelligence13120161 - 8 Dec 2025
Viewed by 562
Abstract
Individuals with mild nonspecific intellectual disability (NSID) often exhibit delayed literacy development. Unfortunately, how cognitive–linguistic processing profiles influence literacy in this population lacks clarity. This study investigated literacy development in this population, considering the cognitive–linguistic mechanisms. The Specialist Battery for the Diagnosis of [...] Read more.
Individuals with mild nonspecific intellectual disability (NSID) often exhibit delayed literacy development. Unfortunately, how cognitive–linguistic processing profiles influence literacy in this population lacks clarity. This study investigated literacy development in this population, considering the cognitive–linguistic mechanisms. The Specialist Battery for the Diagnosis of Cognitive Abilities and School Skills was used to assess cognitive–linguistic abilities and literacy-related skills in 122 participants. Fuzzy C-means clustering was used to identify processing profiles. Developmental age equivalents in literacy were estimated using local regression models and matched comparisons with typically developing peers. Two cognitive–linguistic profiles emerged: globally weaker and moderately developed. Those with NSID performed significantly lower than their peers in all domains. Their literacy skills aligned with those of children 2–4 years younger, and plateaued after age 15. Cognitive–linguistic heterogeneity in students with NSID should guide targeted literacy interventions. The findings inform ICD-11 educational expectations for individuals with mild NSID. Full article
(This article belongs to the Special Issue Intelligence Testing and Its Role in Academic Achievement)
Show Figures

Figure 1

Back to TopTop