Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (79)

Search Parameters:
Keywords = linguistic style

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 336 KiB  
Article
Mitigation, Rapport, and Identity Construction in Workplace Requests
by Spyridoula Bella
Languages 2025, 10(8), 179; https://doi.org/10.3390/languages10080179 - 25 Jul 2025
Viewed by 289
Abstract
This study investigates how Greek professionals formulate upward requests and simultaneously manage rapport and workplace identity within hierarchical exchanges. The data comprise 400 written requests elicited through a discourse–completion task from 100 participants, supplemented by follow-up interviews. Integrating pragmatic perspectives on request mitigation [...] Read more.
This study investigates how Greek professionals formulate upward requests and simultaneously manage rapport and workplace identity within hierarchical exchanges. The data comprise 400 written requests elicited through a discourse–completion task from 100 participants, supplemented by follow-up interviews. Integrating pragmatic perspectives on request mitigation with Spencer-Oatey’s Rapport-Management model and a social constructionist perspective on identity, the analysis reveals a distinctive “direct-yet-mitigated” style: syntactically direct head acts (typically want- or need-statements) various mitigating devices. This mitigation enables speakers to preserve superiors’ face, assert entitlement, and invoke shared corporate goals in a single move. Crucially, rapport work is intertwined with identity construction. Strategic oscillation between deference and entitlement projects four recurrent professional personae: the deferential subordinate, the competent and deserving employee, the cooperative team-player, and the rights-aware negotiator. Speakers shift among these personae to calibrate relational distance, demonstrating that rapport management functions not merely as a politeness calculus but as a resource for dynamic identity performance. This study thus bridges micro-pragmatic choices and macro social meanings, showing how linguistic mitigation safeguards interpersonal harmony while scripting desirable workplace selves. Full article
(This article belongs to the Special Issue Greek Speakers and Pragmatics)
19 pages, 3338 KiB  
Article
Researching Stylistic Neutrality for Map Evaluation
by Rita Viliuviene and Sonata Vdovinskiene
ISPRS Int. J. Geo-Inf. 2025, 14(7), 278; https://doi.org/10.3390/ijgi14070278 - 16 Jul 2025
Viewed by 169
Abstract
Stylistic neutrality is the basis for the stylistic evaluation of maps. Furthermore, the stylistic neutrality of a map as a cartographic text may be related to objectivity. However, what constitutes stylistic neutrality is not clearly stated in the field of cartography. The problem [...] Read more.
Stylistic neutrality is the basis for the stylistic evaluation of maps. Furthermore, the stylistic neutrality of a map as a cartographic text may be related to objectivity. However, what constitutes stylistic neutrality is not clearly stated in the field of cartography. The problem is complicated by the fact that the stylistically neutral image is a hypothetical image. The aim of this research is to investigate stylistic neutrality by exploring the peculiarities of cartographic language functioning in different fields of social activity. The research combines descriptive analysis, stylistic analysis, cartographic and interpretative methods. Firstly, the research reveals the concept of cartographic stylistic neutrality, in line with the cartographic linguistic paradigm. Secondly, an analysis of the characteristics of cartographic language in different fields of social activity from the point of view of stylistic neutrality is carried out. Thirdly, an example is developed to illustrate stylistic cartographic neutrality. Stylistic neutrality is characterised by the stylistic features of cartographic language: clarity, accuracy, conciseness, calmness, abstractness, temperance, neutrality and moderateness. The style of cartographic production for inventory and research activities is closest to stylistic neutrality, while the style of reflective activity is the most expressive and acts as a source of concreteness for stylistic neutrality. Full article
Show Figures

Figure 1

19 pages, 1186 KiB  
Article
Synthetic Patient–Physician Conversations Simulated by Large Language Models: A Multi-Dimensional Evaluation
by Syed Ali Haider, Srinivasagam Prabha, Cesar Abraham Gomez-Cabello, Sahar Borna, Ariana Genovese, Maissa Trabilsy, Bernardo G. Collaco, Nadia G. Wood, Sanjay Bagaria, Cui Tao and Antonio Jorge Forte
Sensors 2025, 25(14), 4305; https://doi.org/10.3390/s25144305 - 10 Jul 2025
Viewed by 597
Abstract
Background: Data accessibility remains a significant barrier in healthcare AI due to privacy constraints and logistical challenges. Synthetic data, which mimics real patient information while remaining both realistic and non-identifiable, offers a promising solution. Large Language Models (LLMs) create new opportunities to generate [...] Read more.
Background: Data accessibility remains a significant barrier in healthcare AI due to privacy constraints and logistical challenges. Synthetic data, which mimics real patient information while remaining both realistic and non-identifiable, offers a promising solution. Large Language Models (LLMs) create new opportunities to generate high-fidelity clinical conversations between patients and physicians. However, the value of this synthetic data depends on careful evaluation of its realism, accuracy, and practical relevance. Objective: To assess the performance of four leading LLMs: ChatGPT 4.5, ChatGPT 4o, Claude 3.7 Sonnet, and Gemini Pro 2.5 in generating synthetic transcripts of patient–physician interactions in plastic surgery scenarios. Methods: Each model generated transcripts for ten plastic surgery scenarios. Transcripts were independently evaluated by three clinically trained raters using a seven-criterion rubric: Medical Accuracy, Realism, Persona Consistency, Fidelity, Empathy, Relevancy, and Usability. Raters were blinded to the model identity to reduce bias. Each was rated on a 5-point Likert scale, yielding 840 total evaluations. Descriptive statistics were computed, and a two-way repeated measures ANOVA was used to test for differences across models and metrics. In addition, transcripts were analyzed using automated linguistic and content-based metrics. Results: All models achieved strong performance, with mean ratings exceeding 4.5 across all criteria. Gemini 2.5 Pro received mean scores (5.00 ± 0.00) in Medical Accuracy, Realism, Persona Consistency, Relevancy, and Usability. Claude 3.7 Sonnet matched the scores in Persona Consistency and Relevancy and led in Empathy (4.96 ± 0.18). ChatGPT 4.5 also achieved perfect scores in Relevancy, with high scores in Empathy (4.93 ± 0.25) and Usability (4.96 ± 0.18). ChatGPT 4o demonstrated consistently strong but slightly lower performance across most dimensions. ANOVA revealed no statistically significant differences across models (F(3, 6) = 0.85, p = 0.52). Automated analysis showed substantial variation in transcript length, style, and content richness: Gemini 2.5 Pro generated the longest and most emotionally expressive dialogues, while ChatGPT 4o produced the shortest and most concise outputs. Conclusions: Leading LLMs can generate medically accurate, emotionally appropriate synthetic dialogues suitable for educational and research use. Despite high performance, demographic homogeneity in generated patients highlights the need for improved diversity and bias mitigation in model outputs. These findings support the cautious, context-aware integration of LLM-generated dialogues into medical training, simulation, and research. Full article
(This article belongs to the Special Issue Feature Papers in Smart Sensing and Intelligent Sensors 2025)
Show Figures

Figure 1

20 pages, 4254 KiB  
Article
Positional Component-Guided Hangul Font Image Generation via Deep Semantic Segmentation and Adversarial Style Transfer
by Avinash Kumar, Irfanullah Memon, Abdul Sami, Youngwon Jo and Jaeyoung Choi
Electronics 2025, 14(13), 2699; https://doi.org/10.3390/electronics14132699 - 4 Jul 2025
Viewed by 400
Abstract
Automated font generation for complex, compositional scripts like Korean Hangul presents a significant challenge due to the 11,172 characters and their complicated component-based structure. While existing component-based methods for font image generation acknowledge the compositional nature of Hangul, they often fail to explicitly [...] Read more.
Automated font generation for complex, compositional scripts like Korean Hangul presents a significant challenge due to the 11,172 characters and their complicated component-based structure. While existing component-based methods for font image generation acknowledge the compositional nature of Hangul, they often fail to explicitly leverage the crucial positional semantics of its basic elements as initial, middle, and final components, known as Jamo. This oversight can lead to structural inconsistencies and artifacts in the generated glyphs. This paper introduces a novel two-stage framework that directly addresses this gap by imposing a strong, linguistically informed structural principle on the font image generation process. In the first stage, we employ a You Only Look Once version 8 for Segmentation (YOLOv8-Seg) model, a state-of-the-art instance segmentation network, to decompose Hangul characters into their basic components. Notably, this process generates a dataset of position-aware semantic components, categorizing each jamo according to its structural role within the syllabic block. In the second stage, a conditional Generative Adversarial Network (cGAN) is explicitly conditioned on these extracted positional components to perform style transfer with high structural information. The generator learns to synthesize a character’s appearance by referencing the style of the target components while preserving the content structure of a source character. Our model achieves state-of-the-art performance, reducing L1 loss to 0.2991 and improving the Structural Similarity Index (SSIM) to 0.9798, quantitatively outperforming existing methods like MX-Font and CKFont. This position-guided approach demonstrates significant quantitative and qualitative improvements over existing methods in structured script generation, offering enhanced control over glyph structure and a promising approach for generating font images for other complex, structured scripts. Full article
(This article belongs to the Special Issue Applications of Computer Vision, 3rd Edition)
Show Figures

Figure 1

27 pages, 1023 KiB  
Article
Exploring Legislative Textual Data in Brazilian Portuguese: Readability Analysis and Knowledge Graph Generation
by Gisliany Lillian Alves de Oliveira, Breno Santana Santos, Marianne Silva and Ivanovitch Silva
Data 2025, 10(7), 106; https://doi.org/10.3390/data10070106 - 1 Jul 2025
Viewed by 562
Abstract
Legislative documents are crucial to democratic societies, defining the legal framework for social life. In Brazil, legislative texts are particularly complex due to extensive technical jargon, intricate sentence structures, and frequent references to prior legislation. The country’s civil law tradition and multicultural context [...] Read more.
Legislative documents are crucial to democratic societies, defining the legal framework for social life. In Brazil, legislative texts are particularly complex due to extensive technical jargon, intricate sentence structures, and frequent references to prior legislation. The country’s civil law tradition and multicultural context introduce further interpretative and linguistic challenges. Moreover, the study of Brazilian Portuguese legislative texts remains underexplored, lacking legal-specific models and datasets. To address these gaps, this work proposes a data-driven approach utilizing large language models (LLMs) to analyze these documents and extract knowledge graphs (KGs). A case study was conducted using 1869proposals from the Legislative Assembly of Rio Grande do Norte (ALRN), spanning January 2019 to April 2024. The Llama 3.2 3B Instruct model was employed to extract KGs representing entities and their relationships. The findings support the method’s effectiveness in producing coherent graphs faithful to the original content. Nevertheless, challenges remain in resolving entity ambiguity and achieving full relationship coverage. Additionally, readability analyses using metrics for Brazilian Portuguese revealed that ALRN proposals require superior reading skills due to their technical style. Ultimately, this study advances legal artificial intelligence by providing insights into Brazilian legislative texts and promoting transparency and accessibility through natural language processing techniques. Full article
Show Figures

Figure 1

22 pages, 1291 KiB  
Article
Linguistic Summarization and Outlier Detection of Blended Learning Data
by Pham Dinh Phong, Pham Thi Lan and Tran Xuan Thanh
Appl. Sci. 2025, 15(12), 6644; https://doi.org/10.3390/app15126644 - 13 Jun 2025
Viewed by 472
Abstract
The linguistic summarization of data is one of the study trends in data mining because it has many useful practical applications. A linguistic summarization of data aims to extract an optimal set of linguistic summaries from numeric data. The blended learning format is [...] Read more.
The linguistic summarization of data is one of the study trends in data mining because it has many useful practical applications. A linguistic summarization of data aims to extract an optimal set of linguistic summaries from numeric data. The blended learning format is now popular in higher education at both undergraduate and graduate levels. A lot of techniques in machine learning, such as classification, regression, clustering, and forecasting, have been applied to evaluate learning activities or predict the learning outcomes of students. However, few studies have been examined to transform the data of blended learning courses into the knowledge represented as linguistic summaries. This paper proposes a method of linguistic summarization of blended learning data collected from a learning management system to extract compact sets of interpretable linguistic summaries for understanding the common rules of blended learning courses by utilizing enlarged hedge algebras. Those extracted linguistic summaries in the form of sentences in natural language are easy to understand for humans. Furthermore, a method of detecting the exceptional cases or outliers of the learning courses based on linguistic summaries expressing common rules in different scenarios is also proposed. The experimental results on two real-world datasets of two learning courses of Discrete Mathematics and Introduction to Computer Science show that the proposed methods have promising practical applications. They can help students and lecturers find the best way to enhance their learning methods and teaching style. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

17 pages, 3915 KiB  
Article
Bangla Character Detection Using Enhanced YOLOv11 Models: A Deep Learning Approach
by Mahbuba Aktar, Nur Islam and Chaoyu Yang
Appl. Sci. 2025, 15(11), 6326; https://doi.org/10.3390/app15116326 - 4 Jun 2025
Viewed by 1147
Abstract
Recognising the Bangla alphabet remains a significant challenge within the fields of computational linguistics and artificial intelligence, primarily due to the script’s inherent structural complexity and wide variability in writing styles. The Bangla script is characterised by intricate ligatures, overlapping diacritics, and visually [...] Read more.
Recognising the Bangla alphabet remains a significant challenge within the fields of computational linguistics and artificial intelligence, primarily due to the script’s inherent structural complexity and wide variability in writing styles. The Bangla script is characterised by intricate ligatures, overlapping diacritics, and visually similar graphemes, all of which complicate automated recognition tasks. Despite ongoing advancements in deep learning (DL), machine learning (ML), and image processing (IP), accurately identifying Bangla characters continues to be a demanding and unresolved issue. A key limitation lies in the absence of robust detection frameworks capable of accommodating the script’s complex visual patterns and nuances. To address this gap, we propose an enhanced object detection model based on the YOLOv11 architecture, incorporating a ResNet50 backbone for improved feature extraction. The YOLOv11 framework is particularly effective in capturing discriminative features from input images, enabling real-time detection with high precision. This is especially beneficial in overcoming challenges such as character overlap and stylistic diversity, which often hinder conventional recognition techniques. Our approach was evaluated on a custom dataset comprising 50 primary Bangla characters (including vowels and consonants) along with 10 numerical digits. The proposed model achieved a recognition confidence of 99.9%, markedly outperforming existing methods in terms of accuracy and robustness. This work underscores the potential of single-shot detection models for the recognition of complex scripts such as Bangla. Beyond its technical contributions, the model has practical implications in areas including the digitisation of historical documents, the development of educational tools, and the advancement of inclusive multilingual technologies. By effectively addressing the unique challenges posed by the Bangla script, this research contributes meaningfully to both computational linguistics and the preservation of linguistic heritage. Full article
Show Figures

Figure 1

36 pages, 2347 KiB  
Article
TSTBench: A Comprehensive Benchmark for Text Style Transfer
by Yifei Xie, Jiaping Gui, Zhengping Che, Leqian Zhu, Yahao Hu and Zhisong Pan
Entropy 2025, 27(6), 575; https://doi.org/10.3390/e27060575 - 29 May 2025
Viewed by 1185
Abstract
In recent years, researchers in computational linguistics have shown a growing interest in the style of text, with a specific focus on the text style transfer (TST) task. While numerous innovative methods have been proposed, it has been observed that the existing evaluations [...] Read more.
In recent years, researchers in computational linguistics have shown a growing interest in the style of text, with a specific focus on the text style transfer (TST) task. While numerous innovative methods have been proposed, it has been observed that the existing evaluations are insufficient to validate the claims and precisely measure the performance. This challenge primarily stems from rapid advancements and diverse settings of these methods, with the associated (re)implementation and reproducibility hurdles. To bridge this gap, we introduce a comprehensive benchmark for TST known as TSTBench. TSTBench includes a codebase encompassing implementations of 13 state-of-the-art algorithms and a standardized protocol for text style transfer. Based on the codebase and protocol, we have conducted thorough experiments across seven datasets, resulting in a total of 7000+ evaluations. Our work provides extensive analysis from various perspectives, explores the performance of representative baselines across various datasets, and offers insights into the task and evaluation processes to guide future research in TST. Full article
Show Figures

Figure 1

22 pages, 6166 KiB  
Article
PCcGE: Personalized Chinese Couplet Generation and Evaluation Framework Based on Large Language Models
by Zhigeng Pan, Xianliang Xia, Fuchang Liu and Minglang Zheng
Appl. Sci. 2025, 15(9), 4996; https://doi.org/10.3390/app15094996 - 30 Apr 2025
Viewed by 416
Abstract
Couplets, consisting of a pair of clauses, are an important form of Chinese intangible cultural heritage, playing a significant role in the education and transmission of traditional Chinese culture. By engaging in couplet creation, students can enhance their Chinese comprehension and expression skills, [...] Read more.
Couplets, consisting of a pair of clauses, are an important form of Chinese intangible cultural heritage, playing a significant role in the education and transmission of traditional Chinese culture. By engaging in couplet creation, students can enhance their Chinese comprehension and expression skills, literary creativity, and cultural identity. Personalized Chinese couplet (PCc) generation entails creating paired clauses that meet specific requirements while adhering to certain linguistic rules (e.g., morphological and syntactical symmetry). However, generating PCcs and evaluating the results is a challenging task that requires both cultural context and language understanding. Large Language Models (LLMs) have powerful learning and language comprehension abilities, providing new possibilities for addressing the challenges. In this study, we propose a framework for generating and evaluating PCcs using LLMs. First, we construct a couplet database, then use a retrieval method and design a specific prompt to provide a pair of clauses as references to guide the LLM following the rules of couplet style. Second, we construct a custom PCc generation dataset to train the base model, improving its ability for this task. Finally, we introduce a debate method based on LLMs to evaluate the quality of the generated couplets. By simulating adversarial human debate processes, it obtains more comprehensive and nuanced reference data for evaluation purposes. The experimental results show that our approach effectively generates and evaluates couplets. Reduced creation difficulty promotes couplet education and the preservation of Chinese intangible cultural heritage. Positive feedback from participants indicates that our framework can enhance user engagement, offer a positive PCc creation experience, and contribute to the education and transmission of couplet culture. Full article
(This article belongs to the Special Issue Applications of Digital Technology and AI in Educational Settings)
Show Figures

Figure 1

28 pages, 3332 KiB  
Article
Classifying and Characterizing Fandom Activities: A Focus on Superfans’ Posting and Commenting Behaviors in a Digital Fandom Community
by Yeoreum Lee and Sangkeun Park
Appl. Sci. 2025, 15(9), 4723; https://doi.org/10.3390/app15094723 - 24 Apr 2025
Viewed by 2629
Abstract
As digital fandom communities expand and diversify, user engagement patterns increasingly shape the social and emotional fabric of online platforms. In the era of Industry 4.0, data-driven approaches are transforming how online communities understand and optimize user engagement. In this study, we examine [...] Read more.
As digital fandom communities expand and diversify, user engagement patterns increasingly shape the social and emotional fabric of online platforms. In the era of Industry 4.0, data-driven approaches are transforming how online communities understand and optimize user engagement. In this study, we examine how different forms of activity, specifically posting and commenting, characterize fandom engagement on Weverse, a global fan community platform. By applying a clustering approach to large-scale user data, we identify distinct subsets of heavy users, separating those who focus on creating posts (post-heavy users) from those who concentrate on leaving comments (comment-heavy users). A subsequent linguistic analysis using the Linguistic Inquiry and Word Count (LIWC) tool revealed that post-heavy users typically employ a structured, goal-oriented style with collective pronouns and formal tones, whereas comment-heavy users exhibit more spontaneous, emotionally rich expressions enhanced by personalized fandom-specific slang and extensive emoji use. Building on these findings, we propose design implications such as pinning community-driven content, offering contextual translations for fandom-specific slang, and introducing reaction matrices that address the unique needs of each group. Taken together, our results underscore the value of distinguishing multiple dimensions of engagement in digital fandoms, providing a foundation for more nuanced platform features that can enhance positive user experience, social cohesion, and sustained community growth. Full article
(This article belongs to the Special Issue Human-Computer Interaction in Smart Factory and Industry 4.0)
Show Figures

Figure 1

18 pages, 2430 KiB  
Article
The Art of Replication: Lifelike Avatars with Personalized Conversational Style
by Michele Nasser, Giuseppe Fulvio Gaglio, Valeria Seidita and Antonio Chella
Robotics 2025, 14(3), 33; https://doi.org/10.3390/robotics14030033 - 13 Mar 2025
Viewed by 1440
Abstract
This study presents an approach for developing digital avatars replicating individuals’ physical characteristics and communicative style, contributing to research on virtual interactions in the metaverse. The proposed method integrates large language models (LLMs) with 3D avatar creation techniques, using what we call the [...] Read more.
This study presents an approach for developing digital avatars replicating individuals’ physical characteristics and communicative style, contributing to research on virtual interactions in the metaverse. The proposed method integrates large language models (LLMs) with 3D avatar creation techniques, using what we call the Tree of Style (ToS) methodology to generate stylistically consistent and contextually appropriate responses. Linguistic analysis and personalized voice synthesis enhance conversational and auditory realism. The results suggest that ToS offers a practical alternative to fine-tuning for creating stylistically accurate responses while maintaining efficiency. This study outlines potential applications and acknowledges the need for further work on adaptability and ethical considerations. Full article
(This article belongs to the Special Issue Human–AI–Robot Teaming (HART))
Show Figures

Figure 1

16 pages, 1051 KiB  
Article
Kafka’s Literary Style: A Mixed-Method Approach
by Carsten Strathausen, Wenyi Shang and Andrei Kazakov
Humanities 2025, 14(3), 61; https://doi.org/10.3390/h14030061 - 12 Mar 2025
Viewed by 922
Abstract
In this essay, we examine how the polyvalence of meaning in Kafka’s texts is engineered both semantically (on the narrative level) and syntactically (on the linguistic level), and we ask whether a computational approach can shed new light on the long-standing debate about [...] Read more.
In this essay, we examine how the polyvalence of meaning in Kafka’s texts is engineered both semantically (on the narrative level) and syntactically (on the linguistic level), and we ask whether a computational approach can shed new light on the long-standing debate about the major characteristics of Kafka’s literary style. A mixed-method approach means that we seek out points of connection that interlink traditional humanist (i.e., interpretative) and computational (i.e., quantitative) methods of investigation. Following the introduction, the second section of our article provides a critical overview of the existing scholarship from both a humanist and a computational perspective. We argue that the main methodological difference between traditional humanist and AI-enhanced computational studies of Kafka’s literary style lies not in the use of statistics but in the new interpretative possibilities enabled by AI methods to explore stylistic features beyond the scope of human comprehension. In the third and fourth sections of our article, we will introduce our own stylometric approach to Kafka, detail our methods, and interpret our findings. Rather than focusing on training an AI model capable of accurately attributing authorship to Kafka, we examine whether AI could help us detect significant stylistic differences between the writing Kafka himself published during his lifetime (Kafka Core) and his posthumous writings edited and published by Max Brod. Full article
(This article belongs to the Special Issue Franz Kafka in the Age of Artificial Intelligence)
Show Figures

Figure 1

21 pages, 721 KiB  
Article
Be Sure to Use the Same Writing Style: Applying Authorship Verification on Large-Language-Model-Generated Texts
by Janith Weerasinghe, Ovendra Seepersaud, Genesis Smothers, Julia Jose and Rachel Greenstadt
Appl. Sci. 2025, 15(5), 2467; https://doi.org/10.3390/app15052467 - 25 Feb 2025
Viewed by 1927
Abstract
Recently, there have been significant advances and wide-scale use of generative AI in natural language generation. Models such as OpenAI’s GPT3 and Meta’s LLaMA are widely used in chatbots, to summarize documents, and to generate creative content. These advances raise concerns about abuses [...] Read more.
Recently, there have been significant advances and wide-scale use of generative AI in natural language generation. Models such as OpenAI’s GPT3 and Meta’s LLaMA are widely used in chatbots, to summarize documents, and to generate creative content. These advances raise concerns about abuses of these models, especially in social media settings, such as large-scale generation of disinformation, manipulation campaigns that use AI-generated content, and personalized scams. We used stylometry (the analysis of style in natural language text) to analyze the style of AI-generated text. Specifically, we applied an existing authorship verification (AV) model that can predict if two documents are written by the same author on texts generated by GPT2, GPT3, ChatGPT and LLaMA. Our AV model was trained only on human-written text and was effectively used in social media settings to analyze cases of abuse. We generated texts by providing the language models with fanfiction snippets and prompting them to complete the rest of it in the same writing style as the original snippet. We then applied the AV model across the texts generated by the language models and the human written texts to analyze the similarity of the writing styles between these texts. We found that texts generated with GPT2 had the highest similarity to the human texts. Texts generated by GPT3 and ChatGPT were very different from the human snippet, and were similar to each other. LLaMA-generated texts had some similarity to the original snippet but also has similarities with other LLaMA-generated texts and texts from other models. We then conducted a feature analysis to identify the features that drive these similarity scores. This analysis helped us answer questions like which features distinguish the language style of language models and humans, which features are different across different models, and how these linguistic features change over different language model versions. The dataset and the source code used in this analysis have been made public to allow for further analysis of new language models. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning in Social Network Analysis)
Show Figures

Figure 1

17 pages, 701 KiB  
Article
Choosing One’s Words: Conversational Indirectness and Humor Style in Two Distinct Cultural Groups
by Tanisha Y. Berrios, Dun-Ya Hu and Jyotsna Vaid
Behav. Sci. 2025, 15(3), 252; https://doi.org/10.3390/bs15030252 - 23 Feb 2025
Viewed by 1470
Abstract
We investigated cultural differences in the relationship between conversational indirectness and styles of humor use. Our study compared responses of English first language (L1) users (n = 56) and Korean first language users studying in the US (n = 32) on [...] Read more.
We investigated cultural differences in the relationship between conversational indirectness and styles of humor use. Our study compared responses of English first language (L1) users (n = 56) and Korean first language users studying in the US (n = 32) on the conversational indirectness scale) and the humor styles questionnaire. We found no overall group differences in conversational indirectness. Instead, higher indirectness for interpretation than for production was noted, but only in the English L1 group. This group also showed a positive correlation between interpretation and production scores; no such association was found in the Korean sample. On the humor style measure, scores for affiliative and self-enhancing humor were significantly higher in the English L1 group compared to the Korean group; the English L1 group also showed a positive correlation between these two dimensions, and between self-enhancing and self-defeating humor. Both groups showed low identification with self-defeating and aggressive humor styles. There was a significant positive correlation in the Korean group between these two styles. Finally, an association between conversational indirectness and humor style was noted in each group: in both groups, a significant positive correlation was found between indirectness in production and aggressive humor. Additionally, for the English L1 group a significant positive correlation was found between self-defeating humor and indirectness in production and interpretation. These findings demonstrate cultural differences in humor uses and an intriguing relationship between the tendency to produce linguistic meanings indirectly and uses of humor considered to be less positive. Full article
(This article belongs to the Special Issue Humor Use in Interpersonal Relationships)
Show Figures

Figure 1

18 pages, 307 KiB  
Article
Who Will Author the Synthetic Texts? Evoking Multiple Personas from Large Language Models to Represent Users’ Associative Thesauri
by Maxim Bakaev, Svetlana Gorovaia and Olga Mitrofanova
Big Data Cogn. Comput. 2025, 9(2), 46; https://doi.org/10.3390/bdcc9020046 - 18 Feb 2025
Viewed by 996
Abstract
Previously, it was suggested that the “persona-driven” approach can contribute to producing sufficiently diverse synthetic training data for Large Language Models (LLMs) that are currently about to run out of real natural language texts. In our paper, we explore whether personas evoked from [...] Read more.
Previously, it was suggested that the “persona-driven” approach can contribute to producing sufficiently diverse synthetic training data for Large Language Models (LLMs) that are currently about to run out of real natural language texts. In our paper, we explore whether personas evoked from LLMs through HCI-style descriptions could indeed imitate human-like differences in authorship. For this end, we ran an associative experiment with 50 human participants and four artificial personas evoked from the popular LLM-based services: GPT-4(o) and YandexGPT Pro. For each of the five stimuli words selected from university websites’ homepages, we asked both groups of subjects to come up with 10 short associations (in Russian). We then used cosine similarity and Mahalanobis distance to measure the distance between the association lists produced by different humans and personas. While the difference in the similarity was significant for different human associators and different gender and age groups, neither was the case for the different personas evoked from ChatGPT or YandexGPT. Our findings suggest that the LLM-based services so far fall short at imitating the associative thesauri of different authors. The outcome of our study might be of interest to computer linguists, as well as AI/ML scientists and prompt engineers. Full article
Back to TopTop