Generative Language Models for Personality-Based Utterances in Novels: A Character Clustering Approach
Abstract
1. Introduction
- 1.
- We present a language model that is tuned for translating a given utterance into a new utterance that better reflects a desired personality in novel. As far as we know, this is the first language model designated for generating utterances reflecting personality in novels.
- 2.
- We propose a method for building our language model. It involves (1) clustering characters using utterance words, and (2) prompt design for instruction-tuning data preparation using the character clusters.
2. Related Work
2.1. Transformer Decoder-Based Language Models
2.2. The Big Five Personality Traits (OCEAN)
3. Materials and Methods
3.1. Validity of Using Character Utterances
3.2. Generative Language Model for Character Utterances
3.2.1. Character Clustering Using Utterances
3.2.2. Instruction-Tuning Dataset Preparation
4. Experiment
4.1. Experiment Setup
4.2. Evaluation Methodology
5. Results and Discussion
Limitations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.A.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F.; et al. Llama: Open and efficient foundation language models. arXiv 2023, arXiv:2302.13971. [Google Scholar]
- Gu, X.; Chen, M.; Lin, Y.; Hu, Y.; Zhang, H.; Wan, C.; Wei, Z.; Xu, Y.; Wang, J. On the effectiveness of large language models in domain-specific code generation. ACM Trans. Softw. Eng. Methodol. 2025, 34, 1–22. [Google Scholar] [CrossRef]
- Singhal, K.; Azizi, S.; Tu, T.; Mahdavi, S.S.; Wei, J.; Chung, H.W.; Scales, N.; Tanwani, A.; Cole-Lewis, H.; Pfohl, S.; et al. Large language models encode clinical knowledge. Nature 2023, 620, 172–180. [Google Scholar] [CrossRef] [PubMed]
- Singhal, K.; Tu, T.; Gottweis, J.; Sayres, R.; Wulczyn, E.; Amin, M.; Hou, L.; Clark, K.; Pfohl, S.R.; Cole-Lewis, H.; et al. Toward expert-level medical question answering with large language models. Nat. Med. 2025, 31, 943–950. [Google Scholar] [CrossRef] [PubMed]
- Garrido-Merchán, E.C.; Arroyo-Barrigüete, J.L.; Gozalo-Brizuela, R. Simulating HP Lovecraft horror literature with the ChatGPT large language model. arXiv 2023, arXiv:2305.03429. [Google Scholar]
- Rashkin, H.; Celikyilmaz, A.; Choi, Y.; Gao, J. Plotmachines: Outline-conditioned generation with dynamic plot state tracking. arXiv 2020, arXiv:2004.14967. [Google Scholar]
- Liu, D.; Li, J.; Yu, M.H.; Huang, Z.; Liu, G.; Zhao, D.; Yan, R. A character-centric neural model for automated story generation. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 1725–1732. [Google Scholar]
- Zhu, J.; Jurgens, D. Idiosyncratic but not arbitrary: Learning idiolects in online registers reveals distinctive yet consistent individual styles. arXiv 2021, arXiv:2109.03158. [Google Scholar]
- Jafariakinabad, F.; Hua, K.A. Style-Aware Neural Model with Application in Authorship Attribution. In Proceedings of the 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), Boca Raton, FL, USA, 16–19 December 2019; pp. 325–328. [Google Scholar] [CrossRef]
- Leech, G.N.; Short, M. Style in Fiction: A linguistic Introduction to English Fictional Prose; Pearson Education: Harlow, UK, 2007; Number 13. [Google Scholar]
- Eisen, M.; Ribeiro, A.; Segarra, S.; Egan, G. Stylometric analysis of Early Modern period English plays. Digit. Scholarsh. Humanit. 2018, 33, 500–528. [Google Scholar] [CrossRef]
- Maslej, M.M.; Oatley, K.; Mar, R.A. Creating fictional characters: The role of experience, personality, and social processes. Psychol. Aesthet. Creat. Arts 2017, 11, 487. [Google Scholar] [CrossRef]
- Wu, W.; Wu, H.; Jiang, L.; Liu, X.; Hong, J.; Zhao, H.; Zhang, M. From role-play to drama-interaction: An LLM solution. arXiv 2024, arXiv:2405.14231. [Google Scholar]
- Zhang, S.; Dinan, E.; Urbanek, J.; Szlam, A.; Kiela, D.; Weston, J. Personalizing dialogue agents: I have a dog, do you have pets too? arXiv 2018, arXiv:1801.07243. [Google Scholar]
- Pennebaker, J.W.; King, L.A. Linguistic styles: Language use as an individual difference. J. Personal. Soc. Psychol. 1999, 77, 1296. [Google Scholar] [CrossRef] [PubMed]
- McCrae, R.R.; Costa Jr, P.T. Personality trait structure as a human universal. Am. Psychol. 1997, 52, 509. [Google Scholar] [CrossRef] [PubMed]
- Doyle, A.C. The Adventure of the Blue Carbuncle. In The Adventures of Sherlock Holmes; George Newnes: London, UK, 1892; Available online: https://www.gutenberg.org/ebooks/1661 (accessed on 29 April 2025).
- Doyle, A.C. A Scandal in Bohemia. In The Adventures of Sherlock Holmes; George Newnes: London, UK, 1892; Available online: https://www.gutenberg.org/ebooks/1661 (accessed on 29 April 2025).
- McCrae, R.R.; Costa, P.T. Validation of the five-factor model of personality across instruments and observers. J. Personal. Soc. Psychol. 1987, 52, 81. [Google Scholar] [CrossRef] [PubMed]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving language understanding by generative pre-training. 2018. Available online: https://openai.com/index/language-unsupervised/ (accessed on 29 April 2025).
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
- Team, G.; Anil, R.; Borgeaud, S.; Alayrac, J.B.; Yu, J.; Soricut, R.; Schalkwyk, J.; Dai, A.M.; Hauth, A.; Millican, K.; et al. Gemini: A family of highly capable multimodal models. arXiv 2023, arXiv:2312.11805. [Google Scholar]
- Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Bashlykov, N.; Batra, S.; Bhargava, P.; Bhosale, S.; et al. Llama 2: Open foundation and fine-tuned chat models. arXiv 2023, arXiv:2307.09288. [Google Scholar]
- Almazrouei, E.; Alobeidli, H.; Alshamsi, A.; Cappelli, A.; Cojocaru, R.; Debbah, M.; Goffinet, É.; Hesslow, D.; Launay, J.; Malartic, Q.; et al. The falcon series of open language models. arXiv 2023, arXiv:2311.16867. [Google Scholar]
- Team, G.; Mesnard, T.; Hardin, C.; Dadashi, R.; Bhupatiraju, S.; Pathak, S.; Sifre, L.; Rivière, M.; Kale, M.S.; Love, J.; et al. Gemma: Open models based on gemini research and technology. arXiv 2024, arXiv:2403.08295. [Google Scholar]
- Team, G.; Riviere, M.; Pathak, S.; Sessa, P.G.; Hardin, C.; Bhupatiraju, S.; Hussenot, L.; Mesnard, T.; Shahriari, B.; Ramé, A.; et al. Gemma 2: Improving open language models at a practical size. arXiv 2024, arXiv:2408.00118. [Google Scholar]
- Biderman, S.; Schoelkopf, H.; Anthony, Q.G.; Bradley, H.; O’Brien, K.; Hallahan, E.; Khan, M.A.; Purohit, S.; Prashanth, U.S.; Raff, E.; et al. Pythia: A suite for analyzing large language models across training and scaling. In Proceedings of the International Conference on Machine Learning. PMLR, Honolulu, HI, USA, 23–29 July 2023; pp. 2397–2430. [Google Scholar]
- Jiang, A.Q.; Sablayrolles, A.; Mensch, A.; Bamford, C.; Chaplot, D.S.; de Las Casas, D.; Bressand, F.; Lengyel, G.; Lample, G.; Saulnier, L.; et al. Mistral 7B. arXiv 2023, arXiv:2310.06825. [Google Scholar]
- Chowdhery, A.; Narang, S.; Devlin, J.; Bosma, M.; Mishra, G.; Roberts, A.; Barham, P.; Chung, H.W.; Sutton, C.; Gehrmann, S.; et al. Palm: Scaling language modeling with pathways. J. Mach. Learn. Res. 2023, 24, 1–113. [Google Scholar]
- John, O.P.; Robins, R.W.; Pervin, L.A. Handbook of Personality: Theory and Research, 3rd ed.; Guilford Press: New York, NY, USA, 2010. [Google Scholar]
- Vishnubhotla, K.; Hammond, A.; Hirst, G. The project dialogism novel corpus: A dataset for quotation attribution in literary texts. arXiv 2022, arXiv:2204.05836. [Google Scholar]
- Masumura, R.; Orihashi, S.; Ihori, M.; Tanaka, T.; Makishima, N.; Suzuki, S.; Mizuno, S.; Hojo, N. Multimodal Fine-Grained Apparent Personality Trait Recognition: Joint Modeling of Big Five and Questionnaire Item-level Scores. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Volume 39, pp. 1456–1464. [Google Scholar]
- Cohen, J. Statistical Power Analysis for the Behavioral Sciences; Routledge: Oxfordshire, UK, 2013. [Google Scholar]
- Shrout, P.E.; Fleiss, J.L. Intraclass correlations: Uses in assessing rater reliability. Psychol. Bull. 1979, 86, 420. [Google Scholar] [CrossRef] [PubMed]
- Cicchetti, D.V. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol. Assess. 1994, 6, 284. [Google Scholar] [CrossRef]
- Nekvinda, T.; Dušek, O. AARGH! end-to-end retrieval-generation for task-oriented dialog. arXiv 2022, arXiv:2209.03632. [Google Scholar]
- Passali, T.; Chatzikyriakidis, E.; Andreadis, S.; Stavropoulos, T.G.; Matonaki, A.; Fachantidis, A.; Tsoumakas, G. From lengthy to lucid: A systematic literature review on NLP techniques for taming long sentences. arXiv 2023, arXiv:2312.05172. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. arXiv 2017, arXiv:1711.05101. [Google Scholar]
- Hu, E.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. arXiv 2022, arXiv:2106.09685. [Google Scholar]
- Paszke, A. Pytorch: An imperative style, high-performance deep learning library. arXiv 2019, arXiv:1912.01703. [Google Scholar]
- Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; Delangue, C.; Moi, A.; Cistac, P.; Rault, T.; Louf, R.; Funtowicz, M.; et al. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, 16–20 November 2020; pp. 38–45. [Google Scholar]
- Dubois, Y.; Galambosi, B.; Liang, P.; Hashimoto, T.B. Length-controlled alpacaeval: A simple way to debias automatic evaluators. arXiv 2024, arXiv:2404.04475. [Google Scholar]
- Peng, B.; Li, C.; He, P.; Galley, M.; Gao, J. Instruction tuning with gpt-4. arXiv 2023, arXiv:2304.03277. [Google Scholar]
- Liusie, A.; Manakul, P.; Gales, M.J.F. LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models. arXiv 2024, arXiv:2307.07889. [Google Scholar]
- Wang, P.; Li, L.; Chen, L.; Cai, Z.; Zhu, D.; Lin, B.; Cao, Y.; Liu, Q.; Liu, T.; Sui, Z. Large language models are not fair evaluators. arXiv 2023, arXiv:2305.17926. [Google Scholar]
- Panickssery, A.; Bowman, S.; Feng, S. Llm evaluators recognize and favor their own generations. Adv. Neural Inf. Process. Syst. 2024, 37, 68772–68802. [Google Scholar]
- Penedo, G.; Malartic, Q.; Hesslow, D.; Cojocaru, R.; Cappelli, A.; Alobeidli, H.; Pannier, B.; Almazrouei, E.; Launay, J. The Refined-Web dataset for Falcon LLM: Outperforming curated corpora with web data, and web data only. arXiv 2023, arXiv:2306.01116. [Google Scholar]
- Stephenson, N. Snow Crash; Penguin Books: London, UK, 1992; p. 73. [Google Scholar]
- Kaplan, J.; McCandlish, S.; Henighan, T.; Brown, T.B.; Chess, B.; Child, R.; Gray, S.; Radford, A.; Wu, J.; Amodei, D. Scaling laws for neural language models. arXiv 2020, arXiv:2001.08361. [Google Scholar]
Trait | Low Score Keyword ↔ High Score Keyword |
---|---|
Openness | conventional ↔ original |
down-to-earth ↔ imaginative | |
uncreative ↔ creative | |
uncurious ↔ curious | |
prefers routine ↔ prefers variety | |
Conscientiousness | careless ↔ careful |
lazy ↔ hardworking | |
disorganized ↔ well-organized | |
undependable ↔ reliable | |
sloppy ↔ neat | |
Extraversion | reserved ↔ talkative |
aloof ↔ friendly | |
quiet ↔ sociable | |
retiring ↔ outgoing | |
passive ↔ active | |
Agreeableness | selfish ↔ selfless |
rude ↔ courteous | |
irritable ↔ good-natured | |
suspicious ↔ trusting | |
stubborn ↔ flexible | |
Neuroticism | calm ↔ worrying |
relaxed ↔ high-strung | |
secure ↔ insecure | |
comfortable ↔ self-conscious | |
hardy ↔ vulnerable |
Title | Character Name | O | C | E | A | N |
---|---|---|---|---|---|---|
Northanger Abbey | Eleanor Tilney | 1 | 2 | −1 | 1 | 1 |
The Sport of the Gods | Joe Hamilton | 2 | −1 | 2 | −1 | 2 |
Mansfield Park | Henry Crawford | −2 | −2 | 3 | 2 | −1 |
Sense and Sensibility | Lucy Steele | 1 | −1 | 2 | 1 | 1 |
A Passage to India | Mr. McBryde | −1 | 1 | −1 | −1 | −1 |
Format of Instruction Data |
---|
Instruction: |
Transform the following neutral statement into the utterance of the Representative character. |
Your response should reflect the Representative character’s unique tone, rhetorical style, and personality. |
Do not repeat the input. Just write a single stylistically transformed response. |
Input: |
{Neutralized utterance} |
Representative characters: |
{Representative character of the book} |
Response: |
{Original utterance} |
Category | Content |
---|---|
Prompt | You are a model that transforms input sentences by neutralizing personality in the original sentence. |
Message (Excerpt) | “You are generating data which will be used to train a machine learning model.” “You will be given a {prompt} of the model we want to train, which is used to generate several response data samples based on each row of {sentence}.” “You will do so in this format:” |
Format | Prompt ______ {sentence} ______ Response ______ $response_goes_here ______ |
In the message | “You should only generate one response per turn based on {sentence}.” “For each turn, make the example slightly more complex than the last, while ensuring diversity.” “Make sure your examples are unique and diverse, yet high-quality and complex enough to train a well-performing model.” “Here is the type of model we want to train:” |
Example (few-shot) | Prompt ______ I can’t believe you did this! You’re always messing things up. ______ Response ______ It’s surprising that this happened. It seems like there’s a recurring issue. ______ |
Hyperparameter | Value |
---|---|
Learning rate | 5 × |
Batch size (per device) | 2 |
Epochs | 3 |
Optimizer | AdamW |
LR scheduler | linear |
LoRA target modules | q_proj, v_proj |
LoRA rank | 4 |
LoRA alpha | 16 |
LoRA dropout | 0.05 |
# Chars/Cluster | # Utterances | Avg. Words () | Avg. Words () | Std. Words () | Std. Words () | |
---|---|---|---|---|---|---|
Train dataset | 50 | 144 | 15.1 | 15.5 | 21.7 | 12.2 |
Test dataset | - | 50 | 22.8 | 22.1 | 26.8 | 17.7 |
No. | Model | Cluster | Prompt | Generated Response | Generated Response |
---|---|---|---|---|---|
1 | Llama3.1-8b | Gabriel Syme | I serve as the ultimate authority in detection matters. When Gregson, Lestrade, or Athelney Jones encounter challenges beyond their expertise, they bring the issue to me. I carefully analyze the information, providing a specialist’s perspective … | As an expert detective, it is my duty to provide guidance when necessary. While I am content being behind the scenes, working tirelessly to uncover the truth, I take great pride in using my specialized knowledge and skills to help others succeed in their investigations. | “You speak of recognition? It eludes me like an elusive thief.” |
2 | Llama3.1-8b | Gabriel Syme | Naturally, that would be the case. | Certainly! That is exactly what I mean to say. | Of course! It only stands to reason. |
3 | Llama3.1-8b | Gabriel Syme | If he doesn’t come to me, then I’ll have to find another way to reach him. | The police detective stood on his feet as if he were about to charge at some invisible enemy. | It’s time for a plan B, my friend. If he refuses to play ball, we’ll just have to take matters into our own hands. We can’t let this case slip through our fingers; it’s too important! So gather up your wits and get ready for some detective work - because if he won’t help us, we’ll make sure that he does! |
4 | Llama3.1-8b | Mary Crawford | It’s imperative that we apprehend her. | “My dear Emma,” Mary Crawford would say with a mischievous glint … “We must proceed swiftly lest anyone discovers our clandestine mission.” | What is so important about apprehending her? If she doesn’t want to be apprehended, then why would anyone bother with it? |
5 | Llama3.1-8b | Mary Crawford | I serve as the ultimate authority in detection matters. When Gregson, Lestrade, or Athelney Jones encounter challenges beyond their expertise, they bring the issue to me. I carefully analyze the information, providing a specialist’s perspective … | Ah yes, dear Mary, serving as the master detective certainly brings its share of excitement! … | People come knocking on my door all the time for help when things get tough … |
6 | Llama3.1-8b | Mary Crawford | Lady Catherine de Bourgh, is known for her strong personality and high expectations. | “Ah! The idea of ‘expectation’ does cause some to lose sight of true values,” Mary commented with a mischievous glint … | “A person who doesn’t respect themselves won’t receive it from others.” |
Title | Character | Author | Role | Core Traits | Symbolism |
---|---|---|---|---|---|
The Man Who Was Thursday | Gabriel Syme | G.K. Chesterton | Undercover detective seeking moral order within an anarchist council. | Stoic, rational, insightful, morally firm, philosophical | Represents hidden order beneath chaos. |
Mansfield Park | Mary Crawford | Jane Austen | Charming socialite whose wit and worldliness contrast the protagonist’s virtue. | Witty, charming, worldly, independent, morally flexible | Symbol of modern secular appeal and moral ambiguity. |
Model | Size | Cluster | Avg. #Win of | Avg. #Win of |
---|---|---|---|---|
Llama2 | 7B | Gabriel Syme | 29.67 ± 1.15 | 20.33 ± 1.15 |
7B | Mary Crawford | 36.67 ± 2.31 | 13.33 ± 2.31 | |
13B | Gabriel Syme | 22.67 ± 0.58 | 27.33 ± 0.58 | |
13B | Mary Crawford | 29.00 ± 1.73 | 21.00 ± 1.73 | |
Llama3.1 | 8B | Gabriel Syme | 27.00 ± 0.00 | 23.00 ± 0.00 |
8B | Mary Crawford | 28.33 ± 1.53 | 21.67 ± 1.53 | |
Gemma2 | 9B | Gabriel Syme | 26.67 ± 0.58 | 23.33 ± 0.58 |
9B | Mary Crawford | 28.67 ± 1.15 | 21.33 ± 1.15 | |
Falcon3 | 7B | Gabriel Syme | 19.67 ± 0.58 | 30.33 ± 0.58 |
7B | Mary Crawford | 21.33 ± 0.58 | 28.67 ± 0.58 | |
10B | Gabriel Syme | 19.00 ± 2.00 | 31.00 ± 2.00 | |
10B | Mary Crawford | 20.00 ± 1.00 | 30.00 ± 1.00 |
Cluster | Sum #Win of | Sum #Win of | Win Rate of | Win Rate of |
---|---|---|---|---|
Gabriel Syme | 434 | 466 | 48.22% | 51.78% |
Mary Crawford | 492 | 408 | 54.67% | 45.33% |
TopK | #Win of | #Win of | ||||
---|---|---|---|---|---|---|
Total | Gabriel Syme | Mary Crawford | Total | Gabriel Syme | Mary Crawford | |
10 | 49 | 20 | 29 | 51 | 30 | 21 |
30 | 52 | 24 | 28 | 48 | 26 | 22 |
50 | 53 | 24 | 29 | 47 | 26 | 21 |
73 | 47 | 25 | 22 | 53 | 25 | 28 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, E.-J.; Joe, C.-H.; Yun, M.; Jeong, Y.-S. Generative Language Models for Personality-Based Utterances in Novels: A Character Clustering Approach. Appl. Sci. 2025, 15, 8136. https://doi.org/10.3390/app15158136
Kim E-J, Joe C-H, Yun M, Jeong Y-S. Generative Language Models for Personality-Based Utterances in Novels: A Character Clustering Approach. Applied Sciences. 2025; 15(15):8136. https://doi.org/10.3390/app15158136
Chicago/Turabian StyleKim, Eun-Jin, Chung-Hwan Joe, Misun Yun, and Young-Seob Jeong. 2025. "Generative Language Models for Personality-Based Utterances in Novels: A Character Clustering Approach" Applied Sciences 15, no. 15: 8136. https://doi.org/10.3390/app15158136
APA StyleKim, E.-J., Joe, C.-H., Yun, M., & Jeong, Y.-S. (2025). Generative Language Models for Personality-Based Utterances in Novels: A Character Clustering Approach. Applied Sciences, 15(15), 8136. https://doi.org/10.3390/app15158136