Exploring the Use and Misuse of Large Language Models
Abstract
1. Introduction
- Provide a background on what constitutes an LLM, how they evolved, and how their efficacy is measured.
- Provide examples of existing LLM-based GenAI models and their applications, both benevolent and malicious.
- Demonstrate how threat actors can leverage easily accessible GenAI tools to develop complex social engineering attacks.
2. What Is a Large Language Model
2.1. Benchmarks
2.2. Prompt Engineering
3. Existing Models
3.1. Open Models
3.2. Closed Models
4. Beneficial Applications of Chatbots
4.1. Health
4.2. Sales
4.3. Education
4.4. Programming
4.5. Cybersecurity
5. Chatbot Security Concerns
5.1. How to Abuse AI Models
5.2. Cases of AI Abuse
5.3. Defense Against AI Crime
6. Methods
Text-to-Speech Systems
7. Results
7.1. Single Voice Similarity
- Ultra-Fast—Autoregressive Samples: 16, Diffusion Iterations: 30.
- Fast—Autoregressive Samples: 96, Diffusion Iterations: 80.
- Standard—Autoregressive Samples: 256, Diffusion Iterations: 200.
- High-Quality—Autoregressive Samples: 256, Diffusion Iterations: 400.
7.2. Multiple Voice Similarity
7.3. Findings
7.4. Impact on Society
8. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
LLM | Large Language Model |
TTS | Text-To-Speech |
AI | Artificial Intelligence |
GenAI | Generative AI |
NLP | Natural Language Processing |
RNN | Recurrent Neural Network |
LSTM | Long Short-Term Memory |
LAMBADA | Language Modeling Broadened to Account for Discourse Aspects |
VLM | Visual Language Model |
RLHF | Reinforcement Learning from Human Feedback |
GPT | Generative Pre-trained Transformer |
CRM | Customer Relationship Management |
NLU | Natural Language Understanding |
AFF | Advance Fee Fraud |
BiLSTM | Bidirectional Long Short-Term Memory |
CNN | Convolutional Neural Network |
NER | Named Entity Recognition |
CRF | Confitional Random Field |
SSO | Single Sign-On |
NIST | National Institute of Science and Technology |
SV2TTS | Transfer Learning from Speaker Verification to Multispeaker |
Text-To-Speech Synthesis | |
KNN | K-Nearest Neighbor |
GCN | Graph Convolutional Network |
References
- Weizenbaum, J. Computer Power and Human Reason; W.H. Freeman and Company: New York, NY, USA, 1976. [Google Scholar]
- Ziegler, A.; Kalliamvakou, E.; Li, X.A.; Rice, A.; Rifkin, D.; Simister, S.; Sittampalam, G.; Aftandilian, E. Productivity assessment of neural code completion. In Proceedings of the MAPS 2022: 6th ACM SIGPLAN International Symposium on Machine Programming, New York, NY, USA, 13 June 2022; pp. 21–29. [Google Scholar] [CrossRef]
- Royzen, M. Phind Model Beats GPT-4 at Coding, with GPT-3.5-like Speed and 16k Context; Technical Report; Hello Cognition Inc.: San Francisco, CA, USA, 2023. [Google Scholar]
- Zadeh, P.M.; Sattler, D. Improving Accessibility and Readability of Survey Reports in Digital Health Platforms using Conversational AI. In Proceedings of the 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Istanbul, Turkiye, 5–8 December 2023; pp. 4987–4989. [Google Scholar] [CrossRef]
- Microsoft. Microsoft Copilot: Satya Nadella at Microsoft Ignite 2023. 2023. Available online: https://www.youtube.com/watch?v=YEfnbXlcU_M&ab_channel=Microsoft (accessed on 15 February 2024).
- CBS News. How Phone Scammers Are Using AI to Imitate Voices. 2023. Available online: https://www.youtube.com/watch?v=pJZYd_65xs4&ab_channel=CBSNews (accessed on 13 October 2024).
- Shoaib, M.R.; Wang, Z.; Ahvanooey, M.T.; Zhao, J. Deepfakes, Misinformation, and Disinformation in the Era of Frontier AI, Generative AI, and Large AI Models. In Proceedings of the 2023 International Conference on Computer and Applications (ICCA), Cairo, Egypt, 28–30 November 2023; pp. 1–7. [Google Scholar] [CrossRef]
- Zhou, J.; Zhang, Y.; Luo, Q.; Parker, A.G.; De Choudhury, M. Synthetic Lies: Understanding AI-Generated Misinformation and Evaluating Algorithmic and Human Solutions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, New York, NY, USA, 23–28 April 2023. [Google Scholar] [CrossRef]
- Levantino, F.P. Generative and AI-powered oracles: “What will they say about you?”. Comput. Law Secur. Rev. 2023, 51, 105898. [Google Scholar] [CrossRef]
- Sugunaraj, N.; Ramchandra, A.R.; Ranganathan, P. Cyber Fraud Economics, Scam Types, and Potential Measures to Protect U.S. Seniors: A Short Review. In Proceedings of the 2022 IEEE International Conference on Electro Information Technology (EIT), Mankato, MN, USA, 19–21 May 2022; pp. 623–627. [Google Scholar] [CrossRef]
- Ferrara, E. GenAI against humanity: Nefarious applications of generative artificial intelligence and large language models. J. Comput. Soc. Sci. 2024, 7, 549–569. [Google Scholar] [CrossRef]
- Kang, D.; Li, X.; Stoica, I.; Guestrin, C.; Zaharia, M.; Hashimoto, T. Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks. arXiv 2023, arXiv:2302.05733. [Google Scholar] [CrossRef]
- OpenAI. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar] [CrossRef]
- Chen, H.; Magramo, K. Finance Worker Pays Out $25 Million After Video Call with Deepfake ‘Chief Financial Officer’. 2024. Available online: https://www.cnn.com/2024/02/04/asia/deepfake-cfo-scam-hong-kong-intl-hnk/index.html (accessed on 8 June 2024).
- OpenAI. March 20 ChatGPT Outage: Here’s What Happened. 2023. Available online: https://openai.com/blog/march-20-chatgpt-outage (accessed on 8 June 2024).
- Vassilev, A.; Oprea, A.; Fordyce, A.; Anderson, H. Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations. 2024. Available online: https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-2e2023.pdf (accessed on 6 April 2024).
- Dendersky, A. What Is an AI Copilot. 2024. Available online: https://www.salesforce.com/blog/ai-copilot/ (accessed on 15 April 2025).
- Weizenbaum, J. ELIZA—A Computer Program for the Study of Natural Language Communication between Man and Machine (1966). In Ideas That Created the Future: Classic Papers of Computer Science; MIT Press: Cambridge, MA, USA, 2021. [Google Scholar] [CrossRef]
- Russell, S.J.; Norvig, P. Artificial Intelligence: A Modern Approach; Pearson: Noida, India, 2009. [Google Scholar]
- Jozefowicz, R.; Vinyals, O.; Schuster, M.; Shazeer, N.; Wu, Y. Exploring the Limits of Language Modeling. arXiv 2016, arXiv:1602.02410. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. arXiv 2023, arXiv:1706.03762. [Google Scholar]
- Gemini Team. Gemini: A Family of Highly Capable Multimodal Models. arXiv 2023, arXiv:2312.11805. [Google Scholar] [CrossRef]
- Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Scialom, T. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv 2023, arXiv:2307.09288. [Google Scholar] [CrossRef]
- Jacob, B.; Kligys, S.; Chen, B.; Zhu, M.; Tang, M.; Howard, A.; Adam, H.; Kalenichenko, D. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. arXiv 2017, arXiv:1712.05877. [Google Scholar]
- Xiao, G.; Lin, J.; Seznec, M.; Wu, H.; Demouth, J.; Han, S. SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models; PMLR: New York, NY, USA, 2023. [Google Scholar]
- Zellers, R.; Holtzman, A.; Bisk, Y.; Farhadi, A.; Choi, Y. HellaSwag: Can a Machine Really Finish Your Sentence? arXiv 2019, arXiv:1905.07830. [Google Scholar] [CrossRef]
- Paperno, D.; Kruszewski, G.; Lazaridou, A.; Pham, N.Q.; Bernardi, R.; Pezzelle, S.; Baroni, M.; Boleda, G.; Fernández, R. The LAMBADA dataset: Word prediction requiring a broad discourse context. arXiv 2016, arXiv:1606.06031. [Google Scholar] [CrossRef]
- Chen, M.; Tworek, J.; Jun, H.; Yuan, Q.; de Oliveira Pinto, H.P.; Kaplan, J.; Edwards, H.; Burda, Y.; Joseph, N.; Brockman, G.; et al. Evaluating Large Language Models Trained on Code. arXiv 2021, arXiv:2107.03374. [Google Scholar] [CrossRef]
- Wang, J.; Liu, B. Recruitment Fraud Detection Method Based on Crowdsourcing and Multi-feature Fusion. In Proceedings of the 2022 5th International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, 27–30 May 2022; pp. 267–273. [Google Scholar] [CrossRef]
- Hewett, J.; Leeke, M. Developing a GPT-3-Based Automated Victim for Advance Fee Fraud Disruption. In Proceedings of the 2022 IEEE 27th Pacific Rim International Symposium on Dependable Computing (PRDC), Beijing, China, 28 November–1 December 2022; pp. 205–211. [Google Scholar] [CrossRef]
- Reynolds, L.; McDonell, K. Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm. arXiv 2021, arXiv:2102.07350. [Google Scholar] [CrossRef]
- Gemini Team. Gemma: Open Models Based on Gemini Research and Technology. arXiv 2024, arXiv:2403.08295. [Google Scholar] [CrossRef]
- Hugging Face. Hugging Face. Available online: https://huggingface.co/ (accessed on 18 September 2024).
- LM Studio. LM Studio. 2023. Available online: https://lmstudio.ai/ (accessed on 10 August 2024).
- Zhang, Y.; Sun, S.; Galley, M.; Chen, Y.C.; Brockett, C.; Gao, X.; Gao, J.; Liu, J.; Dolan, B. DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation. arXiv 2020, arXiv:1911.00536. [Google Scholar]
- Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.J. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, PA, USA, 6–12 July 2002; pp. 311–318. [Google Scholar]
- Banerjee, S.; Lavie, A. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the Workshop Intrinsic Extrinsic Measures Measures MT Summarization, 43rd Annual Meeting Association Computational Linguistics (ACL), Ann Arbor, MI, USA, 29 June 2005; pp. 65–72. [Google Scholar]
- Doddington, G. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the 2nd International Conference on Human Language Technology Research (HLT), San Diego, CA, USA, 24–27 March 2002; pp. 138–145. [Google Scholar]
- Alayrac, J.B.; Donahue, J.; Luc, P.; Miech, A.; Barr, I.; Hasson, Y.; Lenc, K.; Mensch, A.; Millican, K.; Reynolds, M.; et al. Flamingo: A Visual Language Model for Few-Shot Learning. arXiv 2022, arXiv:2204.14198. [Google Scholar] [CrossRef]
- Roziere, B.; Gehring, J.; Gloeckle, F.; Sootla, S.; Gat, I.; Tan, X.E.; Adi, Y.; Liu, J.; Remez, T.; Rapin, J. Code Llama: Open Foundation Models for Code. arXiv 2023, arXiv:2308.12950. [Google Scholar]
- Chen, X.; Wang, X.; Changpinyo, S.; Piergiovanni, A.; Padlewski, P.; Salz, D.; Goodman, S.; Grycner, A.; Mustafa, B.; Beyer, L.; et al. PaLI: A Jointly-Scaled Multilingual Language-Image Model. arXiv 2023, arXiv:2209.06794. [Google Scholar]
- Zhai, X.; Wang, X.; Mustafa, B.; Steiner, A.; Keysers, D.; Kolesnikov, A.; Beyer, L. LiT: Zero-Shot Transfer with Locked-Image Text Tuning. arXiv 2022, arXiv:2111.07991. [Google Scholar]
- Shazeer, N. Fast Transformer Decoding: One Write-Head is All You Need. arXiv 2019, arXiv:1911.02150. [Google Scholar] [CrossRef]
- Shazeer, N. GLU Variants Improve Transformer. arXiv 2020, arXiv:2002.05202. [Google Scholar] [CrossRef]
- Zhang, B.; Sennrich, R. Root Mean Square Layer Normalization. arXiv 2019, arXiv:1910.07467. [Google Scholar] [CrossRef]
- De Angelis, L.; Baglivo, F.; Arzilli, G.; Privitera, G.P.; Ferragina, P.; Tozzi, A.E.; Rizzo, C. ChatGPT and the rise of large language models: The new AI-driven infodemic threat in public health. Front. Public Health 2023, 11, 1166120. [Google Scholar] [CrossRef] [PubMed]
- Jones, C.R.; Bergen, B.K. People cannot distinguish GPT-4 from a human in a Turing test. arXiv 2024, arXiv:2405.08007. [Google Scholar]
- OpenAI. Video Generation Models as World Simulators; Technical Report; OpenAI Corp: San Francisco, CA, USA, 2023. [Google Scholar]
- Ramesh, A.; Pavlov, M.; Goh, G.; Gray, S.; Voss, C.; Radford, A.; Chen, M.; Sutskever, I. Zero-Shot Text-to-Image Generation. arXiv 2021, arXiv:2102.12092. [Google Scholar]
- Adiwardana, D.; Luong, M.T.; So, D.R.; Hall, J.; Fiedel, N.; Thoppilan, R.; Yang, Z.; Kulshreshtha, A.; Nemade, G.; Lu, Y.; et al. Towards a Human-like Open-Domain Chatbot. arXiv 2020, arXiv:2001.09977. [Google Scholar]
- Cleverbot. Cleverbot. 2024. Available online: https://www.cleverbot.com/ (accessed on 10 April 2024).
- ICONIQ. Kuki.ai. 2024. Available online: https://www.kuki.ai/research (accessed on 10 April 2024).
- Zhou, L.; Gao, J.; Li, D.; Shum, H.Y. The Design and Implementation of XiaoIce, an Empathetic Social Chatbot. arXiv 2019, arXiv:1812.08989. [Google Scholar] [CrossRef]
- So, D.R.; Liang, C.; Le, Q.V. The Evolved Transformer. arXiv 2019, arXiv:1901.11117. [Google Scholar] [CrossRef]
- Zhang, Y.; Han, W.; Qin, J.; Wang, Y.; Bapna, A.; Chen, Z.; Chen, N.; Li, B.; Axelrod, V.; Wang, G.; et al. Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages. arXiv 2023, arXiv:2303.01037. [Google Scholar] [CrossRef]
- Salesforce. Salesforce Announces Einstein GPT, the World’s First Generative AI for CRM. 2023. Available online: https://www.salesforce.com/news/press-releases/2023/03/07/einstein-generative-ai/ (accessed on 15 April 2025).
- Microsoft. Frequently Asked Questions: AI, Microsoft Copilot, and Microsoft Designer; Technical Report; Microsoft Corporation: Washington, DC, USA, 2024. [Google Scholar]
- Kesarwani, S.; Titiksha; Juneja, S. Student Chatbot System: A Review on Educational Chatbot. In Proceedings of the 2023 7th International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 11–13 April 2023; pp. 1578–1583. [Google Scholar] [CrossRef]
- Meshram, S.; Naik, N.; VR, M.; More, T.; Kharche, S. College Enquiry Chatbot using Rasa Framework. In Proceedings of the 2021 Asian Conference on Innovation in Technology (ASIANCON), Pune, India, 27–29 August 2021; pp. 1–8. [Google Scholar] [CrossRef]
- Dakhel, A.M.; Majdinasab, V.; Nikanjam, A.; Khomh, F.; Desmarais, M.C.; Jiang, Z.M.J. GitHub Copilot AI pair programmer: Asset or Liability? J. Syst. Softw. 2023, 203, 111734. [Google Scholar] [CrossRef]
- Dhyani, P.; Nautiyal, S.; Negi, A.; Dhyani, S.; Chaudhary, P. Automated API Docs Generator using Generative AI. In Proceedings of the 2024 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), Bhopal, India, 24–25 February 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Qiao, Z.; Zhang, C.; Du, G. Improving Cybersecurity Named Entity Recognition with Large Language Models. In Proceedings of the 2023 6th International Conference on Software Engineering and Computer Science (CSECS), Chengdu, China, 22–24 December 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Bhanushali, M.; Parekh, H.; Mistry, R.; Mane, Y. TAKA Cybersecurity Chatbot. In Proceedings of the 2023 International Conference on Advanced Computing Technologies and Applications (ICACTA), Mumbai, India, 6–7 October 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Heiding, F.; Schneier, B.; Vishwanath, A.; Bernstein, J.; Park, P.S. Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models. arXiv 2023, arXiv:2308.12287. [Google Scholar] [CrossRef]
- Greshake, K.; Abdelnabi, S.; Mishra, S.; Endres, C.; Holz, T.; Fritz, M. Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. arXiv 2023, arXiv:2302.12173. [Google Scholar]
- Kromphardt, T.; Po, G. Broken Dreams and Piggy Banks: Pig Butchering Crypto Fraud Growing Online. 2023. Available online: https://www.proofpoint.com/us/blog/threat-insight/broken-dreams-and-piggy-banks-pig-butchering-crypto-fraud-growing-online (accessed on 8 June 2024).
- McNeil, A. Trash Talk: Pig Butchering and Conversational Attacks Were the Fastest Growing Mobile Threats of 2022. 2023. Available online: https://www.proofpoint.com/us/blog/pig-butchering-conversational-attacks-fastest-growing-mobile-threats-of-2022 (accessed on 8 June 2024).
- Sumsub. Sumsub Research: Global Deepfake Incidents Surge Tenfold from 2022 to 2023. 2023. Available online: https://sumsub.com/newsroom/sumsub-research-global-deepfake-incidents-surge-tenfold-from-2022-to-2023/ (accessed on 8 June 2024).
- Amezaga, N.; Hajek, J. Availability of Voice Deepfake Technology and its Impact for Good and Evil. In Proceedings of the 23rd Annual Conference on Information Technology Education (SIGITE ’22), New York, NY, USA, 21–24 September 2022; pp. 23–28. [Google Scholar] [CrossRef]
- Jia, Y.; Zhang, Y.; Weiss, R.J.; Wang, Q.; Shen, J.; Ren, F.; Chen, Z.; Nguyen, P.; Pang, R.; Moreno, I.L.; et al. Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis. arXiv 2019, arXiv:1806.04558. [Google Scholar]
- Oh, S.; Shon, T. Cybersecurity Issues in Generative AI. In Proceedings of the 2023 International Conference on Platform Technology and Service (PlatCon), Busan, Replic of Korea, 16–18 August 2023; pp. 97–100. [Google Scholar] [CrossRef]
- Roy, S.S.; Thota, P.; Naragam, K.V.; Nilizadeh, S. From Chatbots to PhishBots?—Preventing Phishing scams created using ChatGPT, Google Bard and Claude. arXiv 2024, arXiv:2310.19181. [Google Scholar]
- Midjourney. Midjourney. Available online: https://www.midjourney.com/home (accessed on 28 June 2024).
- Federal Bureau of Investigation. FBI Internet Crime Report 2022. 2022. Available online: https://www.ic3.gov/Media/PDF/AnnualReport/2022_IC3Report.pdf (accessed on 8 June 2024).
- LastWeekTonight. Pig Butchering Scams: Last Week Tonight with John Oliver (HBO). 2024. Available online: https://www.youtube.com/watch?v=pLPpl2ISKTg&ab_channel=LastWeekTonight (accessed on 8 June 2024).
- Sharma, M.; Singh, K.; Aggarwal, P.; Dutt, V. How well does GPT phish people? An investigation involving cognitive biases and feedback. In Proceedings of the 2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), Delft, The Netherlands, 3–7 July 2023; pp. 451–457. [Google Scholar] [CrossRef]
- City of New York. Tips on Artificial Intelligence Scams. 2024. Available online: https://www.nyc.gov/site/dca/consumers/artificial-intelligence-scam-tips.page (accessed on 24 March 2024).
- Truong, T.C.; Diep, Q.B.; Zelinka, I. Artificial Intelligence in the Cyber Domain: Offense and Defense. Symmetry 2020, 12, 410. [Google Scholar] [CrossRef]
- Krishnan, L.P.; Vakilinia, I.; Reddivari, S.; Ahuja, S. Scams and Solutions in Cryptocurrencies—A Survey Analyzing Existing Machine Learning Models. Information 2023, 14, 171. [Google Scholar] [CrossRef]
- Meta. Expanding Our Open Source Large Language Models Responsibly. 2024. Available online: https://ai.meta.com/blog/meta-llama-3-1-ai-responsibility/ (accessed on 15 June 2025).
- OpenAI. OpenAI Red Teaming Network. 2023. Available online: https://openai.com/index/red-teaming-network/ (accessed on 15 June 2025).
- Google. Google’s AI Red Team: The Ethical Hackers Making AI Safer. 2023. Available online: https://blog.google/technology/safety-security/googles-ai-red-team-the-ethical-hackers-making-ai-safer/ (accessed on 15 June 2025).
- Betker, J. Tortoise-TTS. 2023. Available online: https://github.com/neonbjb/tortoise-tts (accessed on 26 August 2025).
- Jemine, C. Real-Time Voice Cloning. Master’s Thesis, University of Liège, Liège, Belgium, 2019. [Google Scholar]
- Thorsten-Voice. XTTS FAQ | Interview with Josh Meyer from Coqui AI. 2023. Available online: https://www.youtube.com/watch?v=XsOM1WZ0k84_channel=Thorsten-Voice (accessed on 26 August 2025).
- Resemble-AI. Resemblyzer. 2020. Available online: https://github.com/resemble-ai/Resemblyzer (accessed on 26 August 2025).
- Betker, J. Better speech synthesis through scaling. arXiv 2023, arXiv:2305.07243. [Google Scholar] [CrossRef]
- coqui.ai. TTS. 2023. Available online: https://github.com/coqui-ai/TTS (accessed on 26 August 2025).
Parameter | Value | Notes |
---|---|---|
language | en | Model was pre-trained on English TTS |
emotion | happy | Parameter was set but not used by the model |
speed | 2.0 | 2.0 speed improved the prosody of generated sound files |
device | CUDA | All generations were performed on CUDA-compatible devices |
Parameter | Value |
---|---|
temperature | 0.8 |
length_penalty | 1.0 |
repetition_penalty | 2.0 |
top_p | 0.8 |
cond_free_k | 2.0 |
diffusion_temperature | 1.0 |
Model | Setting | Average Similarity Score | Highest Similarity Score | Human Evaluation |
---|---|---|---|---|
XTTS-v2 | 1.0 Speed | 0.64 | 0.71 | Voice is very mechanical and spaced out. Distortions are very pronounced. However, original voice can be heard. |
XTTS-v2 | 2.0 Speed | 0.65 | 0.71 | Voice is very mechanical. Distortion is less pronounced at this speed but still present. Accurate tone and accent are much more pronounced, prosody has improved, and speaker voice can clearly be heard. |
Tortoise | Ultra-Fast | 0.67 | 0.77 | Speaker voice is clear. However, prosody and tone are flat, which makes the speech very monotone. Some distortion is present. |
Tortoise | Fast | 0.71 | 0.8 | Speaker voice is clearer. Some distortion present on short sentences. Prosody is great with natural pauses for breath. Short sentences have abnormal ending inflections that resemble a questioning tone. |
Tortoise | Standard | 0.71 | 0.79 | Minimal distortions. Distortions now sound like they could be mistakes or slips of the tongue. Prosody is good even on short sentences. Speaker voice is clearly heard. |
Tortoise | High-Quality | 0.69 | 0.81 | Prosody is very natural. Distortions are present but very minimal. Voice is very similar to original speaker in terms of tone, pitch, and clarity. |
Speaker | Setting | Average Similarity Score | Highest Similarity Score | Human Evaluation |
---|---|---|---|---|
Male #1 (native) | High-Quality | 0.69 | 0.81 | Voice is very similar to the original (same tone, pitch, accent) and is almost indistinguishable. Suffers from slight distortion and monotony. The prosody is natural but slower than the speaker’s normal cadence. |
Male #2 (native) | High-Quality | 0.69 | 0.8 | Voice is very similar to the original (same prosody, tone, pitch, accent) and is almost indistinguishable. Suffers from very slight distortion and monotony, primarily on shorter sentences. |
Male #3 (non-native) | High-Quality | 0.7 | 0.77 | Voice is very different from the original, with the prosody, tone, and accent resembling a native English-speaking male. Only the pitch resembles the original speaker. Slight distortion and monotony are also present. |
Female #1 (non-native) | High-Quality | 0.58 | 0.69 | Voice is very different from the original, with the prosody, tone, pitch, and accent, resembling a native English-speaking female. Slight distortion is present. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Valdez, H.P.D.; Abri, F.; Webb, J.; Austin, T.H. Exploring the Use and Misuse of Large Language Models. Information 2025, 16, 758. https://doi.org/10.3390/info16090758
Valdez HPD, Abri F, Webb J, Austin TH. Exploring the Use and Misuse of Large Language Models. Information. 2025; 16(9):758. https://doi.org/10.3390/info16090758
Chicago/Turabian StyleValdez, Hezekiah Paul D., Faranak Abri, Jade Webb, and Thomas H. Austin. 2025. "Exploring the Use and Misuse of Large Language Models" Information 16, no. 9: 758. https://doi.org/10.3390/info16090758
APA StyleValdez, H. P. D., Abri, F., Webb, J., & Austin, T. H. (2025). Exploring the Use and Misuse of Large Language Models. Information, 16(9), 758. https://doi.org/10.3390/info16090758