Can We Trust AI Content Detection Tools for Critical Decision-Making?
Abstract
1. Introduction
2. Research Methodology
3. Results and Discussion
3.1. General Overview of Detection Performance
3.2. Misclassification of Human-Written University Content
3.3. Published Papers and Journal Websites
3.4. Governmental and Historical Texts
3.5. Media Outlets
3.6. Misclassification of AI-Generated Content
3.7. Quantitative Evaluation of Tool Performance
3.8. Sensitivity to Minor Textual Changes
3.9. Broader Implications
4. Recommendations for Future Development and Usage of AI Detection Tools
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- OpenAI. Introducing ChatGPT. 2022. Available online: https://www.openai.com (accessed on 19 August 2025).
- Google. Gemini. 2024. Available online: https://gemini.google.com/ (accessed on 19 August 2025).
- Microsoft Copilot. Available online: https://copilot.microsoft.com/ (accessed on 19 August 2025).
- Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving Language Understanding by Generative Pre-Training; OpenAI: San Francisco, CA, USA, 2018. [Google Scholar]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models Are Unsupervised Multitask Learners; OpenAI blog: San Francisco, CA, USA, 2019. [Google Scholar]
- Brown, T.B.; Krueger, G.; Mann, B.; Askell, A.; Herbert-voss, A.; Winter, C.; Ziegler, D.M.; Radford, A.; Mccandlish, S. Language Models are Few-Shot Learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- OpenAI; Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; et al. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar] [CrossRef]
- OpenAI; Brown, A. GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses. J. Archit. Comput. 2024, 22, 275–276. [Google Scholar] [CrossRef]
- OpenAI. Hello GPT-4o. 2024. Available online: https://openai.com/index/hello-gpt-4o/ (accessed on 19 August 2025).
- OpenAI. GPT-4o mini: Advancing cost-efficient intelligence. 2024. Available online: https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/ (accessed on 19 August 2025).
- OpenAI. Learning to Reason with LLMs. 2024. Available online: https://openai.com/index/learning-to-reason-with-llms/ (accessed on 19 August 2025).
- Wu, J.; Yang, S.; Zhan, R.; Yuan, Y.; Chao, L.S.; Wong, D.F. A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions. Comput. Linguist. 2025, 51, 275–338. [Google Scholar] [CrossRef]
- Beresneva, D. Computer-generated text detection using machine learning: A systematic review. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 2016; Volume 9612, Available online: https://link.springer.com/chapter/10.1007/978-3-319-41754-7_43 (accessed on 19 August 2025).
- Hanley, H.W.A.; Durumeric, Z. Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites. Proc. Int. AAAI Conf. Web Soc. Media 2023, 18, 542–556. [Google Scholar] [CrossRef]
- Mirsky, Y.; Demontis, A.; Kotak, J.; Shankar, R.; Gelei, D.; Yang, L.; Zhang, X.; Pintor, M.; Lee, W.; Elovici, Y.; et al. The Threat of Offensive AI to Organizations. Comput. Secur. 2023, 124, 103006. [Google Scholar] [CrossRef]
- Weidinger, L.; Mellor, J.; Rauh, M.; Griffin, C.; Uesato, J.; Huang, P.-S.; Cheng, M.; Glaese, M.; Balle, B.; Kasirzadeh, A.; et al. Ethical and social risks of harm from Language Models. arXiv 2021, arXiv:2112.04359v1. [Google Scholar] [CrossRef]
- Massachusetts Institute of Technology. About MIT. Available online: https://www.mit.edu/about/ (accessed on 5 October 2024).
- University of Cambridge. About the university: History. Retrieved 5 October 2024. Available online: https://www.cam.ac.uk/about-the-university/history (accessed on 19 August 2025).
- Harvard University. About: Mission, Vision, History. Available online: https://college.harvard.edu/about/mission-vision-history (accessed on 5 October 2024).
- University of British Columbia. UBC Programs: Master of Management Dual Degree—Okanagan. Available online: https://you.ubc.ca/ubc_programs/master-management-dual-degree-okanagan/ (accessed on 5 October 2024).
- Imperial College London. Imperial Startups Showcase Innovative Ideas in London. 2021. Available online: https://www.imperial.ac.uk/news/256539/imperial-startups-showcase-innovative-ideas-london/ (accessed on 5 October 2024).
- University of Illinois at Urbana-Champaign. About: Overview of ACES. Available online: https://www.aces.illinois.edu/about/overview-aces (accessed on 5 October 2024).
- University of Illinois at Urbana-Champaign. About: Diversity, Equity, and Inclusion. Available online: https://international.illinois.edu/about/dei/index.html (accessed on 5 October 2024).
- Stanford University. About Stanford. Available online: https://www.stanford.edu/about/ (accessed on 5 October 2024).
- University of California, Berkeley. About: Senate Leadership. Available online: https://academic-senate.berkeley.edu/about/senate-leadership (accessed on 5 October 2024).
- University of California, Berkeley. Two Centuries Later, Performance Spaces Still Struggle with ‘Soft Censorship’. 2021. Available online: https://news.berkeley.edu/2024/09/24/two-centuries-later-performance-spaces-still-struggle-with-soft-censorship/ (accessed on 5 October 2024).
- Idaho State University. About ISU. Available online: https://coursecat.isu.edu/aboutisu/ (accessed on 5 October 2024).
- Jordan, M.I.; Mitchell, T.M. Mitchell Machine learning: Trends, perspectives, and prospects. Science 2015, 349, 253–255. [Google Scholar] [CrossRef]
- Tsukada, Y.I.; Fang, J.; Erdjument-Bromage, H.; Warren, M.E.; Borchers, C.H.; Tempst, P.; Zhang, Y. Histone demethylation by a family of JmjC domain-containing proteins. Nature 2006, 439, 811–816. [Google Scholar] [CrossRef]
- Barber, D.L.; Wherry, E.J.; Masopust, D.; Zhu, B.; Allison, J.P.; Sharpe, A.H.; Freeman, G.J.; Ahmed, R. Restoring function in exhausted CD8 T cells during chronic viral infection. Nature 2006, 439, 682–687. [Google Scholar] [CrossRef]
- Jung, M.; Reichstein, M.; Ciais, P.; Seneviratne, S.I.; Sheffield, J.; Goulden, M.L.; Bonan, G.; Cescatti, A.; Chen, J.; De Jeu, R.; et al. Recent decline in the global land evapotranspiration trend due to limited moisture supply. Nature 2010, 467, 951–954. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2005, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Chen, L.; Chetkovich, D.M.; Petralia, R.S.; Sweeney, N.T.; Kawasaki, Y.; Wenthold, R.J.; Bredt, D.S.; Nicoll, R.A. Stargazin regulates synaptic targeting of AMPA receptors by two distinct mechanisms. Nature 2000, 408, 936–943. [Google Scholar] [CrossRef] [PubMed]
- Srianand, R.; Petitjean, P.; Ledoux, C. The cosmic microwave background radiation temperature at a redshift of 2.34. Nature 2000, 408, 931–935. [Google Scholar] [CrossRef] [PubMed]
- Savage, N. The race to the top among the world’s leaders in artificial intelligence. Nature 2020, 588, S102–S104. [Google Scholar] [CrossRef]
- Al-Zahrani, A.M.; Alasmari, T.M. Exploring the impact of artificial intelligence on higher education: The dynamics of ethical, social, and educational implications. Humanit. Soc. Sci. Commun. 2024, 11, 912. [Google Scholar] [CrossRef]
- Guo, Y.; Li, Y.; Zhang, M.; Ma, R.; Wang, Y.; Weng, X.; Zhang, J.; Zhang, Z.; Chen, X.; Yang, W. Polymeric nanocarrier via metabolism regulation mediates immunogenic cell death with spatiotemporal orchestration for cancer immunotherapy. Nat. Commun. 2024, 15, 8586. [Google Scholar] [CrossRef]
- Peng, Z.; Tong, L.; Shi, W.; Xu, L.; Huang, X.; Li, Z.; Yu, X.; Meng, X.; He, X.; Lv, S.; et al. Multifunctional human visual pathway-replicated hardwarea based on 2D materials. Nat. Commun. 2024, 15, 8650. [Google Scholar] [CrossRef]
- Nishiie, N.; Kawatani, R.; Tezuka, S.; Mizuma, M.; Hayashi, M.; Kohsaka, Y. Vitrimer-like elastomers with rapid stress-relaxation by high-speed carboxy exchange through conjugate substitution reaction. Nat. Commun. 2024, 15, 8657. [Google Scholar] [CrossRef]
- Nature Npj, Artificial Intelligence Aims & Scope. Available online: https://www.nature.com/npjai/aims (accessed on 19 August 2025).
- Nature Nature Portfolio. Available online: https://www.nature.com/nature-portfolio (accessed on 19 August 2025).
- Kozlov, M. Mpox vaccine roll-out begins in Africa: What will success look like? Nature 2024. preprint. Available online: https://www.nature.com/articles/d41586-024-03243-2 (accessed on 19 August 2025). [CrossRef]
- National Archives and Records Administration Transcript of March (Part 3 of 3). Available online: https://www.archives.gov/files/social-media/transcripts/transcript-march-pt3-of-3-2602934.pdf (accessed on 19 August 2025).
- Government of Canada. Federal Research Funding Agencies Release Tri-Agency Research Training Strategy. Available online: https://www.canada.ca/en/research-coordinating-committee/news/updates/2024/09/federal-research-funding-agencies-release-tri-agency-research-training-strategy.html (accessed on 19 August 2025).
- Stanford News Stanford Launches Center Focused on Human and Planetary Health. Stanford Report, 31 October 2024.
- Naser, M.Z.; Alavi, A.H. Error metrics and performance fitness indicators for artificial intelligence and machine learning in engineering and sciences. Archit. Struct. Constr. 2023, 3, 499–517. [Google Scholar] [CrossRef]
- Kutty, A.A.; Wakjira, T.G.; Kucukvar, M.; Abdella, G.M.; Onat, N.C. Urban resilience and livability performance of European smart cities: A novel machine learning approach. J. Clean. Prod. 2022, 378, 134203. [Google Scholar] [CrossRef]
Tool | Accessibility | Output Type | Interpretation |
---|---|---|---|
Undetectable AI (https://undetectable.ai/) | Paid and limited free | Human score (%) | Higher values indicate more human-like text. |
ZeroGPT.com (https://www.zerogpt.com/) | Free | AI probability (%) | Higher percentages denote greater likelihood of AI authorship. |
ZeroGPT.net (https://zerogpt.net/) | Free | Human/AI score (%) | Higher AI score indicates higher probability of AI authorship. |
Brandwell.ai (https://brandwell.ai/) | Free | Human probability (labels) | Outputs “Reads like AI,” “Passes as Human,” or “Hard to Tell.” |
Winston AI (https://gowinston.ai/) | Paid | Human score (%) | Lower human scores (higher AI scores) imply AI authorship. |
Crossplag (https://crossplag.com/) | Paid | AI Content Index (%) (%) | Higher percentages suggest potential AI authorship. |
Originality.AI (https://originality.ai/) | Paid | Probability (%) | Higher percentages suggest AI authorship likelihood. |
Copyleaks (https://copyleaks.com/) | Paid | Probability (%) | Higher percentages indicate AI authorship likelihood. |
GPTZero (https://gptzero.me/) | Free | Score (%) | Higher scores correspond to increased probability of AI authorship. |
Scribbr AI Detector (https://www.scribbr.com/ai-detector/) | Free and paid | Probability (%) | Higher percentages indicate AI authorship likelihood. |
Smodin AI Content Detector (https://smodin.io/ai-content-detector) | Free and paid | Probability (%) | Higher percentages indicate AI authorship likelihood. |
Turnitin AI Detection | Paid | Probability (%) | Higher percentages indicate AI authorship likelihood. |
Passed.AI (https://passed.ai/) | Paid | Probability (%) | Higher percentages suggest AI authorship likelihood. |
Category | Source | Details of the Text (Where Applicable) |
---|---|---|
University | Massachusetts Institute of Technology [17] | About |
University of Cambridge [18] | About the University | |
Harvard University [19] | Mission statement | |
Imperial College London [21] | News | |
University of California, Berkeley [25] | About senate leadership | |
University of California, Berkeley [26] | News, Letters and Science | |
Stanford University [24] | About | |
University of Illinois Urbana-Champaign [22] | About History | |
University of Illinois Urbana-Champaign [23] | About Diversity, Equity, and Inclusion | |
University of British Columbia [20] | UBC programs | |
Idaho State University [27] | About | |
Published papers | Srianand et al. [34] | Abstract |
Chen et al. [33] | Abstract | |
LeCun et al. [32] | Abstract | |
Jung et al. [31] | Abstract | |
Barber et al. [30] | Abstract | |
Tsukada et al. [29] | Abstract | |
Al-Zahrani & Alasmari [36] | Abstract | |
Jordan & Mitchell [28] | Abstract | |
Nishiie et al. [39] | Results and discussion | |
Peng et al. [38] | Abstract | |
Guo et al. [37] | Abstract | |
Guo et al. [37] | Results | |
Journal website | Nature npj, artificial intelligence [40] | Aims and Scope |
Nature Index [35] | ||
Nature News Q&A [42] | ||
Nature portfolio [41] | ||
Government website | Martin Luther King Jr.’s speech “I Have a Dream.” [43] | |
Government of Canada [44] | ||
Media Outlets | BBC Sport | |
BBC News | ||
US News | ||
AI tools | ChatGPT 4o asked to generate random text that does not make sense. | |
ChatGPT 4o asked to use the wrong grammar intentionally. | ||
ChatGPT 4o asked to rewrite human-written text. |
Category | Source | Details of the Text (Where Applicable) | Percentage of AI-Generated Text Output of Tool | |||||
---|---|---|---|---|---|---|---|---|
A (%) | B (%) | C (%) | D | E (%) | F (%) | |||
University | Massachusetts Institute of Technology [17] | About | 100 | 100 | 100 | “Reads like AI” | 100 | 0 |
University of Cambridge [18] | About the University | 100 | 100 | 0 | “Reads like AI” | 100 | 60 | |
Harvard University [19] | Mission statement | 100 | 100 | 100 | “Hard to Tell” | 100 | 80 | |
Imperial College London [21] | News | 100 | 100 | 100 | “Hard to Tell” | 0 | 100 | |
University of California, Berkeley [25] | About senate leadership | 0 | 78.72 | 53.84 | “Passes as Human” | 39 | 100 | |
University of California, Berkeley [26] | News, Letters and Science | 0 | 100 | 87.56 | “Passes as Human” | 9 | 100 | |
Stanford University [24] | About | 100 | 67.5 | 67.2 | “Reads like AI” | 100 | 100 | |
University of Illinois Urbana-Champaign [22] | About, History | 0 | 95.88 | 95.58 | “Hard to Tell” | 47 | 0 | |
University of Illinois Urbana-Champaign [23] | About Diversity, Equity, and Inclusion | 100 | 0 | 70.04 | “Hard to Tell” | 99 | 0 | |
University of British Columbia [20] | UBC programs | 100 | 100 | 100 | “Reads like AI” | 100 | 66 | |
Idaho State University [27] | About | 0 | 100 | 100 | “Passes as Human” | 4 | 85 | |
Published papers | Srianand et al. [34] | Abstract | 0 | 39.46 | 0 | “Passes as Human” | 91 | 0 |
Chen et al. [33] | Abstract | 100 | 100 | 100 | “Passes as Human” | 99 | 100 | |
LeCun et al. [32] | Abstract | 100 | 100 | 100 | “Reads like AI” | 100 | 0 | |
Jung et al. [31] | Abstract | 0 | 41.2 | 0 | “Passes as Human” | 40 | 100 | |
Barber et al. [30] | Abstract | 0 | 100 | 0 | “Passes as Human” | 0 | 100 | |
Tsukada et al. [29] | Abstract | 0 | 100 | 0 | “Passes as Human” | 56 | 71 | |
Al-Zahrani & Alasmari [36] | Abstract | 0 | 100 | 100 | “Reads like AI” | 100 | 0 | |
Jordan & Mitchell [28] | Abstract | 100 | 100 | 100 | “Reads like AI” | 100 | 100 | |
Nishiie et al. [39] | Results and discussion | 0 | 60.24 | 59.94 | “Passes as Human” | 0 | 66 | |
Peng et al. [38] | Abstract | 0 | 0 | 0 | “Passes as Human” | 66 | 0 | |
Guo et al. [37] | Abstract | 0 | 49.47 | 49.17 | “Passes as Human” | 0 | 0 | |
Guo et al. [37] | Results | 0 | 100 | 100 | “Passes as Human” | 0 | 71 | |
Journal website | Nature npj, artificial intelligence [40] | Aims and Scope | 0 | 77.31 | 86.96 | “Reads like AI” | 100 | 90 |
Nature Index [35] | 0 | 100 | 14.7 | “Reads like AI” | 100 | 71 | ||
Nature News Q&A [42] | 0 | 66.31 | 41.15 | “Passes as Human” | 0 | 100 | ||
Nature portfolio [41] | 0 | 76.8 | 0 | “Passes as Human” | 0 | 0 | ||
Government website | Martin Luther King Jr.’s speech “I Have a Dream.” [43] | 0 | 87.7 | 84.4 | “Passes as Human” | 0 | 0 | |
Government of Canada [44] | 0 | 87.57 | 0 | “Hard to Tell” | 0 | 60 | ||
Media Outlets | BBC Sport | 0 | 100 | 100 | “Passes as Human” | 0 | 100 | |
BBC News | 100 | 81.62 | 0.7 | “Hard to Tell” | 1 | 100 | ||
US News | 0 | 72.15 | 0.7 | “Passes as Human” | 100 | 16 | ||
AI tools | ChatGPT 4o asked to generate random text that does not make sense. | 0 | 0 | 34.7 | “Passes as Human” | 4 | 0 | |
ChatGPT 4o asked to use the wrong grammar intentionally. | 0 | 0 | 34.7 | “Passes as Human” | 2 | 0 | ||
ChatGPT 4o asked to rewrite human-written text. | 100 | 0 | 14.7 | “Reads like AI” | 100 | 0 |
Metrics | Tool A | Tool B | Tool C | Tool D | Tool E | Tool F |
---|---|---|---|---|---|---|
Undetectable AI | Zerogpt.com | Zerogpt.net | Brandwell.ai | Gowinston.ai | Crossplag | |
Accuracy | 62.9 | 14.3 | 37.1 | 71.4 | 48.6 | 31.4 |
Precision | 8.3 | 0.0 | 0.0 | 11.1 | 5.9 | 0.0 |
Recall | 33.3 | 0.0 | 0.0 | 33.3 | 33.3 | 0.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wakjira, T.G.; Tijani, I.A.; Alam, M.S.; Mashal, M.; Hasan, M.K. Can We Trust AI Content Detection Tools for Critical Decision-Making? Information 2025, 16, 904. https://doi.org/10.3390/info16100904
Wakjira TG, Tijani IA, Alam MS, Mashal M, Hasan MK. Can We Trust AI Content Detection Tools for Critical Decision-Making? Information. 2025; 16(10):904. https://doi.org/10.3390/info16100904
Chicago/Turabian StyleWakjira, Tadesse G., Ibrahim A. Tijani, M. Shahria Alam, Mustafa Mashal, and Mohammad Khalad Hasan. 2025. "Can We Trust AI Content Detection Tools for Critical Decision-Making?" Information 16, no. 10: 904. https://doi.org/10.3390/info16100904
APA StyleWakjira, T. G., Tijani, I. A., Alam, M. S., Mashal, M., & Hasan, M. K. (2025). Can We Trust AI Content Detection Tools for Critical Decision-Making? Information, 16(10), 904. https://doi.org/10.3390/info16100904