MDPI - Publisher of Open Access Journals

25 pages, 4657 KB

Open AccessArticle

Identifying Methodological Language in Psychology Abstracts: A Machine Learning Approach Using NLP and Embedding-Based Clustering

by Konstantinos G. Stathakis, George Papageorgiou and Christos Tjortjis

Big Data Cogn. Comput. 2025, 9(9), 224; https://doi.org/10.3390/bdcc9090224 - 29 Aug 2025

Viewed by 349

Abstract

Research articles are valuable resources for Information Retrieval and Natural Language Processing (NLP) tasks, offering opportunities to analyze key components of scholarly content. This study investigates the presence of methodological terminology in psychology research over the past 30 years (1995–2024) by applying a [...] Read more.

Research articles are valuable resources for Information Retrieval and Natural Language Processing (NLP) tasks, offering opportunities to analyze key components of scholarly content. This study investigates the presence of methodological terminology in psychology research over the past 30 years (1995–2024) by applying a novel NLP and Machine Learning pipeline to a large corpus of 85,452 abstracts, as well as the extent to which this terminology forms distinct thematic groupings. Combining glossary-based extraction, contextualized language model embeddings, and dual-mode clustering, this study offers a scalable framework for the exploration of methodological transparency in scientific text via deep semantic structures. A curated glossary of 365 method-related keywords served as a gold-standard reference for term identification, using direct and fuzzy string matching. Retrieved terms were encoded with SciBERT, averaging embeddings across contextual occurrences to produce unified vectors. These vectors were clustered using unsupervised and weighted unsupervised approaches, yielding six and ten clusters, respectively. Cluster composition was analyzed using weighted statistical measures to assess term importance within and across groups. A total of 78.16% of the examined abstracts contained glossary terms, with an average of 1.8 term per abstract, highlighting an increasing presence of methodological terminology in psychology and reflecting a shift toward greater transparency in research reporting. This work goes beyond the use of static vectors by incorporating contextual understanding in the examination of methodological terminology, while offering a scalable and generalizable approach to semantic analysis in scientific texts, with implications for meta-research, domain-specific lexicon development, and automated scientific knowledge discovery. Full article

(This article belongs to the Special Issue Machine Learning Applications in Natural Language Processing)

► Show Figures

Figure 1

36 pages, 23263 KB

Open AccessArticle

RL-TweetGen: A Socio-Technical Framework for Engagement-Optimized Short Text Generation in Digital Commerce Using Large Language Models and Reinforcement Learning

by Chitrakala S and Pavithra S S

J. Theor. Appl. Electron. Commer. Res. 2025, 20(3), 218; https://doi.org/10.3390/jtaer20030218 - 26 Aug 2025

Viewed by 833

Abstract

In the rapidly evolving landscape of digital marketing and electronic commerce, short-form content—particularly on platforms like Twitter (now X)—has become pivotal for real-time branding, community engagement, and product promotion. The rise of Non-Fungible Tokens (NFTs) and Web3 ecosystems further underscores the need for [...] Read more.

In the rapidly evolving landscape of digital marketing and electronic commerce, short-form content—particularly on platforms like Twitter (now X)—has become pivotal for real-time branding, community engagement, and product promotion. The rise of Non-Fungible Tokens (NFTs) and Web3 ecosystems further underscores the need for domain-specific, engagement-oriented social media content. However, automating the generation of such content while balancing linguistic quality, semantic relevance, and audience engagement remains a substantial challenge. To address this, we propose RL-TweetGen, a socio-technical framework that integrates instruction-tuned large language models (LLMs) with reinforcement learning (RL) to generate concise, impactful, and engagement-optimized tweets. The framework incorporates a structured pipeline comprising domain-specific data curation, semantic classification, and intent-aware prompt engineering, and leverages Parameter-Efficient Fine-Tuning (PEFT) with LoRA for scalable model adaptation. We fine-tuned and evaluated three LLMs—LLaMA-3.1-8B, Mistral-7B Instruct, and DeepSeek 7B Chat—guided by a hybrid reward function that blends XGBoost-predicted engagement scores with expert-in-the-loop feedback. To enhance lexical diversity and contextual alignment, we implemented advanced decoding strategies, including Tailored Beam Search, Enhanced Top-p Sampling, and Contextual Temperature Scaling. A case study focused on NFT-related tweet generation demonstrated the practical effectiveness of RL-TweetGen. Experimental results showed that Mistral-7B achieved the highest lexical fluency (BLEU: 0.2285), LLaMA-3.1 exhibited superior semantic precision (BERT-F1: 0.8155), while DeepSeek 7B provided balanced performance. Overall, RL-TweetGen presents a scalable and adaptive solution for marketers, content strategists, and Web3 platforms seeking to automate and optimize social media engagement. The framework advances the role of generative AI in digital commerce by aligning content generation with platform dynamics, user preferences, and marketing goals. Full article

► Show Figures

Figure 1

20 pages, 3407 KB

Open AccessReview

Application of Digital Twin Technology in Smart Agriculture: A Bibliometric Review

by Rajesh Gund, Chetan M. Badgujar, Sathishkumar Samiappan and Sindhu Jagadamma

Agriculture 2025, 15(17), 1799; https://doi.org/10.3390/agriculture15171799 - 22 Aug 2025

Viewed by 797

Abstract

Digital twin technology is reshaping modern agriculture. Digital twins are the virtual replicas of real-world farming systems, which are continuously updated with real-time data, and are revolutionizing the monitoring, simulation, and optimization of agricultural processes. The literature on agricultural digital twins is multidisciplinary, [...] Read more.

Digital twin technology is reshaping modern agriculture. Digital twins are the virtual replicas of real-world farming systems, which are continuously updated with real-time data, and are revolutionizing the monitoring, simulation, and optimization of agricultural processes. The literature on agricultural digital twins is multidisciplinary, growing rapidly, and often fragmented across disciplines, which lacks well-curated documentation. A bibliometric analysis includes thematic content analysis and science mapping, which provides research trends, gaps, thematic landscape, and key contributors in this continuously evolving and emerging field. Therefore, in this study, we conducted a bibliometric review that included collecting bibliometric data via keyword search strategies on popular scientific databases. The data was further screened, processed, analyzed, and visualized using bibliometric tools to map research trends, landscapes, collaborations, and themes. Key findings show that publications have grown exponentially since 2018, with an annual growth rate of 27.2%. The major contributing countries were China, the USA, the Netherlands, Germany, and India. We observed a collaboration network with distinct geographic clusters, with strong intra-European ties and more localized efforts in China and the USA. The analysis identified seven major research theme clusters revolving around precision farming, Internet of Things integration, artificial intelligence, cyber–physical systems, controlled-environment agriculture, sustainability, and food system applications. We observed that core technologies, such as sensors, artificial intelligence, and data analytics, have been extensively explored, while identifying gaps in research areas. The emerging interests include climate resilience, renewable-energy integration, and supply-chain optimization. The observed transition from task-specific tools to integrated, system-level approaches underline the growing need for adaptive, data-driven decision support. By outlining research trends and identifying strategic research gaps, this review offers insights into leveraging digital twins to improve productivity, sustainability, and resilience in global agriculture. Full article

(This article belongs to the Special Issue Intelligent Sensing and Edge AI-Driven Systems for Precision Agriculture)

► Show Figures

Figure 1

19 pages, 6354 KB

Open AccessArticle

Extract Nutritional Information from Bilingual Food Labels Using Large Language Models

by Fatmah Y. Assiri, Mohammad D. Alahmadi, Mohammed A. Almuashi and Ayidh M. Almansour

J. Imaging 2025, 11(8), 271; https://doi.org/10.3390/jimaging11080271 - 13 Aug 2025

Viewed by 617

Abstract

Food product labels serve as a critical source of information, providing details about nutritional content, ingredients, and health implications. These labels enable Food and Drug Authorities (FDA) to ensure compliance and take necessary health-related and logistics actions. Additionally, product labels are essential for [...] Read more.

Food product labels serve as a critical source of information, providing details about nutritional content, ingredients, and health implications. These labels enable Food and Drug Authorities (FDA) to ensure compliance and take necessary health-related and logistics actions. Additionally, product labels are essential for online grocery stores to offer reliable nutrition facts and empower customers to make informed dietary decisions. Unfortunately, product labels are typically available in image formats, requiring organizations and online stores to manually transcribe them—a process that is not only time-consuming but also highly prone to human error, especially with multilingual labels that add complexity to the task. Our study investigates the challenges and effectiveness of leveraging large language models (LLMs) to extract nutritional elements and values from multilingual food product labels, with a specific focus on Arabic and English. A comprehensive empirical analysis was conducted using a manually curated dataset of 294 food product labels, comprising 588 transcribed nutritional elements and values in both languages, which served as the ground truth for evaluation. The findings reveal that while LLMs performed better in extracting English elements and values compared to Arabic, our post-processing techniques significantly enhanced their accuracy, with GPT-4o outperforming GPT-4V and Gemini. Full article

(This article belongs to the Special Issue Computer Vision for Food Data Analysis: Methods, Challenges, and Applications)

► Show Figures

Figure 1

11 pages, 1020 KB

Open AccessCommentary

Disconnected in a Connected World: Improving Digital Literacies Instruction to Reconnect with Each Other, Ideas, and Texts

by Joseph Marangell and Régine Randall

Educ. Sci. 2025, 15(8), 1026; https://doi.org/10.3390/educsci15081026 - 11 Aug 2025

Viewed by 351

Abstract

This commentary addresses a problem of practice related to student disengagement in technology-rich classrooms, where learners are digitally connected but socially and academically disconnected. Although not an empirical study, the commentary draws on instructional examples from secondary- and graduate-level teaching. The authors examine [...] Read more.

This commentary addresses a problem of practice related to student disengagement in technology-rich classrooms, where learners are digitally connected but socially and academically disconnected. Although not an empirical study, the commentary draws on instructional examples from secondary- and graduate-level teaching. The authors examine how digital literacy instruction can strengthen engagement, reading comprehension, and ethical participation in online environments. The article highlights strategies such as the workshop model, multimodal composition, digital content curation, and the use of mentor texts to support critical thinking and collaborative learning. These practices aim to develop students’ analytical skills, awareness of audience, and recognition of their own positionality in digital spaces. Across courses, the authors reflected on increased student engagement when digital tools were used not simply for task completion but to support inquiry, discourse, and authentic creation for real audiences. Full article

(This article belongs to the Special Issue Digital Literacy Environments and Reading Comprehension)

► Show Figures

Figure 1

22 pages, 538 KB

Open AccessArticle

Meaning in the Algorithmic Museum: Towards a Dialectical Modelling Nexus of Virtual Curation

by Huining Guan and Pengbo Chen

Heritage 2025, 8(7), 284; https://doi.org/10.3390/heritage8070284 - 17 Jul 2025

Viewed by 585

Abstract

The rise of algorithm-driven virtual museums presents a philosophical challenge for how cultural meaning is constructed and critiqued in digital curation. Prevailing approaches highlight important but partial aspects: the loss of aura and authenticity in digital reproductions, efforts to maintain semiotic continuity with [...] Read more.

The rise of algorithm-driven virtual museums presents a philosophical challenge for how cultural meaning is constructed and critiqued in digital curation. Prevailing approaches highlight important but partial aspects: the loss of aura and authenticity in digital reproductions, efforts to maintain semiotic continuity with physical exhibits, optimistic narratives of technological democratisation, and critical technopessimist warnings about commodification and bias. Yet none provides a unified theoretical model of meaning-making under algorithmic curation. This paper proposes a dialectical-semiotic framework to synthesise and transcend these positions. The Dialectical Modelling Nexus (DMN) is a new conceptual structure that views meaning in virtual museums as emerging from the dynamic interplay of original and reproduced contexts, human and algorithmic sign systems, personal interpretation, and ideological framing. Through a critique of prior theories and a synthesis of their insights, the DMN offers a comprehensive model to diagnose how algorithms mediate museum content and to guide critical curatorial practice. The framework illuminates the dialectical tensions at the heart of algorithmic cultural mediation and suggests principles for preserving authentic, multi-layered meaning in the digital museum milieu. Full article

(This article belongs to the Special Issue Digital Museology and Emerging Technologies in Cultural Heritage)

► Show Figures

Figure 1

28 pages, 319 KB

Open AccessArticle

Mediated Mothering: Exploring Maternal and Adolescent Social Media Use and Social Comparison During and Beyond COVID-19

by Amanda L. Sams, Marquita S. Smith, Bitt Moon and Leslie J. Ray

Journal. Media 2025, 6(3), 103; https://doi.org/10.3390/journalmedia6030103 - 15 Jul 2025

Viewed by 1319

Abstract

This study aimed to explore how social media usage influenced both parent and adolescent mental health and social identity during and after the COVID-19 pandemic through the theoretical foundational lens of social comparison theory. In-depth interviews with 24 mothers of adolescent children (ages [...] Read more.

This study aimed to explore how social media usage influenced both parent and adolescent mental health and social identity during and after the COVID-19 pandemic through the theoretical foundational lens of social comparison theory. In-depth interviews with 24 mothers of adolescent children (ages 10–19) were conducted to address the research questions. Qualitative thematic analysis of the interview transcripts revealed eight emerging themes: (1) learning and entertainment, (2) maternal fears related to content binging and cyberbullying, (3) finding connection and comfort through social media during the pandemic, (4) ongoing digital care work as lasting maternal labor, (5) iterative dialogue: platform restrictions and content curation boundaries, (6) upward and downward social comparison, (7) fear of missing out (FoMO), and (8) third-person perception (TPP). The findings show that mothers perceive social media usage as either beneficial or harmful among adolescents (their children); upward and downward social comparison via social media exhibits more dynamic mechanisms. Moreover, this study enhances our theoretical understanding by linking social media usage to social identity, social comparison, and mental health during a global health crisis. Full article

(This article belongs to the Collection Role of Media and Journalism during COVID-19 Pandemic: Lessons Learned and Future Challenges)

17 pages, 265 KB

Open AccessArticle

News Curation in Digital Media: Analysis of Four Newspapers’ Front Pages

by Javier Guallar, Jesús Cascón-Katchadourian, Carlos Lopezosa and Juan-José Boté-Vericad

Journal. Media 2025, 6(3), 97; https://doi.org/10.3390/journalmedia6030097 - 4 Jul 2025

Viewed by 566

Abstract

This study examines content curation on the front pages of four Spanish digital newspapers: two legacy media outlets, El País and La Vanguardia, and two purely digital, elDiario.es and El Español. This exploratory and unprecedented research aims to understand the characteristics of front-page [...] Read more.

This study examines content curation on the front pages of four Spanish digital newspapers: two legacy media outlets, El País and La Vanguardia, and two purely digital, elDiario.es and El Español. This exploratory and unprecedented research aims to understand the characteristics of front-page news curation in digital press in terms of themes, authorship and genre, origin and temporal range of curated content, types of sources used, the function of curated links, and techniques employed to add value to the presented information. The CAS (Curation Analysis System) methodology is used for content analysis of a selection of news items from different time slots and days of the week, following the constructed week method. The results provide a better understanding of the characteristics of curation on newspaper front pages, comparing them with other curation studies in journalistic products, such as media newsletters and live news. Full article

23 pages, 8902 KB

Open AccessArticle

2D Prediction of the Nutritional Composition of Dishes from Food Images: Deep Learning Algorithm Selection and Data Curation Beyond the Nutrition5k Project

by Rachele Bianco, Sergio Coluccia, Michela Marinoni, Alex Falcon, Federica Fiori, Giuseppe Serra, Monica Ferraroni, Valeria Edefonti and Maria Parpinel

Nutrients 2025, 17(13), 2196; https://doi.org/10.3390/nu17132196 - 30 Jun 2025

Viewed by 960

Abstract

Background/Objectives: Deep learning (DL) has shown strong potential in analyzing food images, but few studies have directly predicted mass, energy, and macronutrient content from images. In addition to the importance of high-quality data, differences in country-specific food composition databases (FCDBs) can hinder [...] Read more.

Background/Objectives: Deep learning (DL) has shown strong potential in analyzing food images, but few studies have directly predicted mass, energy, and macronutrient content from images. In addition to the importance of high-quality data, differences in country-specific food composition databases (FCDBs) can hinder model generalization. Methods: We assessed the performance of several standard DL models using four ground truth datasets derived from Nutrition5k—the largest image–nutrition dataset with ~5000 complex US cafeteria dishes. In light of developing an Italian dietary assessment tool, these datasets varied by FCDB alignment (Italian vs. US) and data curation (ingredient–mass correction and frame filtering on the test set). We evaluated combinations of four feature extractors [ResNet-50 (R50), ResNet-101 (R101), InceptionV3 (IncV3), and Vision Transformer-B-16 (ViT-B-16)] with two regression networks (2+1 and 2+2), using IncV3_2+2 as the benchmark. Descriptive statistics (percentages of agreement, unweighted Cohen’s kappa, and Bland–Altman plots) and standard regression metrics were used to compare predicted and ground truth nutritional composition. Dishes mispredicted by ≥7 algorithms were analyzed separately. Results: R50, R101, and ViT-B-16 consistently outperformed the benchmark across all datasets. Specifically, when replacing it with these top algorithms, reductions in median Mean Absolute Percentage Errors were 6.2% for mass, 6.4% for energy, 12.3% for fat, and 33.1% and 40.2% for protein and carbohydrates. Ingredient–mass correction substantially improved prediction metrics (6–42% when considering the top algorithms), while frame filtering had a more limited effect (<3%). Performance was consistently poor across most models for complex salads, chicken-based or eggs-based dishes, and Western-inspired breakfasts. Conclusions: The R101 and ViT-B-16 architectures will be prioritized in future analyses, where ingredient–mass correction and automated frame filtering methods will be considered. Full article

(This article belongs to the Special Issue Artificial Intelligence in Nutrition Research: Current and Future Perspectives and Applications)

► Show Figures

Figure 1

22 pages, 2732 KB

Open AccessArticle

AI-Based Learning Recommendations: Use in Higher Education

by Prabin Dahal, Saptadi Nugroho, Claudia Schmidt and Volker Sänger

Future Internet 2025, 17(7), 285; https://doi.org/10.3390/fi17070285 - 26 Jun 2025

Viewed by 649

Abstract

We propose the extension for Artificial Intelligence (AI)-supported learning recommendations within higher education, focusing on enhancing the widely-used Moodle Learning Management System (LMS) and extending it to the Learning eXperience Platform (LXP). The proposed LXP is an enhancement of Moodle, with an emphasis [...] Read more.

We propose the extension for Artificial Intelligence (AI)-supported learning recommendations within higher education, focusing on enhancing the widely-used Moodle Learning Management System (LMS) and extending it to the Learning eXperience Platform (LXP). The proposed LXP is an enhancement of Moodle, with an emphasis on learning support and learner motivation, incorporating various recommendation types such as content-based, collaborative, and session-based recommendations to provide the next learning resources given by lecturers and retrieved from the content curation of Open Educational Resources (OER) for the learners. In addition, we integrated a chatbot using Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) with AI-based recommendations to provide an effective learning experience. Full article

(This article belongs to the Special Issue Deep Learning in Recommender Systems)

► Show Figures

Figure 1

29 pages, 1602 KB

Open AccessArticle

A Recommender System Model for Presentation Advisor Application Based on Multi-Tower Neural Network and Utility-Based Scoring

by Maria Vlahova-Takova and Milena Lazarova

Electronics 2025, 14(13), 2528; https://doi.org/10.3390/electronics14132528 - 22 Jun 2025

Viewed by 1029

Abstract

Delivering compelling presentations is a critical skill across academic, professional, and public domains—yet many presenters struggle with structuring content, maintaining visual consistency, and engaging their audience effectively. Existing tools offer isolated support for design or delivery but fail to promote long-term skill development. [...] Read more.

Delivering compelling presentations is a critical skill across academic, professional, and public domains—yet many presenters struggle with structuring content, maintaining visual consistency, and engaging their audience effectively. Existing tools offer isolated support for design or delivery but fail to promote long-term skill development. This paper presents a novel intelligent application, the Presentation Advisor application, powered by a personalized recommendation engine that goes beyond fixing slide content and visualization, enabling users to build presentation competence. The recommendation engine leverages a model based on hybrid multi-tower neural network architecture enhanced with temporal encoding, problem sequence modeling, and utility-based scoring to deliver adaptive context-aware feedback. Unlike current tools, the presented system analyzes user-submitted presentations to detect common issues and delivers curated educational content tailored to user preferences, presentation types, and audiences. The system also incorporates strategic cold-start mitigation, ensuring high-quality recommendations even for new users or unseen content. Comprehensive experimental evaluations demonstrate that the suggested model significantly outperforms content-based filtering, collaborative filtering, autoencoders, and reinforcement learning approaches across both accuracy and personalization metrics. By combining cutting-edge recommendation techniques with a pedagogical framework, the Presentation Advisor application enables users not only to improve individual presentations but to become consistently better presenters over time. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

18 pages, 3597 KB

Open AccessArticle

Advancing Image Spam Detection: Evaluating Machine Learning Models Through Comparative Analysis

by Mahnoor Jamil, Hristina Mihajloska Trpcheska, Aleksandra Popovska-Mitrovikj, Vesna Dimitrova and Reiner Creutzburg

Appl. Sci. 2025, 15(11), 6158; https://doi.org/10.3390/app15116158 - 30 May 2025

Viewed by 1077

Abstract

Image-based spam poses a significant challenge for traditional text-based filters, as malicious content is often embedded within images to bypass keyword detection techniques. This study investigates and compares the performance of six machine learning models—ResNet50, XGBoost, Logistic Regression, LightGBM, Support Vector Machine (SVM), [...] Read more.

Image-based spam poses a significant challenge for traditional text-based filters, as malicious content is often embedded within images to bypass keyword detection techniques. This study investigates and compares the performance of six machine learning models—ResNet50, XGBoost, Logistic Regression, LightGBM, Support Vector Machine (SVM), and VGG16—using a curated dataset containing 678 legitimate (ham) and 520 spam images. The novelty of this research lies in its comprehensive side-by-side evaluation of diverse models on the same dataset, using standardized dataset preprocessing, balanced data splits, and validation techniques. Model performance was assessed using evaluation metrics such as accuracy, receiver operating characteristic (ROC) curve, precision, recall, and area under the curve (AUC). The results indicate that ResNet50 achieved the highest classification performance, followed closely by XGBoost and Logistic Regression. This work provides practical insights into the strengths and limitations of traditional, ensemble-based, and deep learning models for image-based spam detection. The findings can support the development of more effective and generalizable spam filtering solutions in multimedia-rich communication platforms. Full article

(This article belongs to the Special Issue New Advances in Computer Security and Cybersecurity)

► Show Figures

Figure 1

26 pages, 364 KB

Open AccessEssay

Viral Leadership: Algorithmic Amplification and the Rise of Leadership Fashions

by Dag Øivind Madsen and Kåre Slåtten

Adm. Sci. 2025, 15(6), 202; https://doi.org/10.3390/admsci15060202 - 26 May 2025

Cited by 1 | Viewed by 1991

Abstract

This essay examines how AI-driven content curation reshapes leadership fashions through algorithmic amplification on social media platforms. Algorithms designed to maximize engagement selectively elevate certain leadership styles, such as authentic, servant, and transformational leadership, while marginalizing others, including transactional or directive approaches. Drawing [...] Read more.

This essay examines how AI-driven content curation reshapes leadership fashions through algorithmic amplification on social media platforms. Algorithms designed to maximize engagement selectively elevate certain leadership styles, such as authentic, servant, and transformational leadership, while marginalizing others, including transactional or directive approaches. Drawing on leadership fashion theory, an extension of management fashion theory, this essay analyzes how viral content, influencer dynamics, and algorithmic prioritization collectively construct contemporary leadership ideals. It highlights the central role of leadership gurus such as Simon Sinek, Brené Brown, and Gary Vaynerchuk, and critiques the risks of oversimplification and performative authenticity in algorithmically mediated leadership discourse. Using recent empirical findings and real-world examples, the analysis shows how emotionally resonant and morally charged content gains disproportionate visibility, potentially distorting leadership development and practice. This essay concludes by discussing implications for organizations, leadership education, and research, and calls for a renewed commitment to evidence-based leadership theory and practice in the face of algorithmic influence. Full article

(This article belongs to the Section Leadership)

26 pages, 28790 KB

Open AccessArticle

Understanding Social Biases in Large Language Models

by Ojasvi Gupta, Stefano Marrone, Francesco Gargiulo, Rajesh Jaiswal and Lidia Marassi

AI 2025, 6(5), 106; https://doi.org/10.3390/ai6050106 - 20 May 2025

Cited by 1 | Viewed by 3826

Abstract

Background/Objectives: Large Language Models (LLMs) like ChatGPT, LLAMA, and Mistral are widely used for automating tasks such as content creation and data analysis. However, due to their training on publicly available internet data, they may inherit social biases. We aimed to investigate [...] Read more.

Background/Objectives: Large Language Models (LLMs) like ChatGPT, LLAMA, and Mistral are widely used for automating tasks such as content creation and data analysis. However, due to their training on publicly available internet data, they may inherit social biases. We aimed to investigate the social biases (i.e., ethnic, gender, and disability biases) in these models and evaluate how different model versions handle them. Methods: We instruction-tuned popular models (like Mistral, LLAMA, and Gemma), and for this we curated a dataset constructed by collecting and modifying diverse data from various public datasets. Prompts were run through a controlled pipeline, and responses were categorized (e.g., biased, confused, repeated, or accurate) and analyzed. Results: We found that models responded differently to bias prompts depending on their version. Fine-tuned models showed fewer overt biases but more confusion or censorship. Disability-related prompts triggered the most consistent biases across models. Conclusions: Bias persists in LLMs despite instruction tuning. Differences between model versions may lead to inconsistent user experiences and hidden harms in downstream applications. Greater transparency and robust fairness testing are essential. Full article

(This article belongs to the Special Issue AI Bias in the Media and Beyond)

► Show Figures

Figure 1

28 pages, 13728 KB

Open AccessArticle

Molecular Recognition of SARS-CoV-2 Mpro Inhibitors: Insights from Cheminformatics and Quantum Chemistry

by Adedapo Olosunde and Xiche Hu

Molecules 2025, 30(10), 2174; https://doi.org/10.3390/molecules30102174 - 15 May 2025

Viewed by 758

Abstract

The SARS-CoV-2 main protease (Mpro), essential for viral replication, remains a prime target for antiviral drug design against COVID-19 and related coronaviruses. In this study, we present a systematic investigation into the molecular determinants of Mpro inhibition using an integrated approach combining large-scale [...] Read more.

The SARS-CoV-2 main protease (Mpro), essential for viral replication, remains a prime target for antiviral drug design against COVID-19 and related coronaviruses. In this study, we present a systematic investigation into the molecular determinants of Mpro inhibition using an integrated approach combining large-scale data mining, cheminformatics, and quantum chemical calculations. A curated dataset comprising 963 high-resolution structures of Mpro–ligand complexes—348 covalent and 615 non-covalent inhibitors—was mined from the Protein Data Bank. Cheminformatics analysis revealed distinct physicochemical profiles for each inhibitor class: covalent inhibitors tend to exhibit higher hydrogen bonding capacity and sp³ character, while non-covalent inhibitors are enriched in aromatic rings and exhibit greater aromaticity and lipophilicity. A novel descriptor, Weighted Hydrogen Bond Count (WHBC), normalized for molecular size, revealed a notable inverse correlation with aromatic ring count, suggesting a compensatory relationship between hydrogen bonding and π-mediated interactions. To elucidate the energetic underpinnings of molecular recognition, 40 representative inhibitors (20 covalent, 20 non-covalent) were selected based on principal component analysis and aromatic ring content. Quantum mechanical calculations at the double-hybrid B2PLYP/def2-QZVP level quantified non-bonded interaction energies, revealing that covalent inhibitors derive binding strength primarily through hydrogen bonding (~63.8%), whereas non-covalent inhibitors depend predominantly on π–π stacking and CH–π interactions (~62.8%). Representative binding pocket analyses further substantiate these findings: the covalent inhibitor F2F-2020198-00X exhibited strong hydrogen bonds with residues such as Glu166 and His163, while the non-covalent inhibitor EDG-MED-10fcb19e-1 engaged in extensive π-mediated interactions with residues like His41, Met49, and Met165. The distinct interaction patterns led to the establishment of pharmacophore models, highlighting key recognition motifs for both covalent and non-covalent inhibitors. Our findings underscore the critical role of aromaticity and non-bonded π interactions in driving binding affinity, complementing or, in some cases, substituting for hydrogen bonding, and offer a robust framework for the rational design of next-generation Mpro inhibitors with improved selectivity and resistance profiles. Full article

(This article belongs to the Special Issue Fundamental Aspects of Chemical Bonding—2nd Edition)

► Show Figures

Figure 1

Search Results (109)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (109)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI