1. Introduction
The digital landscape of the 21st century has been irrevocably shaped by the rise of automated actors. “Social bots”—algorithms designed to generate content and mimic human interaction—have evolved from simple, script-based novelties into sophisticated entities capable of influencing global discourse [1]. While early research primarily viewed these bots as deterministic tools for automated customer service or news aggregation, their capabilities have expanded alongside their potential for misuse. Today, malicious actors are implicated in amplifying low-credibility content, manipulating financial markets, and exacerbating political polarization by infiltrating echo chambers [2,3].
Recent years have witnessed a critical inflection point: the transition from “scripted automation” to “cognitive autonomy”. With the advent of Large Language Models (LLMs) and multimodal foundation models, the distinction between human and machine behavior has blurred significantly. Reports indicate that automated traffic surpassed human traffic for the first time in 2024, a shift driven largely by AI-powered agents [4]. Unlike their predecessors, these modern actors possess “emergent abilities”—such as reasoning, planning, and emotional mimicry—allowing them to navigate complex social dynamics with unprecedented fluidity [5,6].
This evolution has fundamentally altered the nature of online interaction. On an individual level, modern agents can now engage in long-term strategic planning and exhibit persuasive capabilities that rival human interlocutors. Studies suggest that LLM-driven agents can tailor rhetoric to specific user demographics, rendering them potent tools for computational propaganda [7]. On a collective level, these agents are increasingly deployed in “social sandboxes” to simulate human community dynamics. While offering a new lens for computational sociology, this also raises concerns about the scalability of synthetic misinformation [8].
Consequently, authenticating these interactions is becoming increasingly difficult as the “detection boundary” shifts. Traditional models often fail to capture the long-range dependencies inherent in multi-party, multi-turn discussions, a challenge exacerbated by the scarcity of realistic conversational datasets [9]. Furthermore, the ability of agents to utilize external tools (e.g., search engines, APIs) allows them to ground responses in real-time data. This capability enables them to bypass detection methods that rely on identifying factual hallucinations or static knowledge cutoffs [10].
We are thus witnessing a paradigm shift from disembodied software scripts to “Social Agents”—autonomous systems capable of constructing internal world models, maintaining long-term memory, and interacting with the physical world through Embodied AI [11]. However, realizing this vision requires a holistic approach. To build agents that can effectively persuade without hallucinating, or navigate physical spaces without failing, we must address fundamental challenges in intent understanding, environmental grounding, and resilient connectivity. This Special Issue, “Advances in Social Bots,” was curated to bridge this gap, providing the architectural blueprints that span from cognitive algorithms to the necessary physical infrastructure.
2. Thematic Overview: From Cognition to Infrastructure
The articles in this Special Issue illustrate that the evolution from “scripted bots” to “autonomous agents” is not a single leap, but a layered evolution. To support the high-level emergent behaviors described in the Introduction—such as strategic persuasion and adaptive social interaction—advancements are required across three fundamental layers: cognitive processing, perceptual adaptation, and physical infrastructure.
2.1. Cognitive Intelligence: Understanding Stance and Sentiment
For a social agent to interact meaningfully, it must understand not just what is said, but the stance and sentiment behind it.
Xie et al. (Contribution 1) address the challenge of zero-shot stance detection. They propose the PAMR (Pragmatic-Aware Multi-Agent Reasoning) framework, which utilizes LLMs in a multi-agent architecture. By explicitly modeling pragmatic cues like sarcasm, their work demonstrates how agents can “think” collaboratively to decipher implicit stances without task-specific training. Similarly focusing on linguistic nuance, Xu et al. (Contribution 2) explore the granularity of emotion. Their research on aspect-level sentiment analysis allows machines to disentangle complex sentence structures, a critical capability for agents engaging in nuanced human–machine dialogue.
Furthermore, Zeng et al. (Contribution 3) tackle the temporal dimension. They propose the DASR framework for joint event detection, utilizing incremental learning to mitigate “catastrophic forgetting.” This ensures that agents can continuously adapt to emerging hot topics, mitigating the risk of knowledge obsolescence in long-term social simulations.
2.2. Perceptual Robustness: Vision and Adaptation
Modern agents operate in a multimodal world. As the boundary between real and synthetic content blurs, perceptual robustness becomes critical—both for agents to ground themselves in reality and for systems to distinguish authentic interactions from fabricated ones.
Chen et al. (Contribution 4) explore the intersection of vision and geography. Their FLsM model utilizes large-scale visual models for the fuzzy localization of image scenes. This capability is vital for verifying the authenticity of user-generated content and distinguishing between real-world activity and fabricated bot personas. Addressing the stability of AI models in changing environments, Song et al. (Contribution 5) introduce a “Dual Constraints” method for Continual Test-Time Adaptation (CTTA). This research ensures that detection algorithms remain robust even as data distributions shift over time, providing a defense against bots that constantly alter their behavioral patterns to evade detection.
2.3. Embodied Intelligence and Network Infrastructure
Finally, as agents transition from digital chatbots to embodied robots interacting with the physical world, they require sensitive perception and resilient communication networks to maintain autonomy.
Sharma et al. (Contribution 6) provide a glimpse into embodied interaction. Their work on hardness classification using cost-effective tactile sensors mimics human mechanoreceptors, paving the way for service robots that can socially and physically interact with humans.
Supporting these distributed agents requires robust connectivity. Qu et al. contribute two pivotal studies on the network layer. In their first paper (Contribution 7), they optimize the geometry of Stratospheric Pseudolite Networks (SPNs). In their second paper (Contribution 8), they propose an Intelligent Pseudolite Constellation (IPCB) based on high-altitude balloons. These studies lay the foundation for a resilient, wide-coverage communication infrastructure. Such networks are essential for coordinating swarms of autonomous agents in remote areas where terrestrial networks fail, ensuring that the “social” connection remains unbroken.
3. Future Outlook
The research presented in this Special Issue highlights an inevitable trajectory: the convergence of Generative AI, robotics, and social computing. We are witnessing a paradigm shift from simple “Social Bots” to sophisticated “Social Agents”—entities that can think, see, and touch. Unlike their text-centric predecessors, these future agents will operate across three interconnected dimensions, as evidenced by the contributions in this volume.
- Cognitive Autonomy: Future agents must move beyond script adherence to exhibit emergent behaviors and reasoning. They will simulate complex human societal dynamics in computational sandboxes, a direction supported by recent surveys on LLM-based multi-agent systems [12].
- Multimodal Perception: Agents will seamlessly transition between digital platforms and physical forms. They will require visual adaptability to verify reality and tactile sensitivity to interact with the physical world.
- Resilient Infrastructure: The sustainability of these agents will depend on robust network architectures capable of supporting distributed, autonomous swarms in even the most remote environments.
As these technologies mature, the challenge for the scientific community shifts from merely detecting these actors to understanding their complex interactions with human society. The boundary between human and machine is blurring, necessitating new governance frameworks to ensure that these powerful agents align with human values.
4. Conclusions
We extend our gratitude to all the authors, reviewers, and the editorial team who made this Special Issue possible. “Advances in Social Bots” stands as a testament to the field’s diversity, successfully bridging the gap between abstract software algorithms and tangible hardware engineering.
By integrating cognitive intelligence, perceptual robustness, and physical infrastructure, this collection provides the foundational blueprints for the next generation of autonomous systems. We hope this Special Issue inspires further inquiry into the symbiotic future of human and machine intelligence.
Funding
The work was partly supported by the National Natural Science Foundation of China (U25B2042).
Conflicts of Interest
The authors declare no conflicts of interest.
List of Contributions
- Xie, Z.; Niu, F.; Dai, G.; Zhang, B. From Claims to Stance: Zero-Shot Detection with Pragmatic-Aware Multi-Agent Reasoning. Electronics 2025, 14, 4298. https://doi.org/10.3390/electronics14214298.
- Xu, E.; Zhu, J.; Zhang, L.; Wang, Y.; Lin, W. Research on Aspect-Level Sentiment Analysis Based on Adversarial Training and Dependency Parsing. Electronics 2024, 13, 1993. https://doi.org/10.3390/electronics13101993.
- Zeng, X.; Luo, G.; Qin, K. Joint Event Detection with Dynamic Adaptation and Semantic Relevance. Electronics 2025, 14, 234. https://doi.org/10.3390/electronics14020234.
- Chen, W.; Miao, L.; Gui, J.; Wang, Y.; Li, Y. FLsM: Fuzzy Localization of Image Scenes Based on Large Models. Electronics 2024, 13, 2106. https://doi.org/10.3390/electronics13112106.
- Song, Y.; Liu, P.; Wu, Y. Enhancing the Sustained Capability of Continual Test-Time Adaptation with Dual Constraints. Electronics 2025, 14, 3891. https://doi.org/10.3390/electronics14193891.
- Sharma, Y.; Ferreira, P.; Justham, L. Hardness Classification Using Cost-Effective Off-the-Shelf Tactile Sensors Inspired by Mechanoreceptors. Electronics 2024, 13, 2450. https://doi.org/10.3390/electronics13132450.
- Qu, Y.; Wang, S.; Feng, H.; Liu, Q. Geometry Optimization of Stratospheric Pseudolite Network for Navigation Applications. Electronics 2024, 13, 2397. https://doi.org/10.3390/electronics13122397.
- Qu, Y.; Wang, S.; Pan, T.; Feng, H. IPCB: Intelligent Pseudolite Constellation Based on High-Altitude Balloons. Electronics 2024, 13, 2095. https://doi.org/10.3390/electronics13112095.
References
- Ferrara, E.; Varol, O.; Davis, C.; Menczer, F.; Flammini, A. The rise of social bots. Commun. ACM 2016, 59, 96–104. [Google Scholar] [CrossRef]
- Shao, C.; Ciampaglia, G.L.; Varol, O.; Yang, K.C.; Flammini, A.; Menczer, F. The spread of low-credibility content by social bots. Nat. Commun. 2018, 9, 4787. [Google Scholar] [CrossRef] [PubMed]
- Cresci, S. A decade of social bot detection. Commun. ACM 2020, 63, 72–83. [Google Scholar] [CrossRef]
- Thales Group. 2025 Bad Bot Report. 2025. Available online: https://www.imperva.com/resources/resource-library/reports/2025-bad-bot-report/ (accessed on 24 November 2025).
- Park, J.S.; O’Brien, J.; Cai, C.J.; Morris, M.R.; Liang, P.; Bernstein, M.S. Generative Agents: Interactive Simulacra of Human Behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST’23), New York, NY, USA, 29 October–1 November 2023. [Google Scholar] [CrossRef]
- Xi, Z.; Chen, W.; Guo, X.; He, W.; Ding, Y.; Hong, B.; Zhang, M.; Wang, J.; Jin, S.; Zhou, E.; et al. The rise and potential of large language model based agents: A survey. Sci. China Inf. Sci. 2025, 68, 121101. [Google Scholar] [CrossRef]
- Salvi, F.; Ribeiro, M.H.; Gallotti, R.; West, R. On the conversational persuasiveness of large language models: A randomized controlled trial. arXiv 2024, arXiv:2403.14380. [Google Scholar] [CrossRef]
- Gao, C.; Lan, X.; Lu, Z.; Mao, J.; Piao, J.; Wang, H.; Jin, D.; Li, Y. S3: Social-network simulation system with large language model-empowered agents. arXiv 2023, arXiv:2307.14984. [Google Scholar] [CrossRef]
- Niu, F.; Yang, M.; Li, A.; Zhang, B.; Peng, X.; Zhang, B. A Challenge Dataset and Effective Models for Conversational Stance Detection. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italy, 20–25 May 2024. [Google Scholar] [CrossRef]
- Wang, G.; Xie, Y.; Jiang, Y.; Mandlekar, A.; Xiao, C.; Zhu, Y.; Fan, L.; Anandkumar, A. Voyager: An Open-Ended Embodied Agent with Large Language Models. arXiv 2023, arXiv:2305.16291. [Google Scholar] [CrossRef]
- Fung, P.; Bachrach, Y.; Celikyilmaz, A.; Chaudhuri, K.; Chen, D.; Chung, W.; Dupoux, E.; Gong, H.; Jégou, H.; Lazaric, A.; et al. Embodied ai agents: Modeling the world. arXiv 2025, arXiv:2506.22355. [Google Scholar] [CrossRef]
- Guo, T.; Chen, X.; Wang, Y.; Chang, R.; Pei, S.; Chawla, N.V.; Wiest, O.; Zhang, X. Large Language Model Based Multi-agents: A Survey of Progress and Challenges. In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, Jeju, Republic of Korea, 3–9 August 2024; pp. 8048–8057. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).