Prevention Is Better than Cure: Exposing the Vulnerabilities of Social Bot Detectors with Realistic Simulations

Jin, Rui; Liao, Yong

doi:10.3390/app15116230

Open AccessArticle

Prevention Is Better than Cure: Exposing the Vulnerabilities of Social Bot Detectors with Realistic Simulations

by

Rui Jin

and

Yong Liao

^*

School of Cyber Science and Technology, University of Science and Technology of China, Hefei 230026, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(11), 6230; https://doi.org/10.3390/app15116230

Submission received: 24 April 2025 / Revised: 25 May 2025 / Accepted: 29 May 2025 / Published: 1 June 2025

(This article belongs to the Special Issue Artificial Neural Network and Deep Learning in Cybersecurity)

Download

Browse Figures

Review Reports Versions Notes

Abstract

The evolution of social bots, i.e., accounts on social media platforms controlled by malicious software, is making them increasingly more challenging to discover. A practical solution is to explore the adversarial nature of novel bots and find the vulnerability of bot detectors in simulations in advance. However, current studies fail to realistically simulate the environment and bots’ actions, thus not effectively representing the competition between novel bots and bot detectors. Hence, we propose a new method for modeling the impact of bot actions and develop a new bot strategy to simulate various evolved bots within a large social network. Specifically, a bot influence model and a user engagement model are introduced to simulate the growth of followers, retweets, and mentions. Additionally, a profile editor and a target preselection mechanism are proposed to more accurately simulate the behavior of evolved bots. The effectiveness of the bots and two representative bot detectors are verified using adversarial simulations and the real-world dataset. In simulated adversarial scenarios against both RF-based and GNN-based detection models, the proposed approach achieves survival rates of 99.7% and 85.9%, respectively. The simulation results indicate that, despite utilizing the bots’ profile data, user-generated content, and graph information, the detectors failed to identify all variations of the bots and mitigate their impact. More importantly, for the first time, it is found that certain types of bots outperform those usually deemed more advanced in ablation experiments, demonstrating that such “penetration testing” can indeed reveal vulnerabilities in the detectors.

Keywords:

social bot; adversarial simulation; reinforcement learning

1. Introduction

Social bots are automated accounts on social media platforms. Malicious social bots have become a growing concern due to their ability to deceive and manipulate social networks [1,2,3]. Bot detection methods have evolved from traditional feature engineering to more advanced text-based and graph-based deep learning techniques. Despite these advancements, researchers are struggling to keep up with the rapid development of new bots [4]. This detection lag stems from the reactive nature of current research, where efforts focus on retrospectively adapting to bot advancements rather than proactively countering them. Meanwhile, bots continue to evolve adversarially, systematically bypassing existing detection mechanisms.

Adversarial examples generated through adding noise have demonstrated effectiveness in evading deep neural networks, which have been applied in most state-of-the-art bot detectors [5]. Studies have shown that modifying certain features or adjusting the bots’ neighborhood can significantly reduce the likelihood of being detected by bot detectors [6,7]. However, the bots’ states are subject to change over time. As a result, existing approaches based on static bot states and environments may not accurately reflect the impact of the bots’ behavior. While evaluating bot detector vulnerabilities on real social platforms could be a promising solution, ethical concerns need to be considered [8]. A practical approach is to draw inspiration from security engineers. Specifically, this involves proactively identifying vulnerabilities and exploring new tactics to help bots avoid detection, which deepens our understanding of possible bot adversarial behavior. Efforts have been made to simulate bot behavior and create virtual environments [9,10,11], resulting in the development of bots capable of evading detection and enhancing their influence.

To address these issues, a more realistic virtual environment and novel empirical action strategies trained with reinforcement learning that might be used to evade detection while expanding influence are presented in this paper. By analyzing the growth of followers, retweets, and mentions of bots using a real-world dataset, a common relationship with user-generated content was identified. This enabled us to develop a model that simulates the influence of bots and user engagement, helping to better simulate the impact of bot actions. Furthermore, an adversarially trained profile editor and a target preselection mechanism for bots are introduced, allowing for the simulation of novel bots within a large social network.

The heuristic used in previous studies was falsified, and the first bot influence and user engagement models were proposed by analyzing a large real-world dataset.
An adversarially trained profile editor and a preselection mechanism based on user clustering were proposed to enable the simulation of various novel bots that have evolved in different aspects within a large social network.
Novel bots were trained to function as fake accounts and influencers, and both RF-based and GNN-based detectors that utilized the bots’ profile data, user-generated content, and graph information were deployed to identify them. The results demonstrated that the improvements helped the bots to evade detection, rendering the adversary detectors ineffective.
We found that certain types of bots outperform others in the ablation experiments, which proves the existence of the detectors’ vulnerabilities.

2. Related Works

In this section, previous social bot detection studies are broadly divided into three categories and briefly introduced in Section 2.1,Section 2.2,Section 2.3. Additionally, existing studies that have attempted to simulate the evolving bots are introduced in Section 2.4.

2.1. Feature-Based Social Bot Detection

It is intuitive to review the account profile in order to differentiate between a malicious bot and a regular user. Researchers have created specific features, such as the number of tweets posted per day, and used them to train machine learning models. The random forest (RF) model is a widely used traditional machine learning model for identifying social bots. In 2014, Davis and Varol introduced BotOrNot, a Twitter social bot classification service based on RF [12]. BotOrNot applied more than 1000 features, which covered an account’s profile, generated content, and social network. Later, the researchers renamed BotOrNot to Botometer. Since its release, more training datasets and tailored features have been used to improve its performance [13,14]. These handcrafted features are also a common part of the input of deep-learning-based detectors [15,16,17].

2.2. Text-Based Social Bot Detection

Although it might seem straightforward to judge if an account is a bot according to what it has said, it is challenging to handcraft features to offer enough information for classification. Botometer leveraged former studies about languages to score the emotion and altitude represented by texts. Other complicated features were proposed to describe the writing style or contrast pattern [18,19]. Applying deep learning to natural language processing is a breakthrough. Wei et al. built a Bidirectional Long Short Term Memory (BiLSTM) model, trained it with only text, and achieved an F1 of 0.963 without feature engineering [20]. Heidari used a pre-trained BiLSTM model to obtain representations of texts and combined them with user profiles [16]. Dukić et al. developed their detector using a pre-trained language model (BERT) [21]. Feng et al. proposed SATAR, which adopted the attention mechanism [17].

2.3. Graph-Based Social Bot Detection

Digging into an account’s friends/followers is another way to identify whether it is a bot. Early graph-based models applied either random walk or loopy belief propagation. Random walk models assume that starting from regular users, other normal users can be reached within fewer steps than bots, thus using short random walks to propagate labels/scores from known accounts to unknown accounts [22]. Models that apply loopy belief propagation instead propagate probability, which can better represent our prior knowledge than just classifying the accounts into known normal users and unknown accounts [23].

Since 2020, Graph Neural Networks (GNNs) have been widely applied in social bot detection and become the mainstream social bot detection method. GNNs have enabled the simultaneous utilization of handcrafted features, language model-based text representation, account profiles, and social network information [24,25]. State-of-the-art detectors have further emphasized the importance of capturing the structure of the local neighborhood/community of user nodes [26,27,28].

2.4. Social Bot Imitation

Social bots are evolving. In contrast to other studies investigating socialbots, Cresci et al. attempted to simulate the evolving bots’ action sequences using a genetic algorithm to explore their potential future developments [9]. The generated bots can evade a behavior-based detector [29]. In a study by Le et al., a virtual social network environment was created, and reinforcement learning was utilized to simulate a bot that aims to maximize its influence within the network while avoiding detection by a random forest trained on ten specific features [10]. Zeng et al. later enhanced the bot’s performance by incorporating structural entropy to filter out uninfluential users [11]. LLMs have also been introduced in state-of-the-art works to generate human-like behaviors. Gao et al. simulated the users’ reaction towards certain content using LLM-based agents [30]. Fent et al. explored the usage of LLMs in both LLM-based bot detectors and LLM-guided bot manipulation [31].

3. Materials and Methods

This paper presents the following contents. First, Section 3.1 delineates the threat model, specifying the capabilities and objectives of both attackers and defenders. Section 3.2 demonstrates the proposed framework, while Section 3.3,Section 3.4,Section 3.5,Section 3.6 detail its individual components: the virtual environment, target preselection mechanism, profile editor, and bot strategy model. Additionally, Section 3.7 analyzes the computational complexity of the proposed framework.

3.1. Threat Model

The detector developers have access to all data contained in the dataset and need to train a bot detection model D based on it. D is then used to predict the bot score of the controlled bot given its current state

s_{t}

. An active detection will be performed every 10 steps. If the output of the detector

D (s_{t})

, also known as the bot score, is higher than

0.5

, the bot is considered discovered and suspended.

The bot developers have access to all data contained in the dataset and

D (s_{t})

, but D is a black box in its view. It needs to create a new bot account at the beginning and control this bot to perform proper actions to gain influence and evade detection simultaneously. This process is modeled as a Markov decision process presented in Figure 1. Given the state

s_{t}

of the bot, both the action selection agent and the target selection agent determine the bot’s next move. If the bot decides to update its profile, a new profile will be created based on

s_{t}

by a profile editor. If the bot opts to post a tweet or interact with other users, the impact of these actions is simulated using our bot influence model and user engagement model. Finally,

s_{t}

to

s_{t + 1}

are updated and assess whether to continue the simulation. The loop will terminate if the bot is detected or if the bot detection model fails to identify the bot within the

t_{m}

steps.

3.2. Framework

The framework is shown in Figure 2. To begin, a virtual environment is created utilizing real-world data. This involves analyzing bot influence and user engagement, reconstructing social networks, and training adversary detectors. Then, a profile editor is developed and preselection is carried out. The profile editor contains a substitute detector to match the results produced by the black-box adversary detector and a generator trained adversarially. Using the real-world data, user clustering and preselection are performed to narrow down potential targets. This sets the stage for training the action selection agent and the target selection agent, enabling us to train the bot strategy model subsequently. Throughout the training process, the bot strategy model will guide a new account to behave as a fake account or an influencer, interacting with the virtual environment and simulating the creation of a new social bot.

3.3. Virtual Environment

To simulate the development of a bot account within a reasonable time and avoid troubling regular users, a virtual environment was created. This environment simulates Twitter via three directed graphs

{G_{F}, G_{R}, G_{M}}

and a user matrix A. The nodes in graphs

{V_{F}, V_{R}, V_{M}}

represent the users that have followed/been followed by, retweeted, and mentioned other users, and the edges in graphs

{E_{F}, E_{R}, E_{M}}

represent the following, retweeting, and mentioning relationships. The

N \times n_{f}

user matrix A represents each user as a set of features, where N is the total number of users and

n_{f}

is the number of features. The social network is restored from a real-world dataset. To maintain computational efficiency, accounts that have no direct interactions with the controlled bot are frozen. The graph topology will be updated when the bot performs network-altering actions—such as following, retweeting, or mentioning other users—or when it receives engagements from other accounts within the simulation.

3.3.1. Influence Simulation

The primary objective of most social bots is to influence the real world. One way they do this is by gaining the trust of other users and persuading them to adopt their perspectives. Previous studies have used the ICM model to simulate how the bot’s influence spreads. In this model, the social network is represented as a directed graph. An edge from user u to user v indicates that the information can spread from u to v, i.e., v follows u. It is assumed that all users were deactivated at the beginning. The bot needs to interact (retweet, mention, or follow) with a user several times to activate, i.e., to influence the user. The number of interactions required depends on the number of followers of the bot and the user.

However, our analysis results do not support this assumption. The following network

G_{F}

restored from Twibot-22 (see Section 4.1.1 for more information) contains 612,329 normal users and 81,432 bots. The normal users have followed the bots 153,763 times in total. The bot has not interacted with the user 96,180 times, indicating that the majority of the bot’s influence does not come from direct interactions with other users. We think that the primary reason these followers are following the bots is because of the bots’ tweets. To demonstrate this, k seed bots with the slowest follower growth rate were chosen, and the similarity between their tweets and those of other bots was compared. The k seed bots can be represented as

S e e d_{F} \subset V_{F}^{'} s . t . \begin{matrix} | S e e d_{F} | = k & \\ \forall u \in V_{F}^{'} ∖ S e e d_{F} \forall v \in S e e d_{F}, \\ \frac{N_{F o l l o w e r} (u)}{N_{T w e e t} (u)} \geq \frac{N_{F o l l o w e r} (v)}{N_{T w e e t} (v)} \end{matrix}

(1)

where

V_{F}^{'}

are the bots in

V_{F}

, and

N_{F o l l o w e r} (u)

and

N_{T w e e t} (u)

are the number of followers and tweets of user u. With k set to 1000, the similarity is assessed by calculating the Euclidean distance between the tweet embeddings generated using the RoBERTa model:

S_{F} (u) = d i s t (E (u), \frac{\sum_{v \in S e e d_{F}} E (v)}{| S e e d_{F} |})

(2)

where

E (u)

is the tweet embedding of u. The tweet information of 64,741 bots is provided. Based on their similarity, the bots are grouped into 13 clusters of 5000 bots. Figure 3a presents the relationship between the mean follower growth rate and the similarity. It is evident that bots that are less similar to the seed bots generally have higher follower growth rates.

The assumption of the probability of a user following back after interaction is also doubtful. Existing studies believe that this probability should be proportional to the number of the bot’s followers

N_{F o l l o w e r} (b)

and be inversely proportional to the number of the user’s followers

N_{F o l l o w e r} (u)

.

q \times (\frac{1 + N_{F o l l o w e r} (b)}{1 + N_{F o l l o w e r} (u)})

(3)

where q is a hyperparameter set to 0.3 [10]. However, the possibility of the target user following back on different interactions is different. After excluding the users who have been interacted with in multiple ways, statistics show that 15.4% of the followed users followed back, 3.3% of the retweeted users followed back, and 2.8% of the mentioned users followed back. Note that only 25 retweeted users and 989 mentioned users followed back, whereas 49,835 followed users followed back, demonstrating that following other users is a more effective and common way to gain followers.

Hence, the influence of the controlled bot is simulated as follows. When the controlled bot posts a new tweet, there is a probability that a normal user will follow it. The average follower growth rate of bots with the closest

S_{F}

is used as the prediction.

\begin{matrix} \frac{\sum_{u \in U} \frac{1 + N_{F o l l o w e r} (u)}{1 + N_{T w e e t} (u)}}{| U |}, U \subset V_{F}^{'} s . t . \\ | U | = k & \forall u \in V_{F}^{'} ∖ U \forall v \in U, \\ | S_{F} (u) - S_{F} (b) | \geq | S_{F} (v) - S_{F} (b) | \end{matrix}

(4)

where U values are the bots with the k closest

S_{F}

. This new follower is sampled from the normal users that follow the bots using weighted random sampling, where the weights are the number of bots the user follows. To rule out the users that might have followed the bots for other reasons, only the users who have followed bots that have not interacted with them and have posted at least one tweet are considered. When it comes to the influence gained by interaction, it is worth noting that statistics indicate most followers are gained through mutual following. Therefore, we can assume that when our controlled bot follows a user who already follows other bots, there is a possibility that the user will engage in a “follow-for-follow” practice. To model the likelihood of a regular user reciprocating a follow, we heuristically use the user’s number of friends and followers.

q^{'} \times (\frac{1 + N_{F r i e n d} (u)}{1 + N_{F o l l o w e r} (u)})

(5)

where

q^{'}

is a hyperparameter. To minimize the mean absolute error (MAE) in predicting follower growth rates,

q^{'}

is determined by dividing the observed average follower growth rate by the ratio of friends to followers.

3.3.2. User Engagement Simulation

Other users may interact with the bot in an active way, such as by retweeting or mentioning it. The retweeting network

G_{R}

and the mentioning network

G_{M}

are also commonly utilized for bot detection. While previous studies have incorporated retweeting and mentioning into their analysis, they fail to account for instances where the bot is passively retweeted or mentioned by other users. This oversight results in incomplete graph information. To create a more realistic simulation of user engagement, the user engagement is modeled as follows.

The content of the tweet clearly influences the act of retweeting. Therefore, an analysis similar to the one presented earlier is performed. As mentioned above, the first step is to choose seed bots

S e e d_{R}

. Interestingly, it is difficult to establish a connection between the retweet growth rate and tweet similarity when the

S e e d_{R}

bots that have a low retweet growth rate are selected. However, when the bots with the highest retweet growth rates are chosen to act as

S e e d_{R}

, a clear relationship emerges between retweet growth rates and tweet similarity. We believe this is caused by social desirability: while users have diverse interests, they often share content that they believe will be positively received by others, resulting in a narrower range of shared content [32]. The subsequent steps of the analysis remain consistent. The bots are grouped into 14 clusters of 1000 bots. Figure 3b presents the relationship between the mean follower growth rate and the tweet similarity. It is clear that bots that are more similar to

S e e d_{R}

tend to have higher rates of retweet growth.

Normal users mentioned the bots a total of 157,125 times. Out of these mentions, 152,865 were not mutual, indicating that the vast majority of bot mentions did not arise from conversations with other users. Hence, it is again assumed that mentioning is affected by the tweet content and uses the same method to verify it. Based on Figure 3c, it can be concluded that bots that are less similar to the seed bots with the lowest mention growth rate (

S e e d_{M}

) tend to have higher mention growth rates.

The same method used to model influence for estimating user engagement is applied. Specifically, when a controlled bot tweets, there is a probability that a regular user will retweet or mention that tweet. Our prediction is based on the average retweet growth rate of bots that closely match the corresponding metrics,

S_{R}

and

S_{M}

.

3.3.3. Bot Detection Model

The effectiveness of the bot detection model is essential to simulation. Two representative bot detection models are adopted as the adversary. One is a random-forest-based model. The detector is trained with the most features from Botometer that are available for the dataset [33]. Calculated features include

Profile features, such as user name length, account age, and follower count.
Friend features, such as the mean account age of friends and the maximum number of followers of friends.
Network features, such as the number of nodes and edges of the user’s retweet graph.

The number of features reached 145. These features can also be used to represent the bot’s state

s_{t}

. Manually engineered text-based features are not adopted because many are only available for texts written in English. Instead, BotRGCN, a representative GNN-based bot detection model, is adopted as another detection model [24]. BotRGCN utilizes the users’ tweets, descriptions, profiles, and the following network to embed and classify them. During the training of the RF detector, weights of 1 to normal users and 3 are assigned to bots to account for class imbalance. In a 5-fold cross-validation on the TwiBot-22 dataset, the RF detector achieves an accuracy of 84.9% and an F1-score of 50.1%, while BotRGCN attains 79.7% accuracy and a 57.5% F1-score [4]. For our simulation, both detectors are trained on the entire TwiBot-22 user base.

3.4. User Clustering

Social bots not only post tweets but also need to engage with other users to mimic regular users and build trust and influence. When they decide to follow, mention, or retweet other users, they need to choose a target user. A user’s number of followers is seen as an indication of their reputation and influence. Users vary in terms of how difficult it is to gain their trust and the impact of interacting with them. In previous studies, the selection of users was carried out by a specialized agent with a discrete action space of N, which is typically around 1000 for the environment used in the literature. Each user corresponds to an action. However, as N increases, this approach results in a high-dimensional discrete action space, meaning the agent needs to explore and learn the best actions from a larger set. This leads to higher training costs.

To reduce the dimensionality of the action space for target user selection, user clustering is proposed to be performed before action space construction. Specifically, the user selection process is heuristically decomposed into two sequential steps: interest-based selection followed by reputation-based selection. Initially, tweet embeddings are utilized to capture user interests. Due to the high dimensionality of tweet embeddings, principal component analysis is applied to reduce data sparsity and computational overhead. The number of principal components is set to 15, where the explained variance ratio is just over 0.95. Subsequently, K-Means clustering is employed to group users with similar interests, where each cluster is treated as a distinct action. The number of clusters is fixed at 1000 to maintain consistency with established methodologies in prior research [10]. Subsequently, the bot detection model is utilized to predict bot scores for all users, with the user exhibiting the lowest bot score being identified as possessing the highest reputation. This user is then selected as the target user for the chosen cluster. However, because very few of them have followed any bots, it would be extremely challenging for the controlled bot to gain influence from them based on our assumption. To mitigate this limitation, an alternative selection criterion is implemented, wherein for 50% of clusters, the target user is chosen from among those who have followed at least one bot account while still maintaining the lowest bot score requirement.

3.5. Profile Edition

Profile edition—such as altering usernames or profile pictures—can further enhance evasion capabilities. However, existing RL-based approaches have not implemented this functionality due to the challenge of mapping such discrete actions into the action space. To resolve this limitation, the implementation of a dedicated profile editor module is proposed, eliminating the dependency on supplementary agent-based approaches.

The profile editor’s input is the controlled bot’s current state. The goal is to adjust the profile-relevant features and minimize the bot score. Since editing some features, such as whether the account is verified or protected, might cause higher costs or challenges when propagating influence. The accessible features are summarized as follows:

Username length: Considering Twitter’s rules (the username must be more than 4 characters long and can be up to 15 characters or less. It can contain only letters, numbers, and underscores—no spaces are allowed) and the requirement of the flexibility of the digits of numbers in the username, we assume that only letters and underscores are allowed, resulting in 27 options for each digit. Given that usernames cannot be duplicated and Twitter has over 300 million users, the range of username length is set to [7, 15].
Number of digits in username length: The maximum value is the bot’s username length.
Display name length: The display name can be up to 50 characters long and can be duplicated. Thus, the range is [1, 50].
Whether the default profile picture is used.
Description length: The bot’s description is randomly copied from the dataset. The length range is [0, 274].
Whether the description includes URLs.

The structure of the profile editor is similar to a generative adversarial network. It consists of two three-layer MLPs followed by a sigmoid layer. Since the bot detector D is inaccessible to the bot, a substitute model

D^{'}

is used to fit its outputs to enable adversarial training. The other MLP, acting as the generator G, takes

s_{t}

as input and produces a 6-digit vector, which can be mapped into the ranges of the six features mentioned above. During the training phase, the bot detection model is first employed to predict bot scores for all users in the dataset. Subsequently, the substitute model is trained to predict bot scores based on each user’s feature

s_{u}

. Finally, the generator is optimized to minimize user bot scores through the application of the following loss function:

L = D^{'} (s_{u}) - D^{'} (G (s_{u}))

(6)

The substitute model and the generator are trained on the real-world dataset for 100 epochs using an Adam optimizer, with the learning rate set to 0.003.

3.6. Bot Strategy Model

Reinforcement Learning (RL) has been widely applied in decision making in virtual adversarial environments. Proximal Policy Optimization (PPO), one of the most widely used RL models, has been applied to control one bot, successfully increased its influence, and evaded a detector simultaneously [10]. Thus, PPO with an MLP policy is applied to train our bot strategy model. The bot’s action is controlled by two agents responsible for action selection and target selection, respectively.

Observation space: Due to the scale of the experiment, the whole environment cannot be input and thus limits the bot strategy model’s input to the bot’s current state, i.e., the features used to train the RF-based bot detection model. The two agents share the same observation space of size

n_{f}

.

Action space: To enable bot interactions within the virtual environment, a set of actions mimicking typical Twitter user behavior is defined. The bot’s possible actions include

Posting a tweet;
Updating the profile;
Retweeting, mentioning, or following an account;
Creating or deleting a list;
Waiting for a day.

Thus, the action selection agent’s action space is a discrete space of 8 actions. Since we did not generate new texts and focused on decision making, tweets posted by 10,000 bots that are most dissimilar from

S e e d_{F}

were selected randomly to build tweet pools of 1000 for posting, retweeting, and mentioning, respectively. The “wait for a day” action is added to simulate the flow of time. This allows for the measurement of important features of an account, such as posting frequency and account age, which are commonly used in existing bot detection models. Moreover, this adds to the realism of the simulation and improves the practicality of the bot strategy model. As introduced in Section 3.4, the action space of the target selection agent is a discrete space consisting of 1000 actions. Each action corresponds to the selected user from one user cluster.

Reward Formulation: The bots are categorized into two classes: fake accounts and influencer bots. A fake account is simply a bot that evades bot detection. Conversely, an influencer bot needs to escape bot detection and increase its influence to qualify as an influencer bot. Overall, we believe that fake accounts and influencer bots represent the majority of bot types. For instance, bots designed to manipulate votes and fake followers are essentially fake accounts, since their primary goal is to evade detection. Similarly, influencer bots embody the objectives of spam and political bots, as they aim to gain influence while avoiding detection. Two distinct rewards are developed to train the bot strategy model accordingly.

The profit of a fake account is proportional to its survival time:

P^{f} = l, D (s_{l}) > 0.5 & D (s_{l - 1}) < 0.5

(7)

where l represents the time when the account is detected and suspended. Note that RF-based detectors’ direct outputs and those from deep learning-based detectors typically range from 0 to 1, which are considered the probability of the input account being a bot, i.e., the bot score. A higher bot score indicates that the detector can classify the account as a bot with greater confidence.

The profit of an influencer is quantified by counting the number of views of their posts, which is related to the number of followers and tweets. Let

| N (u, t) |

be the number of followers at time t and

P o s t_{t} \in {0, 1}

represent whether the bot posts a tweet at time t. Assuming that all followers would view all tweets posted after they follow the bot, the total number of views of the bot is

P^{i} = \sum_{t = 0}^{l} | N (u, t) | \times P o s t_{t}

(8)

To mitigate the sparse reward problem, the reward function is designed by incorporating both the immediate profit and the dynamically monitored predictions from the bot detection model.

\begin{matrix} R_{t}^{f} = \{\begin{matrix} D (s_{t}) - D (s_{t - 1}), t < l \\ \frac{t}{t_{m}}, t = l \end{matrix} \end{matrix}

(9)

\begin{matrix} R_{t}^{i} = \{\begin{matrix} D (s_{t}) - D (s_{t - 1}) + \frac{| N (u, t) | \times P o s t_{t} \times θ}{t_{m}}, t < l \\ \frac{t + | N (u, t) | \times P o s t_{t} \times θ}{t_{m}}, t = l \end{matrix} \end{matrix}

(10)

where

θ

is a discount factor to balance the reward of hiding and influencing.

3.7. Complexity Analysis

In this section, the computational complexity of our proposed components is analyzed. The proposed framework is divided into two parts: the preprocessing procedure and the simulation. We assume that the dataset is sufficiently large; therefore,

| V |

and

| E |

are much larger than the other variables. In Section 3.3.1 and Section 3.3.2, preprocessing involves identifying seed bots and accounts that may follow, retweet, or mention, which takes

O (| V |)

time complexity. The corresponding simulation in every step is

O (1)

. The user clustering presented in Section 3.4 incurs a time complexity of

O (p^{2} | V | + p^{3} + k | V | T) = O (| V |)

, where p is the number of features, k is the number of clusters, and T is the number of iteration. Training the profile editor consisting of two MLPs has a time complexity of

O (n^{2} E | V |) = O (| V |)

, where n is the number of neurons and E is the number of epochs. Using the profile editor to generate a profile takes time consumption of

O (n^{2})

. Therefore, the preprocessing, which includes building the virtual environment and action space, has a time complexity of

O (| V | + | E |)

. During the simulation, at each step, an action is generated (

O (n^{2})

), followed by the potential influence/user engagement simulation and profile editing mentioned previously (

O (1 + n^{2}) = O (n^{2})

) and concluding with the detector predicting the bot score. Hence, the time complexity of a simulation is

O (t_{m} (n^{2} + f))

, where f is the time complexity of the detector. Finally, the space complexity of the proposed framework is

O (| V | + | E |)

, as constructing the virtual environment dominates memory consumption compared to other operations.

4. Results

This section presents the following research contents. First, Section 4.1 details the experimental setup, including the dataset description, evaluation metrics, and implementation specifics. Section 4.2 evaluates the proposed influence and engagement model. Subsequently, Section 4.3 and Section 4.4 present the adversarial simulation results against various detectors when employing different reward functions to simulate distinct bot types (fake accounts and influencers, respectively).

4.1. Experimental Setup

4.1.1. Dataset

TwiBot-22 was used to initiate our virtual environment and train social bot detection models since it provides large-scale detailed data and reliable annotations [4]. TwiBot-22 includes profiles, tweets, and social network data collected from 1,000,000 Twitter accounts using breadth-first search (BFS), which starts from “seed users” and expands with following relationships. The authors invited bot detection experts to annotate 1000 Twitter users in TwiBot-22 and then generate noisy labels with the help of 15 competitive bot detection methods, which were then used to generate high-quality annotations for TwiBot-22 with Snorkel [34].

4.1.2. Evaluation Metrics

To assess the effectiveness of the proposed influence and engagement model, the mean absolute error (MAE) of the predicted follower and engagement growth rate was calculated. The following metrics were adopted to evaluate the influence and ability of the bot to evade detection.

N (u)

represents the number of users who will receive tweets from the bot and reflects its influence. The network influence ratio, i.e., the ratio of

N (u)

to the total number of users

| V |

, was used in previous studies to evaluate influence [10,11]. Since the scalability of the proposed bot strategy allows us to use a large social network reconstructed from a real-world dataset—rather than further dividing it into subgraphs—

| V |

remains consistent throughout experiments. Therefore,

N (u)

was used as one of the metrics for evaluating influence. Additionally, by considering the number of tweets,

P^{i}

, or the number of views, assesses the impact caused throughout the simulation. The survival rate reflects the bot’s ability to evade detection [7]. Considering that the passing threshold can be adjusted,

D (u)

, known as the bot score, was used to evaluate the difficulty of discovering them by lowering the passing threshold. The standard deviation of

D (u)

,

\sum N (u)

, and

\sum P^{i}

of the surviving bots may also be presented for the assessment of robustness.

4.1.3. Implementation

To assess the bot’s effectiveness in various roles, similar experiments were conducted in which the bot was trained to simulate either fake accounts or influencers. These experiments utilized the same real-world dataset, but additional metrics were employed to evaluate the bot’s impact in influencer scenarios. The experiments were conducted in four different settings: fake account or influencer versus RF-based or BotRGCN. To test the effectiveness of the proposed enhancements, the following ablation experiments were conducted.

Baseline: Since this study is based on the work of Le et al., the baseline is based on ACORN (SISAM uses an adjacency matrix to represent the relation between users, leading to a space complexity of $O (n^{2})$ . Since this would cause severe scaling problems when applying SISAM on Twibot-22, SISAM was not used as a baseline) [10]. The bot strategy model is similar to the proposed model but with a different action selection agent and a target selection agent. Possible actions are posting, retweeting, mentioning, and following. The discrete target selection action space is mapped into randomly selected users.
Additional actions: Except for the actions in the baseline, the action selection agent’s action space includes three new actions: creating/deleting a list and waiting for a day.
Additional actions + profile editor: The action selection agent’s action space further includes updating the profile with the proposed profile editor.
Additional actions + preselection: The target selection agent’s action space is mapped to the preselected users.
Combined: All proposed enhancements are applied.

The virtual environment and reinforcement learning are based on PettingZoo and Stable Baselines3 [35,36]. In the following experiments, the bot strategy models were trained for 100,000/200,000 steps against the RF-based/BotRGCN, and tested in 1000 simulations, with all other hyperparameters set to their default values.

t_{m}

was set to 500, since most bots were discovered before 500 steps.

4.2. Influence and Engagement Model

To assess the effectiveness of the proposed influence and engagement model, the MAE of the predicted follower growth rate was calculated, using the average follower/engagement growth rate as the baseline prediction. The MAE for the proposed model is 0.082, while the baseline has an MAE of 0.061. As for the engagement model, the MAEs for our predicted retweet and mention growth rates are 0.019 and 0.186, respectively, while the MAEs for the average retweet and mention growth rates are 0.026 and 0.270. Additionally, to evaluate the probability of a user following back after interaction, the MAE of the predicted number of followers was assessed.

q^{'}

was set to 0.25. The MAE using expression (5) was 0.515, whereas the MAE using expression (3) was 0.684. In summary, the proposed influence and engagement model contribute to a more realistic simulation.

4.3. Fake Account Simulation

The bots acted as fake accounts in these experiments and focused on evading detection. The final survival rate and average bot score are presented in Table 1. The RF detector was unable to effectively detect all variations of the bot strategy model, with the lowest survival rate being 81.8%. The proposed improvements for the bots were overall beneficial, either increasing the survival rate or lowering the average bot score. The best results were achieved using the combined bot strategy, resulting in a survival rate of 99.0% and an average bot score of 0.354. These results are 12.3% and 0.056 better than the baseline, respectively.

On the other hand, BotRGCN has achieved more promising results compared to the RF detector. The bots’ survival rates were significantly lower overall. Still, the additional actions bot strategy model managed to evade its detection with a survival rate of 85.9%. The additional actions model achieved an average bot score of 0.115, indicating that it is challenging for the detector to identify them by reducing the threshold. Surprisingly, introducing the profile editor and the preselection primarily decreased the survival rate. During the initial stages of the simulations, frequent profile updates were observed when employing the bot strategy with the profile editor. Similarly, a more significant rise in the number of followers was noticed in the simulations utilizing preselection. The observation that abstaining from certain actions might have yielded better outcomes (yet the bot strategy model failed to recognize this) implies that while such actions offer short-term rewards, and they could be detrimental in the long run, disturbing the training of the bot strategy model. Moreover, this demonstrates that bots do not need to be ’advanced’ to avoid detection. By experimenting with different bot strategies, one can uncover vulnerabilities in the detectors.

The number of undiscovered bots and the average bot score throughout the experiments are presented in Figure 4. Figure 4a,b show that all bots discovered by the RF detector were detected in the first detection. BotRGCN also discovered more bots at the early stage of the simulation but kept detecting bots throughout the experiments. The reason for this can be found in Figure 4c,d. Despite the RF detector being a discontinuous function, the average bot scores throughout the simulation were much more steady than those given by BotRGCN, demonstrating that BotRGCN can perceive slighter changes in bot states.

The latest studies increasingly focus on deep learning and using GNN to create social bot detectors. Our findings above also support that GNN-based bot detectors are more advanced than traditional RF detectors. However, the average bot scores tell a different story. The results indicate that the highest and lowest average bot scores achieved with the RF detector were 0.410 and 0.352, respectively, while those achieved with BotRGCN were significantly lower at 0.115 and 0.028. Figure 5a,b display the distribution of bot scores for human accounts given by the RF detector and BotRGCN. When using the RF detector, 10.0% of the human accounts were misclassified as bots with a threshold of 0.5. This misclassification rate would increase to 16.0% and 21.5% with threshold values of 0.410 and 0.352, respectively. If using BotRGCN, the misclassification rate would jump from 9.2% to 42.7% and then to 79.7% as the threshold was lowered from 0.5 to 0.115 and 0.028, respectively. Therefore, we argue that we cannot jump to the conclusion that GNN-based detectors are more advanced. Traditional RF-based detectors might have an advantage when attempting to increase the recall by adjusting the passing threshold.

4.4. Influencer Simulation

In the following experiments, the bots acted as influencers and had to strike a balance between avoiding detection and expanding their influence. Similar experiments were conducted to assess the performance of different variations of the bot strategy model. The only difference is that the fake account reward function was replaced by the influencer reward function, introducing

θ

. The bot strategy models are very sensitive to

θ

, which controls the balance. To prevent the bots from focusing on just one goal,

θ

could not be fixed. Instead,

θ

was adjusted to different values to suit different models, as shown in Table 2.

The ability of the bots to avoid detection reduced overall. When using the RF detector, the survival rate of the baseline model, additional actions model, and the combined model decreased by 21.3%, 22.3, and 14.8%. Other variations of bot strategy models did not suffer significant decreases in survival rate, but their average bot scores increased by around 0.06. More decreases in survival rate were observed with BotRGCN. The survival rates of the baseline model, the additional actions model, the additional actions + profile editor model, and the combined model decreased by 24.4%, 67.1%, 22.8%, and 14.9%, respectively. However, the additional actions + preselection model achieved a remarkably strong performance. Not only did it increase the survival rate by 12.2% and decrease the average bot score by 0.026, but it also achieved the best performance in propagating influence. Expanding the influence by interacting with the preselected users and spamming posts was surprisingly beneficial for concealing the bots in this situation. It is important to note that our previous findings indicate that introducing preselection would significantly reduce the survival rate when the bot operates as a fake account. This highlights the interconnection of different bot characteristics, which collectively affect the performance of detection systems, making it more difficult for bot developers to refine their strategies. From the detectors’ perspective, this poses a risk as well; the additional actions + preselection model exploited the vulnerability of BotRGCN.

The impact of bots is concerning. The experiment herein demonstrates that if the RF detector is implemented, the additional actions + profile editor model will attract 36,320 followers, resulting in a total of 7,878,764 views over 1000 simulations. Although BotRGCN was able to identify and suspend many bots before the simulation concluded, a significant impact remained. With the additional actions + preselection model, this approach attracted 39,542 followers and generated 8,932,088 views, which is even higher than the follower and view counts achieved when using the RF detector. The number of followers and views throughout the experiments are shown in Figure 6. The RF detector still failed to detect bots after the first detection, leading to steady growth in the number of followers and an increasing growth rate in the number of views. On the other hand, BotRGCN slowed the growth of followers and views overall by reducing the number of active bots. In summary, both the RF-based and BotRGCNs were unable to accurately and quickly identify the proposed bots, allowing them to continue spreading their influence.

4.5. Discovery of Unrecognized Bots: An Example

In this section, an example of identifying unrecognized bots by analyzing the detector and the generated bots is provided. For the sake of explainability, the RF detector was analyzed to highlight the most significant and distinctive features of the generated bots. The additional actions + profile editor model was employed as the bot strategy model because of its overall performance. To seize the characteristics of the bot, the state of it was recorded after every step for 100 simulations.

Follower count, description length, tweet count, friend count, and list count are the top five most important features evaluated based on the decrease in impurity. A heatmap is used to present the density of accounts for these features (Figure 7). The corresponding mean values of the bot samples recorded in simulations were calculated and are presented for comparison. Obviously, the most distinctive characteristic of the bot was the high description length. Inspired by this characteristic, we looked into the real accounts in the dataset that have long descriptions. It turns out that among the accounts with the 10 most extended descriptions, all of their descriptions were filled up with URLs, and 8 out of the 10 accounts were domain sellers who display domains in their descriptions for advertisement. However, only two of them were recognized as bots. This proves that our work can actually benefit the detection of social bots.

5. Limitation and Future Work

Designing and verifying this framework is the first step to applying it against real-world social bots. The remaining problems are summarized as follows:

The virtual environment is still much simplified. More actions can be introduced. The recommender engines of social media platforms are not considered. With more realistic simulation come more valuable samples.
The bots would be more realistic if they cooperated. Only one bot was active during simulations. Thus, the bot strategy model was not optimized to control multiple bots simultaneously. However, there is increasing evidence that novel bots act in coordination.
The bot strategy models presented in this study successfully bypassed the adversary detectors, but there is still room for improvement, such as in the observation space and the reinforcement learning algorithm.

6. Conclusions

Accounts on social media platforms behind which stand nobody but malicious software (commonly known as social bots) have been playing hide-and-seek with researchers in the past decade. Studies have shown that bot detectors’ performance on novel social bots is not promising. We believe the crux of the issue lies in the fact that researchers have been constantly playing catch-up with ever-evolving social bots, leading to the development of bot detectors lagging behind. To tackle this, one practical method is to train our own social bots using a virtual environment and reinforcement learning to discover vulnerabilities in advance.

By analyzing a large real-world dataset, the heuristic used in previous studies was falsified, and the first bot influence and user engagement models were proposed by analyzing a large real-world dataset. The action space for bots was expanded, and a profile editor module and user preselection mechanism were introduced to simulate advanced bot behaviors. Adversarial experiments were conducted to evaluate whether the proposed bots could breach RF-based and GNN-based detectors while acting as fake accounts and influencers, respectively. The results are worrying: the RF-based detector could neither detect them nor avoid their influence propagation. The GNN-based detector, despite outperforming the RF-based detector overall, could still be breached by some bots. The following conclusions can be drawn based on the experimental results.

Aside from the structure of the detector, other information, such as the utilized account data and the output bot score of given accounts, can also be used against it. Social media platforms should think twice before releasing any information relevant to their bot detector.
The impact of evaluation on different aspects of bots is interconnected. A highly developed bot may not necessarily deliver the best performance, and conversely, a bot detector may not effectively detect certain types of bots, even if it can detect bots that are generally considered more advanced. It is advisable to create and test various types of bots to identify vulnerabilities before they can be exploited for malicious purposes.
RF-based detectors may have a higher recall for identifying new social bots compared to GNN-based detectors, although this may come at the expense of precision. It could be worthwhile to conduct further research on using an RF-based detector with a low-pass threshold to screen users before additional analysis or for ensemble learning.
The detectors should undergo separate evaluations using bots with varying account ages, numbers of followers, and number of views. Bots that don’t attempt to increase their influence are usually more challenging to identify. Also, it is crucial to discover and suspend influencer bots as soon as possible. This way, they will have fewer followers or views, helping to reduce their impact.

Author Contributions

Conceptualization, R.J.; methodology, R.J.; software, R.J.; validation, R.J.; formal analysis, R.J.; investigation, R.J.; resources, Y.L.; data curation, R.J.; writing—original draft preparation, R.J.; writing—review and editing, R.J.; visualization, R.J.; supervision, Y.L.; project administration, R.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in TwiBot-22 at https://github.com/LuoUndergradXJTU/TwiBot-22 (accessed on 31 May 2025).

Acknowledgments

This work is supported by the Provincial Key Research and Development Program of Anhui (202423l10050033) and the Ministry of Public Security Science and Technology Plan (Project Number 2023JSZ01).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Cresci, S.; Di Pietro, R.; Petrocchi, M.; Spognardi, A.; Tesconi, M. The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race. In Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, 3–7 April 2017; pp. 963–972. [Google Scholar]
Hajli, N.; Saeed, U.; Tajvidi, M.; Shirazi, F. Social bots and the spread of disinformation in social media: The challenges of artificial intelligence. Br. J. Manag. 2022, 33, 1238–1253. [Google Scholar] [CrossRef]
Hagen, L.; Neely, S.; Keller, T.E.; Scharf, R.; Vasquez, F.E. Rise of the machines? Examining the influence of social bots on a political discussion network. Soc. Sci. Comput. Rev. 2022, 40, 264–287. [Google Scholar] [CrossRef]
Feng, S.; Tan, Z.; Wan, H.; Wang, N.; Chen, Z.; Zhang, B.; Zheng, Q.; Zhang, W.; Lei, Z.; Yang, S.; et al. TwiBot-22: Towards graph-based Twitter bot detection. arXiv 2022, arXiv:2206.04564. [Google Scholar]
Kwon, H. AudioGuard: Speech Recognition System Robust against Optimized Audio Adversarial Examples. Multimed. Tools Appl. 2024, 83, 57943–57962. [Google Scholar] [CrossRef]
Van Der Walt, E.; Eloff, J. Using machine learning to detect fake identities: Bots vs humans. IEEE Access 2018, 6, 6540–6549. [Google Scholar] [CrossRef]
Wang, L.; Qiao, X.; Xie, Y.; Nie, W.; Zhang, Y.; Liu, A. My brother helps me: Node injection based adversarial attack on social bot detection. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada, 29 October–3 November 2023; pp. 6705–6714. [Google Scholar]
Abreu, J.V.F.; Fernandes, J.H.C.; Gondim, J.J.C.; Ralha, C.G. Bot development for social engineering attacks on Twitter. arXiv 2020, arXiv:2007.11778. [Google Scholar]
Cresci, S.; Petrocchi, M.; Spognardi, A.; Tognazzi, S. Better safe than sorry: An adversarial approach to improve social bot detection. In Proceedings of the 10th ACM Conference on Web Science, Boston, MA, USA, 30 June–3 July 2019; pp. 47–56. [Google Scholar]
Le, T.; Tran-Thanh, L.; Lee, D. Socialbots on fire: Modeling adversarial behaviors of socialbots via multi-agent hierarchical reinforcement learning. In Proceedings of the ACM Web Conference 2022, Lyon, France, 25–29 April 2022; pp. 545–554. [Google Scholar]
Zeng, X.; Peng, H.; Li, A. Adversarial socialbots modeling based on structural information principles. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38, pp. 392–400. [Google Scholar]
Davis, C.A.; Varol, O.; Ferrara, E.; Flammini, A.; Menczer, F. Botornot: A system to evaluate social bots. In Proceedings of the 25th International Conference Companion on World Wide Web, Montreal, QC, Canada, 11–15 April 2016; pp. 273–274. [Google Scholar]
Varol, O.; Ferrara, E.; Davis, C.; Menczer, F.; Flammini, A. Online human-bot interactions: Detection, estimation, and characterization. In Proceedings of the International AAAI Conference on Web and Social Media, Montreal, QC, Canada, 15–18 May 2017; Volume 11, pp. 280–289. [Google Scholar]
Yang, K.C.; Varol, O.; Davis, C.A.; Ferrara, E.; Flammini, A.; Menczer, F. Arming the public with artificial intelligence to counter social bots. Hum. Behav. Emerg. Technol. 2019, 1, 48–61. [Google Scholar] [CrossRef]
Kudugunta, S.; Ferrara, E. Deep neural networks for bot detection. Inf. Sci. 2018, 467, 312–322. [Google Scholar] [CrossRef]
Heidari, M.; Jones, J.H.; Uzuner, O. Deep contextualized word embedding for text-based online user profiling to detect social bots on twitter. In Proceedings of the 2020 International Conference on Data Mining Workshops (ICDMW), Sorrento, Italy, 17–20 November 2020; pp. 480–487. [Google Scholar]
Feng, S.; Wan, H.; Wang, N.; Li, J.; Luo, M. Satar: A self-supervised approach to twitter account representation learning and its application in bot detection. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Virtual, 1–5 November 2021; pp. 3808–3817. [Google Scholar]
Cardaioli, M.; Conti, M.; Di Sorbo, A.; Fabrizio, E.; Laudanna, S.; Visaggio, C.A. It’s a matter of style: Detecting social bots through writing style consistency. In Proceedings of the 2021 International Conference on Computer Communications and Networks (ICCCN), Athens, Greece, 19–22 July 2021; pp. 1–9. [Google Scholar]
Loyola-González, O.; Monroy, R.; Rodríguez, J.; López-Cuevas, A.; Mata-Sánchez, J.I. Contrast pattern-based classification for bot detection on twitter. IEEE Access 2019, 7, 45800–45817. [Google Scholar] [CrossRef]
Wei, F.; Nguyen, U.T. Twitter bot detection using bidirectional long short-term memory neural networks and word embeddings. In Proceedings of the 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA), Los Angeles, CA, USA, 12–14 December 2019; pp. 101–109. [Google Scholar]
Dukić, D.; Keča, D.; Stipić, D. Are you human? Detecting bots on Twitter Using BERT. In Proceedings of the 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA), Sydney, Australia, 6–9 October 2020; pp. 631–636. [Google Scholar]
Jia, J.; Wang, B.; Gong, N.Z. Random walk based fake account detection in online social networks. In Proceedings of the 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Denver, CO, USA, 26–29 June 2017; pp. 273–284. [Google Scholar]
Breuer, A.; Eilat, R.; Weinsberg, U. Friend or faux: Graph-based early detection of fake accounts on social networks. In Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; pp. 1287–1297. [Google Scholar]
Feng, S.; Wan, H.; Wang, N.; Luo, M. BotRGCN: Twitter bot detection with relational graph convolutional networks. In Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Virtual, 8–11 November 2021; pp. 236–239. [Google Scholar]
Feng, S.; Tan, Z.; Li, R.; Luo, M. Heterogeneity-aware twitter bot detection with relational graph transformers. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 22 February–1 March 2022; Volume 36, pp. 3977–3985. [Google Scholar]
Liu, Y.; Tan, Z.; Wang, H.; Feng, S.; Zheng, Q.; Luo, M. Botmoe: Twitter bot detection with community-aware mixtures of modal-specific experts. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, Taipei, Taiwan, 23–27 July 2023; pp. 485–495. [Google Scholar]
Pham, P.; Nguyen, L.T.; Vo, B.; Yun, U. Bot2Vec: A general approach of intra-community oriented representation learning for bot detection in different types of social networks. Inf. Syst. 2022, 103, 101771. [Google Scholar] [CrossRef]
Dehghan, A.; Siuta, K.; Skorupka, A.; Dubey, A.; Betlen, A.; Miller, D.; Xu, W.; Kamiński, B.; Prałat, P. Detecting bots in social-networks using node and structural embeddings. J. Big Data 2023, 10, 119. [Google Scholar] [CrossRef] [PubMed]
Cresci, S.; Di Pietro, R.; Petrocchi, M.; Spognardi, A.; Tesconi, M. Social fingerprinting: Detection of spambot groups through DNA-inspired behavioral modeling. IEEE Trans. Dependable Secur. Comput. 2017, 15, 561–576. [Google Scholar] [CrossRef]
Gao, C.; Lan, X.; Lu, Z.; Mao, J.; Piao, J.; Wang, H.; Jin, D.; Li, Y. S³: Social-network Simulation System with Large Language Model-Empowered Agents. arXiv 2023, arXiv:2307.14984. [Google Scholar] [CrossRef]
Feng, S.; Wan, H.; Wang, N.; Tan, Z.; Luo, M.; Tsvetkov, Y. What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection. arXiv 2024, arXiv:2402.00371. [Google Scholar]
Edwards, A.L. The relationship between the judged desirability of a trait and the probability that the trait will be endorsed. J. Appl. Psychol. 1953, 37, 90. [Google Scholar] [CrossRef]
Varol, O.; Davis, C.A.; Menczer, F.; Flammini, A. Feature engineering for social bot detection. In Feature Engineering for Machine Learning and Data Analytics; CRC Press: Boca Raton, FL, USA, 2018; p. 311. [Google Scholar]
Ratner, A.; Bach, S.H.; Ehrenberg, H.; Fries, J.; Wu, S.; Ré, C. Snorkel: Rapid training data creation with weak supervision. In Proceedings of the VLDB Endowment. International Conference on Very Large Data Bases, Munich, Germany, 28 August–1 September 2017; Volume 11, p. 269. [Google Scholar]
Terry, J.; Black, B.; Grammel, N.; Jayakumar, M.; Hari, A.; Sullivan, R.; Santos, L.S.; Dieffendahl, C.; Horsch, C.; Perez-Vicente, R.; et al. Pettingzoo: Gym for multi-agent reinforcement learning. Adv. Neural Inf. Process. Syst. 2021, 34, 15032–15043. [Google Scholar]
Raffin, A.; Hill, A.; Gleave, A.; Kanervisto, A.; Ernestus, M.; Dormann, N. Stable-Baselines3: Reliable Reinforcement Learning Implementations. J. Mach. Learn. Res. 2021, 22, 1–8. [Google Scholar]

Figure 1. Markov decision process in our simulation. The bot’s next move is determined by its current state

s_{t}

through action and target selection. If it updates its profile, a new one is generated based on

s_{t}

. If it posts a tweet or interacts with users, the effects are simulated using our influence and engagement models. Finally,

s_{t}

to

s_{t + 1}

are updated and decide whether to continue the simulation. If the bot score given by the detector

D (s_{t + 1})

is less than 0.5 and the simulation has not reached the step limit

t_{m}

, the simulation will continue; otherwise, it will stop versa.

Figure 1. Markov decision process in our simulation. The bot’s next move is determined by its current state

s_{t}

through action and target selection. If it updates its profile, a new one is generated based on

s_{t}

. If it posts a tweet or interacts with users, the effects are simulated using our influence and engagement models. Finally,

s_{t}

to

s_{t + 1}

are updated and decide whether to continue the simulation. If the bot score given by the detector

D (s_{t + 1})

is less than 0.5 and the simulation has not reached the step limit

t_{m}

, the simulation will continue; otherwise, it will stop versa.

Figure 2. Framework. The construction of the virtual environment starts by using real-world data, analyzing bot influence and user engagement, and reconstructing social networks. Then, a profile editor with a substitute detector that aligns with a black-box adversary detector and an adversarially trained generator is developed. This allows for user clustering and preselection of potential targets. The action selection and target selection agents are trained subsequently, which help guide a new account to act as either a fake account or an influencer within the virtual environment, simulating the creation of a social bot. The image representing the detector comes from BotRGCN [24].

Figure 3. Relationship between the average followers (a), retweets (b), mentions (c), growth rate and cluster number (based on the tweet similarity, from similar to dissimilar).

Figure 4. The number of undiscovered bots and average bot score throughout the fake account simulation experiments. (a) Number of undiscovered bots against the RF detector. (b) Number of undiscovered bots against BotRGCN. (c) Average bot score against the RF detector. (d) Average bot score against BotRGCN.

Figure 5. Distribution of bot scores for human accounts (a) when using the RF detector and (b) when using BotRGCN.

Figure 6. The number of followers and views throughout the influencer simulation experiments. (a) The number of followers against the RF detector. (b) The number of followers against BotRGCN. (c) The number of views against the RF detector. (d) The number of views against BotRGCN.

Figure 7. The density of account for the five most important features (log-transformed and min-max normalized). The green lines represent the corresponding mean values of the bot samples recorded in simulations.

Table 1. Results of fake account simulation experiments.

Scene	Bot Strategy	Survival Rate	Avg. Bot Score (SD)
Fake account and RF detector	Baseline	86.7%	0.410 (0.014)
	Additional actions	81.8%	0.377 (0.011)
	Additional actions + profile editor	94.0%	0.352 (0.014)
	Additional actions + preselection	99.7%	0.403 (0.006)
	Combined	99.0%	0.354 (0.013)
Fake account and BotRGCN	Baseline	58.5%	0.082 (0.084)
	Additional actions	85.9%	0.115 (0.063)
	Additional actions + profile editor	60.0%	0.086 (0.066)
	Additional actions + preselection	48.8%	0.028 (0.128)
	Combined	49.0%	0.083 (0.079)

Table 2. Results of influencer simulation experiments.

Scene	$θ$	Bot Strategy	Survival Rate	Avg. Bot Score (SD)	$\sum N (u)$ (SD)	$\sum P^{i}$ (SD)
Influencer & RF detector	0.003	Baseline	65.4%	0.382 (0.015)	29,942 (8.057)	7,595,250 (1845.121)
	0.003	Additional actions	59.5%	0.401 (0.018)	26,426 (7.706)	6,292,190 (1950.827)
	0.0025	Additional actions + profile editor	97.9%	0.359 (0.014)	36,320 (8.042)	7,878,764 (2016.605)
	0.0025	Additional actions + preselection	98.6%	0.408 (0.014)	34,392 (8.782)	7,188,792 (2302.556)
	0.0025	Combined	84.2%	0.389 (0.017)	30,584 (7.687)	6,416,062 (1990.133)
Influencer & BotRGCN	0.003	Baseline	34.1%	0.009 (0.054)	20,874 (7.366)	4,335,142 (2374.298)
	0.003	Additional actions	18.8%	0.011 (0.103)	13,470 (6.789)	3,119,952 (1957.740)
	0.003	Additional actions + profile editor	37.2%	0.011 (0.057)	22,884 (5.877)	4,408,418 (1721.843)
	0.004	Additional actions + preselection	60.0%	0.002 (0.041)	39,542 (8.030)	8,932,088 (2629.436)
	0.003	Combined	34.1%	0.002 (0.043)	31,620 (6.562)	6,610,018 (1997.801)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jin, R.; Liao, Y. Prevention Is Better than Cure: Exposing the Vulnerabilities of Social Bot Detectors with Realistic Simulations. Appl. Sci. 2025, 15, 6230. https://doi.org/10.3390/app15116230

AMA Style

Jin R, Liao Y. Prevention Is Better than Cure: Exposing the Vulnerabilities of Social Bot Detectors with Realistic Simulations. Applied Sciences. 2025; 15(11):6230. https://doi.org/10.3390/app15116230

Chicago/Turabian Style

Jin, Rui, and Yong Liao. 2025. "Prevention Is Better than Cure: Exposing the Vulnerabilities of Social Bot Detectors with Realistic Simulations" Applied Sciences 15, no. 11: 6230. https://doi.org/10.3390/app15116230

APA Style

Jin, R., & Liao, Y. (2025). Prevention Is Better than Cure: Exposing the Vulnerabilities of Social Bot Detectors with Realistic Simulations. Applied Sciences, 15(11), 6230. https://doi.org/10.3390/app15116230

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prevention Is Better than Cure: Exposing the Vulnerabilities of Social Bot Detectors with Realistic Simulations

Abstract

1. Introduction

2. Related Works

2.1. Feature-Based Social Bot Detection

2.2. Text-Based Social Bot Detection

2.3. Graph-Based Social Bot Detection

2.4. Social Bot Imitation

3. Materials and Methods

3.1. Threat Model

3.2. Framework

3.3. Virtual Environment

3.3.1. Influence Simulation

3.3.2. User Engagement Simulation

3.3.3. Bot Detection Model

3.4. User Clustering

3.5. Profile Edition

3.6. Bot Strategy Model

3.7. Complexity Analysis

4. Results

4.1. Experimental Setup

4.1.1. Dataset

4.1.2. Evaluation Metrics

4.1.3. Implementation

4.2. Influence and Engagement Model

4.3. Fake Account Simulation

4.4. Influencer Simulation

4.5. Discovery of Unrecognized Bots: An Example

5. Limitation and Future Work

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI