Identification of Bots and Cyborgs in the #FeesMustFall Campaign

Khan, Yaseen; Thakur, Surendra; Obiyemi, Obiseye; Adetiba, Emmanuel

doi:10.3390/informatics9010021

Open AccessArticle

Identification of Bots and Cyborgs in the #FeesMustFall Campaign

¹

KZN e-Skills CoLab, Durban University of Technology, Durban 4001, South Africa

²

Department of Electrical and Electronic Engineering, Osun State University, Osogbo 4494, Nigeria

³

Department of Electrical and Information Engineering, Covenant University, Ota 1023, Nigeria

⁴

Covenant Applied Informatics and Communication Africa Center of Excellence, Covenant University, Ota 1023, Nigeria

⁵

HRA, Institute for Systems Science, Durban University of Technology, Durban 4001, South Africa

^*

Author to whom correspondence should be addressed.

Informatics 2022, 9(1), 21; https://doi.org/10.3390/informatics9010021

Submission received: 5 January 2022 / Revised: 25 February 2022 / Accepted: 28 February 2022 / Published: 4 March 2022

(This article belongs to the Section Human-Computer Interaction)

Download

Browse Figures

Versions Notes

Abstract

:

Bots (social robots) are computer programs that replicate human behavior in online social networks. They are either fully automated or semi-automated, and their use makes online activism vulnerable to manipulation. This study examines the existence of social robots in the #FeesMustFall movement by conducting a scientific investigation into whether social bots were present in the form of Twitter bots and cyborgs. A total of 576,823 tweets posted between 15 October 2015 and 10 April 2017 were cleaned, with 490,449 tweets analyzed for 90,783 unique persons. Three separate approaches were used to screen out suspicious bot and cyborg activity, supplemented by the DeBot team’s methodology. User 1 and User 2, two of the 90,783 individuals, were recognized as bots or cyborgs in the study and contributed 22,413 (4.57 percent) of the 490,449 tweets. This confirms the existence of bots throughout the campaign, which aided in the #FeesMustFall’s amplification on Twitter, complicating sentiment analysis and invariably making it the most popular and lengthiest hashtag campaign in Africa, particularly at the time of data collection.

Keywords:

#FeesMustFall; software robots; social robots; bots; cyborgs

1. Introduction

As recently estimated [1], the provision of social networking services by Twitter has witnessed phenomenal growth, with a total monetizable daily active users (mDAU) of over 211 million users worldwide. With an open Application Programming Interface (API) in place since its inception in July 2006, various autonomous posting applications have actively explored access. Some applications—ranging from the mild automation of content from reliable sources, such as weather services or blogs, to automated spam posts, including postings containing links to harmful content—have been made possible [2]. The growing Twitter user base and its open nature draw a significant number of automated programs, conventionally referred to as bots. Even though they can be beneficial and damaging [3], their widespread use has led to a rise in the spread of false, inaccurate, or misleading information on social media [4].

Cyborgs, chatbots, internet bots, spambots, and social bots are examples of bots available today. A cyborg exists in the space between humans and machines, perceived as a human assisted by a robot or a robot that a human supports [5]. According to [6], a computer program operating on a host rather than a stand-alone device is how a robot may be described. A chatbot is an AI-driven program that simulates human-like conversations with users via text messages or speech, while software performing automated tasks (scripts) over the Internet is known as an internet bot. Spambots are computer programs designed to assist in distributing spam, and they generally establish accounts to deliver spam messages together [7,8,9]. A social bot is an algorithm that mimics people, posts content, or engages with people on social media [10,11]. “Trolls” is another striking phenomenon. It represents members of the online community that utilize fictional identities to support or oppose a cause or endeavor, and their identification is not always straightforward [12]. Trolls, or sockpuppets, are regularly employed to communicate with regular users on social media and may pose as someone else to propagate harmful links or generate ads.

A typical bot exhibits repeated and frequent behavior, with high output. Twitter bots may tweet, retweet, like, follow, unfollow, or send direct messages to other accounts. Based on Twitter’s automation policies, many bot profiles are already being terminated because of their extreme or violent conduct [13,14]. However, auto-tweeting software such as TweetDeck, Hootsuite, and Buffer may all publish and retweet tweets seamlessly, thus making it easy for anyone to influence an agenda by re-tweeting the same message repeatedly. Despite growing concern about using bots for political purposes in viral Twitter-based campaigns, it may be difficult and time-consuming to identify and isolate them to reduce their social impact on public discussions. This has drawn considerable attention, with several documented efforts to identify bots in popular campaigns.

The study by [5] documented one of the earliest attempts to identify bots on Twitter, roughly four (4) years after its public launch, which described an approach for distinguishing between bots, cyborgs, and humans. The study aimed to improve the detection of malicious Twitter bots and estimated the human:cyborg:bot ratio to be about 5:4:1. In a similar study based on a similar dataset, additional results were presented from a more advanced analysis [14], where a bot was spotted and identified to have been registered since March of 2007. Similar estimates have been presented on the proportion of bots to humans on Twitter. For example, ref. [15] inferred from their study that about 8.5% of all Twitter users were bots, which was based on Twitter data from the US Securities and Exchange Commission; on the other hand, ref. [16] established that about 16% of Twitter users were exhibiting highly automated behavior. While conducting a comprehensive identification of the growing number of Twitter services and users that has been spreading across the underdeveloped, developing, and developed nations, the researchers of [17] and [15] reported the increasing sophistication of bots, thus presenting some difficulty in their detection.

Botometer is a frequently used tool for detecting bots on Twitter that can be accessed via a variety of price-related APIs based on usage [18,19]. Botometer identifies Twitter bots using a Random Forest classification algorithm based on about 1000 characteristics gleaned from the user’s metadata and tweet chronology. Several studies [2,11,20,21,22,23] have used this technique for bot-based investigation. In the recent research by Aldayel and Magdy [20], Botometer was used to analyze the online interactions of 4000 Twitter users that were predictive of their stances and to identify the bots within those interactions. Similarly, it was also possible to examine the impact of bots on the 2016 US Presidential Debate by using the Botometer API, as investigated by Rizoiu et al. [21]. In their study, Broniatowski et al. [23] looked at the bot ratings of Twitter profiles that spread information about vaccines on social media.

DeBot is another prominent bot detection API, developed as the first unsupervised bot detection method on social media and described as “a near real-time system” [24,25,26]. It was created using a warped correlation finder to detect correlated accounts of users on social media networks identical to Twitter. It has been widely implemented for detecting bots in several documented studies [27,28,29] since its deployment. The report by Rofrío et al. [27] assessed the impact of bots in Ecuador’s 2017 presidential elections, demonstrating how presidential candidates utilized bots to promote their candidacies on social media platforms by using the open-source DeBot API. By analyzing a retweet network related to MMR vaccination during the 2015 measles outbreak in Disneyland California, Yuan et al. [28] investigated the communication patterns of anti- and pro-vaccine Twitter users as well as the impact of bots. Similarly, Kušen and Strembeck [29] used the DeBot API to evaluate 1.3 million Twitter accounts that created 4.4 million tweets on 24 randomly picked real-world events.

DeBot works on activity correlation, and it also has an archive of detected bots. Due to the size of the dataset, resource cost limitations, and the ability to search via topics, DeBot was the preferred choice, ahead of Botometer. Therefore, it was employed in this study to assess the involvement of bots or cyborgs in the famous #FeesMustFall movement in South Africa.

The South African #FeesMustFall and Other Related Movements

A distinguishing component of #FeesMustFall was the campaign’s use of social media platforms and social networks to organize activities, educate and persuade students and activists, and gather support via continuous media and community attention. Between 14 October and 23 October 2015, the authors of [30] examined protest activities associated with the #FeesMustFall hashtag on the internet and in person. The study examined the history, type, and extent of Twitter usage; the popularity of protest activities over higher education institutions; leadership; web-based social hub structure; and the structure of the campaign were all examined in the study. Matrose [31] looked into the Port Elizabeth Herald and the #FeesMustFall student protest movement in 2015. The study’s purpose was to determine why students used social media during the #FeesMustFall student protest campaign. Students utilized their #FeesMustFall social media outlets to promote and legitimize themselves and the campaign. The findings also found that during the #FeesMustFall protest, Nelson Mandela Metropolitan University students used social media to gain mainstream media coverage and alter the narrative in the coverage.

Olagunju [32] used the hashtag (#) FeeMustFall as a case study and examined the media–audience connection by assessing how audience involvement through Facebook and Twitter was gradually altering young South African students’ “news” habits. The data revealed that students utilized social media to express their involvement and participation in social issues impacting youth, particularly during the height of the #FeesMustFall campaign. There was also a shift in news reporting as a result of social media’s audience participation. A related study [33] assessed the role of social media in the facilitation of effective student online activism. The analysis looked at 567,533 twitter posts from the early movements on free education, observed in 2015 and 2016, using a mixed research approach that preferred trend lines above headlines. The results showed that enhanced cooperation was essential to guarantee the long-term viability of a microblogging resource governance value chain integrated within the higher education institutions’ ecosystem.

Bolton [34] conducted a recent study to evaluate the various identities, communities, and discourses on Twitter during student protests, specifically between the 2015 #RhodesMustFall and the #FeesMustFall protests. This study used 1000 tweets per hashtag, including #OpenStellenbosch, #RhodesMustFall, #UCTFeesMustFall, and #WitsFeesMustFall. AntConc, a corpus linguistics software, was used to run the data, and the results were analyzed using Critical Discourse Analysis (CDA). Twitter was found to be a useful tool for sharing information, obtaining resources, coordinating, and strategizing. It also determined that Twitter’s major role was to integrate ideologies such as free tertiary education into the national consciousness as well as to allow individuals to convey their personal views and perspectives in the public spotlight, thus establishing new forms of subject posts and common grounds.

Youth advocacy and counter-memory through Twitter were also influenced by the #RMF movement, denoting Rhodes Must Fall, which was led by South African students [35]. The #RMF movement at the University of Cape Town (UCT) advocated for the removal of the statue of British colonialist Cecil John Rhodes, claiming that it supported a culture of exclusion and the institutionalization of racism, and was especially popular among black South African students. Qualitatively, tweets were analyzed and a network analysis was carried out using NodeXL (Social Media Research Foundation, Redwood, CA, USA), with findings showing how social media debates could influence mainstream news topics and should not be separated from conventional media outlets. The essay also argued that adolescents were more reliant on social media, fostering the distinct biography of a citizenry marked by more personalized kinds of action.

Considering this research’s focus on Twitter, this study aims to determine whether bots or cyborgs were involved in the popular #FeesMustFall movement. The influence of bots and bot systems on social networking and society, along with bot identification algorithms and strategies, are discussed in this article. This study employed various strategies, including temporal, content, and source detection techniques, to investigate the presence of Twitter bots and cyborgs during an effective activist campaign called #FeesMustFall. Unlike previous research on #FeesMustFall [30,32,34], this study uncovered significant new information about the campaign and established that Twitter bots and cyborgs were indeed present. These findings add value to the information on stakeholders in South Africa by providing a unique perspective on the types of social media users who maintained and possibly influenced the campaign.

2. Data and Methodology

A growing number of studies are using Twitter (Twitter, San Francisco, CA, USA) as a data source. Text data (tweets) containing emojis and emoticons were exclusively utilized in the study, without the inclusion of images or videos in the collected data. Only tweets with the hashtag #FeesMustFall were included in the dataset. Despite the fact that the true movement began on 15 October 2015, the Twitter data collection timeline was set from the very first occurrence of the hashtag FeesMustFall (#FeesMustFall) on 21 March 2015 up to 10 April 2017 and includes 576,583 data points. After removing duplicates and other data that were deemed irregular or unclear, the database had 490,449 tweets. Each data point, i.e., tweet analyzed, included information such as the date, timestamp, tweet text, username, the tweet source, favorited, retweeted, and the user language. The Data Analytics Lifecycle proposed by Erl, Khattak, and Buhler [36] was used to manage the data in this investigation. It was chosen for its methodical nine-stage framework, which is equally applicable to large data analytics. The data extraction process in this study involved converting the acquired #FeesMustFall tweets from an ASCII-delimited file to a Microsoft Excel format for further analysis. The dataset containing Tweets, Tweet Source, Datetime was safely archived in Google drive and may be assessed via a link, which can be provided on request.

The 576,583 data points or tweets about the campaign were obtained from a professional data service provider, Podargos, as a cost-effective alternative to Twitter. Due to an explicit request made to Podargos to obtain tweets on #FeesMustFall from Twitter, the dataset did not capture all other FeesMustFall-related campaign content, particularly those that did not include the #FeesMustFall hashtag. Table 1 represents a sample of the data, with Tweets and usernames anonymized for privacy reasons. “Eng.” is an abbreviation for the English Language, while “Und.” signifies the representation of the undefined language class of the dataset.

The pre-processed dataset was subjected to several procedures to detect social bots, including a linguistic analysis of the data, with the rest relying on the DeBot API, as shown in Figure 1. Method 1 detected users who posted many times within a single timestamp. While this is difficult for traditional users because each tweet must be manually composed, it is achievable for bots and cyborg users due to the automated support. Equations (1)–(4) represent the mathematical procedure employed for method 1.

The total number of tweets is given as:

\sum_{i} {(t w)}_{i}

(1)

where

{(t w)}_{i}

is a tweet and

i \in N

.

The total number of users is given as:

\sum_{j} u_{j}

(2)

where

u_{j}

is a user for a tweet and

j \in N

.

The total number of timestamps is given as:

\sum_{k} t_{k}

(3)

where

t_{k}

is a timestamp for a tweet and

k \in N

.

\sum_{i} {(t w)}_{i j k} > 1

(4)

Moreover, it is believed that a user,

u_{j}

, is a bot or cyborg if the condition in Equation (4) is satisfied at timestamp,

t_{k}

.

Since it was essential to identify Twitter users (tweeters) with replicated content, i.e., users with repeated tweets, method two (2) was employed to eliminate single-user tweets with a volume of less than 30. This number was selected to create a manageable dataset for further investigation and ensure the benefits of the potential use of the Central Limit Theorem (CLT) for additional statistical analysis [37]. CLT suggests that samples with a minimum size of 30 contain means that tend to approximate to normal distributions regardless of the population distributions [38,39]. Furthermore, based on the data, this threshold assisted with filtering out low-volume tweeters in favor of high-volume tweeters and retained the top 3% of high-volume tweeters out of 90,820 total tweeters. The sole purpose of method 2 was to capitalize on the spamming feature of some bots and cyborgs that triggers or schedules tweets. By using synchronous and recurrent tweeting mechanisms, bots and cyborgs were more likely to generate multiple instances of the same tweet content, more frequently than human tweeters. Even though humans are capable of tweeting the same content several times, such duplication is unlikely to occur frequently enough. Chu et al. [5] provided vital indicators for identifying bots such as duplicated tweets and limited original content. Since cyborgs have a human-and-bot combination, a certain percentage of tweets would be automated. Extending this notion, this study applied a specific threshold for flagging a user as a bot or cyborg if its total volume of tweets contained 30% or more duplicated content. This condition conforms with the behavioral characteristics of spam bots, whose content variation is significantly limited [5]. In other words, it can therefore be safely assumed that anyone who posts duplicated content for 30% or more of their tweets throughout a fixed period is likely to be a bot or a cyborg.

The total number of unique tweets quantifiable using this method is represented as follows:

\sum_{i} {(\overset{=}{t w})}_{i},

(5)

where

{(\overset{=}{t w})}_{i}

is each unique tweet and

i \in N

.

\sum_{i} {(t w)}_{i j} \geq \frac{10 * \sum_{i} {(\overset{=}{t w})}_{i j}}{7},

(6)

\sum_{i} {(t w)}_{i j} \geq 30

(7)

If the conditions for Equation (6) are met, it is assumed that user

u_{j}

is either a bot or a cyborg.

The third method (method 3) finds tweeters who mostly employ automated applications. By amplifying tweet volume, cyborgs and bots contribute to their desired causes by using an automation software, where the tweet, retweet, following, and responding to posts are all automated within Twitter. A reasonable criterion was devised, which searched for tweeters that had published 30 or more tweets, with at least 70% of those tweets having been generated by a known automated software. This raised the likelihood of learning about bots and cyborgs. As part of the metadata, tweet sources were retrieved, supporting these requirements for data analysis.

Among the automated software for Twitter data analysis, researchers used Hootsuite, “IF This Then That” (IFTTT), TweetDeck, Buffer, and TweetCaster based on their popularity. It is possible to schedule tweets on Twitter with Buffer and Hootsuite, two social media managers with automation features [40,41]. IFTTT develops applications that automate Twitter operations such as tweeting and retweeting [42]. TweetDeck is a social media program that enables the management of several Twitter accounts and includes a function for scheduling tweets [43]. Additionally, the program TweetCaster controls a Twitter account and provides the ability to schedule and automate the publishing of tweets [44]. This method is denoted mathematically as follows:

Let A = {IFTTT; Hootsuite; TweetDeck; TweetCaster; Buffer},

where A is a set of automating tweet sources

A user,

u_{j}

, is therefore considered to be either a bot or a cyborg if Equations (8) and (9) below are satisfied:

\frac{(a t s) j}{\sum_{i} {(t w)}_{i j}} \geq 0.7,

(8)

\sum_{i} {(t w)}_{i j} \geq 30,

(9)

where

{(a t s)}_{j}

represents the total number of times an automated tweet source occurs for a certain jth user, and

{(a t s)}_{j} \in A

;

j \in N

.

In Method 4, Twitter bots and cyborgs were identified using the popular DeBot API. Upon registration at the DeBot website [25], researchers received API keys, provided with the successful creation of the account purposely set up for this study. In order to find relevant bots, the keyword “#FeesMustFall” was used and executed in Python using the following code:

\begin{matrix} import debot \\ db = debot . DeBot (' your_api_key') \\ db . get_related_bots (' # FeesMustFall') \end{matrix}

3. Results

Podargos’ algorithm discovered a total of 27 different languages utilized in the #FeesMustFall campaign over the observation period. With 432,942 tweets or 88.3% of all tweets being in English, it was clear that the language was dominant, proving that it was the primary language through which users engaged with the #FeesMustFall hashtag on Twitter. Table 2 shows the number of tweets identified for each language and the total number of tweets. Nine thousand three hundred thirty-one (9331) non-English tweets were detectable, with Dutch being the second most used language. Undefined tweets totaled 48,176 due to the algorithm’s inability to comprehend messages not included inside its code.

Researchers observed that March and April of 2015 had a single tweet in the sample, as shown in Table 3, which presents the distribution of tweets over the entire observation period. There were 289,458 tweets in October 2015, accounting for 59.02 percent of the 490,449 tweets examined. Eighty-two thousand seven hundred twelve (82,712) tweets, or 16.8 percent of the total Twitter posts, were recorded when the #FeesMustFall movement marked its first anniversary in October 2016. The tweet distribution shown in Figure 2 follows a lognormal distribution across a 12-month period; this is a result of a lognormal statistical test conducted from October 2015 to September 2016. Figure 3 shows the distribution of total tweets across the typical days of the week, with a significant drop in tweet output on Saturdays and Sundays. Overall, this suggests a notable trend in which tweeting grew every weekday, then decreased during the weekend.

Using the methods outlined in Section 2, two hundred and eighty-three (283) bots and cyborgs were found based on method 1. Similarly, method 2 identified six (6) bots and cyborgs, while method 3 spotted one hundred and thirty-five (135) bots and cyborgs for further analysis. The DeBot API’s method 4 returned four (4) bot-prone accounts. Results from methods 1, 2, 3, and 4 were compared with the top tweeters identified in this study, as presented in Figure 4. It was observed that User 1 and User 2 were the only two top tweeters whose appearance were common in the different groupings. Due to Twitter’s privacy policy, the usernames have been anonymized and the actual names are available upon request.

Figure 4 shows that Users 1 and 2 are the highest-ranking tweeters, and they were among the users identified via the application of methods one (1), two (2), and three (3). Further investigations on the regularity component and the number of tweets sent by these accounts resulted from this. Several tweets for each second as well as 21 tweets in 10 s indicate that the user is self-aware, as shown in Figure 4. Based on the overall 90,783 active accounts, User 1 and User 2 were categorized as bots or cyborgs since they generated roughly 4.57 percent of the 490,449 Twitter posts. Figure 5 depicts the tweeting habits of User 1 and shows a consistent pattern in its tweeting behavior over a 1-h period, with noticeable spikes at 5-min intervals. Hootsuite, a Twitter automation tool, was used mainly by User 1 for regularly tweeting various pieces of material. On 21 October 2015 at 10:53 a.m., the tweeting behavior of User 2, as seen in Figure 6, shows numerous tweets per second for eight consecutive seconds within 10 s. The horizontal axis shows time in the hour:minute:seconds format, while the vertical axis reflects the volume of tweets in that period. The results indicate a high activity level, particularly within 10 s and a minimum of two tweets per second in this dataset. User 2 tweeted using the IFTTT protocol. This is a frequently used technique for both building and generating bot applets.

4. Discussion

As a result of the multilingual composition of South Africa’s population, it is probable that users composed the texts with alternate or unique languages, emoticons, jargon, emojis, URLs, or alphanumeric characters that are not generally associated with human languages. Based on the data presented in Table 2, the overwhelming majority of texts written by users were in English. The second most defined twitter language is Dutch. This is not surprising because the Dutch language takes a lot of influence from the Afrikaans language, which is widely spoken in South Africa.

Lognormality for the distribution of #FeesMustFall was discovered, as presented in Figure 2a. Based on a study by Bild et al. [45], tweet distributions exhibit a lognormal form over finite sampling intervals in Twitter campaigns studied. Since the #FeesMustFall campaign on Twitter followed an expected lognormal distribution, this implies that human tweeters participated actively in the campaign, and that the drive was not exclusively automated. The data were collected from Podargos, a data service provider, and Table 4 and Figure 2a provide mathematical affirmation regarding this study’s data distribution.

Researchers discovered an unexpected social pattern in the top tweeters’ frequency, volume, and content during the data processing phase that points to automated behavior. Figure 5 shows an example of the tweeting of messages at regular intervals, whereas Figure 6 shows a surge of tweeting activity for a few seconds. Twitter bots and cyborgs are known for tweeting with these features [14]. The findings and analyses established that Users 1 and 2 were social robots. Since Hootsuite and Buffer have automated tweets from this account, User 1 was deemed to exhibit cyborg-like characteristics. In contrast to a wholly human-controlled Twitter account, User 1′s tweeting behavior demonstrated a high frequency of tweets. Additionally, Chavoshi, Hamooni, and Mueen’s [25] DeBot system detected User 1 as a bot on 29 December 2016 when they used its API.

The investigation uncovered evidence of the use of social robots on Twitter during the #FeesMustFall movement over this period. Significantly, this study particularly identified two bots that were used to, at a bare minimum, magnify the #FeesMustFall movement. The two most prolific tweeters were both classified as social robots. As a result of these social robots’ influence, the number of #FeesMustFall tweets significantly increased. Users could modify, disable, delete, or alter tweets and profiles. Such user adjustments could change the metadata, so the data obtained at one moment may not be precisely comparable to data obtained at a subsequent time for the same subject.

5. Conclusions

No such massive movement as #FeesMustFall has occurred in South Africa since the dawn of electronic communication. This research revealed unusual social behavior displayed by the top tweeting users, indicating automated behavior in the #FeesMustFall campaign on Twitter. This included tweeting at set intervals and tweeting numerous times per second for many seconds. These tweet traits significantly resemble Twitter bots or cyborgs, as described in [14], and are designed to spam or troll the social networks in order to magnify selected tweets or send out a bunch of tweets in burst or pre-determined interval modes. User 1 and User 2 are social robots. While User 1 resembles a cyborg due to the usage of Buffer and Hootsuite to automate tweets, its tweeting behavior was unlike a human-controlled Twitter account.

As a result of their use during the #FeesMustFall campaign, social robots may have helped to continue the discourse on Twitter during off-peak times such as during vacations and university breaks. This study’s discovery of Twitter bots and cyborgs is notable since it is the first known study to analyze and identify social robots in South Africa, especially in relation to the #FeesMustFall movement. When it comes to South Africa, there are no historical records of this kind of behavior throughout previous campaigns. Government agencies and stakeholders should be concerned about the possibility of manipulative or other sorts of harmful bots being generated to shape public opinion during movements such as #FeesMustFall, which might have an impact on both polls and other delicate activities.

Author Contributions

Conceptualization, Y.K. and S.T.; methodology, Y.K.; validation, Y.K., S.T., O.O. and E.A.; formal analysis, Y.K. and S.T.; investigation, Y.K., S.T. and E.A.; resources, Y.K. and S.T.; data curation, Y.K. and S.T.; writing—original draft preparation, O.O.; writing—review and editing, Y.K. and O.O.; visualization, O.O. and E.A.; supervision, E.A.; project administration, S.T.; funding acquisition, S.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Secured and private datasets were analyzed in this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

Statista Twitter Global mDAU 2021|Statista. Available online: https://www.statista.com/statistics/970920/monetizable-daily-active-twitter-users-worldwide/ (accessed on 9 February 2022).
Moon, B. Identifying bots in the Australian Twittersphere. In Proceedings of the 8th International Conference on Social Media & Society, Toronto, ON, Canada, 28–30 July 2017; Association for Computing Machinery: New York, NY, USA, 2017. [Google Scholar]
Singer, P.W.; Brooking, E.T. LikeWar: The Weaponization of Social Media; Eamon Dolan Books: New York, NY, USA, 2018. [Google Scholar]
Rahman, N.; Maimuna, M.; Begum, A.; Ahmed, M.R.; Arefin, M.S. A Survey of Data Mining Techniques in the Field of Cyborg Mining BT. In Soft Computing for Security Applications; Ranganathan, G., Fernando, X., Shi, F., El Allioui, Y., Eds.; Springer: Singapore, 2022; pp. 781–797. [Google Scholar]
Chu, Z.; Gianvecchio, S.; Wang, H.; Jajodia, S. Who is Tweeting on Twitter: Human, Bot, or Cyborg? In Proceedings of the 26th Annual Computer Security Applications Conference, Austin, TX, USA, 6–10 December 2010; Association for Computing Machinery: New York, NY, USA, 2010; pp. 21–30. [Google Scholar]
Wigmore, I. What Is Software Robot? Available online: https://whatis.techtarget.com/definition/software-robot (accessed on 29 December 2021).
Khan, F. The Weaponization of Social Media—CounterPunch.org. Available online: https://www.counterpunch.org/2018/11/09/the-weaponization-of-social-media/ (accessed on 29 December 2021).
Dunham, K.; Melnick, J. Malicious Bots: An Inside Look into the Cyber-Criminal Underground of the Internet, 1st ed.; Auerbach Publications: New York, NY, USA; Boca Raton, FL, USA, 2008. [Google Scholar]
AFP Surge in Anonymous Asia Twitter Accounts Sparks Bot Fears. Available online: https://www.dailymaverick.co.za/article/2018-04-22-surge-in-anonymous-asia-twitter-accounts-sparks-bot-fears/ (accessed on 29 December 2021).
Wagner, C.; Mitter, S.; Körner, C.; Strohmaier, M. When Social Bots Attack: Modeling Susceptibility of Users in Online Social Networks. In Proceedings of the 2nd Workshop on Making Sense of Microposts, #MSM, Lyon, France, 16 April 2012; pp. 41–48. [Google Scholar]
Suárez-Serrato, P.; Roberts, M.E.; Davis, C.; Menczer, F. On the influence of social bots in online protests. In Proceedings of the International Conference on Social Informatics, Bellevue, WA, USA, 11–14 November 2016; Springer: Cham, Switzerland, 2016; pp. 269–278. [Google Scholar]
Bu, Z.; Xia, Z.; Wang, J. A sock puppet detection algorithm on virtual spaces. Knowl.-Based Syst. 2013, 37, 366–377. [Google Scholar] [CrossRef]
Twitter. Twitter’s Automation Development Rules|Twitter Help. Available online: https://help.twitter.com/en/rules-and-policies/twitter-automation (accessed on 29 December 2021).
Chu, Z.; Gianvecchio, S.; Wang, H.; Jajodia, S. Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg? IEEE Trans. Dependable Secur. Comput. 2012, 9, 811–824. [Google Scholar] [CrossRef]
Subrahmanian, V.S.; Azaria, A.; Durst, S.; Kagan, V.; Galstyan, A.; Lerman, K.; Zhu, L.; Ferrara, E.; Flammini, A.; Menczer, F. The DARPA Twitter bot challenge. Computer 2016, 49, 38–46. [Google Scholar] [CrossRef] [Green Version]
Zhang, C.M.; Paxson, V. Detecting and analyzing automated activity on twitter. In Proceedings of the International Conference on Passive and Active Network Measurement, Atlanta, GA, USA, 20–22 March 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 102–111. [Google Scholar]
Ferrara, E.; Varol, O.; Davis, C.; Menczer, F.; Flammini, A. The rise of social bots. Commun. ACM 2016, 59, 96–104. [Google Scholar] [CrossRef] [Green Version]
Davis, C.A.; Varol, O.; Ferrara, E.; Flammini, A.; Menczer, F. BotOrNot: A System to Evaluate Social Bots. In Proceedings of the 25th International Conference Companion on World Wide Web, Montreal, QC, Canada, 11–15 April 2016; pp. 273–274. [Google Scholar]
Yang, K.; Varol, O.; Davis, C.A.; Ferrara, E.; Flammini, A.; Menczer, F. Arming the public with artificial intelligence to counter social bots. Hum. Behav. Emerg. Technol. 2019, 1, 48–61. [Google Scholar] [CrossRef] [Green Version]
Aldayel, A.; Magdy, W. Characterizing the role of bots’ in polarized stance on social media. Soc. Netw. Anal. Min. 2022, 12, 1–24. [Google Scholar] [CrossRef] [PubMed]
Rizoiu, M.-A.; Graham, T.; Zhang, R.; Zhang, Y.; Ackland, R.; Xie, L. # debatenight: The role and influence of socialbots on twitter during the 1st 2016 us presidential debate. In Proceedings of the International AAAI Conference on Web and Social Media, Palo Alto, CA, USA, 25–28 June 2018; Volume 12. [Google Scholar]
Varol, O.; Ferrara, E.; Davis, C.; Menczer, F.; Flammini, A. Online human-bot interactions: Detection, estimation, and characterization. In Proceedings of the International AAAI Conference on Web and Social Media, Montreal, QC, Canada, 15–18 May 2017. [Google Scholar]
Broniatowski, D.A.; Jamison, A.M.; Qi, S.; AlKulaib, L.; Chen, T.; Benton, A.; Quinn, S.C.; Dredze, M. Weaponized Health Communication: Twitter Bots and Russian Trolls Amplify the Vaccine Debate. Am. J. Public Health 2018, 108, 1378–1384. [Google Scholar] [CrossRef] [PubMed]
Chavoshi, N.; Hamooni, H.; Mueen, A. DeBot: Twitter bot detection via warped correlation. In Proceedings of the IEEE International Conference on Data Mining, ICDM, Barcelona, Spain, 12–15 December 2016; pp. 817–822. [Google Scholar]
Chavoshi, N.; Hamooni, H.; Mueen, A. Identifying correlated bots in Twitter. In Proceedings of the International Conference on Social Informatics, Bellevue, WA, USA, 11–14 November 2016; Springer: Cham, Switzerland, 2016; pp. 14–21. [Google Scholar]
Chavoshi, N.; Hamooni, H.; Mueen, A. On-Demand Bot Detection and Archival System. In Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, 3–7 April 2017; International World Wide Web Conferences Steering Committee: Geneva, Switzerland, 2017; pp. 183–187. [Google Scholar]
Rofrío, D.; Ruiz, A.; Sosebee, E.; Raza, Q.; Bashir, A.; Crandall, J.; Sandoval, R. Presidential Elections in Ecuador: Bot Presence in Twitter. In Proceedings of the 2019 Sixth International Conference on eDemocracy & eGovernment (ICEDEG), Quito, Ecuador, 24–26 April 2019; pp. 218–223. [Google Scholar]
Yuan, X.; Schuchard, R.J.; Crooks, A.T. Examining emergent communities and social bots within the polarized online vaccination debate in Twitter. Soc. Media Soc. 2019, 5, 2056305119865465. [Google Scholar] [CrossRef] [Green Version]
Kušen, E.; Strembeck, M. Why so emotional? An analysis of emotional bot-generated content on Twitter. In Proceedings of the 3rd International Conference on Complexity, Future Information Systems and Risk, Funchal, Portugal, 20–21 March 2018; pp. 13–22. [Google Scholar]
Luescher, T.M.; Makhubu, N.; Oppelt, T.; Mokhema, S.; Radasi, M.Z. Tweeting #FeesMustFall: The Online Life and Offline Protests of a Networked Student Movement. In Student Movements in Late Neoliberalism; Cini, L., della Porta, D., Guzmán-Concha, C., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 103–131. [Google Scholar]
Matrose, T.L. Media Reporting and Student Self-Representation: A Comparison of the 2015 #FeesMustFall Campaign at Nelson Mandela Metropolitan University by the Herald and #FeesMustFall. Master’s Thesis, Faculty of Humanities, Nelson Mandela University, Gqeberha, South Africa, 2021. [Google Scholar]
Olagunju, A.C. Exploring Audience Inclusion in Facebook and Twitter Reporting among Young University Students in South Africa: The Case of #FeesMustFall. Master’s Thesis, Durban University of Technology, Durban, South Africa, 2021. [Google Scholar]
Makhubu, N.; Budree, A. The Effectiveness of Twitter as a Tertiary Education Stakeholder Communication Tool: A Case of #FeesMustFall in South Africa. In Proceedings of the International Conference on Human-Computer Interaction, Orlando, FL, USA, 26–31 July 2019; Springer: Cham, Switzerland, 2019; pp. 535–555. [Google Scholar]
Bolton, A. ‘Tweeting in Solidarity’: A Corpus Linguistics-Driven Analysis of Tweets during the South African Student Protests. Master’s Thesis, University of the Witwatersrand, Johannesburg, South Africa, 2018. [Google Scholar]
Bosch, T. Twitter activism and youth in South Africa: The case of #RhodesMustFall. Inf. Commun. Soc. 2017, 20, 221–232. [Google Scholar] [CrossRef]
Erl, T.; Khattak, W.; Buhler, P. Big Data Fundamentals: Concepts, Drivers & Techniques; Prentice Hall Press: Hoboken, NJ, USA, 2016. [Google Scholar]
Roscoe, J.T. Fundamental Research Statistics for the Behavioral Sciences; Holt, Rinehart & Winston: Austin, TX, USA, 1975. [Google Scholar]
Islam, M.R. Sample size and its role in Central Limit Theorem (CLT). Comput. Appl. Math. J. 2018, 4, 1–7. [Google Scholar]
Kwak, S.G.; Kim, J.H. Central limit theorem: The cornerstone of modern statistics. Korean J. Anesthesiol. 2017, 70, 144. [Google Scholar] [CrossRef] [PubMed]
Buffer. Buffer: All-You-Need Social Media Toolkit for Small Businesses. Available online: https://buffer.com/ (accessed on 13 February 2022).
Hootsuite. Keep Your Social Presence Active—Even When You’re Not. Available online: https://www.hootsuite.com/platform/publish (accessed on 13 February 2022).
IFTTT IF This Then That (IFTTT). Available online: https://ifttt.com/ (accessed on 13 February 2022).
TweetDeck. Available online: https://tweetdeck.twitter.com/ (accessed on 13 February 2022).
TweetCaster. Available online: https://www.tweetcaster.com/ (accessed on 13 February 2022).
Bild, D.R.; Liu, Y.; Dick, R.P.; Mao, Z.M.; Wallach, D.S. Aggregate characterization of user behavior in Twitter and analysis of the retweet graph. ACM Trans. Internet Technol. 2015, 15, 1–24. [Google Scholar] [CrossRef]

Figure 1. The data-driven detection process for the social bots.

Figure 2. Monthly tweets between October 2015 and September 2016: (a) lognormal distribution curve and (b) lognormal P-P plot.

Figure 3. Total tweet distribution according to days of the week.

Figure 4. The top ten Twitter users with the highest number of twitter posts, as well as the number of hashtags, URLs, and retweets.

Figure 5. User 1’s frequency, with total tweets per minute over a one-hour period.

Figure 6. Volume of User 2’s Twitter posts in a ten-second period.

Table 1. A sample of the data with Tweets and usernames anonymized.

Tweet	Date and Time	Retweet	Favorite	Tweet Source	Tweet Language	User Name
This is a dummy tweet #FeesMustFall.	2017-04-09 17:55:35	1	4	Twitter for Android	Eng.	User 37345
This is a dummy tweet #FeesMustFall.	2017-04-09 17:55:11	0	2	TweetDeck	Eng.	User 17036
This is a dummy tweet #FeesMustFall.	2017-04-09 17:35:17	0	0	Twitter for Android	Eng.	User 44562
This is a dummy tweet #FeesMustFall.	2017-04-09 17:21:58	0	0	Twitter for iPhone	Eng.	User 21750
This is a dummy tweet #FeesMustFall.	2017-04-09 17:20:38	0	0	Twitter Web Client	Und.	User 77419

Table 2. Distribution of Tweets by language.

Language	No. of Tweets	Language	No. of Tweets	Language	No. of Tweets	Language	No. of Tweets	Sum Total
English	432,942	Polish	517	Norwegian	188	Chinese	11
Undefined	48,176	Romanian	398	Czech	153	Ukrainian	11
Dutch	3366	Turkish	349	Arabic	152	Vietnamese	6
Spanish	941	Hindi	245	Swedish	152	Urdu	5
German	791	Finnish	241	Hungarian	53	Greek	2
Portuguese	744	Danish	226	Japanese	26	Korean	1
French	523	Italian	217	Russian	12	Persian	1
Total	487,483	Total	2193	Total	736	Total	37	490,449

Table 3. Distribution of Tweets over the observation period.

Year	Month	Number of Tweets	Total Number of Tweets
2015	March	1	306,834
	April	1
	October	289,458
	November	13,452
	December	3922
2016	January	13,318	172,052
	February	7215
	March	3898
	April	2076
	May	900
	June	1551
	July	2238
	August	6541
	September	38,472
	October	82,712
	November	9505
	December	3626
2017	January	4113	11,563
	February	2244
	March	2843
	April	2363
Total No. of Tweets (2015–2017)			490,449

Table 4. Results of a descriptive test for total monthly tweets for #FeesMustFall from October 2015 to September 2016.

Variable	Value
Count	12
Mean	31,920
St. Dev	81,762
Median	5231.50
Min	900
Max	289,458
Skew	3370
Kurt	1150

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Khan, Y.; Thakur, S.; Obiyemi, O.; Adetiba, E. Identification of Bots and Cyborgs in the #FeesMustFall Campaign. Informatics 2022, 9, 21. https://doi.org/10.3390/informatics9010021

AMA Style

Khan Y, Thakur S, Obiyemi O, Adetiba E. Identification of Bots and Cyborgs in the #FeesMustFall Campaign. Informatics. 2022; 9(1):21. https://doi.org/10.3390/informatics9010021

Chicago/Turabian Style

Khan, Yaseen, Surendra Thakur, Obiseye Obiyemi, and Emmanuel Adetiba. 2022. "Identification of Bots and Cyborgs in the #FeesMustFall Campaign" Informatics 9, no. 1: 21. https://doi.org/10.3390/informatics9010021

APA Style

Khan, Y., Thakur, S., Obiyemi, O., & Adetiba, E. (2022). Identification of Bots and Cyborgs in the #FeesMustFall Campaign. Informatics, 9(1), 21. https://doi.org/10.3390/informatics9010021

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Identification of Bots and Cyborgs in the #FeesMustFall Campaign

Abstract

1. Introduction

The South African #FeesMustFall and Other Related Movements

2. Data and Methodology

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI