Gender Prediction of Generated Tweets Using Generative AI

Alowibdi, Jalal S.

doi:10.3390/info15080452

Open AccessArticle

Gender Prediction of Generated Tweets Using Generative AI

by

Jalal S. Alowibdi

Department of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah 23890, Saudi Arabia

Information 2024, 15(8), 452; https://doi.org/10.3390/info15080452

Submission received: 24 June 2024 / Revised: 19 July 2024 / Accepted: 27 July 2024 / Published: 1 August 2024

(This article belongs to the Special Issue Second Edition of Predictive Analytics and Data Science)

Download

Browse Figures

Versions Notes

Abstract

With the use of Generative AI (GenAI), Online Social Networks (OSNs) now generate a huge volume of content data. Yet, user-generated content on OSNs, aided by GenAI, presents challenges in analyzing and understanding its characteristics. In particular, tweets generated by GenAI at the request of authentic human users present difficulties in determining the gendered variation of the content. The vast amount of data generated from tweets’ content necessitates a thorough investigation into the gender-specific language used in these tweets. This study explores the task of predicting the gender of text content in tweets generated by GenAI. Through our analysis and experimentation, we have achieved a remarkable 90% accuracy in attributing gender-specific language to these tweets. Our research not only highlights the potential of GenAI in gender prediction but also underscores the sophisticated techniques employed to decipher the refined linguistic cues that differentiate male and female language in GenAI-generated content. This advancement in understanding and predicting gender-specific language in GenAI-generated tweets covers the way for more refined and accurate content analysis in the evolving landscape of OSNs.

Keywords:

generative AI; artificial intelligence; linguistic patterns; text classification; GenAI-generated; human authored; gender-specific

1. Introduction

With the majority of people using Online Social Networks (OSNs), these platforms are overwhelmed with a massive volume of text content teeming with diverse perspectives, opinions, and sentiments. Twitter, now rebranded as X, has over 350 millions and 100 millions active users monthly and daily, respectively, resulting in 100 millions tweets and billions of words daily [1]. Therefore, understanding and analyzing the characteristics of this content, especially that generated with the help of Generative AI (GenAI), presents a significant challenge. Specifically, tweets generated by GenAI at the request of authentic human users present difficulties in determining the gendered variation of the content. Predicting gender from content is significantly different from face recognition for gender, even though both use similar classification techniques. While face recognition relies on visual cues and patterns that are often distinct and easily detectable, predicting gender from text content involves analyzing linguistic and stylistic characteristics that are much more intricate. The challenge in text-based gender prediction lies in the variability and complexity of language, where individual writing styles can vary widely regardless of gender [2,3,4]. This makes it harder to achieve accurate predictions compared to face detection, where the visual features are more consistent and easier to classify. Our research explores the interesting task of predicting the gender of text content in tweets generated by GenAI.

In addition, GenAI applications have advanced significantly, replicating human language and cognitive patterns with increasing sophistication [5,6,7,8]. GenAI has revolutionized content creation, offering a tantalizing glimpse into algorithmically generated text that mimics human-like language patterns. While GenAI algorithms demonstrate remarkable proficiency, human-authored content emanates from the depths of individual thought processes, reflecting the complexities of human cognition and emotion [7,8,9,10,11,12]. This progress presents a technological and societal challenge in accurately predicting gender-specific language in tweets generated by these algorithms. We investigate the small language details, contextual clues, and delicate hints that reveal the gender differences embedded in a tweet. By examining sentence structure, semantic coherence, and other linguistic features, we aim to predict gender-specific tweets generated by GenAI on request by authentic human users. Understanding the gender-specific language in GenAI-generated content holds profound implications for content creation on OSNs. Our fundamental research questions are: What is the impact of using GenAI to produce gender-specific tweets compared to tweets authored by gender-specific humans? What linguistic features, such as syntax, vocabulary, and grammar, are most indicative of gender-specific language in tweets generated by GenAI? How accurately can machine learning models predict the gender of GenAI-generated tweets compared to human-authored tweets? What roles do sentiment and emotional expression play in distinguishing gender-specific language in GenAI-generated tweets from that in human-authored tweets? How does the use of hashtags differ between male and female language in GenAI-generated tweets, and what impact does this have on tweet engagement and authenticity? What are the patterns of user interaction with gender-specific language in GenAI-generated tweets compared to those authored by humans? These questions guide our exploration, driving us to uncover the underlying dynamics of tweet generation and consumption on OSNs.

The importance of our work extends beyond academic research to include practical benefits for a variety of stakeholders. From OSN platforms grappling with issues of content moderation to marketers tailoring their strategies, businesses targeting specific demographics, policymakers regulating AI usage, and users seeking authentic sources of information, the ability to predict gender-specific language in GenAI-generated tweets holds immense value. By providing insights into the distinctive characteristics of gendered language from each source, our research aims to empower individuals and organizations to make informed decisions in increasingly complex OSNs. Our contributions are manifold:

We collected a dataset containing gender-specific GenAI-generated tweets from users using ChatGPT, and human-authored tweets labeled by gender (OpenAI 2024).
We presented a novel approach and methodology for collecting a dataset tagged with hashtags, utilizing a temporal approach to capture trending hashtags over different time periods. This ensures a balanced and representative sample of tweets.
We employed a two-stage feature selection method to identify the most discriminative features for gender prediction. This involved analyzing term frequencies and applying the Chi-square test to select features with high discriminative scores that significantly contribute to distinguishing gender-specific language in tweets.
Through extensive experimentation with various Machine Learning (ML) classifiers, including Support Vector Machine (SVM), Naive Bayes (NB), Decision Tree (DT), Random Forest (RF), and Multi-Layer Perceptron (MLP), we validated the efficacy of our method. Our results demonstrate that we can accurately predict the gender of text content in tweets generated by GenAI.

Indeed, this article is outlined as follows: Section 2 provides the related works. Then, Section 3 explores the proposed work. Also, Section 4 introduces the experimental results of our work and lists the outcomes. Finally, we highlight the conclusion and the future work.

2. Related Works

Gender prediction on OSNs has been a significant area of research due to its applications in targeted marketing, personalized recommendations, and social studies. Early work in this domain focused on utilizing profile information and textual content to predict the gender of users. Peersman et al. explored methods for predicting age and gender in online social networks by analyzing user profiles and social media comments [13]. Their study highlighted the importance of linguistic and behavioral features in gender prediction tasks.

Also, Merler et al., extended this work by incorporating a semantic analysis of social media images to predict gender [14]. They found that combining visual and textual features significantly improved the accuracy of gender prediction models on social media platforms like Twitter. Another notable study by Çelik and Aslan utilized artificial intelligence to predict gender from social media comments, emphasizing the role of natural language processing techniques in enhancing prediction accuracy [15].

In addition, Reddy et al., presented an N-gram approach for gender prediction, demonstrating how specific linguistic patterns can be used to distinguish between male and female language in social media content [16]. Similarly, Krüger and Hermann investigated the state-of-the-art in gender identification from texts, evaluating various online services and their effectiveness in gender prediction [17]. Bamman et al. examined gender identity and lexical variation on social media, highlighting how gender influences language use and communication styles [18].

With the advancement of Gen-AI technologies like ChatGPT, the generation of text that depicts human language has become increasingly prevalent (OpenAI 2024). This raises questions about the ability of AI to replicate gender-specific language characteristics. The work by OpenAI (2024) demonstrated the capability of ChatGPT to produce coherent and contextually relevant text, yet it also highlighted the challenges in ensuring the generated content accurately reflects gender-specific subtleties [4]. Research by Gu discussed the ethical considerations and responsibilities involved in generative AI, particularly concerning the generation of biased or stereotypical content [9]. This work emphasizes the need for the careful design and monitoring of AI systems to avoid perpetuating gender biases in generated content. García-Peñalvo and Vázquez-Ingelmo provided a comprehensive overview of the evolution and trends in generative AI, underscoring the significance of addressing biases and ensuring the ethical deployment of these technologies [10].

The comparative analysis of human-authored and AI-generated text reveals several challenges in predicting gender-specific language. Alowibdi et al. explored the task of distinguishing between human-authored and GenAI-generated tweets, achieving high accuracy in identifying the source of tweets [1]. Their research underscores the complexity of modeling gender-specific language in generative AI content. Overall, the body of work in gender prediction for generative AI content highlights the progress made and the ongoing challenges in this field. As GenAI continues to evolve, it is crucial to develop robust methods for predicting and analyzing gender-specific language to ensure the ethical and accurate representation of gender in OSN content.

3. Materials and Methods

3.1. Motivation

The advanced development of GenAI applications has transformed content creation, enabling machines to generate text that closely resembles human language and is widely adopted by users. Nowadays, many users use GenAI daily. It has been noticed that users use it to generate tweets for them. This technological leap raises important questions about the gender-specific traits of content shared on OSNs. Understanding the gender specificity in tweets generated by GenAI-generated and human-authors has become a pressing concern, with significant implications for trust, transparency, and the overall integrity of OSN content. Consequently, our research seeks to address these challenges and develop robust methods for predicting the gender specifics of GenAI-generated tweets upon the request of human-authors. This distinction is crucial because the increasing volume of GenAI-generated content on OSN platforms like Twitter can obscure the differences between authentic human interactions and GenAI responses, potentially failing to identify gender-specific language. This lack of differentiation can diminish user trust, as gender-specific characteristics are often key to authenticity. If users start to doubt the genuineness of the content they encounter, their engagement and trust in these platforms may decline.

Also, it is crucial for enhancing personalization and user engagement on OSNs. By accurately identifying and generating gender-specific language, AI systems can tailor content more effectively to meet the preferences and expectations of diverse user groups. This level of personalization can lead to higher user satisfaction and increased engagement, as users feel more understood and valued. In addition, for businesses and marketers, gender-specific prediction in GenAI-generated tweets can significantly improve communication strategies. By tailoring messages to resonate with different gender groups, companies can enhance their marketing effectiveness and reach their target audiences more efficiently. Understanding the characteristics of gender-specific language allows for more compelling and persuasive communication, ultimately leading to better conversion rates and customer loyalty.

Gender-specific prediction in GenAI-generated content contributes to a more authentic user experience on OSNs. Users often expect content that aligns with their linguistic preferences and communication styles. By generating gender-specific content, GenAI systems can provide a more relatable and engaging experience for users, fostering a sense of community and belonging. Also, accurate gender-specific prediction in GenAI-generated tweets is also a matter of ethical AI deployment. Ensuring that GenAI systems respect and reflect gender differences responsibly is crucial for maintaining the trust and confidence of users. Ethical considerations, such as avoiding the reinforcement of harmful stereotypes and biases, are integral to the development and deployment of GenAI technologies. By focusing on gender-specific predictions, developers can contribute to the creation of fair and equitable GenAI systems. Yet, from a research perspective, exploring gender-specific prediction in GenAI-generated tweets provides valuable insights into the complexities of human language and communication. It allows researchers to better understand how gender influences language use and interaction patterns on OSNs. This knowledge can inform the development of more sophisticated and accurate GenAI models, contributing to advancements in the field of natural language processing (NLP) and AI.

Indeed, the ability to accurately predict and generate gender-specific language in GenAI-generated tweets is important for enhancing personalization, addressing bias, improving communication strategies, enhancing user experience, ensuring ethical AI deployment, and advancing research. As GenAI continues to evolve, prioritizing gender-specific prediction will play a crucial role in creating a more inclusive, engaging, and ethical online environments.

3.2. Dataset

We started our journey to collect datasets from Twitter by harvesting datasets using two distinct approaches.

Firstly, we retrieved a dataset containing tweets associated with older hashtags, originating from real, gender-specific, labeled human-authored users. Simultaneously, we generated an equivalent dataset using a GenAI application such as ChatGPT, specifically instructing it to produce tweets based on the same hashtags without emotional bias, to observe how it interacted with the hashtags [5]. We created the tweets using two different sources of ChatGPT; a male person instructing ChatGPT to create the tweets for the assigned hashtags and a female person instructing ChatGPT to create the tweets for the assigned hashtags. While the tweets collected from human-authored sources encompassed a mix of positive and negative sentiments depending on the content generated for specific assigned hashtags and are randomly collected, labeled, and verified, those generated by GenAI were solely based on the provided hashtags using the two different sources mentioned above. This meticulous approach to dataset employed a dual-pronged strategy to ensure a robust foundation for our analysis, capturing a diverse array of tweets authored by genuine human users. Concurrently, we harnessed the capabilities of cutting-edge GenAI to generate a synthetic dataset mirroring the thematic scope of the collected tweets. By using identical hashtags for both datasets, we created a controlled environment for comparative analysis, enabling detailed insights into the dynamics of content generation.

Secondly, we extended beyond historical datasets to encompass contemporary trends. In parallel with our exploration of older hashtags, we pivoted towards the latest trending hashtags on Twitter. This temporal approach enabled us to capture real-time conversations and emergent themes, thereby enriching the breadth and depth of our dataset. By correlating the datasets spanning distinct temporal epochs, we aimed to spot shifting patterns and trends in gender-specific, human-authored and gender-specific, GenAI-generated content. We have observed that GenAI now produced tweets with different contextual characteristics compared to those generated in the past. Therefore, similar to the initial phase, we acquired a dataset comprising tweets generated by real human users, alongside a corresponding dataset generated by GenAI for the same set of hashtags. Thus, two types of hashtags on Twitter were collected from two different sources at two different times to analyze the gender-specific behavior for both GenAI-generated tweets and human-generated tweets. The hashtags selected were not related to any specific topic but were randomly picked from the trending list at the time or those that showed up during our exploration. Subsequently, all tweets related to the chosen hashtags were collected and stored in a database. Concurrently, all hashtags from the collected tweets were extracted and inputted into the male and female sources of the GenAI application to produce tweets. This resulted in two types of tweets: those generated by real human users with two classes of gender and those generated by GenAI with two classes of gender, after being provided with the hashtags. The process unfolded through a systematic approach, encompassing various stages to ensure a comprehensive and balanced selection of data.

Therefore, we collected 3000 gender-specific human-authored tweets spanning more than 150 different hashtags, representing 8 years. Subsequently, we tasked the GenAI application to generate an equivalent number of gender-specific GenAI-generated tweets for the same number of hashtags [5]. This approach yielded a balanced dataset comprising 6000 tweets containing four different classes. These classes are male human-authored tweets, female human-authored tweets, male GenAI-generated tweets and female GenAI-generated tweets.

3.3. Approach

The preprocessing step is crucial, before applying the features selection, to ensure the quality and relevance of the data used for analysis. We started by eliminating noise such as irrelevant characters, URLs, and special symbols from the tweets. Then, we split the tweets into individual words or tokens. We then removed common words that did not contribute to the meaning (e.g., “the”, “is”, “at”) and reduced words to their base or root form to ensure consistency. After that, each tweet was converted into a bag-of-words model to represent the actual content. This model creates a set of words used in the tweet without considering the order. Finally, we analyzed the sentiment of all the collected tweets to categorize them into gender-specific for both GenAI-generated and human-authored content. After preprocessing, we conducted a comprehensive analysis of the collected datasets, focusing on extracting various textual features. Our analytical framework emphasized the extraction and characterization of salient features inherent in textual data. Ultimately, we identified numerous characteristic features. Therefore, we applied a reduction technique, using a feature selection method, to the extracted features. Through rigorous feature engineering, we discovered that conventional feature selection methods could result in deficient performance and, consequently, reduced accuracy.

To overcome these issues, we propose a novel approach of choosing Term Frequency (TF) and Document Frequency (DF) with higher scoring by computing the most discriminative term or document. The goal is to select features that are most discriminative between tweets generated by GenAI and those authored by humans, and then more specifically focus on gender-specific characteristics on tweets that were GenAI-generated or human-authored. To achieve this, we propose an approach to leverage the Chi-square statistic to evaluate the significance of each term’s frequency in predicting gender classification, while also using probability equations to assess the differences between, firstly, GenAI-generated tweets and human-authored tweets, and secondly, male and female term usage. This will help us evaluate the independence of the terms across GenAI-generated and human-authored categories, then gender-specific categories.

In addition, feature selection is a crucial step in the machine learning pipeline, particularly in high-dimensional data contexts such as text classification and social network analysis. Thus, in this study, we evaluated the performance of our proposed feature selection method, which combines TF, DF and Chi-square statistics, against other standard feature selection techniques. The standard techniques considered include Mutual Information (MI), Principal Component Analysis (PCA), and Recursive Feature Elimination (RFE). This will help in terms of computational efficiency and accuracy. The proposed method’s computational efficiency, which leverages TF, DF, and Chi-square statistics, has a time complexity of

O (n \cdot m)

. It is similar to the MI time complexity of,

O (n \cdot m)

but significantly more efficient than PCA, which has time complexity of

O (n^{2} \cdot m + n^{3})

and RFE that has time complexity of

O (n^{2} \cdot m)

. Moreover, the proposed method’s accuracy compared to standard techniques demonstrates competitive accuracy while maintaining computational efficiency. Yet, the results indicate that the proposed feature selection method provides a good balance between computational efficiency and accuracy. It performs better than standard techniques in terms of execution time and model performance. The combination of TF, DF, and Chi-square statistics effectively captures the most discriminative features, enhancing the predictive power of the model without incurring significant computational overhead.

Therefore, for each term

t

in the dataset, we calculated its frequency in GenAI-generated tweets

T F (t, G)

and in human-authored tweets

T F (t, H)

, and then, we calculated its frequency in male tweets

T F (t, M)

and in female tweets

T F (t, F)

. After that, we calculated the total length of terms for both in GenAI-generated tweets

T L G

and in human-authored tweets

T L H

, and then in male tweets

T L M

and in female tweets

T L F

. Then, we calculated the probability of each term

t

occurring in GenAI-generated and human-authored tweets as well as gender-specific male and female tweets as follows:

P (t | G) = \frac{T F (t, G)}{T L (G)}

(1)

P (t | H) = \frac{T F (t, H)}{T L (H)}

(2)

P (t | M) = \frac{T F (t, M)}{T L (M)}

(3)

P (t | F) = \frac{T F (t, F)}{T L (F)}

(4)

where Equation (1) is for GenAI-generated tweets, Equation (2) is for human-authored tweets, Equation (3) is for gender-specific-male tweets and Equation (4) is for gender-specific-female tweets. Also, we calculated the difference in probabilities between GenAI-generated and human-authored tweets as well as gender-specific male and female tweets for each term t as follows:

{△ P (t)}_{G H} = P (t | G) - P (t | H)

(5)

{△ P (t)}_{M F} = P (t | M) - P (t | F)

(6)

where Equations (5) and (6) are for GenAI-generated tweets and human-authored tweets and gender-specific male and female tweets, respectively. In addition, we calculated the expected frequency for term

t

in GenAI-generated and human-authored tweets as well as gender-specific male and female tweets as follows:

E_{G} = \frac{(T F (t, G) + T F (t, H)) \times T L (G)}{T L (G) + T L (H)}

(7)

E_{H} = \frac{(T F (t, G) + T F (t, H)) \times T L (H)}{T L (G) + T L (H)}

(8)

E_{M} = \frac{(T F (t, M) + T F (t, F)) \times T L (M)}{T L (G) + T L (H)}

(9)

E_{F} = \frac{(T F (t, M) + T F (t, F)) \times T L (F)}{T L (G) + T L (H)}

(10)

where Equations (7)–(10) are for GenAI-generated tweets, human-authored tweets, gender-specific male tweets and gender-specific female tweets, respectively. Furthermore, we calculated the Chi-square statistic for each term

t

as follows:

χ_{G H}^{2} = \frac{{(T F (t, G) - E_{G})}^{2}}{E_{G}} + \frac{{(T F (t, H) - E_{H})}^{2}}{E_{H}}

(11)

χ_{M F}^{2} = \frac{{(T F (t, M) - E_{M})}^{2}}{E_{M}} + \frac{{(T F (t, F) - E_{F})}^{2}}{E_{F}}

(12)

where Equations (11) and (12) are for GenAI-generated and human-authored tweets and gender-specific male and female tweets, respectively. Moreover, we combined the Chi-square statistic and the probability difference to form a composite feature selection criterion as follows:

D i s c r i m i n a t i v e S c o r e (t) = |△ P (t)| + χ_{G H}^{2}

(13)

D i s c r i m i n a t i v e S c o r e (t) = |△ P (t)| + χ_{M F}^{2}

(14)

where Equations (13) and (14) are for GenAI-generated and human-authored tweets and gender-specific male and female tweets, respectively. Finally, we selected the top terms based on their discriminative scores for GenAI-generated and human-authored classification as well as gender-specific male and female classification using the following pseudo code (Algorithm 1):

Algorithm 1. Selected the top terms based on their discriminative scores

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

// Input: List of tweets for class 1 (e.g., male) and class 2 (e.g., female)
// Output: Selected top features for gender prediction
// Calculate Term Frequencies
function calculate_term_frequencies(tweets, class):
initialize term_freq as an empty dictionary
initialize total_length as 0
for each tweet in tweets:
words = preprocess(tweet)
total_length += length of words
for each word in words:
if word not in term_freq:
term_freq[word] = 0
term_freq[word] += 1
return term_freq, total_length

// Calculate Probability, Chi-Square, and Discriminative Score
function calculate_discriminative_score(term_freq_c1,term_freq_c2,
total_length_c1,
total_length_c2):
initialize scores as an empty dictionary
for each term in union of keys in term_freq_c1 and term_freq_c2:
tf_c1 = term_freq_c1.get(term, 0)
tf_c2 = term_freq_c2.get(term, 0)
p_c1 = tf_c1 / total_length_c1
p_c2 = tf_c2 / total_length_c2
delta_p = absolute value of (p_c1 − p_c2)
e_c1 = (tf_c1 + tf_c2) * total_length_c1 / (total_length_c1 + total_length_c2)
e_c2 = (tf_c1 + tf_c2) * total_length_c2 / (total_length_c1 + total_length_c2)
chi_square = ((tf_c1 − e_c1)^2 / e_c1) + ((tf_c2 − e_c2)^2 / e_c2)
scores[term] = delta_p + chi_square
return scores

// Feature Selection
function select_top_features(scores, top_n):
sorted_terms = sort scores by value in descending order
return first top_n items from sorted_terms

// Main Function
function main(tweets_c1, tweets_c2, top_n):
term_freq_c1, total_length_c1 = calculate_term_frequencies(tweets_c1, ‘class1’)
term_freq_c2, total_length_c2 = calculate_term_frequencies(tweets_c2, ‘class2’)
scores = calculate_discriminative_score(term_freq_c1, term_freq_c2,
total_length_c1, total_length_c2)
top_features = select_top_features(scores, top_n)
return top_features

After calculating the discriminative scores for each term, we proceeded to select the most discriminative features that would be used in our GenAI-generated and human-authored prediction model and gender-specific prediction model. We categorized these features into three different types to evaluate their effectiveness:

Top 500 Features: We first selected the top 500 most discriminative features based on their scores. These features are expected to have the highest impact on distinguishing between male and female language in GenAI-generated tweets and human-authored tweets.
Top 1000 Features: In the second category, we extended our selection to the top 1000 most discriminative features. By including a larger set of features, we aim to capture more characteristics variations in gender-specific language. This broader selection helps ensure that detailed but potentially important linguistic patterns are not overlooked.
All Selected Features: Finally, we compiled a comprehensive set of all the features that were identified as discriminative, regardless of their rank. This complete set includes every term that demonstrates a statistically significant difference in usage between male and female categories. Using this extensive set allows us to fully explore the complexity of gender-specific language in GenAI-generated tweets and human-authored tweets and provides a robust basis for our predictive models.

By categorizing our features in this manner, we can systematically evaluate the impact of feature selection on the performance of our gender prediction models. This approach ensures that our analysis is both thorough and detailed, allowing us to identify the most effective features for distinguishing gender-specific language in GenAI-generated tweets and human-authored tweets. Therefore, Figure 1 illustrates the comprehensive workflow of the proposed algorithm for gender prediction in GenAI-generated tweets.

4. Experimental Results

4.1. Evaluation

We trained five different ML classifiers: SVM, NB, DT, RF and MLP. We applied 10-fold cross-validation on the selected features to ensure robust evaluation. The performance of each model was assessed using accuracy, precision, recall, and F1-score metrics. The results, as shown in Figure 2, Figure 3 and Figure 4, illustrate that as the number of features increases, the performance of all classifiers improves, with MLP consistently achieving the highest scores. This underscores the importance of feature selection in enhancing the model’s ability to predict gender-specific language in GenAI-generated tweets.

These results highlight the effectiveness of the MLP model in capturing the details of gender-specific language in GenAI-generated tweets, making it the best-performing classifier among those tested. To evaluate both performance and accuracy, we tested the classifiers on three different feature sets: Feature 500, Feature 1000, and All Features. The results for each experiment are as follows:

Feature 500: When trained on a feature set consisting of 500 features, the MLP classifier demonstrates the highest performance across all metrics, achieving an accuracy of 83%. SVM follows with slightly lower scores of 81% accuracy. RF obtained 78%, DT scored 80%, and the NB classifier shows the lowest performance in this feature set with an accuracy of 76%.
Feature 1000: When trained on a feature set consisting of 1000 features, the MLP continues to outperform the other classifiers, achieving an accuracy of 86%. SVM shows strong performance as well, with a slightly lower accuracy score of 84%. RF obtained 80%, DT scored 81%, and NB showed improved performance compared to the 500-feature set, indicating that increasing the number of features enhances model performance, with an accuracy of 77%.
All Features: When trained on a feature set consisting of all features, the MLP achieves the highest scores across all metrics, with an accuracy of an accuracy at 90%; this was followed by SVM at 87%, RF at 84%, and DT at 85%. NB, though improved, remains the lowest performer with an accuracy of 80%.

4.2. Observation

Our experiments demonstrate that it is possible to accurately distinguish between male and female language in GenAI-generated tweets. We have observed several notable differences and characteristics between these two gender-specific language patterns in GenAI-generated content. GenAI-generated tweets, when instructed to mimic male language in tweets, often exhibits certain linguistic patterns that are distinct from those instructed to mimic female language. Male GenAI-generated tweets tend to use more assertive and direct language, as in “We need to tackle this problem head-on and come up with a solution quickly”, while female GenAI-generated tweets often include more collaborative and empathetic expressions, as in “Let’s work together to find a solution that benefits everyone involved”. This differentiation in language style reflects societal norms and stereotypes related to gender communication. Additionally, we observed that male GenAI-generated tweets frequently employ more technical jargon and formal language, particularly in professional contexts, as in “Integrating the latest AI technologies will significantly enhance our operational efficiency”, whereas female GenAI-generated tweets often incorporate more personal and relational language, as in “I’m excited about the possibilities that new AI technologies bring to our work”. This pattern is consistent with the tendency for male language to focus on information and task-oriented communication, while female language emphasizes relationship building and emotional expression.

In terms of emotional tone, male GenAI-generated tweets generally exhibit a more neutral or objective tone, as in “Implementing the new software update should optimize our system’s performance and increase efficiency”, whereas female GenAI-generated tweets often convey a wider range of emotions, including empathy, warmth, and support, as in “I really appreciate the team’s support and dedication; it makes all the difference”. This difference in emotional expression can impact the perceived authenticity and relatability of the tweets, with female GenAI-generated content potentially resonating more with audiences seeking emotional connection. Moreover, we found that male GenAI-generated tweets are more likely to include language related to authority, competition, and independence, as in “Achieving these milestones ahead of schedule demonstrates our efficiency and dedication”, while female GenAI-generated tweets tend to use language associated with cooperation, nurturing, and community, as in “Our collective efforts and shared vision have led to this wonderful accomplishment”. This distinction is evident in the choice of words, phrases, and overall tone used in the tweets. This is summarized in Table 1 and Table 2.

Finally, from the word cloud of the male-GenAI generated tweets and female-GenAI-generated tweets, shown in Figure 5 and Figure 6, we have noticed that female-GenAI-generated tweets used less-common words compared to male-GenAI-generated tweets.

5. Conclusions

Our research on predicting the gender-specific language in GenAI-generated tweets has highlighted significant differences between male and female language patterns, both in content and style. Our comprehensive analysis and the use of diverse machine learning models have validated the efficacy of our approach, with the MLP model consistently outperforming others in capturing the details of gender-specific language in GenAI-generated content.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

References

Alowibdi, J.S. A human-authored or GenAI-generated: Who is creating the content. Eng. Technol. Appl. Sci. Res. 2024; in press. [Google Scholar]
Alowibdi, J.S.; Buy, U.A.; Yu, P.S. Language Independent Gender Classification on Twitter. In Proceedings of the 2013 12th International Conference on Machine Learning and Applications, Niagara, ON, Canada, 25–28 August 2013; Volume 1, pp. 365–369. [Google Scholar]
Alowibdi, J.S.; Buy, U.A.; Yu, P.S. Empirical Evaluation of Profile Characteristics for Gender Classification on Twitter. In Proceedings of the 2013 12th International Conference on Machine Learning and Applications, Miami, FL, USA, 4–7 December 2013; Volume 1, pp. 365–369. [Google Scholar]
Alowibdi, J.S.; Buy, U.A.; Yu, P.S.; Ghani, S.; Mokbel, M. Deception Detection in Twitter. Soc. Netw. Anal. Min. 2015, 5, 32. [Google Scholar] [CrossRef]
OpenAI. ChatGPT (March 15 Version) [Large Language Model]. 2024. Available online: https://chat.openai.com (accessed on 20 May 2024).
Lai, J.W. Adapting Self-Regulated Learning in an Age of Generative Artificial Intelligence Chatbots. Future Internet 2024, 16, 218. [Google Scholar] [CrossRef]
Susnjak, T.; McIntosh, T.R. ChatGPT: The End of Online Exam Integrity? Educ. Sci. 2024, 14, 656. [Google Scholar] [CrossRef]
Ali, D.; Fatemi, Y.; Boskabadi, E.; Nikfar, M.; Ugwuoke, J.; Ali, H. ChatGPT in Teaching and Learning: A Systematic Review. Educ. Sci. 2024, 14, 643. [Google Scholar] [CrossRef]
Gu, J. Responsible Generative AI: What to Generate and What Not. arXiv 2024, arXiv:2404.05783. [Google Scholar]
García-Peñalvo, F.; Vázquez-Ingelmo, A. What do we mean by GenAI? A systematic mapping of the evolution, trends, and techniques involved in Generative AI. Int. J. Interact. Multimed. Artif. Intell. 2023, 8. [Google Scholar] [CrossRef]
Kumar, R.; Mindzak, M. Who Wrote This? Detecting Artificial Intelligence–Generated Text from Human-Written Text. Can. Perspect. Acad. Integr. 2024, 7. [Google Scholar] [CrossRef]
Yan, L.; Martinez-Maldonado, R.; Gasevic, D. Generative Artificial Intelligence in Learning Analytics: Contextualising Opportunities and Challenges through the Learning Analytics Cycle. In Proceedings of the 14th Learning Analytics and Knowledge Conference, Kyoto, Japan, 18–22 March 2024; pp. 101–111. [Google Scholar]
Peersman, C.; Daelemans, W.; Van Vaerenbergh, L. Predicting age and gender in online social networks. In Proceedings of the 3rd International Workshop on Search and Mining User-Generated Contents, Glasgow, UK, 28 October 2011; pp. 37–44. [Google Scholar]
Merler, M.; Cao, L.; Smith, J.R. You are what you tweet… pic! Gender prediction based on semantic analysis of social media images. In Proceedings of the 2015 IEEE International Conference on Multimedia and Expo (ICME), Turin, Italy, 29 June–3 July 2015. [Google Scholar]
Çelik, Ö.; Aslan, A.F. Gender prediction from social media comments with artificial intelligence. Sak. Univ. J. Sci. 2019, 23, 1256–1264. [Google Scholar] [CrossRef]
Reddy, T.R.; Vardhan, B.V.; Reddy, P.V. N-gram approach for gender prediction. In Proceedings of the 2017 IEEE 7th International Advance Computing Conference (IACC), Hyderabad, India, 5–7 January 2017; pp. 860–865. [Google Scholar]
Krüger, S.; Hermann, B. Can an online service predict gender? On the state-of-the-art in gender identification from texts. In Proceedings of the 2019 IEEE/ACM 2nd International Workshop on Gender Equality in Software Engineering (GE), Montreal, QC, Canada, 27 May 2019. [Google Scholar]
Bamman, D.; Eisenstein, J.; Schnoebelen, T. Gender identity and lexical variation in social media. J. Socioling. 2014, 18, 135–160. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the proposed algorithm for gender prediction in GenAI-generated tweets.

Figure 2. Performance metrics for Feature 500.

Figure 3. Performance metrics for Feature 1000.

Figure 4. Performance metrics for all features.

Figure 5. Most-used words by GenAI-Male (left) and GenAI-Female (right).

Figure 6. Least-used words by GenAI-Male (left) and GenAI-Female (right).

Table 1. The key linguistic features that distinguish male and female language patterns in GenAI-generated tweets.

Feature	Male GenAI-Generated	Female GenAI-Generated
Lexical and Vocabulary	Use of assertive and technical vocabulary (e.g., “achieve”, “optimize”)	Use of collaborative and empathetic vocabulary (e.g., “support”, “understanding”)
Structure and Syntax	Direct and straightforward sentences; focus on facts and outcomes	Complex sentence structures; conversational and engaging style
Use of Pronouns	Frequent use of “I” and “we,” emphasizing individual/group achievements	Inclusive pronouns like “we”, “us”, and frequent “you” for direct engagement
Emotional Tone	Neutral or objective tone; minimal emotional expression	Wide range of emotions; empathy, warmth, and support
Hashtag Usage	Related to industry-specific topics, technology, current events; typically placed at the end	Related to social issues, personal experiences, community-building; integrated into the tweet body
Punctuation and Grammar	Formal punctuation; fewer grammatical errors; less frequent use of exclamation marks	Expressive punctuation; use of exclamation marks, ellipses; personal touch
Use of Emojis	Less frequent use of emojis; professional contexts	Frequent use of emojis; enhance emotional expression and relatability

Table 2. GenAI-generated tweets for both male and female expressions.

Male GenAI-Generated Tweets	Female GenAI-Generated Tweets
Just achieved a new milestone in our project! #success	So excited to share this milestone with everyone! #success
Optimize your workflow with these tools. #productivity	These tools can really help us streamline our tasks! #productivity
Our team will be discussing the new strategy tomorrow. #business	Can’t wait to brainstorm the new strategy with the team tomorrow! #business
Here are the latest stats on our performance. #data	Check out these interesting stats! Let’s dive in together. #data
Developing new tech solutions to drive innovation. #technology	Thrilled to be part of developing innovative tech solutions! #technology
Results show a significant increase in productivity. #results	The results are in and they look great! #results
Stay focused and achieve your goals. #motivation	You’ve got this! Keep pushing towards your goals. #motivation
Join us for a webinar on the latest trends in AI. #webinar	Can’t wait for the webinar on the latest AI trends! Hope to see you there. #webinar
Analyze these figures for a clearer picture. #analysis	Let’s dive into these figures for a better understanding. #analysis
Implement these strategies to enhance your skills. #development	These strategies can really help you grow! #development

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alowibdi, J.S. Gender Prediction of Generated Tweets Using Generative AI. Information 2024, 15, 452. https://doi.org/10.3390/info15080452

AMA Style

Alowibdi JS. Gender Prediction of Generated Tweets Using Generative AI. Information. 2024; 15(8):452. https://doi.org/10.3390/info15080452

Chicago/Turabian Style

Alowibdi, Jalal S. 2024. "Gender Prediction of Generated Tweets Using Generative AI" Information 15, no. 8: 452. https://doi.org/10.3390/info15080452

APA Style

Alowibdi, J. S. (2024). Gender Prediction of Generated Tweets Using Generative AI. Information, 15(8), 452. https://doi.org/10.3390/info15080452

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Gender Prediction of Generated Tweets Using Generative AI

Abstract

1. Introduction

2. Related Works

3. Materials and Methods

3.1. Motivation

3.2. Dataset

3.3. Approach

4. Experimental Results

4.1. Evaluation

4.2. Observation

5. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI