Building Political Hashtag Communities: A Multiplex Network Analysis of U.S. Senators on Twitter during the 2022 Midterm Elections

Orhan, Yunus Emre; Pirim, Harun; Akbulut, Yusuf

doi:10.3390/computation11120238

Open AccessArticle

Building Political Hashtag Communities: A Multiplex Network Analysis of U.S. Senators on Twitter during the 2022 Midterm Elections

by

Yunus Emre Orhan

¹

,

Harun Pirim

^2,*

and

Yusuf Akbulut

²

¹

Center for the Study of Digital Society, North Dakota State University, Fargo, ND 58102, USA

²

Industrial and Manufacturing Engineering, North Dakota State University, Fargo, ND 58102, USA

^*

Author to whom correspondence should be addressed.

Computation 2023, 11(12), 238; https://doi.org/10.3390/computation11120238

Submission received: 20 October 2023 / Revised: 19 November 2023 / Accepted: 20 November 2023 / Published: 1 December 2023

(This article belongs to the Special Issue Computational Social Science and Complex Systems)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

This study examines how U.S. senators strategically used hashtags to create political communities on Twitter during the 2022 Midterm Elections. We propose a way to model topic-based implicit interactions among Twitter users and introduce the concept of Building Political Hashtag Communities (BPHC). Using multiplex network analysis, we provide a comprehensive view of elites’ behavior. Through AI-driven topic modeling on real-world data, we observe that, at a general level, Democrats heavily rely on BPHC. Yet, when disaggregating the network across layers, this trend does not uniformly persist. Specifically, while Republicans engage more intensively in BPHC discussions related to immigration, Democrats heavily rely on BPHC in topics related to identity and women. However, only a select group of Democratic actors engage in BPHC for topics on labor and the environment—domains where Republicans scarcely, if at all, participate in BPHC efforts. This research contributes to the understanding of digital political communication, offering new insights into echo chamber dynamics and the role of politicians in polarization.

Keywords:

multiplex network; hashtag activism; US Senate; Chat GPT-4

1. Introduction

The U.S. Senate stands as a polarized institution, marked by a party divide that fosters an environment of divisive political rhetoric. Daily, Democratic and Republican senators engage in contentious debates across a broad spectrum of issues. Concurrently, senators also utilize their media presence and direct communications with constituents as powerful platforms to launch political attacks against their opponents. While press releases have conventionally served as a means to reach a broader audience, the advent of social media, particularly Twitter, has ushered in an alternative communication method that circumvents traditional media channels [1,2,3,4,5].

In the midst of the ongoing and widespread political conflicts on Twitter, notable distinctions become evident in the level of participation exhibited by politicians. Some political figures are seen engaging in multiple subjects concurrently, while others concentrate their efforts on a single issue, and a few choose to remain disengaged from the politically charged discourse altogether. These dynamics have garnered significant attention in recent times. Scholars have shifted their focus from traditional sources of information to social media platforms (i.e., [4,6,7,8,9,10,11]). In line with this empirical trend, we undertake a specific analysis of the public Twitter activity of members of the U.S. Senate in the context of the 2022 midterm election.

Our work provides a comprehensive analysis of interactions among U.S. Senators by proposing an innovative way to model communication strategies among elites. First, instead of examining only the direct interactions on Twitter (i.e., follows, retweets, replies, or mentions), we delve deeper into the patterns of reciprocal hashtag usage to uncover another potential communication strategy among political figures. Second, contrary to a common approach that models elite Twitter interactions via monoplex networks, our study harnesses the advances of multiplex networks, a technique that political scientists have explored less. In our multiplex network construction, layers are defined by the reciprocal use of specific hashtags between pairs of accounts. Here, we introduce the concept of Building Political Hashtag Communities (BPHC). In the BPHC framework, the creation and maintenance of political communities on social media are not merely based on overt interactions like mentions or retweets, but more subtly through reciprocal and strategic hashtag use. The underlying premise of BPHC is that reciprocal involvement in hashtags is not coincidental but a deliberate communication strategy. By consistently echoing similar hashtags, politicians not only assert their standpoints but also subtly cultivate virtual communities around shared political concerns or values. This strategic echo, we argue, can critically serve to limit the information variety in their followers’ feeds, thereby reinforcing specific narratives or viewpoints. The continuous reciprocal involvement in these hashtags strengthens this echo chamber effect, establishing more cohesive and ideologically aligned online communities.

Do politicians strategically use hashtags to build their own communities? To address this question, we test our models on a real-world dataset capturing the Twitter interactions of American Senators during the midterm elections of 2022. Adding another layer of innovation, we introduce a novel topic modeling method, leveraging the capabilities of OpenAI’s Chat GPT-4. This approach empowers us to categorize a vast array of hashtags using both supervised and unsupervised techniques, effectively overcoming constraints seen in previous studies, such as sample restrictions or biased categorizations. Aligning with the current literature (i.e., [4,6,7,8,9,10,11]) on congressional tweets, we find that certain subgroups among senators (Democrats, incumbents, females, and representatives from the southern region, as opposed to Republicans, candidates, males, and those from the northern region) cultivate deeper connections and foster more dynamic lines of communication to varying degrees. However, a more nuanced examination using advanced multiplex community analysis offers fresh insights into elite behavior on Twitter. Our multiplex analysis of Democratic and Republican networks shows that, at a general level, Democrats heavily rely on BPHC. Yet, when disaggregating the network across layers, this trend does not uniformly persist. Specifically, while Republicans engage more intensively in BPHC discussions related to immigration, Democrats heavily rely on BPHC in topics related to identity and women. However, only a select group of Democratic actors engage in BPHC for topics on labor and the environment—domains where Republicans scarcely, if at all, participate in BPHC efforts. These nuances underscore the multifaceted nature of BPHC strategies employed by different political actors.

This article makes important contributions. First, the extant literature predominantly employs monoplex network models, offering a limited perspective on the multifaceted interactions occurring on these platforms. To the best of our knowledge, this is the first multiplex analysis to understand the strategic hashtag involvement of politicians. Second, relying on AI-driven hashtag modeling to construct network layers, our study provides a more nuanced and comprehensive understanding of digital political communication. Finally, our findings also speak to existing research on polarization. Conventional wisdom asserts that elites are the source of mass polarization [12]. Evidence, however, is still rare to depict how they strategically do that. We show how elites contribute to the echo chamber by strategically involving/not involving themselves in specific topics.

Following this introduction, the paper presents a literature review, delineating the theoretical framework and empirical background of the study. Subsequently, we elaborate on the methodology employed for data collection and analysis. The findings section provides a detailed account of the study’s results, followed by a discussion integrating these findings with the existing literature and theoretical framework. The paper concludes with remarks on the study’s implications and suggestions for future research.

2. Related Literature

Crafting and maintaining a positive public image has always been a critical component of a politician’s overall strategy. Prior to the digital age, political communication primarily relied on in-person interactions. However, social media has created new direct (arguably more intimate) channels of communication between politicians and the electorate. New communication strategies (e.g., sharing daily activities [13] or replicating messages across platforms [14]) have given politicians the ability to easily shape their public image [15] and clearly express their positions on diverse political issues [16].

The use of online platforms by politicians, however, has evolved significantly in the last decade (see [13,17,18,19,20]). Interactions between politicians, such as retweets, replies, mentions, and likes, are now more common than ever before, especially since the 2016 election cycle (for a review, see [21]). Today, these platforms are not just communication tools but also serve as informational platforms for local and global content [22]. Politicians commonly use these platforms to influence media coverage of political discussions (see [23]) and promote their policies and ideological stances, either by supporting their colleagues or attacking opposition [4].

These dynamics have gained substantial attention in recent years, as scholars—who are interested in examining the congressional ecosystem—shift their focus from traditional sources like roll call data [24], bill text [25], or dissemination of news on media [26] to social networks [27,28,29,30,31,32]. We follow in this empirical trend by specifically analyzing the public Twitter activity of U.S. Senate members in the context of the 2022 midterm election but offer three additional contributions.

First, recent works show that analyzing only direct interactions (i.e., follows, retweets, or mentions) misses a significant part of how politicians interact on Twitter: hashtags [30,31,33]. The importance of hashtag analysis comes from its ability to reveal underlying patterns in social media use that are not immediately apparent. Hashtag analysis is a powerful tool in social media conversations, providing insights into user interests, behaviors, and community dynamics [34]. Many of Twitter’s interactions unfold in this manner, with conversations defined by hashtags that enable users to engage with each other on particular topics directly (for a comprehensive review on hashtag usage research, see Pilar et al. [35]).

On Twitter, however, the use of a specific hashtag in a tweet does more than merely increase the visibility of that message. Hashtags can go beyond their bookmark function and serve as an implicit form of communication with other Twitter users who are utilizing the same hashtag, even if there is no direct connection between them. Unlike traditional follower-based networks, these hashtag-driven spaces for discourse do not necessitate mutual follower relationships, setting them apart from existing social media dynamics [36].

Recent research convincingly shows how hashtags can serve as symbols of communities, functioning as rallying points and banners where individuals with shared interests and concerns unite. This phenomenon has been termed the “imagined audience” in academic literature [37]. Yang et al. [38], for example, articulate the dual role of hashtags and show how hashtag involvement can capture adoption behavior by users and predict future engagements. It is evident that hashtags could serve as tools that facilitate the formation of conversational platforms where people can engage in discussions and contribute to the cultural interpretation of various subjects like racial justice, gender equality, and health matters [39]. Recent research by Hemphill et al. [30], for instance, identifies distinct “polarizing hashtags” that predict an author’s political party with high accuracy.

Building on this literature, we investigate whether political figures strategically employ the latter function of hashtags to establish their own communities, rather than solely using them as an effective means to increase the visibility of certain figures or merely express support for the ideas encapsulated within the hashtags. This communication strategy, which we termed “Building Political Hashtag Communities" (BPHC), not only provides politicians with a direct means to connect and engage with citizens but also facilitates the organization of their base into distinct online groups through deliberate reciprocal usage of hashtags. Coordination network analysis, for example, has proven helpful for learning about how disinformation campaigns operate (see, for example, [40]); however, the usefulness of these tools for political hashtag community building is still an open question. While politicians can coordinate along multiple dimensions, our study is the first to investigate the coordination of hashtags. In this article, instead of focusing on the detection of hashtags mainly used by politicians, constructing a coordination network by connecting senators if they use the same hashtag, we address coordinated activity classification to provide evidence about whether the activity of senators appears to be part of a BPHC.

Second, in studies examining the implications of politicians’ social networks, a common approach to model Twitter interactions among politicians is to construct a monoplex network based on following/follower relations [27], or networks based on either retweets [28,29] or explicit mentions indicated by the “@” character [32]. Following a smaller, but growing, literature (see [31,33]), we model multiplex networks, an area that political scientists have explored less. Recent progress in the field of multiplex networks has indicated that examining multiple monoplex connections together could reveal new insights that are not apparent when looking at each connection separately. Bonifazi et al. [33], for example, enhance this discussion by applying multilayer network analysis to investigate various social phenomena, including the spread of misinformation and public reaction during different phases of the COVID-19 pandemic. Their findings underscore the power of multilayer network representations in analyzing user interactions and the spread of content across social platforms. Similarly, Hanteer and Rossi [31] introduce a thematic multiplex model to capture the nuanced interactions among Danish politicians during the parliamentary elections of 2015. By using a multiplex network model, their study unveils distinct community dynamics based on shared hashtags that traditional interaction models do not reveal, highlighting the strategic use of hashtags in shaping political narratives. We build upon this literature and expand its scope to examine how politicians may strategically use hashtags to shape the information environment of their political base.

Finally, we also offer a new, relatively more effective layer-building strategy. Prior studies that used hashtags to create multiplex/multilayer networks mainly depended on qualitative analyses to identify politically polarizing hashtags (see [31,33]). This approach not only raised the potential for bias but also restricted the sample of hashtags that researchers could work with. Thanks to recent innovations in generative pre-trained transformers, we introduce a new topic modeling method. Utilizing OpenAI’s Chat GPT-4, this strategy enables the categorization of a vast array of hashtags using both supervised and unsupervised techniques, overcoming previous limitations. To our knowledge, there has been no prior research focused on using AI-driven topic modeling on social media platforms for the purpose of identifying interactions among politicians.

3. Multiplex Networks

Complicated systems do not emerge from isolated networks with connections of identical significance and implications. Instead, they involve layers of interactions that are difficult to comprehend all at once. For example, in transportation networks, diverse modes of transportation can be differentiated (bus, subway, train, etc.). Likewise, in molecular biology and neural networks, interactions can possess varying meanings, emphasizing the necessity of examining these systems using a comprehensive approach that acknowledges the distinctions among different types of interactions. Within social networks, various kinds of social connections can be identified (friends, coworkers, acquaintances, family connections, etc.). The concept of multilayer networks was initially introduced in the realm of social science to describe the various categories of social connections that exist among the nodes within a social network [41]. Multiplex networks, a special form of multilayer networks, are commonly employed in scenarios where a consistent group of nodes is linked by connections that represent distinct kinds of interactions. A multiplex network is denoted as

M = (N, L, V, E)

, where N refers to a set of nodes, L represents a set of layers, and

(V, E)

constitutes a graph. This structure adheres to the condition that for any edge

(n_{1}, l_{1}, n_{2}, l_{2}) \in E

, the layers

l_{1}

and

l_{2}

must be identical. In this study, we adopt a multiplex network representation of U.S. senators where each network layer is formed by a distinct hashtag relationship. Such a representation enables us to obtain macro- to micro-level analysis, such as comparison of layers using network metrics informing about similarity of layers, most sustained relationships through layers, revealing hidden communities accounting for all layers, and analyzing central senators across all layers.

3.1. Network Construction

We construct two main multiplex networks, one in a supervised fashion and the other in an unsupervised fashion. The supervised multiplex network has seven layers and the unsupervised one has six. The first multiplex network is formed in a supervised fashion with the hypothesis that some outstanding hashtags will be more informative to distinguish echo chambers, inter-party relationships, regional differences, and gender preferences. The second multiplex network is constructed in an unsupervised fashion using the capability of GPT4 in topic modeling. Our motivation is to contrast and learn from the two strategies, as the latter one provides a more universal and unbiased approach to topic selection. Layers are formed if any hashtag belonging to a layer is used by pairs of accounts reciprocally. If a hashtag is used by one account but not used by the other one, these accounts are not connected by an edge. Otherwise, an edge incident to these accounts is added to the network. At least one such relationship is required to add an edge to the network. The multiplicity of pairwise hashtags is not counted. Multiplex network constructions are illustrated in Figure 1.

3.2. Sub-Networks

While senator-level networks offer distinct information, investigations also extend to party-level, gender-level, and region-level networks. For the senator-level networks, nodes and edges are subset to retain only the Republican or Democratic party nodes while preserving the existing edges. A similar subsetting approach is employed to obtain regional, gender, and candidate networks.

3.3. Network Metrics

We employ descriptive network analysis metrics for the flattened and multiplex networks. The metrics used for the flattened networks are: density, number of connected components, and average shortest path length. As for multiplex network metrics, we use Jeffrey degree, Pearson degree, and correlations, facilitating a comprehensive examination of both dissimilarity and similarity aspects in the multiplex network. Jeffrey degree is a statistical metric employed to evaluate the correlation between node degrees in distinct layers of a multiplex network. This measure quantifies the ’dissimilarity’ between the joint distribution of degrees in two layers. We can express the dissimilarity using the following formula [42]:

\sum_{k = 1}^{K} f r_{k, l_{1}} log \frac{f r_{k, l_{1}}}{f r_{k, l_{2}}} + \sum_{k = 1}^{K} f r_{k, l_{2}} log \frac{f r_{k, l_{2}}}{f r_{k, l_{1}}}

where

f r_{k, l_{1}}

represents the relative frequency of degree value k in layer 1. Higher values of the Jeffrey degree indicate increased dissimilarity between the degrees of nodes in the two layers, offering valuable insights into the structural differences across multiplex network strata. In contrast, Pearson degree measures the ’similarity’ between the degrees of nodes across each pair of layers, calculated through the computation of the Pearson correlation coefficient. We can express degree correlations using the following formula [42]:

\frac{{[p_{l_{1}} - mean (p_{l_{1}})]}^{'} \cdot [p_{l_{2}} - mean (p_{l_{2}})]}{∥p_{l_{1}} - mean (p_{l_{1}})∥ \cdot ∥p_{l_{2}} - mean (p_{l_{2}})∥}

where

p_{l_{1}}

represents the degree vector of layer 1. Positive values of Pearson degree indicate that nodes with similar degrees are connected (assortativity), while negative values suggest disassortativity. We used R igraph [43] and multinet [44] libraries for computing these metrics.

4. Overview of the Dataset

4.1. Data Collection

Exploring the intricate networks of the U.S. Senate on Twitter, this article delves into the patterns and dynamics exhibited by senators. Data used to construct this network were gathered from a period spanning one month before and one month after the election. Although the roots of polarization might be more easily discerned in the U.S. House, studies indicate that the Senate has progressively become a chamber marked by polarization, mirroring the House. According to American political literature, evidence over the years underscores the transformation of the Senate into a chamber increasingly characterized by polarization, akin to the House [4,5,45,46].

Twitter serves as a particularly valuable arena for examining the networks of the U.S. Senate for two main reasons. First, it is an influential platform that facilitates the dissemination of new information by Senators. Second, the extensive quantity and diversity of tweets from politicians enable an in-depth analysis of their readiness to engage in partisan conflicts. Additionally, Twitter has become a leading channel for political communication among Senators and a stage where partisan battles frequently unfold.

In the context of the 2022 midterm elections, our study aims to assess the extent to which BPHC strategy is followed by different senator networks constructed by partisan alignment, region, incumbency, and gender. This study encompasses each senator’s Twitter activity from 1 October 2022, to 12 December 2022. Our sample comprises 137 observations, encompassing various senatorial types, which provides a comprehensive perspective on different electoral outcomes and statuses. All descriptive details are documented in Table 1. The majority of our sample, 65 senators, served in the preceding Senate term and did not participate in the recent election yet continue their senatorial roles. Overall, 24 Senators from our sample served in the prior term, entered the election, secured a victory, and retained their seats. A subset, constituting 5 Senators, were incumbents who contested the election but faced defeat and subsequently no longer serve in the Senate. Another 11 politicians in our data did not serve in the preceding Senate term, contested and won the recent election, and are now serving as Senators. Finally, the dataset includes 32 individuals who were not Senators in the prior term, contested the recent election, faced defeat, and thus remained outside the Senate. It is noteworthy that our dataset does not encompass a distinct group of 6 Senators who were incumbents but opted for resignation, and hence no longer hold office. Out of the 137 Twitter users in our sample, we were unable to obtain data for 9 of them for various reasons, such as not having a Twitter account or not tweeting within the specified time frame.

Tweet data was collected using Twitter’s official Academic API, which enabled researchers to access Twitter’s historical data and advanced search features through API v2 (this free API for academics was discontinued by Twitter on 27 March 2023). To interact with the API, we employed the Tweepy [47] package, a user-friendly Python library that streamlines data extraction and parsing. The Tweepy package also features capabilities to handle rate limits imposed by Twitter. We created a Python codebase that saves the API data, in JSON format, into an SQLite database, a lightweight and efficient database management system well suited for small- to medium-sized projects. This allowed us to query our API data, whether in table format or JSON format. We stored fields such as entities, where hashtag data is stored, in JSON columns, while fields like tweet or user IDs were kept in standard SQLite columns. The combination of the Academic API, Tweepy, and SQLite (see Figure 2) enabled us to efficiently collect, manage, query, and analyze the dataset for our research. You can find all the code used in this project in the GitHub repository for our project (https://github.com/harunpirim/multilayer_polarization/ accessed on 21 November 2023).

In our research, we formed a collaboration with the Digital Society Project (http://digitalsocietyproject.org/ accessed on 21 November 2023), renowned for its extensive collection of election-centric data both in the US and globally. This project provided us with datasets containing the Twitter accounts of senatorial candidates. These datasets meticulously catalog the official Twitter handles of the candidates. Notably, we observed that some candidates had multiple official accounts listed. To maintain consistency in our analysis, we consolidated these multiple accounts into a single representation for each candidate. This approach was justified as most candidates predominantly used one main Twitter account during our study’s timeframe.

For senators who were not identified as candidates in the Digital Society Project’s dataset, we sourced their Twitter data from an alternative repository, found at https://github.com/unitedstates/congress-legislators (accessed on 21 November 2023). Leveraging these two data sources allowed us to enrich our tweet dataset with critical metadata elements, including Party Identity, Incumbency, Region, and Sex. These metadata features were instrumental in creating network layers for our analysis. To manage and process this data effectively, we employed the Python Pandas package [48], which enabled us to efficiently process query results from our SQLite database and seamlessly integrate the metadata, facilitating the construction of comprehensive and layered networks based on the tweet data.

4.2. Hashtag Topic Modeling

The process of data construction in our study is illustrated in Figure 2. Our approach involved constructing a network data model grounded in the shared use of hashtags. In this model, a connection, or ’edge,’ was established between two users when they utilized the same hashtag at least once. To extract meaningful insights from these networks, we categorized hashtags into groups based on their semantic content. This categorization was facilitated by the use of ChatGPT-4, an advanced language model known for its proficiency in classical NLP tasks such as named entity recognition, topic modeling, and summarization. Studies indicate that ChatGPT-4 performs on par with or surpasses current state-of-the-art models in these domains [49,50,51,52]. Our application of ChatGPT-4 involved feeding it hashtags, to which it responded by identifying their inherent meanings, thus enabling us to understand the underlying thematic connections among the hashtags used by different political figures. This methodology is pivotal in revealing the nature of political discourse and alignments within the Twitter sphere, linking directly back to the core objectives of our research.

In the hashtag clustering task we are interested in, there is a twist that classical NLP models, like topic modeling, cannot handle. Our goal is to cluster these hashtags according to their semantic relation with the tweets, which requires context information about the hashtags themselves. Training or fine-tuning a specific model like RoBERTa, or other state-of-the-art language models, for this task requires extensive training data, labor-intensive labeling, expertise in deep learning, and fine-tuning language models, which translates into significant budget and time concerns. Instead of taking this route, we crafted an innovative approach using a large language model (LLM), specifically ChatGPT-4, since it is the best performing one among others [53]. Considering its vast training dataset, our hypothesis was that using an LLM would enable us to capture the inherent semantic meanings needed for clustering these hashtags into meaningful groups, which would then be used to create semantic networks. ChatGPT-4 excels when the task requires context awareness with tailored prompting specific to the case [49]. To align with this approach, we used various prompts to achieve the desired results, which was to cluster as many hashtags as possible into groups. Before determining the right prompts for our task, we tested several (see Table A3 in Appendix A). To ensure the accuracy of our results, we carefully reviewed all hashtag clusterings with our expertise in polarization.

During our analysis, we faced a significant challenge with hashtags that had multiple spelling variants, such as ’#AZSen,’ ’#AZSEN,’ and ’#AZsen.’ This variation in spellings was problematic for accurately constructing our network, as it prevented us from linking users who used different variants of the same hashtag. To resolve this, we employed OpenRefine [54], a powerful tool designed for cleaning and transforming messy data. Through OpenRefine, we applied eight different algorithms specifically tailored to identify and unify similar texts. This data cleaning step was crucial; it reduced the total number of distinct hashtags from 1964 to 1660, significantly enhancing the accuracy of our network analysis. For example, prior to cleaning, the three variants of ’#AZSEN’ were treated as separate entities, hindering our ability to establish connections between users employing different versions of this hashtag. After unifying these variants into a single ’#AZSEN’ tag using OpenRefine, we could accurately link users, reflecting a more authentic representation of the online interactions. This process not only corrected mismatches but also enriched our network, increasing the total number of usable hashtags from 376 to 406. This improvement in data quality allowed us to create richer and more precise networks, capturing a more accurate picture of the hashtag-driven interactions among users.

We implemented two distinct methods for clustering hashtags: ’loosely supervised’ and ’unsupervised.’ In the loosely supervised approach, we directed ChatGPT-4 to consider key issues that are central to polarization in US politics. These issues included health care, immigration, climate change and environmental policy, gun control, economic policy, racial and social justice, abortion, and LGBTQ+ rights. This method allowed us to observe how hashtags aligned with these specific political topics. Conversely, for the unsupervised method, we did not provide ChatGPT-4 with any predefined context. Instead, we allowed the model to cluster hashtags based purely on their inherent patterns and relationships, offering insights into natural groupings that emerge from the data itself.

After the initial clustering by ChatGPT-4 in both methods, we conducted a second refinement phase. This step was crucial to further refine and make sense of the clusters, enhancing their relevance and coherence. For example, in this phase, we grouped similar themes like ’Appreciation Veterans’, ’Appreciation Youth’, and ’Awareness Adoption’ into a broader category labeled ’Awareness & Appreciation’. This process of refinement and categorization led to the creation of more meaningful networks. A detailed list of these refined clusters with details can be found in our GitHub repository (https://github.com/harunpirim/multilayer_polarization/tree/main/hashtags accessed on 21 November 2023). The results of our clustering efforts, encompassing both the supervised and unsupervised methods, are summarized in Table 2. This table provides an overview of the hashtag groups and their respective categorizations, illustrating the diverse ways in which political discourse manifests on Twitter.

5. Results

We begin our analysis by presenting the descriptive results of our analysis, comparing networks that utilize different modeling strategies (i.e., supervised and unsupervised) across various analytical levels (i.e., Party ID, Incumbency, Sex, and Region).

Figure 3 immediately makes apparent the intricate dynamics of Senate collaborations. (For a detailed breakdown of edge distribution across layers, along with confidence intervals for proportions, see Table A1 and Table A2 in Appendix A.) It reveals that certain groups consistently demonstrate denser BPHC networks than their counterparts. Notably, Democrats, incumbents, females, and representatives from the southern region (as opposed to Republicans, candidates, males, and those from the northern region) display heightened collaboration and interconnectedness across hashtags. These patterns prevail in both supervised and unsupervised networks, with a more pronounced prominence in the former. However, a notable exception is the similarity observed between incumbent and candidate senators within unsupervised networks.

A deeper examination through distinct layers provides further insights. While some layers maintain consistent patterns, others reveal subtle variations that underscore the multidimensional nature of polarization in U.S. politics. Economy-related hashtags in supervised networks and awareness-related hashtags in unsupervised networks seem to bolster in-group communication. Another interesting trend is that topics related to social justice—such as identity, labor, women, and rights—emerge as the primary sources of division and polarization among politicians.

When examining differences across various analytical categories, it becomes evident that the underlying network structures could shape the dynamics within distinct layers. For instance, while Democrats generally have denser connections than Republicans, this pattern is somewhat weakened in the economy layer, reversed in the immigration and state politics layers, and remained the same in other areas. Notably, metrics like density and the number of edges demonstrate that by constructing a multiplex network using supervised topic modeling with a focus on reciprocal hashtag usage, we can predict senators’ party affiliation with a minimum accuracy of 82% in the immigration layer and with accuracy levels ranging from 93% (i.e., Health layer) to a flawless 100% (i.e., Labor layer) across all other topics, excluding the economy.

The pattern for other analytical categories also exhibits interesting shifts. The trend of female senators forming denser networks than their male counterparts is mostly consistent but weakens in the media layer and inverts in the federal/state politics and environment-associated hashtag groups. In a similar vein, while incumbents typically have denser connections than candidates, this trend diminishes in the identity and federal politics layers and reverses in areas such as campaign, state politics, immigration, and women-centric hashtags. Finally, the propensity for southern senators to have denser networks than their northern peers is evident in most layers but diminishes in the campaign layer and flips in the environment, identity, women, and rights layers. It is noteworthy that senators from the South show no involvement in labor-related issues.

Overall, both supervised and unsupervised networks provide invaluable insights. They reinforce our initial hypothesis that elements such as gender, geography, incumbency, and party affiliation have a pronounced impact on the structure of politicians’ online networks. For the subsequent analysis, however, we will focus on the supervised networks among Democrats and Republicans to delve deeper. It is evident that adopting an unsupervised topic modeling strategy leads to a greater number of edges, signifying a more densely connected network than what is observed in supervised networks. This outcome is not so surprising, given that the hashtag sample size in unsupervised modeling is nearly twice as large as that in supervised modeling. However, the advantages of supervised networks are also clear. They offer a more detailed view of polarization dynamics among elites. For example, in networks utilizing supervised modeling, the density among Democrats is five times that of Republicans. In contrast, this ratio is only 1.8 times in models based on unsupervised modeling. Another compelling observation is that while networks based on supervised modeling capture several omitted layers (e.g., labor, women), those relying on unsupervised structures fail to discern such nuanced divisions. We conclude that these qualitative advantages could allow us to delve better into subtle insights into how specific thematic areas contribute to the formation of echo chambers and polarization in US politics.

We begin our multiplex analysis using Jeffrey degree and Pearson degree correlations for pairwise layer comparisons across political parties. These layer comparison matrices, visualized as heatmaps in Figure 4, offer further insights into the nuanced landscape of political communication and polarization within the U.S. Senate.

The Jeffrey degree dissimilarity function calculates differences between degree distributions of layers, with higher values indicating greater differences. Among Democrats, Jeffrey degree values are noticeably elevated when comparing the identity layer with those of women, labor, immigration, and especially environment. This suggests a unique assembly of senators engaging with identity-related discussions, underlining the range of perspectives within Democratic dialogues on identity. For Republicans, our data reveals pronounced Jeffrey degree values between the economy layer and most other layers, which implies that discussions related to the economy involve a distinct group of actors, distinct from those engaged in other topics.

Examining Pearson degree scores, which provide correlations between actors’ degrees across different layers, reveals more nuances. For Democrats, there is a predominant positive correlation across various topics, suggesting aligned actor activity levels and possibly indicating coordinated communication strategies or shared priorities. For instance, there are notable positive correlations between economy and other layers like health, identity, and labor, implying actors active in economy discussions are similarly engaged in these areas. On the other hand, the Republican network presents varied Pearson degree correlations. Positive correlations are observed between economy and health, and economy and immigration, while a negative correlation is evident between identity and almost all other layers. This variation suggests that actors active in economy discussions within the Republican network may not be as engaged in identity-related dialogues, pointing to divergent engagement patterns and potentially different focal points or priorities among Republican actors.

Figure 5 presents an overview of the flattened network that spans all layers, illustrating the distribution of party relations across diverse topic layers. A unique color illustrates each layer’s relationship. Notably, the ’Economy’ topic, represented by the purple edges, stands out as the predominant theme, making up 41.17% of the relationships. Following the economy, the ’Identity’ and ’Health’ topics emerge as the next most dominant, accounting for 25.89% and 22.53%, respectively. At the lower end, ’Women’, ’Environment’, and ’Labor’ surface as the least addressed topics, each representing less than 3% of the relationships. This observation suggests that while these topics retain importance, they may not be at the core of the BPHC communication strategy as much as the more prevalent subjects. The node sizes, on the other hand, are scaled based on their degrees, indicating the number of connections. Democrats are represented by blue nodes, Republicans by red, and the gray nodes likely depict Independents. The figure clearly underscores that Democratic senators are considerably more engaged in BPHC across various topics compared to their Republican peers.

Figure 6 offers additional insights by detailing the relationships either between or within parties for each layer individually. In the Economy layer, the dense clustering of both blue and red nodes reaffirms that economic issues garner significant attention from both parties, positioning it as a central policy discussion area. In the Health layer, both parties seem notably engaged, though there is a modest prevalence of blue nodes. Intriguingly, in the Immigration layer, we observe two distinct clusters. This implies that even though both parties engage with BPHC, they might be approaching from unique viewpoints or potentially targeting their respective communities with specific hashtags. For the remaining layers, which include Women, Environment, Identity, and Labor, there is an implicit pattern of more pronounced Democratic involvement. It is particularly noteworthy that the ’Identity’ layer, despite ranking as the second most significant in relationship density, is overwhelmingly dominated by blue nodes.

Figure 5 and Figure 6 shed light on the intricacies of the topics that feature prominently in the BPHC strategies across different political parties. They offer a multi-layered understanding of topic engagement but do not clearly indicate if parties pivot their BPHC strategies to break away from echo chambers, a phenomenon where ideas and beliefs are amplified by repeated internal communication. To obtain a clearer perspective on which topics serve as catalysts for inter-party dialogues, we further filtered the data to focus solely on inter-party edges, as presented in Figure 7. At first glance, it is evident that the Economy and Health layers act as significant platforms for fostering dialogue between the parties. These layers brim with interactions, underscoring their roles as central arenas for bipartisan discussions. In stark contrast, the Immigration and Labor layers tell a different story. They are notably devoid of any inter-party conversations, suggesting that, at least within the context of BPHC strategies, these topics might be approached with more insular perspectives by each party. The implications could be that discussions on these layers are more internally focused, potentially amplifying party-specific narratives.

6. Discussion

Hashtags play a crucial role on social media platforms, functioning not only as bookmarks for content but also as symbols representing communities. On the one hand, hashtags connect tweets with similar subjects, simplifying the process of identifying and categorizing discussions related to specific hashtags [55]. On the other hand, hashtags go beyond their functional utility—they serve as symbols of communities, functioning as rallying points and banners where individuals with shared interests and concerns unite [38].

This article investigates whether political figures strategically utilize the latter function of hashtags to establish their own communities, rather than solely using them as an effective means to increase the visibility of certain figures, launch attacks against opponents, or merely express support for the ideas encapsulated within the hashtags. This communication strategy, which we termed “Building Political Hashtag Communities" (BPHC), not only provides politicians with a direct means to connect and engage with citizens but also facilitates the organization of their base into distinct online groups through deliberate reciprocal usage of hashtags.

Our research sheds light on the complex dynamics within the Senate, particularly regarding the use of BPHC as a communication strategy. Our results reveal that Democrats, specifically incumbents, women, and representatives from the southern region, are more actively engaged in this BPHC strategy, leading to denser collaboration networks compared to their counterparts (namely Republicans, candidates, men, and representatives from the northern region). These findings align with prior research (i.e., [4,6,7,8,9,10,11]) on congressional tweets, confirming that certain subgroups among politicians could cultivate deeper connections and foster more immediate and dynamic lines of communication to varying degrees.

However, a more nuanced examination using advanced multiplex community analysis offers fresh insights into elite behavior on Twitter. Our multiplex analysis of Democratic and Republican networks reveals areas within the political spectrum, particularly in economic policy discussions, where the conventional wisdom of diverging characteristics may not entirely hold. In these contexts, the internal dynamics of interaction appear to be consistent across party lines, indicating that both Republicans and Democrats employ BPHC as their primary communication strategy. Furthermore, while Republicans engage more intensively in BPHC discussions related to immigration, Democrats heavily rely on BPHC topics related to identity and women. Only a select group of Democratic actors use BPHC for discussions on labor and the environment, areas where Republicans are rarely or never involved in BPHC efforts. More interestingly, using hashtag involvement detection algorithms, and without considering the content of the tweets, we are able to infer the political affiliation of each politician with up to 100.0% accuracy in the Senate. Considering the similar levels of Twitter activity (e.g., number of tweets sent, number of # usage) among party members, as shown in Table 1, these nuances underscore the multifaceted nature of the BPHC strategies employed by different political actors.

In conclusion, our study underscores the significance of BPHC as a communication strategy in the political arena. It reveals complex patterns of engagement and highlights the potential for cross-party collaboration within certain BPHC networks. As we continue to navigate the digital age, understanding the dynamics of BPHC and its implications for political discourse will be essential for both scholars and practitioners in the field of political communication. The enduring question of the impact of BPHC on public opinion warrants further exploration. While our study has primarily focused on the strategies employed by political actors, it remains an open question whether BPHC influences the beliefs and attitudes of regular social media users. Future research could delve into the extent to which engagement with BPHC content shapes public discourse and perceptions. Does exposure to BPHC discussions lead to greater polarization, or does it encourage more nuanced and informed perspectives among social media users? Additionally, the longevity of BPHC communities and their adaptability over time present intriguing possibilities. How do these communities evolve in response to changing political landscapes, emerging issues, and shifts in public sentiment? Can BPHC strategies remain effective in an environment where the social media landscape is constantly evolving? We also acknowledge a limitation in our study due to the absence of a time-based analysis distinguishing pre- and post-election periods. Future research could also explore behavior patterns before and after the election, offering a more comprehensive understanding of temporal dynamics in BPHC strategies. It would have been promising to understand how these dynamics evolve over time, particularly in response to shifting political climates, emerging policy challenges, and the continuous evolution of social media as a tool for political communication and public engagement.

Exploring these questions can provide valuable insights into the evolving nature of political communication and online communities. This research lays the groundwork for an expanded understanding of political interactions on social media, providing a lens through which we can better comprehend the nuanced dynamics of political communication in the digital age. To optimize our network construction strategy, we also introduce a new topic modeling method, utilizing OpenAI’s Chat GPT-4. This strategy enables the categorization of a vast array of hashtags using both supervised and unsupervised techniques, overcoming previous limitations (e.g., sample restrictions or biased categorizations). Future studies could potentially explore and analyze the opportunities provided by new LLM models, contributing to a global understanding of digital political communication and collaboration.

Author Contributions

Conceptualization, Y.E.O.; methodology, H.P.; software, H.P. and Y.A.; data collection Y.A.; analysis, H.P. and Y.E.O.; writing, Y.E.O., H.P. and Y.A.; visualization, Y.E.O., H.P. and Y.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data and analyses codes are available in our GitHub repository (https://github.com/harunpirim/multilayer_polarization/ accessed on 21 November 2023).

Acknowledgments

We thank the Digital Society Project for providing access to candidate data. We would also like to thank the reviewers and the editors of Computation, who provided extensive and constructive feedback to improve our manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1

Table A1. Supervised network: proportional scores.

Category	Layer	Edges	Total Edges	p-Value	Mean	Lower	Upper
Economy	Incumbent	338	813	0.416	0.416	0.382	0.451
Environment	Incumbent	25	813	0.031	0.031	0.020	0.046
Health	Incumbent	228	813	0.280	0.280	0.250	0.313
Immigration	Incumbent	17	813	0.021	0.021	0.013	0.034
Identity	Incumbent	179	813	0.220	0.220	0.192	0.251
Labor	Incumbent	20	813	0.025	0.025	0.015	0.038
Women	Incumbent	6	813	0.007	0.007	0.003	0.017
Economy	Candidate	177	438	0.404	0.404	0.358	0.452
Environment	Candidate	5	438	0.011	0.011	0.004	0.028
Health	Candidate	68	438	0.155	0.155	0.123	0.193
Immigration	Candidate	20	438	0.046	0.046	0.029	0.071
Identity	Candidate	136	438	0.311	0.311	0.268	0.356
Labor	Candidate	6	438	0.014	0.014	0.006	0.031
Women	Candidate	26	438	0.059	0.059	0.040	0.087
Economy	Democrat	319	1383	0.231	0.231	0.209	0.254
Environment	Democrat	43	1383	0.031	0.031	0.023	0.042
Health	Democrat	361	1383	0.261	0.261	0.238	0.285
Immigration	Democrat	15	1383	0.011	0.011	0.006	0.018
Identity	Democrat	557	1383	0.403	0.403	0.377	0.429
Labor	Democrat	50	1383	0.036	0.036	0.027	0.048
Women	Democrat	38	1383	0.027	0.027	0.020	0.038
Economy	Republican	265	368	0.720	0.720	0.671	0.765
Environment	Republican	2	368	0.005	0.005	0.001	0.022
Health	Republican	26	368	0.071	0.071	0.048	0.103
Immigration	Republican	72	368	0.196	0.196	0.157	0.241
Identity	Republican	1	368	0.003	0.003	0.000	0.017
Women	Republican	2	368	0.005	0.005	0.001	0.022
Economy	North	16	99	0.162	0.162	0.098	0.252
Environment	North	4	99	0.040	0.040	0.013	0.106
Health	North	19	99	0.192	0.192	0.122	0.286
Immigration	North	2	99	0.020	0.020	0.004	0.078
Identity	North	45	99	0.455	0.455	0.355	0.557
Labor	North	4	99	0.040	0.040	0.013	0.106
Women	North	9	99	0.091	0.091	0.045	0.170
Economy	South	147	210	0.700	0.700	0.632	0.760
Environment	South	2	210	0.010	0.010	0.002	0.038
Health	South	27	210	0.129	0.129	0.088	0.183
Immigration	South	29	210	0.138	0.138	0.096	0.194
Identity	South	4	210	0.019	0.019	0.006	0.051
Women	South	1	210	0.005	0.005	0.000	0.030
Economy	Female	85	284	0.299	0.299	0.247	0.357
Environment	Female	1	284	0.004	0.004	0.000	0.023
Health	Female	86	284	0.303	0.303	0.251	0.360
Immigration	Female	3	284	0.011	0.011	0.003	0.033
Identity	Female	85	284	0.299	0.299	0.247	0.357
Labor	Female	9	284	0.032	0.032	0.016	0.061
Women	Female	15	284	0.053	0.053	0.031	0.087
Economy	Male	72	146	0.493	0.493	0.410	0.577
Environment	Male	4	146	0.027	0.027	0.009	0.073
Health	Male	36	146	0.247	0.247	0.181	0.326
Immigration	Male	3	146	0.021	0.021	0.005	0.064
Identity	Male	27	146	0.185	0.185	0.127	0.259
Labor	Male	3	146	0.021	0.021	0.005	0.064
Women	Male	1	146	0.007	0.007	0.000	0.043

Table A2. Unsupervised network: proportional scores.

Category	Layer	Edges	Total Edges	p-Value	Mean	Lower	Upper
Awareness	Candidate	579	1291	0.448	0.448	0.421	0.476
Campaign	Candidate	422	1291	0.327	0.327	0.301	0.353
Federal Pol.	Candidate	131	1291	0.101	0.101	0.086	0.120
Media	Candidate	42	1291	0.033	0.033	0.024	0.044
Rights	Candidate	44	1291	0.034	0.034	0.025	0.046
State Politics	Candidate	73	1291	0.057	0.057	0.045	0.071
Awareness	Incumbent	784	1246	0.629	0.629	0.602	0.656
Campaign	Incumbent	88	1246	0.071	0.071	0.057	0.087
Federal Pol.	Incumbent	162	1246	0.130	0.130	0.112	0.150
Media	Incumbent	103	1246	0.083	0.083	0.068	0.100
Rights	Incumbent	105	1246	0.084	0.084	0.070	0.101
State Politics	Incumbent	4	1246	0.003	0.003	0.001	0.009
Awareness	North	103	190	0.542	0.542	0.469	0.614
Campaign	North	27	190	0.142	0.142	0.097	0.202
Federal Pol.	North	23	190	0.121	0.121	0.080	0.178
Media	North	6	190	0.032	0.032	0.013	0.071
Rights	North	26	190	0.137	0.137	0.093	0.196
State Politics	North	5	190	0.026	0.026	0.010	0.064
Awareness	South	257	447	0.575	0.575	0.528	0.621
Campaign	South	36	447	0.081	0.081	0.058	0.111
Federal Pol.	South	89	447	0.199	0.199	0.164	0.240
Media	South	44	447	0.098	0.098	0.073	0.131
Rights	South	10	447	0.022	0.022	0.011	0.042
State Politics	South	11	447	0.025	0.025	0.013	0.045
Awareness	Democrat	894	1817	0.492	0.492	0.469	0.515
Campaign	Democrat	286	1817	0.157	0.157	0.141	0.175
Federal Pol.	Democrat	296	1817	0.163	0.163	0.146	0.181
Media	Democrat	68	1817	0.037	0.037	0.029	0.047
Rights	Democrat	256	1817	0.141	0.141	0.125	0.158
State Politics	Democrat	17	1817	0.009	0.009	0.006	0.015
Awareness	Republican	513	1039	0.494	0.494	0.463	0.525
Campaign	Republican	199	1039	0.192	0.192	0.168	0.217
Federal Pol.	Republican	225	1039	0.217	0.217	0.192	0.243
Media	Republican	62	1039	0.060	0.060	0.046	0.076
Rights	Republican	14	1039	0.013	0.013	0.008	0.023
State Politics	Republican	26	1039	0.025	0.025	0.017	0.037
Awareness	Female	236	446	0.529	0.529	0.482	0.576
Campaign	Female	123	446	0.276	0.276	0.235	0.320
Federal Pol.	Female	25	446	0.056	0.056	0.037	0.083
Media	Female	18	446	0.040	0.040	0.025	0.064
Rights	Female	39	446	0.087	0.087	0.064	0.119
State Politics	Female	5	446	0.011	0.011	0.004	0.028
Awareness	Male	172	328	0.524	0.524	0.469	0.579
Campaign	Male	54	328	0.165	0.165	0.127	0.210
Federal Pol.	Male	52	328	0.159	0.159	0.122	0.204
Media	Male	13	328	0.040	0.040	0.022	0.068
Rights	Male	17	328	0.052	0.052	0.031	0.083
State Politics	Male	20	328	0.061	0.061	0.039	0.094

Table A3. ChatGPT Prompts.

Prompt Code	Prompt
unsup 01	This list consists of hashtags used by US politicians during the campaign period. Can you please categorize these hashtags into meaningful categories.
unsup 03	This list consists of hashtags used by US Senators during the campaign period. Categorize these hashtags into politically meaningful categories, considering the polarizing aspects of US politics.
unsup 04	This list consists of hashtags used by US Senators during the campaign period. Categorize these hashtags into politically meaningful categories, considering the polarizing aspects of US politics. Please ensure every hashtag is in only one category and be sure every hashtag is categorized.
unsup 05	I have a list of hashtags that were used by US politicians during the 2022 midterm election campaign period. I’d like you to categorize these into distinct meaningful categories. Please present the results in a markdown table format with columns titled ’Hashtag’ and ’Category’.
unsup 05 stage 2 (unsuccesful)	This is a request to recategorize hashtag groups you previously provided me. Recall, these hashtag categories are related to politicians’ use of Twitter during the campaign period. Please regroup all these categories into 10 groups maximum. You can also have an 11th group, labeled as “other”, for cases that you are not sure. The hashtag categories will be used to investigate the network among politicians using multilayer techniques. Each category will be considered as a network layer, and there will be a network between two nodes (politicians) if they both use any hashtag in the same category. Please provide the results in a table format with the following columns: previous category (the list I gave you above) and new category (the new list including 10 meaningful groups and other).
unsup 05_2	This is a request to categorize US politicians’ hashtags during the campaign period into meaningful categories. The hashtags will be used to investigate the network among politicians using multilayer techniques. Each category will be considered as a network layer, and there will be a network between two nodes (politicians) if they both use any hashtag in the same category. Please provide the results in a table format with the following columns: hashtag and category.
unsup 06	this list consists of hashtags and some example tweets used by us politicians during the campaign period. can you please categorize these hashtags into meaningful categories. give me the results as a table columns being: hashtag and category
sup 01-health	This list consists of hashtags used by US Senators during the campaign period. I want you to detect hashtags related to one of the most polarized issues, health care. Those hashtags might include debates over the Affordable Care Act (ACA), universal healthcare, Medicare for All, and how to best ensure affordable access to healthcare.
sup 01-immiigration	This list consists of hashtags used by US Senators during the campaign period. I want you to detect hashtags related to one of the most polarized issues, immigration. Those hashtags might include debates over border security, the treatment of undocumented immigrants, Deferred Action for Childhood Arrivals (DACA), and family separations at the border.
sup 01-climate	This list consists of hashtags used by US Senators during the campaign period. I want you to detect hashtags related to one of the most polarized issues, climate change and environmental policy. Those hashtags might include debates over climate change, renewable energy policies, and the role of government regulation in environmental protection.
sup 01-gun control	This list consists of hashtags used by US Senators during the campaign period. I want you to detect hashtags related to one of the most polarized issues, gun control. Those hashtags might include debates over background checks, assault weapon bans, and concealed carry laws
sup 01-economic policy	This list consists of hashtags used by US Senators during the campaign period. I want you to detect hashtags related to one of the most polarized issues, economic policy. Those hashtags might include debates over taxation, government spending, the national debt, social welfare programs, and the role of government in regulating the economy.
sup 01-social justice	This list consists of hashtags used by US Senators during the campaign period. I want you to detect hashtags related to one of the most polarized issues, Racial and Social Justice. Those hashtags might include debates over systemic racism, police reform, affirmative action, and the Black Lives Matter movement that have sparked intense debate.
sup 01-abortion	This list consists of hashtags used by US Senators during the campaign period. I want you to detect hashtags related to one of the most polarized issues, abortion. Those hashtags might Include debates over abortion rights, with strong feelings on both pro-life and pro-choice sides.
sup 01-LGBTQ+ Rights	This list consists of hashtags used by US Senators during the campaign period. I want you to detect hashtags related to one of the most polarized issues, LGBTQ+ Rights. Those hashtags might include debates over same-sex marriage, transgender rights, and anti-discrimination laws that continue to be contentious.
unsup detailed	This is a request to categorize the hashtags used by US politicians during the campaign period into meaningful categories for the purpose of investigating the network among politicians. In the resulting multiplex network, the nodes will represent politician accounts, the edges will represent common hashtags, and the layers will represent the categories you provide. Each category will be considered a network layer, and there will be a network between two nodes (politicians) if they both use any hashtag in the same category. Please provide the results in a table format with the following columns: hashtag and category. When categorizing the hashtags, please ensure that the layers of the network (categories) are as uncorrelated as possible to maximize the network analysis effectiveness.

References

Grimmer, J. Representational Style in Congress: What Legislators Say and Why It Matters; Cambridge University Press: New York, NY, USA, 2013. [Google Scholar] [CrossRef]
Prior, M. Media and Political Polarization. Annu. Rev. Political Sci. 2013, 16, 101–127. [Google Scholar] [CrossRef]
Straus, J.R.; Williams, R.T.; Shogan, C.J.; Glassman, M.E. Congressional Social Media Communications: Evaluating Senate Twitter Usage. Online Inf. Rev. 2016, 40, 643–659. [Google Scholar] [CrossRef]
Russell, A. US Senators on Twitter: Asymmetric Party Rhetoric in 140 Characters. Am. Politics Res. 2018, 46, 695–723. [Google Scholar] [CrossRef]
Gelman, J. Partisan Intensity in Congress: Evidence from Brett Kavanaugh’s Supreme Court Nomination. Political Res. Q. 2021, 74, 450–463. [Google Scholar] [CrossRef]
Golbeck, J.; Auxier, B.; Bickford, A.; Cabrera, L.; McHugh, M.C.; Moore, S.; Hart, J.; Resti, J.; Rogers, A.; Zimmerman, J. Congressional twitter use revisited on the platform’s 10-year anniversary. J. Assoc. Inf. Sci. Technol. 2018, 69, 1067–1070. [Google Scholar] [CrossRef]
Hemphill, L.; Otterbacher, J.; Shapiro, M. What’s congress doing on twitter? In Proceedings of the 2013 Conference on Computer Supported Cooperative Work, San Antonio, TX, USA, 23–27 February 2013; pp. 877–886. [Google Scholar] [CrossRef]
Hemphill, L.; Russell, A.; Schöpke-Gonzalez, A.M. What drives US congressional members’ policy attention on Twitter? Policy Internet 2021, 13, 233–256. [Google Scholar] [CrossRef]
Evans, H.K.; Clark, J.H. “You Tweet Like a Girl!”: How Female Candidates Campaign on Twitter. Am. Politics Res. 2015, 44, 326–352. [Google Scholar] [CrossRef]
Gainous, J.; Wagner, K.M. Tweeting to Power: The Social Media Revolution in American Politics; Oxford University Press: New York, NY, USA, 2014. [Google Scholar] [CrossRef]
Atkinson, M.L.; Windett, J.H. Gender stereotypes and the policy priorities of women in Congress. Political Behav. 2019, 41, 769–789. [Google Scholar] [CrossRef]
Orhan, Y.E. The Relationship between Affective Polarization and Democratic Backsliding: Comparative Evidence. Democratization 2022, 29, 714–735. [Google Scholar] [CrossRef]
Evans, H.K.; Cordova, V.; Sipole, S. Twitter Style: An Analysis of How House Candidates Used Twitter in Their 2012 Campaigns. PS Political Sci. Politics 2014, 47, 454–462. [Google Scholar] [CrossRef]
Larsson, A.O. The EU Parliament on Twitter: Assessing the Permanent Online Practices of Parliamentarians. J. Inf. Technol. Politics 2015, 12, 149–166. [Google Scholar] [CrossRef]
Golbeck, J.; Grimes, J.M.; Rogers, A. Twitter use by the U.S. Congress. J. Am. Soc. Inf. Sci. Technol. 2010, 61, 1612–1621. [Google Scholar] [CrossRef]
Kruikemeier, S. How political candidates use Twitter and the impact on votes. Comput. Hum. Behav. 2014, 34, 131–139. [Google Scholar] [CrossRef]
Auter, Z.J.; Fine, J.A. Social Media Campaigning: Mobilization and Fundraising on Facebook. Soc. Sci. Q. 2018, 99, 185–200. [Google Scholar] [CrossRef]
Gelman, J.; Wilson, S.L.; Petrarca, C.S. Mixing messages: How candidates vary in their use of Twitter. J. Inf. Technol. Politics 2021, 18, 101–115. [Google Scholar] [CrossRef]
Mechkova, V.; Wilson, S. Does gender still matter for politics? The Case of the 2018 US Elections on Twitter. Digital Society Project Working Paper. 2019.
Stromer-Galley, J.; Rossini, P. Categorizing political campaign messages on social media using supervised machine learning. J. Inf. Technol. Politics 2023, 1–14. [Google Scholar] [CrossRef]
Jungherr, A. Twitter use in election campaigns: A systematic literature review. J. Inf. Technol. Politics 2016, 13, 72–91. [Google Scholar] [CrossRef]
Barberá, P. Birds of the Same Feather Tweet Together. Bayesian Ideal Point Estimation Using Twitter Data. Political Anal. 2015, 23, 76–91. [Google Scholar] [CrossRef]
Hemsley, J. Followers Retweet! The Influence of Middle-Level Gatekeepers on the Spread of Political Information on Twitter. Policy Internet 2019, 11, 280–304. [Google Scholar] [CrossRef]
Clinton, J.; Jackman, S.; Rivers, D. The Statistical Analysis of Roll Call Data. Am. Political Sci. Rev. 2004, 98, 355–370. [Google Scholar] [CrossRef]
Poole, K.T.; Rosenthal, H. Patterns of Congressional Voting on JSTOR. Am. J. Political Sci. 1991, 35, 228–278. [Google Scholar] [CrossRef]
Gerrish, S.; Blei, D. How They Vote: Issue-Adjusted Models of Legislative Behavior. In Advances in Neural Information Processing Systems; Pereira, F., Burges, C., Bottou, L., Weinberger, K., Eds.; Curran Associates: Red Hook, NY, USA, 2012; Volume 25. [Google Scholar]
Livne, A.; Simmons, M.; Adar, E.; Adamic, L. The Party Is Over Here: Structure and Content in the 2010 Election. ICWSM 2011, 5, 201–208. [Google Scholar] [CrossRef]
Conover, M.D.; Ratkiewicz, J.; Francisco, M.; Gonçalves, B.; Flammini, A.; Menczer, F. Political Polarization on Twitter. ICWSM 2011, 133, 89–96. [Google Scholar] [CrossRef]
Cherepnalkoski, D.; Mozetic, I. Retweet networks of the European Parliament: Evaluation of the community structure. Appl. Netw. Sci. 2016, 1, 1–20. [Google Scholar] [CrossRef]
Hemphill, L.; Culotta, A.; Heston, M. #Polar Scores: Measuring partisanship using social media content. J. Inf. Technol. & Politics 2016, 13, 365–377. [Google Scholar] [CrossRef]
Hanteer, O.; Rossi, L. An Innovative Way to Model Twitter Topic-Driven Interactions Using Multiplex Networks. Front. Big Data 2019, 2, 463659. [Google Scholar] [CrossRef]
Chamberlain, J.M.; Spezzano, F.; Kettler, J.J.; Dit, B. A Network Analysis of Twitter Interactions by Members of the U.S. Congress. ACM Trans. Soc. Comput. 2021, 4, 1–22. [Google Scholar] [CrossRef]
Bonifazi, G.; Breve, B.; Cirillo, S.; Corradini, E.; Virgili, L. Investigating the COVID-19 vaccine discussions on Twitter through a multilayer network-based approach. Inf. Process. Manag. 2022, 59, 103095. [Google Scholar] [CrossRef]
Bode, L.; Hanna, A.; Yang, J.; Shah, D.V. Candidate Networks, Citizen Clusters, and Political Expression: Strategic Hashtag Use in the 2010 Midterms. ANNALS Am. Acad. Political Soc. Sci. 2015, 659, 149–165. [Google Scholar] [CrossRef]
Pilař, L.; Stanislavská, L.K.; Kvasnička, R.; Bouda, P.; Pitrová, J. Framework for Social Media Analysis Based on Hashtag Research. Appl. Sci. 2021, 11, 3697. [Google Scholar] [CrossRef]
Bruns, A.; Burgess, J. The use of Twitter hashtags in the formation of ad hoc publics. In Proceedings of the 6th European Consortium for Political Research (ECPR) General Conference 2011, The European Consortium for Political Research (ECPR), Reykjavik, Iceland, 25–27 August 2011; pp. 1–9. [Google Scholar]
Litt, E. Knock, Knock. Who’s There? The Imagined Audience. J. Broadcast. Electron. Media 2012, 56, 330–345. [Google Scholar] [CrossRef]
Yang, L.; Sun, T.; Zhang, M.; Mei, Q. We know what@ you# tag: Does the dual role affect hashtag adoption? In Proceedings of the 21st International Conference on World Wide Web, Lyon, France, 16–20 April 2012; pp. 261–270. [Google Scholar] [CrossRef]
Kuo, R. Racial justice activist hashtags: Counterpublics and discourse circulation. New Media Soc. 2016, 20, 495–514. [Google Scholar] [CrossRef]
Vargas, L.; Emami, P.; Traynor, P. On the detection of disinformation campaign activity with network analysis. In Proceedings of the 2020 ACM SIGSAC Conference on Cloud Computing Security Workshop, Virtual Event, 9 November 2020; pp. 133–146. [Google Scholar]
Bianconi, G. Multilayer Networks: Structure and Function; Oxford University Press: Oxford, UK, 2018. [Google Scholar] [CrossRef]
Bródka, P.; Chmiel, A.; Magnani, M.; Ragozini, G. Quantifying layer similarity in multiplex networks: A systematic study. R. Soc. Open Sci. 2018, 5, 171747. [Google Scholar] [CrossRef] [PubMed]
Csárdi, G.; Nepusz, T.; Traag, V.; Horvát, S.; Zanini, F.; Noom, D.; Müller, K. igraph: Network Analysis and Visualization in R, R Package Version 1.5.1; Zenodo, 2023. [CrossRef]
Magnani, M.; Rossi, L.; Vega, D. Analysis of Multiplex Social Networks with R. J. Stat. Softw. 2021, 98, 1–30. [Google Scholar] [CrossRef]
Fleisher, R.; Bond, J.R. The Shrinking Middle in the US Congress. Br. J. Political Sci. 2004, 34, 429–451. [Google Scholar] [CrossRef]
Theriault, S.M. The Gingrich Senators: The Roots of Partisan Warfare in Congress; Oxford University Press: New York, NY, USA, 2013. [Google Scholar] [CrossRef]
Roesslein, J. Tweepy: Twitter for Python! 2020. Available online: https://github.com/tweepy/tweepy (accessed on 21 November 2023).
Pandas Development Team. Pandas-Dev/Pandas: Pandas. 2020. Available online: https://zenodo.org/records/10107975 (accessed on 21 November 2023).
Kocoń, J.; Cichecki, I.; Kaszyca, O.; Kochanek, M.; Szydło, D.; Baran, J.; Bielaniewicz, J.; Gruza, M.; Janz, A.; Kanclerz, K.; et al. ChatGPT: Jack of all trades, master of none. Inf. Fusion 2023, 99, 101861. [Google Scholar] [CrossRef]
Qin, C.; Zhang, A.; Zhang, Z.; Chen, J.; Yasunaga, M.; Yang, D. Is ChatGPT-4 a general-purpose natural language processing task solver? arXiv 2023, arXiv:2302.06476. [Google Scholar]
Törnberg, P. Chatgpt-4 outperforms experts and crowd workers in annotating political twitter messages with zero-shot learning. arXiv 2023, arXiv:2304.06588. [Google Scholar]
Sun, X.; Dong, L.; Li, X.; Wan, Z.; Wang, S.; Zhang, T.; Li, J.; Cheng, F.; Lyu, L.; Wu, F.; et al. Pushing the Limits of ChatGPT on NLP Tasks. arXiv 2023, arXiv:2306.09719. [Google Scholar]
Chatbot Arena Leaderboard—A Hugging Face Space by Lmsys. Available online: https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard (accessed on 21 November 2023).
Huynh, D. GitHub—OpenRefine/OpenRefine: OpenRefine Is a Free, Open Source Power Tool for Working with Messy Data and Improving It—github.com. Available online: https://github.com/OpenRefine/OpenRefine (accessed on 18 November 2023).
Huang, J.; Katherine, M.T.; Efthimiadis, N. Conversational tagging in Twitter. In Proceedings of the 21st ACM Conference on Hypertext and Hypermedia, Toronto, ON, Canada, 13–16 June 2010. [Google Scholar] [CrossRef]

Figure 1. Illustration of network constructions. In a topic layer (i.e., L1), if a hashtag (i.e., #H1) is used reciprocally, an edge is added to this layer. However, if in a topic layer (i.e., L2) a hashtag is used by only A or only B, an incident edge is not added to the layer.

Figure 2. Data pipeline.

Figure 3. This figure illustrates the level of involvement in BPHC among U.S. senators across various sub-groupings. The left panel displays the distribution of edges across networks derived from supervised modeling, whereas the right panel highlights the disparity among networks based on unsupervised modeling. The x-axis represents analytical network categories, and the y-axis denotes the number of edges. The colors within each bar represent the distribution of edges across various layers.

Figure 4. These heat plots depict the Pearson degree (PD, upper panel) and Jeffrey degree (JD, lower panel) correlations across different layers among Democrats and Republicans. The darkest colorings represent higher values, and empty rows/columns imply no reciprocal hashtag involvement is detected.

Figure 5. Node colors represent two parties and independents: blue, Democrats; red, Republicans. Edge colors represent the topic layers.

Figure 6. BPHC strategy at each layer.

Figure 7. Inter-party relationships at each layer.

Table 1. Data description.

Group	Sub-Group	# of Politicians	# of Tweets	# of Hashtags
Full Sample		128	37,361	1660
Party ID
	Democrats	65	19,195	929
	Republicans	60	17,911	886
	Independent	3	255	19
Incumbency
	Incumbent	60	12,009	672
	Candidate	68	25,352	1205
Region
	North	98	25,942	1270
	South	30	11,419	585
Sex
	Male	90	26,419	1292
	Female	38	10,942	585

Notes: It is important to note that the number of tweets per user among Democrats and Republicans is almost the same, at around 295.

Table 2. Hashtag Groups.

Modeling	# of Layer	# of Hashtags	# of Tweets	# of Ind. Users
Supervised
	Economy	35	890	77
	Environment	12	319	27
	Health	35	905	61
	Immigration	9	336	21
	Identity	28	464	49
	Women	22	436	27
	Labor	17	332	25
Unsupervised
	Appreciation	153	2785	103
	Campaign	86	7417	71
	Federal Politics	95	8297	85
	Media	83	1400	81
	Rights	54	1124	53
	State Politics	168	7015	62

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Orhan, Y.E.; Pirim, H.; Akbulut, Y. Building Political Hashtag Communities: A Multiplex Network Analysis of U.S. Senators on Twitter during the 2022 Midterm Elections. Computation 2023, 11, 238. https://doi.org/10.3390/computation11120238

AMA Style

Orhan YE, Pirim H, Akbulut Y. Building Political Hashtag Communities: A Multiplex Network Analysis of U.S. Senators on Twitter during the 2022 Midterm Elections. Computation. 2023; 11(12):238. https://doi.org/10.3390/computation11120238

Chicago/Turabian Style

Orhan, Yunus Emre, Harun Pirim, and Yusuf Akbulut. 2023. "Building Political Hashtag Communities: A Multiplex Network Analysis of U.S. Senators on Twitter during the 2022 Midterm Elections" Computation 11, no. 12: 238. https://doi.org/10.3390/computation11120238

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Building Political Hashtag Communities: A Multiplex Network Analysis of U.S. Senators on Twitter during the 2022 Midterm Elections

Abstract

1. Introduction

2. Related Literature

3. Multiplex Networks

3.1. Network Construction

3.2. Sub-Networks

3.3. Network Metrics

4. Overview of the Dataset

4.1. Data Collection

4.2. Hashtag Topic Modeling

5. Results

6. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI