2.1. Concept of Online Deliberation and Past Measurement Efforts
Online deliberation research is an emerging strand of deliberation literature that focuses on three aspects of internet-enabled deliberation [27
]: input, throughput, and output. An input aspect sheds light on the preconditions of deliberation. Institutional arrangements (e.g., participatory budgeting), platforms (e.g., government-run platform), and socio-political elements (e.g., internet access rate and social strata) are examples of such. A second aspect is related to outcomes resulting from online deliberation—be they internal (e.g., knowledge gains and digital citizenship) or external effects (e.g., policy changes and side effects). A third aspect concerns processes through which multiple stakeholders participate and build consensus democratically. This article aims to contribute to the third aspect, the process of online deliberation, by proposing automated quality indicators using a social network and time-series analysis.
This section reviews current automated computational methods for assessing online deliberative quality. Theories of deliberation and deliberative democracy have long been influenced by Habermas’s notion of communicative rationality and the public sphere [49
]. The central presumption is that social problems are increasingly “wicked,” that is, subjective and contextual; thus, instrumental rationality, which uses impersonal tools designed to attain measurable objectives, no longer captures the essence of social problems [53
]. Since there is no single optimal solution to the problems faced by multiple stakeholders, they need to engage in communicative processes through which problems will be identified, viewpoints exchanged, and collective action promoted [54
]. Therefore, social problems need to be solved through inter-subjective communication rather than objective calculation.
Nevertheless, deliberation literature has suffered from a lack of standard definition of deliberation [33
], leading to fragmented quality indicators [28
]. For instance, Dahlberg [62
] suggested six criteria: reasoned critique, reflexivity, ideal role-taking, sincerity, inclusion and discursive equality, and autonomy from state and economic power. Fishkin [60
] proposed five criteria: information, substantive balance, diversity, conscientiousness, and equal consideration, based on three democratic values: deliberation, equality, and participation. Gastil and Black [61
] developed the following criteria: create an information base, prioritize key values, identify solutions, weigh solutions, make the best decision, speaking opportunities, mutual comprehension, consideration, and respect. Steenbergen et al. [41
] suggested the criteria: open participation, level of justification, content of justification, respect, and constructive politics. Although many listed indicators share similar traits, this fragmented landscape demonstrates less standardized deliberative quality indicators even within the same theoretical root [28
Traditionally, empirical studies select some of the theory-based criteria using coding schemes and then hire trained coders to assess deliberative quality [35
]. For instance, Esau et al. [35
] developed eight quality measures, and five coders assessed the quality of textual contents by reading a sample of user comments on several online platforms. This method is considered a “gold standard,” since human experts can extract sophisticated meanings from text [43
]. However, Beauchamp [42
] has pointed out that it requires intensive work and there may be biases as to what count as deliberative criteria. We agree in part, because there are well-established measures (e.g., Krippendorff’s alpha) in content analysis to handle inter-coder reliability. Nevertheless, manual coding can still be problematic when there is significant disagreement among coders or a large amount of online discussion data, which is the case in this article.
Alternatively, there have recently been attempts to develop automated deliberative quality indicators. By automation, we mean measuring deliberative quality by computational methods rather than by human judgment. According to Beauchamp’s review [42
], automated methods are still rare but growing, centering around natural language processing and social network analysis. Social network analysis is useful in studying intricate interaction patterns in deliberation processes [44
]. For instance, Gonzalez-Bailon et al. [46
] developed two dimensions of network topology, representativeness and argumentation, and compared the quality of deliberation across different topics in an online discussion community. Another automated method is machine learning-based natural language processing [48
]. For instance, Fournier-Tombs and Marzo Serugendo [48
] adopted Steenbergen et al.’s [41
] well-known discourse quality index and manually coded online comments in the training dataset, then applied the random forests algorithm (supervised learning) to automatically label deliberative quality in the testing set with computing training errors. Several studies combined network analysis and content analysis to identify major discussion topics and support flows in the discussion network [44
While these automated methods have opened up promising future research avenues, they are in their infancy, and are far from complete. First, the most significant limitation is that human experts can better interpret nuanced political debates and contexts than machines at the current level of technical development. Beauchamp [42
] found that studies employing automated methods tend “to focus on superficial, easily measured markers of argument and deliberation” (p. 324), “many of which measures are also disappointingly superficial when examined in detail” (p. 345). Second, he pointed out dozens of heterogenous indicators in the field (p. 336). This is perhaps because of unique online deliberative systems. For instance, online deliberation takes place on small online forums [38
], online communities such as Twitter and Facebook [46
], government-run platforms [65
], parliament websites [69
], and online newspapers [70
]. These various digital platforms have unique deliberative systems, influencing data collection and analysis. For instance, while Campos-Domínguez et al. [70
] used Twitter’s hashtags, Black et al. [38
] used Wikipedia’s collaborative systems, and Gonzalez-Bailon et al. [46
] used Slashdot’s moderation system to identify divergent topics under discussion and assess their quality quantitatively. Steenbergen et al. [41
] saw that the lack of standard definitions and measures could lead to problems with validity. If one attempts to develop a comprehensive set of indicators that captures the universal elements of online deliberation, this could achieve external validity (generalizability) at the risk of reducing internal validity (trustworthiness). Third, automated methods have focused relatively less on a crucial dimension of deliberation: time. Without the time dimension, online deliberation data consist of a chunk of texts and interactions captured by a single snapshot. Since deliberation is a communicative process, it is crucial to assess how its quality changes over time.
Overall, we note that automated methods for assessing online deliberative quality pose potential validity issues. This article addresses internal validity by developing new deliberative quality indicators using network data and time-series data collected from an online deliberative platform of interest, arguing that the combination will provide information on how online deliberation evolves. In terms of external validity, we argue that the two types of data are commonly observed on many online platforms, creating comparable datasets.
2.2. New Online Deliberative Quality Indicators
Against this backdrop, we select some of the criteria from the pool of existing indicators or create a new one that can be measured and analyzed using the two types of data. For this aim, this article defines deliberation as “the process by which individuals sincerely weigh the merits of competing arguments in discussions together,” followed by James Fishkin [60
] (p. 33) based on Habermas [72
], and developed indicators based on his framework with three democratic dimensions: participation, deliberation, and equality [60
]. These dimensions will provide us an analytical lens to examine online deliberation processes from various angles. We set participation and deliberation as two dimensions of the proposed indicators, with equality as an overarching value.
First, participation means “behavior on the part of members of the mass public directed at influencing, directly or indirectly, the formulation, adoption, or implementation of governmental or policy choices” [60
] (p. 45). When residents intend to influence local politics, they might engage in a wide range of activities, for instance, joining an association, visiting petition websites, and attending offline meetings. Fishkin [60
] regards these activities as forms of mass political participation, arguing that such activities should be spread throughout the population, and people should reinforce their participatory activities over time: “Mass participation is a cornerstone of democracy” (p. 46). One of the following criteria is the participation rate: what percentage of the population participates in online deliberation? It is a measure for the representative participation in online deliberation. Next, activeness measures the number of active commentators and comments and their longitudinal trends. To what extent do residents actively participate in online deliberation? Does online deliberation show an upward, downward, or constant trend? Lastly, continuity measures whether there is consistency in deliberation engagement, without significant gaps, especially during the operational periods. Overall, we consider participation rate, activeness, and consistency as essential indicators of the extent to which residents participate in online deliberation over time.
The second dimension is deliberation. While the participation dimension focuses on the volume of deliberation, in this article, the deliberation dimension focuses on interactions in deliberation. Fishkin [60
] suggested five criteria of deliberation: information (accessibility of crucial information), substantive balance (reciprocal communications), diversity (multiple topics by multiple actors), conscientiousness (reasoned arguments), and equal consideration (equal opportunities to weigh up values offered by all actors). We found that network analysis and time-series analysis were useful in examining substantive balance and equal consideration, by which three indicators were developed. Responsiveness measures the degree to which online comments generate back-and-forth conversations like a real discussion that can help participants identify others’ viewpoints and clarify their preferences [73
]. Janssen and Kies [28
] noted that previous studies mostly categorized comments into “initiate” (a message initiates a new debate), “reply” (a message that replies to a previous message), and “monologue” (a message that is not part of a debate). This categorization requires qualitative interpretation of each comment, which does not conform to this article’s aim. Therefore, we applied an initiate-reply categorization and measured the proportion of replies. It is a simple yet useful indicator that shows the extent to which others respond to messages. Inter-linkedness examines structural patterns of who communicates with whom and how proposals are related to each other. Although online deliberative platforms provide free and accessible public space for the mass public, actors still select the appropriate partners and topics for benefits [46
]. This intentionality in creating and maintaining interactions might form polarized subgroups when conflictual issues emerge that can be analyzed through social network analysis. Lastly, commitment measures the variability of engagement in online deliberation. Many empirical studies have observed that a handful of people and topics often dominate deliberation processes while others remain silent [61
]. We consider this political inactivity an essential issue of political equality [60
], and examine the degree of engagement across actors. Overall, we consider responsiveness, inter-linkedness, and commitment as essential indicators of interactions during deliberation processes.