Content Analysis of E-Participation Platforms in Taiwan with Topic Modeling: How to Train and Evaluate Neural Topic Models?

Sontheimer, Moritz; Fahlbusch, Jonas; Chou, Shuo-Yan; Kuo, Yu-Lin

doi:10.3390/app15052263

Open AccessArticle

Content Analysis of E-Participation Platforms in Taiwan with Topic Modeling: How to Train and Evaluate Neural Topic Models?^†

¹

Department of Mechanical Engineering, National Taiwan University of Science and Technology, No. 43, Section 4, Keelung Rd, Da’an District, Taipei 106, Taiwan

²

Chair of Work Studies, Technology and Participation, Technische Universität Berlin, Marchstraße 23, 10587 Berlin, Germany

³

Department of Industrial Management, National Taiwan University of Science and Technology, No. 43, Section 4, Keelung Rd, Da’an District, Taipei 106, Taiwan

⁴

Center for Smart Manufacturing Innovation, National Taiwan University of Science and Technology, No. 43, Section 4, Keelung Rd, Da’an District, Taipei 106, Taiwan

^*

Authors to whom correspondence should be addressed.

^†

This paper is an extended version of our paper published in Sontheimer, M.; Fahlbusch, J.; Korjakow, T.; Chou, S.-Y. Text-Mining of E-Participation Platforms: Applying Topic Modeling on Join and iVoting in Taiwan. In Proceedings of the TE2024: Engineering for Social Change, London, UK, 9–11 July 2024.

Appl. Sci. 2025, 15(5), 2263; https://doi.org/10.3390/app15052263

Submission received: 31 December 2024 / Revised: 6 February 2025 / Accepted: 18 February 2025 / Published: 20 February 2025

(This article belongs to the Topic Social Computing and Social Network Analysis)

Download

Browse Figures

Versions Notes

Abstract

E-participation platforms, such as iVoting and Join in Taiwan, provide digital spaces for citizens to engage in deliberation, voting, and oversight. As a forerunner in Asia, Taiwan has implemented these platforms to enhance participatory democracy. However, there is still limited research on the specific content debated on these platforms. Utilising recent advancements in Natural Language Processing, the content of proposals that users have submitted between 2015 and 2025 is explored. In this study, a pipeline for mining text corpora scraped from these platforms in the context of political analysis is proposed. The pipeline is applied to two datasets which have different characteristics. A topic model for each of the two platforms is generated and later evaluated with OCTIS (Optimizing and Comparing Topic Models Is Simple) and compared to different baselines. Our research highlights the trade-offs between model performance and processing time, emphasizing the balance between accuracy and meaningful topic creation. By integrating a translation pipeline from Chinese to English within the text-mining process, our method also demonstrates a solid approach to overcome language barriers. Consequently, our method is adaptable to e-participation platforms in various languages, providing decision-makers with a more comprehensive tool to understand citizens’ needs and enabling the formulation of more informed and effective policies.

Keywords:

policy informatics; text mining; neural topic modeling; e-participation

1. Introduction

E-participation platforms are increasingly used worldwide to engage citizens in decision-making processes [1,2,3,4]. Understanding these digitally expressed opinions, identifying resonating topics, and pinpointing areas of public support are essential resources for policymakers navigating parliamentary obligations [5]. The “Ladder of Participation” (1969) by Sherry Arnstein [6] measures the extent of citizen power in influencing policies and decisions. With adaptations, it applies to the digital sphere, from manipulation (e.g., algorithmic control) and information to consultation (e.g., online polls), crowd-sourced law proposals, and e-voting. While digital participation can enhance time efficiency and accessibility [7,8], researchers also highlight that technical barriers, limited digital literacy, and a lack of emotional engagement can have an inhibiting effect, as examples from the COVID-19 pandemic have demonstrated [9].

In America, researchers who investigated the We the People e-petition platform have found that it expresses short-winded policy debates after major sociopolitical events, fueled by media coverage and social media mobilization [1]. In Taiwan, Liao et al. [10] have examined the iVoting platform and its impact on digital democratic development, emphasizing that the challenge of informed voter participation, along with the growing online presence of political parties and their internet-savvy supporters, is reflected in petition outcomes. Equally critical is the ability to track users who abandon discussions, particularly in Taiwan’s cyberspace, which has ranked as the number one country regarding the exposure of disinformation [11].

In addition to the voting-based platform iVoting, our study also examines the proposals on Join, as this platform provides more space for deliberation and has become the central e-participation platform in recent years. On Arnstein’s ladder, Join spans consultation and partnership, as citizens can discuss and vote for proposals from other users and governmental bodies. Similarly, on both platforms, legislators are required to address proposals once they reach a certain number of votes. Meanwhile, iVoting has seen a significant decline in usage in recent years. The platform was primarily designed for Taipei City to vote on municipal affairs. In contrast, Join serves as a national platform managed by the central government.

To extract information easily and make it publicly available, more machine-aided tools and research is necessary [12]. This research employs a Natural Language Processing (NLP) method to assist policymakers and administrators in gaining a comprehensive understanding of uncategorized online proposals. While existing studies on Join and iVoting have focused primarily on petition dynamics [13], this study extends the analysis to thematic shifts and topical trends over a ten-year period within the framework of policy informatics.

Taiwan has pursued pioneering efforts to leverage the internet for transparency and inclusion, exemplified by the emergence of e-participation platforms like Join and iVoting and other activities such as civic hacking groups [14]. The 2014 Sunflower Movement served as an incubator for these initiatives, fostering a new phase of government-civic society cooperation and driving digital innovations that have improved public services and political engagement. The two platforms, although distinct in scope and functionality, share the common goal of enhancing informed decision making and civic engagement. This paper explores the content, impact, and dynamics of these platforms and contributes to:

1.: Insight into Platforms: We provide a comprehensive analysis of two e-participation platforms, Join and iVoting, highlighting their thematic focuses, user engagement patterns, and platform-specific trends. This comparison offers valuable insights into how public participation varies across platforms and governance contexts.
2.: Language-Independent Pipeline Evaluation: We propose a robust method for evaluating text-mining pipelines across languages using neural topic models. This approach ensures flexibility and applicability in multilingual e-participation settings, enabling a more inclusive analysis of public sentiment and thematic trends.
3.: Evaluation Framework for Neural Topic Models: We introduce a systematic framework to assess the performance of neural topic models, combining metrics such as coherence evaluated with NPMI (Normalized Point Mutual Information) and computational efficiency. This framework provides researchers and practitioners with practical tools to balance semantic quality and processing demands effectively.

The rest of this paper is organized as follows: Section 2 describes the state of the art, including details about the Join and iVoting platforms, the preprocessing steps, and the clustering methodology employed. Section 3 explains the experimental setup, including the neural topic modeling approaches, evaluation metrics, and computational frameworks used in the study. Section 4 presents the results, showcasing the thematic clusters, topic evolution trends, and comparative analyses of different embedding models. Section 5 discusses the findings, focusing on the trade-offs between coherence and computational efficiency, along with their implications for e-participation platforms. Finally, Section 6 concludes the paper, summarizing the key contributions, outlining limitations, and suggesting directions for future research.

2. Literature Review

2.1. E-Participation Platforms in Taiwan

The e-participation platforms Join and iVoting, while complementary in purpose, differ significantly in their scope, functionality, and deployment. Join, introduced in February 2015 by the National Development Council, was designed to foster whole-citizen participation by facilitating comprehensive civic engagement [13]. It allows users to propose, discuss, and vote on policy issues, as well as provide feedback on draft legislation. In contrast, iVoting—launched during the 2012 Legislative Yuan elections—focuses specifically on enhancing the electoral process through informed voting. As Taiwan’s first Voting Advice Application (VAA), iVoting helps users make more deliberate and informed decisions by offering issue position diagnostics.

Despite its initial success—drawing over 1400 members and 40,000 visitors between 2011 and 2012—iVoting has seen a significant decline in activity in recent years [10]. A critical factor contributing to this drop was the platform’s controversial use in 2016 for a local redevelopment project in Taipei’s Shezidao, where the city government offered only pre-selected options for public voting. This limited democratic choice led to widespread criticism, ultimately eroding public trust in the platform [15].

Former research on e-petitions in Taiwan found out that political participation has been largely influenced by party mobilization, creating a “mobilization of bias” that is difficult to eliminate [16]. However, our study suggests that e-participation, despite being limited by the digital divide, may counterbalance traditional political mobilization by enabling participation from individuals outside established party networks, potentially making deliberation and petition results more representative of public opinions. Therefore, the large volume of unstructured textual data generated by Join and iVoting presents both an opportunity and a challenge for policymakers. Understanding citizen preferences and identifying dominant themes within proposals and discussions is essential for effective governance.

2.2. Neural Topic Modeling in Policy Informatics

With the advance of Large Language Models (LLMs) and their availability on platforms like Hugging Face, NTM has become accessible to a wider audience. In the context of e-participation platforms like Join and iVoting, NTMs can offer powerful methods for analyzing citizen proposals and discussions by automatically identifying key topics, trends, and shifts in public opinion over time similar to existing text-analysis approaches with LLMs [17]. Policymakers can leverage these insights to prioritize issues of public concern, identify emerging themes in civic discourse, and make more informed decisions. For instance, NTMs could be used to analyze the thousands of policy proposals submitted on the Join platform, extracting thematic clusters that indicate broader societal trends or areas of public interest [18]. Previously, methods like Network Discourse Analysis [19], Latent Dirichlet Allocation [3], Market Basket Analysis [1] and Neural Political Statement Classification [20] have been proposed. They benefit from advances in the machine learning domain, such as Hierarchical Density-Based Clustering under Application of Noise (HDBSCAN) to BERT Topic [21] or some hybrid approach similar to Malzer and Baum’s approach [22]. These have narrowed down the focus on topic modeling and its derivatives such as structural topic modeling [23] or many other different topic modeling architectures [24]. Neural topic models can also be encountered in all kinds of domains, such as the news articles and their influence on elections [25].

However, what has still proven to be difficult is the evaluation of these neural topic models. Recent studies have evaluated NTMs for various applications, including policy analytics. Doan and Hoang [24] found that NTMs outperform traditional models in uncovering cohesive topics and document representation for classification, but lag in modeling input documents. Li et al. [26] demonstrated that the contextual neural topic model excels in interactive content analysis tasks, though LDA remains competitive. González-Pizarro and Carenini (2024) extended this evaluation to multimodal datasets, proposing novel models and metrics for documents containing both text and images. Terragni and Fersini [27] emphasized the importance of hyperparameter optimization in NTM evaluation, revealing relationships between hyperparameters, document length, and performance measures. These studies collectively highlight the need for comprehensive evaluation strategies that consider multiple metrics, task-based assessments, and domain-specific requirements when selecting topic models for policy analytics applications. Terragni et al. also proposed OCTIS [28], which is targeted towards the evaluation of topic models.

While significant advances have been made in applying topic modeling techniques to analyze political discourse and e-participation platforms, gaps remain in both the evaluation of these models and their specific applications within policy informatics. Previous studies have focused primarily on evaluating coherence and document clustering performance, but there is a lack of comprehensive frameworks that take into account the unique requirements of policy-related datasets. This paper proposes new methodologies for evaluating neural topic models in the context of e-participation platforms, with an emphasis on their ability to handle diverse topics and changing themes over time. By focusing on platforms such as Join and iVoting, this research aims to bridge the gap between advanced text-analysis techniques and real-world policy applications.

3. Methods

3.1. Dataset

The dataset consists of 20,923 proposals from Join and 454 from iVoting, submitted between 2015 and 2025. The earliest recorded entry on Join dates back to 10 September 2015, while the latest entry was submitted on 7 January 2025. In contrast, the earliest available proposal on iVoting dates back to 13 April 2017, while the latest one in our dataset was submitted on 22 June 2022. The number of proposals on Join from 2015 to 2025 can be seen in Figure 1; the number of proposals on iVoting from 2015 to 2022 are declining, as shown in Figure 2. iVoting has become obsolete at this point. For each proposal, the dataset includes information such as the title, content, submission date, proposer’s username, background, rationale, and vote count. Personal identifiers like gender, age, and real names were optional for submission and are not included in the dataset. Given the low volume and government-initiated nature of certain iVoting proposals—such as the Shezidao case—direct comparisons between the platforms present challenges.

On Join, proposals undergo a preliminary review by the public administration to assess their coherence and appropriateness before being opened for public endorsement. Proposals deemed offensive are filtered out. Approved proposals remain on the platform for 60 days, allowing time for public votes and comments. If a proposal garners at least 5000 votes within this deliberative period, the relevant government ministry responsible for the issue must take action. This includes initiating contact with the proposer to understand their demands, organizing a policy discussion meeting, and ensuring that the minutes of the meeting are made publicly available. The ministry is then required to provide a well-reasoned response within two months. Given this structured process, our research employs text-modeling approaches to identify and categorize the main themes of the proposals using Natural Language Processing (NLP) techniques. While existing studies have primarily examined petition dynamics on Join, our analysis goes deeper by exploring thematic shifts and topical trends over a ten-year period within the broader framework of policy informatics.

3.2. Proposed Pipeline

Our proposed method for analyzing e-participation platforms involves a multi-stage process designed to extract, preprocess, and analyze citizen proposals from platforms such as Join and iVoting and is based on [21], as shown in Figure 3. Initially, data are collected via web scraping, extracting proposals submitted by users on these platforms. The raw text undergoes a series of preprocessing tasks, including data cleaning and stopword removal, resulting in a collection of processed documents. These documents are then transformed into embedded representations, which serve as inputs to our neural topic modeling framework. This model identifies patterns, themes, and trends within the proposals, providing a structured understanding of the topics discussed.

To refine the results, we apply techniques such as hyperparameter tuning, clustering, and dimensionality reduction, improving both the coherence of the topics and the quality of insights derived from the data. Additionally, other knowledge sources, including legal frameworks and media outlets, are integrated via a parsing routine and stored in a background knowledge base to provide further context. The output of the topic modeling, along with this enriched background knowledge, is then made accessible through a user interface that supports browsing functionality, enabling users to filter, query, visualize, and graph the results. Finally, policymakers are able to leverage these insights for more informed decision making, supported by a system that enhances both content analysis and trend identification across multiple years of data.

3.3. Pipeline Mathematical

In this study, we employed a topic-modeling approach based on BERTopic [21], which leverages pre-trained transformer embeddings, dimensionality reduction, and density-based clustering to extract topics from proposals submitted on e-participation platforms. Below, we detail the key steps in our methodology, including embedding generation, dimensionality reduction, clustering, and topic evaluation metrics.

3.4. Document Embeddings

We begin by generating dense vector representations for each document in the corpus using a pre-trained transformer model. Let

D = {d_{1}, d_{2}, \dots, d_{n}}

represent the set of documents (proposals), where each document

d_{i}

is embedded into a d-dimensional space using the embedding model

f_{BERT}

. The embedding of each document

d_{i}

is denoted as:

v_{i} = f_{BERT} (d_{i}) \in R^{d}

Here,

v_{i}

represents the document embedding in the vector space, and d is the dimensionality of the embedding. This dimensionality depends on the embedding space of your embedding LLM.

3.5. Dimensionality Reduction

To improve computational efficiency and facilitate clustering, we apply Uniform Manifold Approximation and Projection (UMAP), a non-linear dimensionality reduction technique. UMAP reduces the high-dimensional embeddings

V = {v_{1}, v_{2}, \dots, v_{n}}

into a lower-dimensional space

Z = {z_{1}, z_{2}, \dots, z_{n}}

, where:

z_{i} = f_{UMAP} (v_{i}) \in R^{k}

Here, k is the reduced dimensionality (typically

k ≪ d

), and

f_{UMAP}

denotes the UMAP function that transforms the embeddings.

3.6. Clustering with HDBSCAN

The reduced embeddings are then clustered using Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), a density-based clustering algorithm that groups similar documents into topics. The clustering algorithm produces a set of clusters

C = {c_{1}, c_{2}, \dots, c_{m}}

, where each cluster

c_{j}

represents a topic. The clustering process can be expressed as:

c (z_{i}) = arg max_{j \in C} Density (z_{i} ∣ c_{j})

where

c (z_{i})

is the cluster assignment for the document embedding

z_{i}

, and the density function evaluates the density of

z_{i}

relative to the cluster

c_{j}

.

3.7. Topic Representation

For each cluster (topic), we extract the most representative words using their term frequency. The score of a word w in a cluster

c_{j}

is computed as:

Score (w ∣ c_{j}) = \frac{Count (w ∣ c_{j})}{\sum_{w^{'} \in c_{j}} Count (w^{'} ∣ c_{j})}

Here,

Count (w ∣ c_{j})

represents the frequency of word w in cluster

c_{j}

, and the score reflects its importance relative to other words in the cluster.

3.8. Topic Evaluation Metrics

To evaluate the quality of the generated topics, we utilize two key metrics: NPMI and diversity.

3.8.1. NPMI for Topic Coherence

NPMI is used to assess the coherence of the top words within each topic. For a given pair of words

w_{i}

and

w_{j}

within a topic

t_{k}

, NPMI is calculated as:

NPMI (w_{i}, w_{j}) = \frac{log \frac{P (w_{i}, w_{j})}{P (w_{i}) P (w_{j})}}{- log P (w_{i}, w_{j})}

where

P (w_{i}, w_{j})

represents the probability of co-occurrence of the words

w_{i}

and

w_{j}

, and

P (w_{i})

and

P (w_{j})

are the individual probabilities of occurrence. The overall NPMI score for a topic

t_{k}

is the average NPMI for all pairs of top-N words in the topic:

NPMI (t_{k}) = \frac{1}{(\binom{N}{2})} \sum_{i < j} NPMI (w_{i}, w_{j})

where N is the number of top words in the topic.

3.8.2. Topic Diversity

Diversity measures the distinctiveness of the top-N words across all topics. Let

T = {t_{1}, t_{2}, \dots, t_{m}}

represent the set of topics, each consisting of the top-N words. The diversity of the topics is defined as:

Diversity (T) = \frac{Number of unique words in T}{m \times N}

where m is the number of topics, and N is the number of top words per topic. A higher diversity score indicates that the topics are more distinct from each other.

3.9. Refinements

To further enhance the quality of the generated topics, we incorporate additional refinements, including the use of TF-IDF for weighing words and **cosine similarity** to assess document similarity. The TF-IDF score for a word w in a document d is computed as:

TF-IDF (w, d) = TF (w, d) \times log \frac{| D |}{| d \in D : w \in d |}

where

TF (w, d)

is the term frequency of word w in document d,

| D |

is the total number of documents, and

| d \in D : w \in d |

is the number of documents containing w.

In addition, we calculate the cosine similarity between document embeddings

v_{i}

and

v_{j}

as:

Cos ineSim (v_{i}, v_{j}) = \frac{v_{i} \cdot v_{j}}{∥ v_{i} ∥ ∥ v_{j} ∥}

This similarity measure allows us to assess the closeness of documents within the same topic and across different topics.

3.10. Problem Definition

Concept Selection

In the context of our text-mining system for e-participation platforms, we define concept selection as the process of isolating documents (proposals) labeled by certain concepts (e.g., themes or categories in the proposal). Given a collection of documents D, and a set of concepts K, we define the subset of documents labeled by all the concepts in K as

D / K

. When focusing on a single concept k, we use the notation

D / k

.

Definition 1

(Concept Selection).

D / K = {d \in D : d is labeled with all concepts in K}

This operation is essential for identifying groups of proposals that belong to specific thematic areas on the e-participation platforms (e.g., proposals tagged as “transportation”, “environment”, etc.).

Concept Proportion

To analyze the proportion of documents labeled by a specific concept, we define the concept proportion. This allows us to calculate the fraction of documents within a collection that are associated with a given concept or set of concepts.

Definition 2

(Concept Proportion). The fraction

f (D, k)

of documents in D labeled by a concept k is:

f (D, k) = \frac{| D / k |}{| D |}

where

| D / k |

is the number of documents labeled with concept k, and

| D |

is the total number of documents in the collection. For example,

f (D, environment)

could represent the proportion of proposals labeled with the “environment” concept.

Conditional Concept Proportion

We often need to compute the conditional concept proportion, which represents the fraction of documents labeled with one concept that are also labeled with another. This is particularly useful for understanding the co-occurrence of themes in proposals.

Definition 3

(Conditional Concept Proportion). If

K_{1}

and

K_{2}

are sets of concepts, the conditional proportion

f (D, K_{1} | K_{2})

is the proportion of documents in D labeled by

K_{2}

that are also labeled by

K_{1}

:

f (D, K_{1} | K_{2}) = \frac{| D / K_{2} \cap D / K_{1} |}{| D / K_{2} |}

For instance,

f (D, environment | transportation)

represents the proportion of proposals tagged with “transportation” that are also labeled as “environment”.

Concept Proportion Distribution

When analyzing a set of thematic concepts across documents, we utilize the concept proportion distribution, which measures the proportion of documents labeled with each concept in a collection.

Definition 4

(Concept Proportion Distribution). For a collection D and a set of concepts K, the concept proportion distribution

F_{K} (D, x)

assigns a proportion to each concept

x \in K

as:

F_{K} (D, x) = \frac{| D / x |}{| D |}

This distribution helps identify the spread of thematic topics across all documents. For instance,

F_{topics} (D, x)

would represent the proportion of proposals associated with each thematic category under a concept hierarchy such as “topics”.

Average Concept Proportion

To compare thematic areas across different subcategories (e.g., analyzing a group of platforms or specific timeframes), we define the average concept proportion, which provides an unweighted average of concept proportions for a set of sibling nodes in a hierarchy.

Definition 5

(Average Concept Proportion). For a collection D, concept k, and an internal node n in the hierarchy, the average concept proportion

a (D, k | n)

is the average value of

f (D, k | k^{'})

, where

k^{'}

varies over all immediate children of n:

a (D, k | n) = {Avg}_{k^{'} is a child of n} (f (D, k | k^{'}))

This provides a summary of the typical concept proportion for child nodes of n. For example,

a (D, transportation | urban)

might give the average concept proportion of transportation-related proposals within the urban-related nodes.

3.11. Evaluation

To evaluate the coherence of topics generated by embedding models, we employed Normalized Pointwise Mutual Information (NPMI), a widely used metric for measuring word associations within topics. NPMI captures the semantic coherence of a set of words that form a topic, based on how frequently these words co-occur in the corpus. It ranges from

- 1

(complete negative association) to 1 (complete positive association), where higher values indicate better coherence among words in a topic.

The formula for NPMI is given by:

NPMI (w_{i}, w_{j}) = \frac{log \frac{P (w_{i}, w_{j})}{P (w_{i}) P (w_{j})}}{- log P (w_{i}, w_{j})}

where -

P (w_{i}, w_{j})

is the probability of words

w_{i}

and

w_{j}

co-occurring within a specific window in the corpus. -

P (w_{i})

and

P (w_{j})

represent the individual probabilities of occurrence for words

w_{i}

and

w_{j}

.

NPMI serves as an important measure in determining whether the words grouped under a topic make sense as a coherent, interpretable concept. This metric was crucial in our analysis, allowing us to assess the quality of topics generated by different embedding models on both the iVoting and Join platforms.

In addition to coherence, topic diversity is another essential factor that ensures the uniqueness of topics in a model. We evaluated diversity by examining the top 25 words of each topic cluster. To quantify diversity, we ensure that each cluster presents words that are not highly repetitive across different topics. The diversity score can be defined as the proportion of unique words across all topics, which can be calculated using:

Diversity = \frac{Number of unique words across all topics}{Total number of words across all topics}

A higher diversity score indicates that the topics generated by the model are more distinct from each other, ensuring that the top words in one topic do not overly overlap with those in other topics. This metric is particularly useful in ensuring that the clustering provides a rich, varied representation of the underlying textual data, which is essential for e-participation platforms to capture a wide range of public opinions and themes.

Both NPMI and diversity measures were applied to evaluate the quality of topics generated by the embedding models, providing insights into the balance between coherence and topic uniqueness in the results.

4. Results

This section presents the results of our analysis of text-mining techniques applied to e-participation platforms, specifically focusing on the iVoting and Join platforms. We employed various embedding models to analyze uncategorized online proposals and their thematic shifts. The performance of these embeddings was evaluated using two primary metrics: the Normalized Pointwise Mutual Information (NPMI) and computational time. Through this analysis, we aimed to identify the most effective embedding models for extracting meaningful insights from the text data on these platforms, thereby contributing to the understanding of public participation in decision-making processes.

4.1. Topic Models

The topic model of the Join platform can be seen in Figure 4. The resulting clusters reveal dominant themes in public discourse, highlighting major areas of concern and policy engagement over the years. The visualization in Figure 4 presents a scatter plot of the clustered topics, while Table 1 provides the corresponding topic distribution. The clustering results show that Government Policies and Regulations form the most significant category, which is the noise cluster, since all proposals fall in this category. Education Policies in Taiwan and Traffic Safety Regulations also emerge as major areas of discussion, indicating public interest in transportation safety and educational reforms. Other notable themes include Energy Transition and Sustainable Development, Labor Rights and Regulations, and Electoral Reform, highlighting concerns related to governance, sustainability, and social policies. To complement the Join platform’s topic clusters, we analyzed discussions within the iVoting platform. The results demonstrate a parallel focus on governance, sustainability, and specific policy issues, with Governance and Sustainability in Taiwan, Taipei Public Transport Infrastructure, and Traffic Management being dominant themes.

The topic distribution suggests that public discourse on these platforms is highly focused on regulatory policies, governance, and sustainability, with certain issue areas, such as traffic safety and epidemic prevention, standing out as critical concerns. The differences between Join and iVoting clusters indicate that while both platforms capture governance-related themes, iVoting discussions tend to be more issue-specific, reflecting citizen engagement on more targeted topics.

To complement the topic clustering analysis, Figure 5 illustrates the temporal evolution of the five most prevalent topics on the Join platform. The trends indicate fluctuations in public interest, with notable peaks around significant events such as policy changes, elections, or crises. For instance, Reforms in Taiwanese Education and Taiwan’s Healthcare Response to Severe Epidemic exhibit marked surges during specific periods, reflecting public discourse dynamics. The trends also highlight the sustained relevance of issues like Sustainable Energy and Environmental Protection and Road Safety and Traffic Management, suggesting continued public concern and policy engagement over the years.

Figure 6 illustrates the comparison of different embeddings on the iVoting platform in terms of NPMI. The results indicate that the all-MiniLM-L6-v2 embedding consistently outperformed other models across varying topic counts, with the highest NPMI values observed at lower topic counts. Conversely, embeddings from the models published by the Beijing Academy of Artificial Intelligence (BAAI)—BAAI/bge-large-en-v1.5 and BAAI/bge-small-en-v1.5—exhibited lower NPMI scores, particularly as the number of topics increased. This suggests that these models may be less effective at capturing the underlying semantic structure. The shaded areas around the curves represent the confidence intervals, highlighting the stability of the different models across multiple trials.

In terms of computational efficiency, Figure 7 displays the average computation time required for each embedding model on the iVoting platform. The results reveal that the BAAI/bge-base-en-v1.5 and BAAI/bge-small-en-v1.5 embeddings maintained relatively low computation times, ranging between 4 and 5 s, regardless of the number of topics. In contrast, the all-MiniLM-L6-v2 embedding showed increased computation times, particularly at higher topic counts, although it remained below 7 s. This suggests a trade-off between embedding performance and computational efficiency.

Figure 3 further explores the computational time associated with different embeddings for the Join platform. Notably, all embeddings exhibited a consistent computation time that remained stable across various topic counts, with the all-mpnet-base-v2 and BAAI/bge-large-en-v1.5 embeddings exhibiting slightly higher times compared to the other models.

Figure 6 presents a comparison of the performance of different sentence embedding models in topic modeling for the iVoting platform, evaluated using the Normalized Pointwise Mutual Information (NPMI) metric. The embeddings tested include all-MiniLM-L6-v2, all-mpnet-base-v2, BAAI/bge-base-en-v1.5, BAAI/bge-large-en-v1.5, and BAAI/bge-small-en-v1.5, with the number of topics varying from 10 to 50. Among the models, BAAI/bge-base-en-v1.5 consistently achieves the highest NPMI scores, indicating superior topic coherence, particularly as the number of topics increases beyond 20. In contrast, all-mpnet-base-v2 and BAAI/bge-large-en-v1.5 exhibit lower coherence across most topic numbers, with notable drops in performance when the number of topics exceeds 30. The all-MiniLM-L6-v2 model performs relatively stably but remains slightly behind BAAI/bge-base-en-v1.5 in overall coherence. While BAAI/bge-small-en-v1.5 shows more variability in its performance, it demonstrates a recovery in NPMI as the number of topics increases beyond 30. These results suggest that BAAI/bge-base-en-v1.5 is the most effective embedding model for thematic analysis on the iVoting platform, providing the highest coherence across various topic configurations. Similarly, Figure 8 compares the different embeddings on the Join platform.

Figure 7 illustrates the computational time required by various embedding models during topic modeling on the iVoting platform, as a function of the number of topics. The models evaluated include all-MiniLM-L6-v2, all-mpnet-base-v2, BAAI/bge-base-en-v1.5, BAAI/bge-large-en-v1.5, and BAAI/bge-small-en-v1.5. Among these models, BAAI/bge-large-en-v1.5 consistently exhibits the longest computational times, with a noticeable peak when the number of topics is around 20, reaching approximately 7 s. In contrast, the BAAI/bge-small-en-v1.5 model demonstrates the lowest computational times, staying below 4 s across all topic configurations. The all-mpnet-base-v2 model starts with relatively high computation times around 5 s but stabilizes as the number of topics increases. The all-MiniLM-L6-v2 model shows a gradual decrease in computational time as the number of topics increases, performing similarly to BAAI/bge-small-en-v1.5 at higher topic counts. BAAI/bge-base-en-v1.5 remains stable, with moderate computation times slightly above 4 s. Overall, these results indicate a clear trade-off between computational efficiency and model size, where smaller models like BAAI/bge-small-en-v1.5 offer significantly faster processing times while larger models such as BAAI/bge-large-en-v1.5 demand more computational resources.

In summary, the computational time results on Join suggest that larger models like BAAI/bge-large-en-v1.5 incur significantly higher computational costs, while smaller models, particularly BAAI/bge-small-en-v1.5, offer much faster processing times, making them potentially more suitable for large-scale or time-sensitive e-participation tasks.

Figure 9 displays the computational time required by different embedding models for topic modeling on the JOIN platform, plotted against the number of topics. The models under comparison include all-MiniLM-L6-v2, all-mpnet-base-v2, BAAI/bge-base-en-v1.5, BAAI/bge-large-en-v1.5, and BAAI/bge-small-en-v1.5. In contrast to the iVoting dataset, the BAAI/bge-large-en-v1.5 model takes the longest computational time consistently across all topic numbers, exceeding 100 s. The BAAI/bge-base-en-v1.5 follows at a stable computational time of approximately 40 s, while BAAI/bge-small-en-v1.5 performs significantly better in terms of computational efficiency, taking less than 20 s. The all-mpnet-base-v2 model starts around 60 s for 10 topics and gradually decreases to about 55 s with an increasing number of topics, showing some efficiency improvements with larger topic numbers. The all-MiniLM-L6-v2 model, while relatively stable, sits close to the performance of BAAI/bge-small-en-v1.5, maintaining a time between 15 and 20 s.

Table 2 presents the evaluation metrics for different topic-modeling approaches across multiple datasets. Across all datasets, Top2Vec consistently achieves the highest NPMI scores, indicating strong topic coherence, while also maintaining high diversity, particularly in the iVoting dataset where it reaches the maximum diversity score of 1. The proposed approach based on BERTopic performs competitively, particularly on the BBC News and Trump datasets, achieving NPMI values of 0.167 and 0.066, respectively, with relatively high diversity. CTM_C also demonstrates strong diversity scores across most datasets but lags slightly behind Top2Vec in terms of coherence. LDA and NMF, while traditional topic-modeling methods, generally exhibit lower NPMI scores and more varied diversity results, suggesting weaker topic coherence. Notably, the Trump dataset shows negative NPMI values for LDA and Top2Vec, indicating potential challenges in extracting meaningful topics from this dataset. These results highlight the trade-offs between coherence and diversity across different models and datasets, emphasizing the advantages of transformer-based and deep learning-based approaches over traditional methods.

4.2. Human Validation of Topic Coherence

To further assess the correctness of the extracted topics, a human validation experiment was conducted with domain experts. A total of 110 topics were reviewed, and each was categorized as either correct, incorrect, or uncertain. The results indicate that 81.82% of the topics were deemed correct, 10.91% were identified as incorrect, and 7.27% fell into the uncertain category, as shown in Figure 10.

These findings suggest that the topic-modeling approach effectively captures meaningful themes in e-participation discussions. However, the presence of incorrect and uncertain topics highlights areas for potential improvement. The uncertain category suggests that some topics may lack clear semantic boundaries or require further refinement. Future work will extend this validation by incorporating a broader panel of domain experts and employing inter-annotator agreement measures to improve the robustness of the evaluation.

Further, the effectiveness of the topic-modeling pipeline was evaluated using the BBC dataset [29], which consists of documents labeled into five thematic groups: business, entertainment, politics, sport, and tech. The dataset consists of 2225 articles. The labels were encoded using LabelEncoder from the scikit-learn library, and the dataset was split into training and testing sets with an 80-20 ratio.

The pipeline was trained on the training set and evaluated on the held-out test set. The following hyperparameters were used: top_n_words: 10, UMAP: n_neighbors=15, n_components=2, metric=’cosine’. To assess the model’s performance, several evaluation metrics were computed. The Adjusted Rand Index (ARI) [30] measured the agreement between the predicted clusters and the true labels, yielding an ARI of 0.888. The Normalized Mutual Information (NMI) quantified the mutual information between the predicted clusters and the true labels, achieving an NMI of 0.874. The accuracy, representing the proportion of correctly classified documents, was 0.953. Additionally, precision, recall, and F1-score were calculated, which were 0.955, 0.953, and 0.952, respectively.

The results were visualized using a UMAP projection of the test-set embeddings, as shown in Figure 11. The scatter plot highlights the true and predicted labels, providing insights into the model’s ability to separate the classes in the embedding space. Misclassified samples were also analyzed to identify potential limitations of the model. Some documents belonging to overlapping topics (e.g., business and politics) were harder to accurately classify. Additionally, a confusion matrix was visualized in order to show the models performance Figure 12.

5. Discussion

The findings indicate that while the all-MiniLM-L6-v2 embedding provides superior performance in terms of NPMI, it may require additional computational resources, particularly as the number of topics increases. In contrast, the BAAI/bge-base-en-v1.5 and BAAI/bge-small-en-v1.5 embeddings, while less effective in capturing semantic relationships, offer improved efficiency, making them suitable for applications where computational resources are limited.

One of the key observations is the impact of embedding model size on computational efficiency and topic coherence. Larger models, such as BAAI/bge-large-en-v1.5, tend to achieve higher coherence in some cases but come at the cost of significantly longer computation times. This trade-off highlights the need for careful consideration of the specific requirements of e-participation platforms. For instance, platforms operating in resource-constrained environments may prioritize smaller, more efficient embeddings, even if it means sacrificing some degree of semantic accuracy.

The e-participation platforms Join and iVoting, while complementary in purpose, differ significantly in their scope, functionality and deployment. Join, introduced in February 2015 by the National Development Council, was designed to foster whole-citizen participation by facilitating comprehensive civic engagement [13]. It allows users to propose, discuss, and vote on policy issues, as well as provide feedback on draft legislation. In contrast, iVoting—launched during the 2012 Legislative Yuan elections—focuses specifically on enhancing the electoral process through informed voting. As Taiwan’s first Voting Advice Application (VAA), iVoting helps users make more deliberate and informed decisions by offering issue position diagnostics. An additional point of interest is the effect of multilingual datasets on topic coherence. While Huang et al. [13] highlight that little is known about the participation platform Join due to language barriers, our translation pipeline demonstrates that efficient NLP applications can help analyze large volumes of proposals in Chinese and synthesize their thematic content. The major discourse on Join and iVoting has been extracted. It could be seen that both platforms have common topics that are of concern for their users. However, the translation of proposals into English before applying neural topic models may introduce noise, potentially lowering the coherence scores. This effect warrants further investigation, particularly in platforms with high linguistic diversity. Future studies could explore techniques such as fine-tuning multilingual embeddings to better handle translation artifacts and maintain semantic integrity across languages.

The temporal analysis of topics as can be seen in Figure 5 also provides valuable insights into the evolving interests of users on e-participation platforms. For example, spikes in topics related to public health, such as COVID-19 responses, align with significant global events, underscoring the potential of such platforms to reflect and adapt to real-time societal needs. Understanding these shifts can help policymakers identify emerging concerns and allocate resources more effectively.

Moreover, the results emphasize the importance of confidence intervals in evaluating model performance. The variability observed in models like BAAI/bge-small-en-v1.5 suggests that embedding stability across multiple trials is a critical factor in determining their suitability for deployment. By incorporating robustness checks, practitioners can ensure more reliable and interpretable results in their analyses.

From a methodological perspective, this study also demonstrates the utility of combining coherence metrics like NPMI with computational efficiency measures to provide a holistic evaluation of embedding models. While coherence reflects the semantic quality of topics, computational efficiency is crucial for scaling these methods to larger datasets and real-time applications. Future work could integrate additional metrics, such as diversity or specificity of topics, to further enhance the evaluation framework.

The high ARI and NMI scores on the BBC dataset indicate the strong clustering performance of the proposed method, while the high accuracy and F1-score demonstrate the effective mapping of predicted topics to true labels. Future work will focus on hyperparameter tuning and refining topic representations to improve performance on overlapping topics. The confusion matrix analysis suggests that misclassifications tend to occur in topics that share thematic content, which can be addressed by further model refinement.

Finally, these findings contribute to the broader discourse on the use of artificial intelligence in public governance. By leveraging neural topic models, e-participation platforms can enable more nuanced analyses of citizen input, ultimately fostering greater transparency and inclusivity. However, as these platforms become more integral to democratic processes, ensuring the ethical and equitable application of AI techniques remains a critical challenge. Addressing issues such as algorithmic bias and accessibility will be essential to fully realize the potential of AI-enhanced public engagement.

6. Conclusions

In this study, we analyzed various embedding models applied to text mining within e-participation platforms, specifically focusing on iVoting and Join. Our findings reveal that the all-MiniLM-L6-v2 embedding demonstrated the highest efficacy in capturing meaningful semantic relationships through improved NPMI scores, despite incurring higher computational costs. Conversely, embeddings such as BAAI/bge-base-en-v1.5 and BAAI/bge-small-en-v1.5 exhibited lower performance metrics but offered a more efficient computational profile, making them viable alternatives for applications requiring speed and resource efficiency. Additionally, this study has shown that the proposed pipeline is also applicable to other datasets such as BBC [29] and that the proposed pipeline can obtain valuable insights on e-participation platforms while overcoming the language barriers present in iVoting and Join.

This analysis underscores the necessity for practitioners and researchers in the field of e-participation to carefully consider their choice of embedding models based on the specific context and objectives of their text-mining efforts. As e-participation platforms increasingly play a role in democratic governance, the insights derived from effective text-mining techniques can significantly enhance the understanding of public sentiment and the thematic evolution of proposals over time. Future research should continue to explore advanced embedding techniques and hybrid models to refine the balance between performance and computational efficiency, ultimately contributing to more effective public engagement and decision-making processes.

Despite these contributions, several limitations remain. First, the presence of noise in the dataset poses challenges in accurately classifying proposals, and future work should explore methods to refine classifications with greater precision. Second, LLMs generate unfiltered text, which may introduce ambiguous or politically sensitive topic labels. Addressing this issue requires incorporating filtering mechanisms to improve the interpretability of generated topics. Third, proposals often exhibit semantic overlap, meaning they can belong to multiple categories simultaneously. This ambiguity affects both traditional clustering approaches and soft clustering techniques. Further research is needed to develop evaluation metrics that can effectively assess topic models in cases where soft clustering is necessary.

Author Contributions

M.S.: Investigation, Methodology, Software, Visualization, Writing—Original Draft. J.F.: Writing—Review and Editing, Investigation, Methodology, Conceptualization. S.-Y.C.: Conceptualization, Supervision, Resources, Funding Acquisition, Project Administration. Y.-L.K.: Supervision, Resources. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by two institutions: (1) The National Science and Technology Council of the Republic of China (Taiwan) under grant NSTC 110-2221-E-011-136-MY3 and the Smart Manufacturing Innovation Center as part of the Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Science and Technology (MOST) in Taiwan under grant 109-2221-E-011-069. (2) By the German Federal Ministry of Education and Research (BMBF) funded project “Worldmaking from a Global Perspective: A Dialogue with China”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Some of the data are publicly available; other data and the code for the experiments will be made available by the authors.

Acknowledgments

We would like to thank JOIN and iVoting for publishing the data and therefore enabling us to conduct this study. This article is a revised and expanded version of a paper entitled “Text-Mining of E-Participation Platforms: Applying Topic Modeling on Join and iVoting in Taiwan”, which was presented at TE2024: Engineering for Social Change, London, UK, 9–11 July 2024.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Dumas, C.L.; LaManna, D.; Harrison, T.M.; Ravi, S.S.; Kotfila, C.; Gervais, N.; Hagen, L.; Chen, F. Examining Political Mobilization of Online Communities through E-Petitioning Behavior in We the People. Big Data Soc. 2015, 2, 2053951715598170. [Google Scholar] [CrossRef]
Park, C.H.; Johnston, E.W. An Event-Driven Lens for Bridging Formal Organizations and Informal Online Participation: How Policy Informatics Enables Just-in-Time Responses to Crises. In Policy Analytics, Modelling, and Informatics: Innovative Tools for Solving Complex Social Problems; Gil-Garcia, J.R., Pardo, T.A., Luna-Reyes, L.F., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 343–361. [Google Scholar] [CrossRef]
Hagen, L. Content Analysis of E-Petitions with Topic Modeling: How to Train and Evaluate LDA Models? Inf. Process. Manag. 2018, 54, 1292–1307. [Google Scholar] [CrossRef]
Chen, C.H.; Liu, C.L.; Hui, B.P.H.; Chung, M.L. Does Education Background Affect Digital Equal Opportunity and the Political Participation of Sustainable Digital Citizens? A Taiwan Case. Sustainability 2020, 12, 1359. [Google Scholar] [CrossRef]
Rodríguez Bolívar, M.P. Policy Makers’ Perceptions About Social Media Platforms for Civic Engagement in Public Services. An Empirical Research in Spain. In Policy Analytics, Modelling, and Informatics: Innovative Tools for Solving Complex Social Problems; Gil-Garcia, J.R., Pardo, T.A., Luna-Reyes, L.F., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 267–288. [Google Scholar] [CrossRef]
Arnstein, S.R. A Ladder Of Citizen Participation. J. Am. Inst. Planners 1969, 35, 216–224. [Google Scholar] [CrossRef]
Huesmann, C.; Renkamp, A. Digitale Bürgerdialoge—Eine Chance für die lokale Demokratie. Available online: https://www.bertelsmann-stiftung.de/doi/10.11586/2021015 (accessed on 30 December 2024).
Fischer, D.; Brändle, F.; Mertes, A.; Pleger, L.E.; Rhyner, A.; Wulf, B. Partizipation im digitalen Staat: Möglichkeiten und Bedeutung digitaler und analoger Partizipationsinstrumente im Vergleich. Swiss Yearb. Adm. Sci. 2020, 11, 129–144. [Google Scholar] [CrossRef]
Dienel, H.L.; von Blanckenburg, C.; Bach, N. Mini Publics Online: Erfahrungen mit Online-Bürgerräten und Online-Planungszellen. In Handbuch Digitalisierung und Politische Beteiligung; Kersting, N., Radtke, J., Baringhorst, S., Eds.; Springer Fachmedien Wiesbaden: Wiesbaden, Germany, 2020; pp. 1–16. [Google Scholar] [CrossRef]
Liao, D.c.; Chen, B. Strengthening Democracy: Development of the iVoter Website in Taiwan. In Political Behavior and Technology: Voting Advice Applications in East Asia; Liao, D.c., Chen, B., Jensen, M.J., Eds.; Palgrave Macmillan US: New York, NY, USA, 2016; pp. 67–89. [Google Scholar] [CrossRef]
Rauchfleisch, A.; Tseng, T.H.; Kao, J.J.; Liu, Y.T. Taiwan’s Public Discourse About Disinformation: The Role of Journalism, Academia, and Politics. J. Pract. 2023, 17, 2197–2217. [Google Scholar] [CrossRef]
Liebeck, M.; Esau, K.; Conrad, S. Text Mining Für Online-Partizipationsverfahren: Die Notwendigkeit Einer Maschinell Unterstützten Auswertung. HMD Prax. Wirtsch. 2017, 54, 544–562. [Google Scholar] [CrossRef]
Huang, H.Y.; Kovacs, M.; Kryssanov, V.; Serdült, U. Towards a Model of Online Petition Signing Dynamics on the Join Platform in Taiwan. In Proceedings of the 2021 Eighth International Conference on eDemocracy & eGovernment (ICEDEG), Quito, Ecuador, 28–30 July 2021; pp. 199–204. [Google Scholar] [CrossRef]
Tang, A. A Young Democracy Is a Strong Democracy: Civil Rights of Taiwan’s Children. 2020. Available online: https://freedomreport.5rightsfoundation.com/protecting-children-online-the-past-present-and-future (accessed on 17 February 2025).
Hsiao, H. ICT-mixed community participation model for development planning in a vulnerable sandbank community: Case study of the Eco Shezi Island Plan in Taipei City, Taiwan. Int. J. Disaster Risk Reduct. 2021, 58, 102218. [Google Scholar] [CrossRef]
Lee, C.p.; Chen, D.Y.; Huang, T.y. The Interplay Between Digital and Political Divides. Soc. Sci. Comput. Rev. 2014, 32, 37–55. [Google Scholar] [CrossRef]
Törnberg, P. How to Use LLMs for Text Analysis. arXiv 2023, arXiv:2307.13106. [Google Scholar]
Sontheimer, M.; Fahlbusch, J.; Korjako, T.; Chou, S.Y. Text-Mining of E-Participation Platforms: Applying Topic Modeling on Join and iVoting in Taiwan. In Proceedings of the TE2024: Engineering for Social Change, London, UK, 9–11 July 2024; Cooper, A., Ed.; pp. 105–112. [Google Scholar] [CrossRef]
Lapesa, G.; Blessing, A.; Blokker, N.; Dayanik, E.; Haunss, S.; Kuhn, J.; Padó, S. Analysis of Political Debates through Newspaper Reports: Methods and Outcomes. Datenbank-Spektrum 2020, 20, 143–153. [Google Scholar] [CrossRef]
Dayanik, E.; Blessing, A.; Blokker, N.; Haunss, S.; Kuhn, J.; Lapesa, G.; Pado, S. Improving Neural Political Statement Classification with Class Hierarchical Information. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, 22–27 May 2022; pp. 2367–2382. [Google Scholar] [CrossRef]
Grootendorst, M. BERTopic: Neural Topic Modeling with a Class-Based TF-IDF Procedure. arXiv 2022, arXiv:2203.05794. [Google Scholar]
Malzer, C.; Baum, M. A Hybrid Approach To Hierarchical Density-based Cluster Selection. In Proceedings of the 2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), Karlsruhe, Germany, 14–16 September 2020; pp. 223–228. [Google Scholar] [CrossRef]
Roberts, M.E.; Stewart, B.M.; Tingley, D.; Airoldi, E. The Structural Topic Model and Applied Social Science. In Proceedings of the International Conference on Neural Information Processing, Lake Tahoe, NV, USA, 5–10 December 2013. [Google Scholar]
Doan, T.N.; Hoang, T.A. Benchmarking Neural Topic Models: An Empirical Study. In Proceedings of the Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Online Event, 1–6 August 2021; pp. 4363–4368. [Google Scholar]
Moodley, A.; Marivate, V. Topic Modelling of News Articles for Two Consecutive Elections in South Africa. In Proceedings of the 2019 6th International Conference on Soft Computing & Machine Intelligence (ISCMI), Johannesburg, South Africa, 19–20 November 2019; pp. 131–136. [Google Scholar] [CrossRef]
Li, Z.; Mao, A.; Stephens, D.; Goel, P.; Walpole, E.; Dima, A.; Fung, J.; Boyd-Graber, J. Improving the TENOR of Labeling: Re-evaluating Topic Models for Content Analysis. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), St. Julian’s, Malta, 17–22 March 2024; Graham, Y., Purver, M., Eds.; Association for Computational Linguistics: St. Julian’s, Malta, 2024; pp. 840–859. [Google Scholar]
Terragni, S.; Fersini, E. An Empirical Analysis of Topic Models: Uncovering the Relationships between Hyperparameters, Document Length and Performance Measures. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), Online, 1–3 September 2021; pp. 1408–1416. [Google Scholar]
Terragni, S.; Fersini, E.; Galuzzi, B.G.; Tropeano, P.; Candelieri, A. OCTIS: Comparing and Optimizing Topic Models Is Simple! In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, Online, 19–23 April 2021; pp. 263–270. [Google Scholar] [CrossRef]
Bose, B. BBC News Classification; Kaggle: San Francisco, CA, USA, 2019; Available online: https://kaggle.com/competitions/learn-ai-bbc (accessed on 1 February 2025).
Halkidi, M.; Batistakis, Y.; Vazirgiannis, M. Cluster validity methods: Part I. ACM Sigmod Rec. 2002, 31, 40–45. [Google Scholar] [CrossRef]

Figure 1. Number of proposals submitted on Join from 2015 to 2025.

Figure 2. Number of proposals submitted on iVoting from 2017 to 2022.

Figure 3. Proposed flow of our pipeline for the topic modeling.

Figure 4. Topic model clustering of citizen participation on the Join platform from 2015 to 2025. Each color represents a distinct thematic cluster, capturing key areas of public discourse.

Figure 5. Evolution of key topics over time on the Join platform from 2015 to 2025. The y-axis represents the frequency of topic mentions, while the x-axis shows the timeline.

Figure 6. Comparison of different embeddings on iVoting regarding their NPMI and the number of topics per topic model.

Figure 7. Comparison of different embeddings on iVoting regarding their computational time and the number of topics per topic model.

Figure 8. Comparison of different embeddings on Join regarding their NPMI and the number of topics per topic model.

Figure 9. Comparison of different embeddings on JOIN regarding their computational time and the number of topics per topic model.

Figure 10. The results of the human validation experiment. The majority of topics (81.82%) were classified as correct, while 10.91% were incorrect and 7.27% were uncertain.

Figure 11. UMAP projection of the test-set embeddings. The outer circle colors represent predicted labels, while the inner colors represent true labels.

Figure 12. Confusion matrix showing the classification performance of the model across different topics. The matrix highlights true positives, false positives, true negatives, and false negatives for each class, providing insights into areas of misclassification, particularly in overlapping topics.

Table 1. Cluster counts for Join and iVoting proposals.

JOIN Cluster	Count	iVoting Cluster	Count
Government Policies and Regulations	6710	Governance and Sustainability in Taiwan	247
Education Policies in Taiwan	2702	Taipei Public Transport Infrastructure	26
Traffic Safety Regulations	1947	Traffic Management	81
Energy Transition and Sustainable Development	1294	Taiwan’s Epidemic Prevention Efforts	25
Government Response to COVID-19 Epidemic	1149	Education Technology and Reform	25
Labor Rights and Regulations in Taiwan	997	Public Servant Conduct and Accountability	24
Debate over the death penalty in Taiwan	810	Waste Management and Recycling in Taiwan	18
High-Speed Rail Extension in Pingtung, Taiwan	743	Taiwan’s Epidemic Prevention Efforts	15
Electoral Reform	729	Labor and Social Welfare in Taiwan	10
Gender and Military Service	535	Inappropriate Content	8
Nationality and Identity in China	513
Pet and Stray Animal Management	461
Drunk Driving Laws and Penalties	427
Tobacco Control and Smoking Regulations	416
Housing Market Regulation	360
Parenting and Fertility Support in Taiwan	289
Media Regulation and False Information Online	236
Lowering the age limit for Obtaining a Motorcycle	223
Marriage Laws and Regulations	205
Fiscal policies and taxation	177

Table 2. Evaluation metrics for different topic models across various datasets.

Model	20 News Group		BBC News		Join		Trump		iVoting
	NPMI	Diversity	NPMI	Diversity	NPMI	Diversity	NPMI	Diversity	NPMI	Diversity
LDA	0.058	0.749	0.014	0.577	0.001	0.572	−0.011	0.502	−0.04	0.124
CTM_C	0.096	0.886	0.094	0.819	0.033	0.862	0.009	0.855	−0.24	0.746
NMF	0.089	0.663	0.012	0.549	0.061	0.322	0.009	0.379	−0.04	0.343
Top2Vec	0.192	0.823	0.171	0.792	0.081	0.898	−0.169	0.658	−0.16	1.000
Proposed Approach	0.166	0.851	0.167	0.794	0.060	0.598	0.066	0.663	−0.04	0.436

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sontheimer, M.; Fahlbusch, J.; Chou, S.-Y.; Kuo, Y.-L. Content Analysis of E-Participation Platforms in Taiwan with Topic Modeling: How to Train and Evaluate Neural Topic Models? Appl. Sci. 2025, 15, 2263. https://doi.org/10.3390/app15052263

AMA Style

Sontheimer M, Fahlbusch J, Chou S-Y, Kuo Y-L. Content Analysis of E-Participation Platforms in Taiwan with Topic Modeling: How to Train and Evaluate Neural Topic Models? Applied Sciences. 2025; 15(5):2263. https://doi.org/10.3390/app15052263

Chicago/Turabian Style

Sontheimer, Moritz, Jonas Fahlbusch, Shuo-Yan Chou, and Yu-Lin Kuo. 2025. "Content Analysis of E-Participation Platforms in Taiwan with Topic Modeling: How to Train and Evaluate Neural Topic Models?" Applied Sciences 15, no. 5: 2263. https://doi.org/10.3390/app15052263

APA Style

Sontheimer, M., Fahlbusch, J., Chou, S.-Y., & Kuo, Y.-L. (2025). Content Analysis of E-Participation Platforms in Taiwan with Topic Modeling: How to Train and Evaluate Neural Topic Models? Applied Sciences, 15(5), 2263. https://doi.org/10.3390/app15052263

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Content Analysis of E-Participation Platforms in Taiwan with Topic Modeling: How to Train and Evaluate Neural Topic Models? †

Abstract

1. Introduction

2. Literature Review

2.1. E-Participation Platforms in Taiwan

2.2. Neural Topic Modeling in Policy Informatics

3. Methods

3.1. Dataset

3.2. Proposed Pipeline

3.3. Pipeline Mathematical

3.4. Document Embeddings

3.5. Dimensionality Reduction

3.6. Clustering with HDBSCAN

3.7. Topic Representation

3.8. Topic Evaluation Metrics

3.8.1. NPMI for Topic Coherence

3.8.2. Topic Diversity

3.9. Refinements

3.10. Problem Definition

3.11. Evaluation

4. Results

4.1. Topic Models

4.2. Human Validation of Topic Coherence

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Content Analysis of E-Participation Platforms in Taiwan with Topic Modeling: How to Train and Evaluate Neural Topic Models?^†