Decoding Digital Labor: A Topic Modeling Analysis of Platform Work Experiences

Ütük Bayılmış, Oya; Orhan, Serdar

doi:10.3390/systems13090819

Open AccessArticle

Decoding Digital Labor: A Topic Modeling Analysis of Platform Work Experiences

by

Oya Ütük Bayılmış

^1,2

and

Serdar Orhan

^3,*

¹

Labour Economics and Industrial Relations, Institute of Social Sciences, Sakarya University, 54050 Sakarya, Türkiye

²

Sakarya Vocational School, Sakarya University of Applied Sciences, 54290 Sakarya, Türkiye

³

Labour Economics and Industrial Relations, Faculty of Political Sciences, Sakarya University, 54050 Sakarya, Türkiye

^*

Author to whom correspondence should be addressed.

Systems 2025, 13(9), 819; https://doi.org/10.3390/systems13090819

Submission received: 13 August 2025 / Revised: 9 September 2025 / Accepted: 15 September 2025 / Published: 18 September 2025

(This article belongs to the Special Issue Business Model Innovation in the Digital Era)

Download

Browse Figures

Versions Notes

Abstract

The growing prevalence of digital labor platforms has fundamentally transformed business models by creating interconnected value systems that redefine how work is organized, delivered, and monetized in today’s digital economy. This study examines platform-based business model innovation through the lens of value co-creation processes, analyzing user-generated content from digital work platforms including Reddit, FlexJobs, Toptal, and Deel. Using Latent Dirichlet Allocation (LDA) topic modeling on 342 semantically filtered reviews from platform workers, we identified six key themes characterizing stakeholder experiences: User Experience and Platform Evaluation (23.77%), Financial Concerns and Time Management (18.49%), Platform Satisfaction and Recommendation System (16.60%), Paid Services and Investment Strategies (15.09%), Job Search Processes and Remote Work Alternatives (13.96%), and Overall Platform Performance and Account Management (12.08%). These findings reveal how digital platforms create value through complex interactions between technology infrastructure, governance mechanisms, and stakeholder experiences within interconnected ecosystems. The dominance of user experience concerns over purely economic considerations challenges traditional labor economics frameworks and highlights the critical role of platform design in worker satisfaction. Our analysis demonstrates that successful plsatform business models depend on balancing technological capabilities with human-centered value propositions, requiring innovative approaches to ecosystem orchestration, stakeholder engagement, and value distribution. The study contributes to understanding how digital business models can leverage interconnected value systems to drive sustainable innovation, offering strategic insights for platform design, ecosystem governance, and business model optimization in the digital era.

Keywords:

platform-based work; semantic topic modeling; data analytics; behavioral trends; digital business model; digital ecosystems

1. Introduction

In our age of accelerating digital transformation, business models are evolving beyond traditional value creation mechanisms to interconnected platform ecosystems called “platform work” [1]. In these digital business models, value creation and capture occur through digital interfaces, algorithmic ecosystem orchestration mechanisms and flexible stakeholder engagement models; organizations are provided with geographical independence, project-based value delivery opportunities and autonomy in resource allocation. However, this business model innovation simultaneously brings ecosystem challenges such as value distribution complexities, revenue volatility and governance uncertainties [2,3]. These contradictory dynamics between the economic, social, psychological and technical dimensions of platform-based business models require in-depth analysis based not only on numerical data but also on stakeholder value creation experiences.

Existing academic studies often emphasize the value propositions of platform-based work, such as flexibility and geographical independence, while ecosystem challenges such as governance uncertainties, value capture fluctuations and stakeholder protection gaps are often addressed through quantitative data or case studies [2,3,4,5,6,7,8,9]. However, user reviews shared on digital platforms provide a rich source of qualitative data reflecting stakeholder experiences, value expectations and ecosystem challenges in their own voices [10,11]. By analyzing these reviews with machine learning-based topic modeling, this study aims to systematically reveal the prominent themes in the context of platform business model innovation and the value creation opportunities and ecosystem challenges presented by these interconnected value systems.

In this study, data on platform business model dynamics is analyzed through a dataset of user reviews collected from globally widely used online job platforms such as FlexJobs, Deel, Reddit and Toptal. FlexJobs, with its large pool of flexible and remote work postings; Deel, with its global payroll and legal compliance support; Toptal, with its top-tier pool of experts; and Reddit, with its dynamic discussion environment where real stakeholder experiences are shared. Semantic filtering was applied to isolate only the texts referring to value creators from the mixed pool of both service providers and value recipients. The resulting subset was analyzed using Latent Dirichlet Allocation (LDA)-based topic modeling to identify prominent latent themes and examine the distribution of positive, negative and neutral opinions within each theme. This integrated methodology systematically reveals the value creation opportunities and ecosystem orchestration capabilities offered by platform business models, as well as critical challenges such as algorithmic governance mechanisms and value distribution instabilities.

In addition, the research questions to assess possible relationships are as follows:

RQ1: What are the major themes that emerge from stakeholder value creation experiences in platform business models?
RQ2: How do these emerging themes describe the value creation opportunities and ecosystem challenges presented by platform business model innovation, and what guidance do they provide for ecosystem design or business model optimization recommendations?

The rest of this paper is organized as follows. Section 2 presents literature review on platform work and topic modeling. Section 3 outlines the methodology, including data collection, pre-processing, semantic filtering, and LDA topic modeling. Section 4 presents the LDA analysis results, revealing six key themes characterizing platform workers’ experiences. Section 5 discusses the findings and their implications. Section 6 concludes with key contributions and future research directions.

2. Literature Review

This section reviews the literature on platform work and the analysis of user reviews through topic modeling.

2.1. Platform Work

Platform work is a form of employment and business model that matches labor supply and demand to find solutions to the problems of organizations or individuals through an online platform or to provide services for a fee. Platform work models vary according to criteria such as task type, skill requirements, scale of work and service delivery format. Initially limited to low-skill micro-tasks, this portfolio has now expanded to include more complex and highly specialized projects [1,2]. Also referred to as the gig economy, crowd work or on-demand economy, platform work is an increasingly common form of non-standard work [3,4,5,6].

Mandl and Codagnone [2] argue that digitalization, together with innovative technologies such as the Internet of Things, artificial intelligence and blockchain, is a comprehensive process of sociotechnical innovation that not only transforms the structure and processes of organizations but also creates significant differences in platform-based forms of work and employment conditions. Gundert and Leschke [6], considering the heterogeneous nature of platform work relationships and the uncertainties created by algorithmic governance mechanisms, aimed to provide a systematic and comparative assessment of platform work conditions by proposing adaptations in indicators and sub-dimensions for the direct application of traditional multidimensional work quality frameworks to platform work. Cook and Rani [7] argue that the potential of digital labor platforms in emerging economies is constrained by limited digital infrastructure and skills mismatches, and that government intervention is needed for an inclusive transformation. Samant and Malik [8] found that algorithmic management increases physical and psychosocial health risks among app jobs workers and needs to be addressed through transparent policy reforms. Berg et al. [9], in a comprehensive field study prepared by the ILO, identified both the opportunities and risks of digital work platforms by comparing working conditions, wage levels, access to social protection and work intensity across five micro-task platforms.

Taken together, these studies establish a theoretical foundation for analyzing platform work through the lens of value co-creation. Workers’ lived experiences can thus be interpreted in terms of trust mechanisms, governance structures, workload and autonomy management, payment architectures, service quality frameworks, and professional identity formation. This framework not only informs our empirical approach but also connects directly to Table 1, where inductively derived themes from our LDA analysis are systematically aligned with these established theoretical dimensions. Building on prior research, the robustness of these six dimensions is further supported by established platform economy literature [1,2,3,4,5,6,7,8,9].

2.2. Studies on Topic Modeling

The literature review in Section 2.1 reveals the role of platform work in the digital transformation process, the evolution of work types, and the opportunities and risks posed by algorithmic management. In this context, the analysis of platform workers’ reviews on their experiences, expectations and challenges through topic modeling provides a critical resource for understanding current work dynamics and developing effective policy recommendations.

The recent growth in the variety and volume of information sources has led researchers to turn to more advanced computational methods to discover latent themes in large text collections. In this field, Latent Dirichlet Allocation (LDA) has been successfully applied in a wide range of fields, from environmental communication to health discourse, from service quality improvement to the detection of thematic trends in public policy texts, highlighting its simplicity, scalability and interpretability [10,11,12,13,14,15,16,17,18,19,20,21].

Analysis studies on new employment types such as ICT-based work, platform work, gig economy with topic modeling are limited. Bayılmış et al. [10] analyzed the dataset obtained from the X platform with LDA-based topic modeling and identified seven main themes such as flexibility, precariousness, algorithmic control in the context of gig economy through a real-time and large-scale dataset. Won et al. [11], in the context of direct platform work, modeled the posts of food delivery platform employees with the keywords ‘AI’ and ‘algorithm’ in online communities during the COVID-19 period with LDA and revealed the experiences under algorithmic management in the axis of “AI matching”, “driver behavior” and “platform system” themes. Rojas Rincón et al. [21] conducted a sentiment analysis of the opinions shared on the X platform regarding the remote and teleworking model in the post-COVID-19 period, revealing the positive and negative effects of this new way of working and providing suggestions to improve worker welfare. A recent study by Li et al. [22], using topic modeling to explore visitor experiences in conservation centers shaped by social identity, further demonstrates the adaptability of this method in analyzing user-generated narratives in diverse sociotechnical contexts.

Taken together, these studies demonstrate the methodological flexibility of topic modeling in revealing hidden thematic structures across different domains. For the purposes of this study, LDA was selected because of its proven ability to handle heterogeneous, short, and user-generated texts, while maintaining transparency and interpretability.

Compared to the studies mentioned above, this paper applies LDA-based topic modeling to user reviews collected from global digital work platforms (FlexJobs, Deel, Reddit, and Toptal). By using semantic filtering to isolate service provider reviews, the study systematically uncovers the opportunities and challenges of platform work from a scalable and contemporary perspective. This methodological framing also connects back to Table 1, where inductively derived themes are aligned with theoretically established dimensions, reinforcing the link between empirical evidence and conceptual frameworks.

3. Materials and Methods

This section provides a detailed discussion of the analysis methods and research processes used to uncover the perceptions of platform workers by analyzing user reviews reflecting the experiences of individuals who freelance on digital platforms. Figure 1 shows the methodological process followed in the research. The application was implemented using Python (3.11.13) programming language and modules.

3.1. Data Acquisition

The dataset of platform work user reviews was obtained from FlexJobs, Deel, Toptal applications and Reddit platform. We used Python modules including app_store_scraper for App Store data, google_play_scraper for Play Store reviews and praw for Reddit posts. In this context, a dataset consisting of 1407 reviews between 20 June 2024–20 June 2025 was obtained. The distribution of this dataset according to platforms is as follows. FlexJobs 40%, Deel 35%, Reddit 17%, Toptal 8%. To mitigate risks of commercialized or manipulated content, semantic filtering and duplicate removal were applied during pre-processing, and a subset of reviews was manually inspected to ensure that the data reflected authentic worker experiences rather than promotional or spam-like material.

These four sources were selected to capture different segments of the platform economy. While FlexJobs, Deel, and Toptal represent professional job-matching and freelancing contexts, Reddit provides a community-driven discussion space. Combining these datasets therefore allows broader coverage of worker experiences across both structured and informal platform environments.

3.2. Data Pre-Processing

User reviews in the dataset were subjected to various text cleaning processes in the preprocessing stage. In this process, texts were converted to lower case, punctuation marks and numbers were removed, and unnecessary (stop) words of the language were eliminated. Empty and duplicate reviews were also removed from the dataset. The data collection and preprocessing steps were implemented using Python libraries such as pandas, nltk and re for data cleaning and text normalization.

3.3. Semantic Filtering

The dataset was obtained from FlexJobs, Deel, Toptal, and Reddit, four widely used platforms representing different segments of digital labor. FlexJobs focuses on flexible and remote work, Deel provides payroll and job management, Toptal serves as an elite freelance talent pool, and Reddit offers a community-driven discussion space.

Reviews came from both service providers (freelancers) and service recipients (clients), covering worker profiles, customer experiences, technical issues, and overall satisfaction. A semantic filtering method was used to isolate reviews specifically related to platform-based work (e.g., freelancing, gig economy, digital labor). Unlike traditional keyword filtering, which relies on surface-level matches, semantic filtering evaluates contextual meaning, providing more precise differentiation.

Semantic filtering was implemented using the SentenceTransformer library and the “paraphrase-multilingual-MiniLM-L12-v2” model. Reviews were transformed into multidimensional vector representations and cosine similarity was calculated with predefined target sentences (e.g., “gig economy”, “courier satisfaction”, “platform labor”). Reviews above the threshold value were included in the analysis.

The cosine similarity threshold for semantic filtering was set at 0.40, which offered a balanced trade-off between coverage and thematic relevance. Threshold values between 0.35 and 0.45 were additionally tested, and the resulting thematic structure remained stable, confirming the robustness of the chosen parameter. Lower thresholds (e.g., 0.30) included excessive irrelevant reviews, thereby increasing noise, while higher thresholds (e.g., above 0.50) excessively reduced the dataset and risked excluding relevant perspectives. This evaluation indicates that the selected value of 0.40 minimizes exclusion bias while maintaining thematic coherence.

After semantic filtering, the dataset was reduced to 342 reviews, distributed as follows: Deel 47%, FlexJobs 33%, Toptal 11% and Reddit 9%.

3.4. Topic Modeling with LDA

Topic modeling is a text mining method used to discover latent thematic structures in large text corpora. In this study, Latent Dirichlet Allocation (LDA) is used as a topic modeling method. Developed by Blei et al. [12,13], LDA is a generative probabilistic model for document collections that assumes that each document consists of a mixture of randomly selected latent topics and that each topic has a given probability distribution over words. By discovering probabilistic relationships between words and topics, the model reveals the semantic structure of the text.

LDA was chosen over more recent approaches, such as BERTopic, primarily for its transparent probabilistic framework, which was essential to our research methodology. This structure facilitates the direct mapping of emergent themes as probability distributions onto our established theoretical dimensions (outlined in Table 1), a critical step in connecting our inductive findings to the pre-existing theory. Furthermore, within the context of our corpus of short and heterogeneous user-generated reviews, LDA provided a level of thematic interpretability that was most aligned with the analytical goals of this study.

The number of topics (K) was set to 6, after testing alternative values between 5 and 10. The main thematic structure—covering user experience, financial concerns, subscription services, job search processes, and overall platform performance—remained stable across this range. Among the tested models, K = 6 provided the most balanced combination of coherence and interpretability, ensuring sufficient granularity while avoiding fragmentation of overlapping themes.

Figure 2 schematically illustrates the generative structure of the LDA model from documents to confidential topics. In this structure, user reviews are used as input data. For each document, the model generates a topic distribution (θ_d) and for each word, a hidden topic (z_d,n) is sampled from this distribution. Then, the observed word (w_d,n) is generated using the word distribution (β_k) of this selected topic.

The basic variables used in the model are as follows:

-: α: Prior hyperparameter of the topic distribution per document.
-: η: Prior hyperparameter of the word distributions for each topic.
-: θ_d: The topic distribution at the document level (one vector per document).
-: β_k: Refers to the word distribution for each topic.
-: z_d,n: The hidden topic to which word n in document d belongs.
-: w_d,n: The observed word n in document d (the only observable variable of the model).

Arrows in the model represent conditional dependencies between variables, while rectangular boxes represent recurrent structures:

-: M: Total number of documents;
-: N: Number of words in each document;
-: K: The total number of topics identified.

Through this generative structure, hidden thematic patterns in unstructured text collections are discovered. As a result of the LDA model process, topic clusters, word frequencies of these topics and topic distributions in documents can be obtained as output [10,11,12,13,14,15,16].

4. Results

Before presenting the detailed LDA results, we first establish the theoretical–empirical alignment by mapping our data-driven themes to established platform value system dimensions identified in the literature (Table 1). This mapping illustrates how our inductive topic modeling outcomes connect to existing conceptual frameworks while also revealing the relative prominence of different value dimensions in actual worker experiences.

As shown in Table 1, the strongest correspondences emerge between trust mechanisms and user experience (23.77%) and between service quality frameworks and platform satisfaction (16.60%). The comprehensive keyword analysis confirms that platform workers’ narratives naturally cluster around theoretically anticipated value dimensions, thereby validating both our methodological approach and the applicability of existing platform economy theories to lived worker experiences.

It should be noted that these findings are based on merged datasets collected from different platforms (FlexJobs, Deel, Toptal, and Reddit). While the combination of distinct sources may introduce representativeness constraints, the persistence of core themes across all four platforms suggests that the integrated dataset still provides a valid and meaningful basis for analyzing both commonalities and platform-specific variations in worker experiences.

This section contains the results of the analysis of reviews from FlexJobs, Deel, Toptal and Reddit on platform work workers.

The word cloud in Figure 3 provides a visual representation of the most frequently used terms in the set of texts reflecting the experiences of platform workers. As can be seen from the figure, core concepts such as “job”, “money”, “work”, “platform”, “service” are central, while experiential and transactional terms such as “subscription”, “payment”, “time”, ‘experience’, “easy” are in the periphery. Word sizes show that platform workers primarily focus on finding a job (“job”), earning income (“money”) and working experience (‘work’, “experience”).

In addition, the presence of evaluative terms such as “like”, “good”, ‘best’, “easy” indicates that platform workers actively express opinions about service quality and user experience. The terms “subscription”, “payment” and “reliable” reveal the presence of economic sustainability and financial security concerns. These visualization results are consistent with the thematic structures revealed in the LDA topic modeling analysis and confirm the main areas of concern of individuals working in the platform economy.

To answer RQ1, the analysis of the set of texts collected from FlexJobs, Deel, Toptal, and Reddit reveals that freelancers’ experiences in platform business models are shaped around six key themes. Key terms such as “job”, “money”, “work”, “platform” and “experience” appear consistently, suggesting that workers in the platform economy are primarily focused on job opportunities, economic returns, and work experience.

Table 2 shows the topics and key keywords distributed under 6 categories according to the results of LDA.

Analyzing the reviews allocated according to the six different topic categories identified by LDA, the following explanations can be derived:

Topic 1—User Experience and Platform Evaluation (23.77%): This topic reflects debates around ease of use, speed of recruitment processes and reliability of payment systems, which are central to the working experiences of platform workers. Particular emphasis is placed on the user-friendly interfaces of platforms such as Deel and the quality of interactions with companies. This theme reveals the key components of a successful experience in the platform economy.

Topic 2—Financial Concerns and Time Management (18.49%): This category addresses freelancers’ concerns about economic sustainability and time efficiency. The difficulty of platform workers reaching their monthly income targets, the efficiency of time spent in the job search process, and inefficiencies in the application process are discussed under this theme. It shows that financial insecurity is the main concern of platform workers.

Topic 3—Platform Satisfaction and Recommendation System (16.60%): This topic covers platform workers’ recommendation mechanisms for each other and their opinions on the most reliable platform choices. The abundance of positive evaluation terms such as “best”, “amazing”, “reliable” emphasizes the importance of knowledge sharing and experience transfer within the community. This theme highlights the critical role of peer-to-peer recommendations in the platform economy.

Topic 4—Paid Services and Investment Strategies (15.09%): This category reflects discussions around premium memberships, subscription models and the evaluation of paid services. Platform workers’ strategic thinking and opportunity cost analyses on which investments will bring them returns are discussed under this theme. Cost–benefit evaluation of paid features on platforms is prominent.

Topic 5—Job Search Processes and Remote Work Alternatives (13.96%): This topic focuses on active job search behaviors, exploring free platform options and remote work opportunities. The terms “remote”, “free”, “help” indicate that platform workers are looking for cost-effective solutions and prefer flexible working models. This theme emphasizes the importance of accessibility and geographic flexibility in the platform economy.

Topic 6—Overall Platform Performance and Account Management (12.08%): This category includes comprehensive assessments of the overall usability of platforms, account management processes and the functioning of payment systems. The quality of the technical infrastructure of platform workers and the holistic evaluation of the user experience are covered under this theme. The terms “overall” and “account” indicate the presence of a systemic perspective.

Overall, these six themes reveal the multidimensional nature of the experiences of freelancers working in the platform economy. As seen in Figure 4, the top two themes with the highest percentage (42.26%) indicate that user experience and financial security concerns are the primary areas of concern for platform workers. The visual representation of the topic distribution clearly reveals the hierarchical structure of platform workers’ experiences.

Figure 5 shows the “Intertopic Distance Map” produced by the LDA model, which visually presents the similarities, differences and relationships between topics. The figure was generated using the LDAvis method developed by Sievert and Shirley [23] and provides a multidimensional analysis of the thematic structure of platform economy experiences. The x and y axes represent coordinates obtained from multidimensional scaling (MDS) based on the Jensen-Shannon deviation between topic distributions. Each circle represents a topic, and the size of the circle indicates the marginal topic distribution of that topic across the text set.

In the visualization, the central and dominant position of Topic 1 (23.77%) reveals that the theme of “User Experience and Platform Evaluation” is the most critical element in the experiences of platform workers. The positioning of Topic 2 and Topic 3 close to Topic 1 indicates the existence of strong thematic links between financial concerns and platform satisfaction.

The more distant positioning of Topics 4, 5 and 6 suggests that these topics (paid services, recruitment processes and overall platform performance) have relatively different content profiles from the other themes. This distribution suggests that the experiences of platform workers are organized around two main clusters: core concerns (Topic 1–3) and specific operational issues (Topic 4–6).

The bar chart on the right shows the 30 most relevant terms for Topic 1. The red bars reflect the estimated term frequency within the selected topic, while the blue bars reflect the overall corpus frequency. The predominant red distribution of terms such as “deel”, “company”, “easy”, “user” indicates that these words are characteristic terms specific to Topic 1, while the predominant blue distribution of terms such as ‘work’, “service” indicates that these terms are widely used in the general corpus.

Figure 6 shows a word co-occurrence graph visualizing the semantic relationships between words that occur together in the text set on which the model was trained. Words that frequently occur together in the same document are represented by denser edges; these connections between word nodes reveal which concepts are mentioned together in user utterances. In the graph, a modularity-based algorithm was used for community detection and semantically close words were grouped with the same color. When the visual is analyzed, it is observed that users frequently use positive attributes such as “easy”, “fast”, ‘friendly’, “reliable” together (blue clustering) while describing their application/platform experiences, and accordingly, they create a positive theme. On the other hand, the yellow/purple region, where words such as “money”, “don’t”, “subscription”, ‘waste’, “paid” are clustered, reflects the thematic structure in which users express their negative experiences. This structure supports the fact that the topics that the LDA model statistically decomposes also show a significant decomposition at the semantic level, that is, the model is successful not only numerically but also in terms of semantic validity.

Perplexity is a measure of how well the model can predict words in the test data. This value is calculated by considering the probabilities that the model assigns to words. Generally, perplexity decreases as the number of topics increases, but this does not necessarily mean a better model. Low perplexity can sometimes indicate that the model is overfitting the data, i.e., overlearning.

Therefore, models need to be evaluated not only on their forecasting success, but also on the meaningfulness of the issues. This is where the coherence (UMass) metric comes into play. This metric measures the semantic relationship between words in a topic. High coherence values indicate that topics are more coherent and easier to interpret [25].

Table 3 evaluates the overall fit and interpretability of our LDA model. The obtained perplexity score of 35.53 indicates that the model predicts satisfactorily on the separate data (lower values are better). On the other hand, the c_v coherence value of 0.312 reveals that the extracted topics are semantically coherent enough (higher values are better).

These results show that the set of texts on the experiences of platform workers is successfully decomposed into six meaningful and interpretable topics and that the model performs acceptably overall.

5. Discussion

The platform economy can be said to be one of the great paradoxes of the digital age: on the one hand it offers liberating technological opportunities, and on the other it creates new kinds of economic uncertainty. The results of this research’s LDA analysis reveal how this paradox is experienced through the voices of users themselves.

To answer RQ2, six emerging themes reveal the complex opportunity-challenge balance of the platform work phenomenon. The theme “User Experience and Platform Evaluation” with the highest percentage (23.77%) highlights the main opportunity offered by platform work: technological accessibility. The prevalence of the terms “easy”, “fast”, ‘friendly’ indicates low barriers to entry and ease of use compared to traditional employment channels, while the theme “Job Search Processes and Remote Work Alternatives” highlights the opportunity for geographical flexibility and the theme “Platform Satisfaction and Referral System” reveals the potential for community-based learning and peer-to-peer support.

However, alongside these opportunities, significant structural challenges also emerge. The high percentage of the theme “Financial Concerns and Time Management” (18.49%) points to the most critical challenge of platform work: income uncertainty and financial insecurity. Concerns around the term’s “money”, “don’t”, “waste”, ‘paid’ reflect the lack of traditional employment guarantees in the platform economy, while the theme “Paid Services and Investment Strategies” reveals the problem of inequality of access on platforms. The fact that platform workers must pay extra for premium services shows that pay-to-play dynamics create social justice issues.

These findings provide important directions for platform design. The development of hybrid evaluation systems requires a combination of user experience and financial security concerns. Transparent pricing models are critical for fair and understandable delivery of premium services, while integrating community-driven features and peer-to-peer learning and mentorship mechanisms will increase the platforms’ potential to create social value. Furthermore, revenue predictability tools and workflow projection systems are essential for platform workers to engage in financial planning.

In terms of policy development, the findings suggest that platform work arrangements need to be improved along three main axes. First, social security mechanisms for platform workers need to be redesigned, and a “portable benefits” model should provide continuous protection when moving between projects. Second, the development of transparency and accountability standards in algorithmic decision-making processes will ensure fairness in work distribution and remuneration. Finally, ensuring equal access guarantees to basic platform services will prevent the digital divide from deepening.

These six themes show that platform work is neither a utopian instrument of liberation nor a dystopian mechanism of exploitation, but rather a complex socio-technical system that can realize its potential through careful design and regulation. Future success lies in balancing technological capabilities with human-centered policies. For the sustainable development of the platform economy, holistic approaches that address structural inequalities while prioritizing the user experience are needed.

We also recognize that combining reviews from distinct platforms (FlexJobs, Deel, Toptal, and Reddit) may introduce representativeness constraints, since each platform has different affordances and review structures. Nevertheless, the persistence of core themes across all four sources, alongside identifiable platform-specific emphases, suggests that the integrated dataset remains robust and theoretically meaningful for analyzing both commonalities and differences in platform work experiences.

In addition, the study did not employ methodological triangulation. While LDA-based topic modeling provided robust insights into cross-cutting themes, the absence of complementary methods (e.g., surveys, interviews, ethnographic observation) limits external validation and reduces the interpretive richness of the findings. Future research should therefore integrate multiple methods to strengthen both validity and depth.

A further limitation is the reliance on publicly available user reviews, which may include commercialized or manipulated content. Although semantic filtering and manual inspection were used to mitigate this risk, complete authentication of reviews was not possible within the scope of this study. Future research should therefore complement API-based data with advanced authenticity checks or triangulation methods such as surveys or interviews.

In addition, platform-specific factors such as moderation practices, governance rules, and local regulatory contexts may influence the content and sentiment of user reviews. While our analysis focused on identifying cross-cutting themes across platforms, we acknowledge that these contextual differences could act as confounding factors. Future research should therefore incorporate platform-level governance frameworks and regulatory environments to better disentangle platform-specific effects from broader trends.

Furthermore, semantic filtering relies on pretrained transformer models, which may introduce semantic drift without external human validation. Similarly, while LDA provides interpretable topic clusters, it may not fully capture nuanced sentiment or complex intertopic relationships. Moreover, no alternative model triangulation or manual coding was employed, which limits external validation. Nevertheless, robustness is supported by stability checks across multiple semantic thresholds (0.35–0.45) and topic numbers (K = 5–10), which consistently yielded similar thematic structures.

Another limitation of this study is the lack of sectoral granularity. By analyzing platforms in aggregate, certain industry-specific dynamics (e.g., IT freelancing, delivery services, or remote employment) may be overlooked. While the present study aimed to identify cross-cutting themes across platform work, future research should conduct sector-specific analyses to refine our understanding of how different industries shape workers’ experiences.

Our findings resonate with and extend prior research on platform work. For instance, Gundert and Leschke (2024) [6] emphasized the challenges of evaluating platform work through conventional job-quality frameworks, noting the prevalence of algorithmic governance uncertainties. This aligns with our Theme 5 (Job Search Processes and Remote Work Alternatives), where governance and accessibility issues emerge as critical concerns. Similarly, Cook and Rani (2025) [7] highlighted the persistent risks of income instability in developing economies; our Theme 2 (Financial Concerns and Time Management) provides empirical confirmation of financial insecurity as a universal characteristic across platforms. Berg et al. (2018) [9], in a comprehensive ILO study, also identified wage volatility and limited access to social protection, which parallel our findings on the pay-to-play dynamics in Theme 4 (Paid Services and Investment Strategies). However, unlike these prior studies, our results capture additional nuances such as the strategic evaluation of subscription-based services, a dimension less visible in earlier large-scale surveys.

Theoretically, this study contributes by bridging inductive data-driven insights with established conceptual dimensions of platform value systems. While previous works often treated trust, governance, and payment structures as separate analytical categories, our alignment table (Table 1) demonstrates how workers’ lived experiences naturally map onto these theoretical constructs. Moreover, by introducing “investment strategies in paid services” as a novel thematic dimension, the analysis extends existing frameworks to account for emerging monetization logics in platform design. This dual contribution—confirming the universality of financial insecurity and uncovering platform-specific strategies—strengthens the explanatory power of value system perspectives in digital labor research.

6. Conclusions

The identified themes—User Experience and Platform Evaluation, Financial Concerns and Time Management, Platform Satisfaction and Recommendation System, Paid Services and Investment Strategies, Job Search Processes and Remote Work Alternatives, and Overall Platform Performance and Account Management—clearly illustrate the dual character of platform work between opportunities such as technological accessibility and geographical flexibility and challenges such as income uncertainty and financial insecurity.

Among these, three themes stand out as particularly significant: User Experience and Platform Evaluation (23.77%), Financial Concerns and Time Management (18.49%), and Platform Satisfaction and Recommendation System (16.60%). Presenting these themes upfront highlights the central issues faced by platform workers and establishes a clearer foundation for theoretical interpretation.

These findings can be explicitly mapped to the value co-creation framework: user experience concerns reflect trust and reliability mechanisms; financial insecurity relates to workload and autonomy management; and satisfaction and recommendation processes align with service quality frameworks. This mapping demonstrates how platform workers’ lived experiences correspond to established theoretical dimensions of value systems.

By explicitly mapping these empirically derived themes onto six theoretical dimensions of platform value systems, this study demonstrates how workers’ lived experiences correspond to—and at times extend—established conceptual frameworks. In doing so, it strengthens the theoretical bridge between inductive topic modeling approaches and the platform economy literature. Compared to prior research (e.g., Gundert and Leschke [6], Cook and Rani [7], Berg et al. [9]), our findings both confirm the universality of financial insecurity and uncover novel dynamics such as subscription-based investment strategies. This extends existing frameworks by introducing subscription-based investment strategies as a novel dimension of value creation and capture, thereby advancing theoretical debates on value creation and capture in digital labor markets.

In practical terms, the findings suggest that platform designers should prioritize transparent pricing models, revenue predictability tools, and community-driven features such as peer-to-peer learning and mentorship. For policymakers, the results highlight the importance of portable benefits schemes, algorithmic accountability standards, and ensuring equal access to basic platform services as a safeguard against deepening the digital divide.

In methodological terms, the study demonstrates that the subjective experiences of platform workers can be systematically examined through large-scale textual data analysis. The integration of semantic filtering and LDA modeling provides an innovative framework that enhances the reliability and interpretability of topic modeling in the context of platform work.

Limitations of the study include its restriction to English-language platform reviews and the lack of objective working conditions due to its text-based analysis. Additionally, combining datasets from different platforms introduces representativeness constraints, though the persistence of core themes across sources indicates robustness. Another limitation is the lack of sectoral granularity: by analyzing platforms in aggregate, certain industry-specific dynamics (e.g., IT freelancing, delivery services, or remote employment) may be obscured. Furthermore, platform-specific policies, moderation practices, and regulatory contexts may also influence the tone and content of user reviews, which could act as confounding factors in interpreting findings.

Future research could conduct cross-cultural comparisons with multilingual datasets, examine changes in platform workers’ experiences over time through longitudinal studies to capture temporal dynamics and platform strategy shifts, incorporate platform-level governance and regulatory frameworks into the analysis.

Complementary techniques such as sentiment analysis, manual coding, or alternative topic models (e.g., Contextualized Topic Models, BERTopic) could also be employed to capture more nuanced perspectives and relationships across topics. In addition, methodological triangulation should be applied by combining LDA-based text mining with complementary approaches such as interviews, surveys, or ethnographic fieldwork, which would enhance both the robustness and the interpretive depth of future investigations.

The methodological framework and findings of this study have the potential to guide future steps for the human-centered development of the platform economy.

Author Contributions

Conceptualization, O.Ü.B. and S.O.; Methodology, O.Ü.B. and S.O.; Formal Analysis, O.Ü.B. and S.O.; Investigation, O.Ü.B. and S.O.; Validation, O.Ü.B. and S.O.; Writing—Original Draft, O.Ü.B. and S.O.; Writing—Review and Editing O.Ü.B. and S.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study is available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

API	Application Programming Interface
BERT	Bidirectional Encoder Representations from Transformers
ICT	Information and Communication Technology
LDA	Latent Dirichlet Allocation Algorithm
MDS	MultiDimensional Scaling

References

Eurofound. Overview of New Forms of Employment—2020 Update; Publications Office of the European Union: Luxembourg, 2020; pp. 1–72. [Google Scholar] [CrossRef]
Mandl, I.; Codagnone, S. The Diversity of Platform Work—Variations in Employment and Working Conditions. In Digital Innovation and the Future of Work, 1st ed.; Schaffers, H., Vartiainen, M., Bus, V., Eds.; River Publishers: Gistrup, Denmark, 2020; pp. 177–196. [Google Scholar]
Fulker, Z.; Riedl, C. Cooperation in the Gig Economy: Insights from Upwork Freelancers. Proc. ACM Hum. Comput. Interact. 2024, 8, 1–20. [Google Scholar] [CrossRef]
Pilatti, G.R.; Pinheiro, F.L.; Montini, A.A. Systematic literature review on gig economy: Power dynamics, worker autonomy, and the role of social networks. Adm. Sci. 2024, 14, 267. [Google Scholar] [CrossRef]
De Stefano, V. The rise of the “just-in-time workforce”: On-demand work, crowdwork, and labor protection in the gig-economy. Comp. Labor Law Policy J. 2016, 37, 471–504. [Google Scholar] [CrossRef]
Gundert, S.; Leschke, J. Challenges and potentials of evaluating platform work against established job-quality measures. Econ. Ind. Democr. 2024, 45, 696–718. [Google Scholar] [CrossRef]
Cook, S.; Rani, U. Platform Work in Developing Economies: Can Digitalisation Drive Transformation? Indian J. Labour Econ. 2025, 68, 395–416. [Google Scholar] [CrossRef]
Samant, Y.; Naz Malik, R. App joobs, algorithms, and risks: Hidden hazards of platform work. Occup. Med. 2025, 75, 6–8. [Google Scholar] [CrossRef] [PubMed]
Berg, J.M.; Furrer, M.; Harmon, E.; Rani, U.; Silberman, M.S. Digital Labour Platforms and the Future of Work: Towards Decent Work in the Online World; ILO: Geneva, Switzerland, 2018; pp. 95–109. [Google Scholar]
Bayılmış, O.Ü.; Orhan, S.; Bayılmış, C. Unveiling Gig Economy Trends via Topic Modeling and Big Data. Systems 2025, 13, 553. [Google Scholar] [CrossRef]
Won, J.; Lee, D.; Lee, J. Understanding experiences of food-delivery-platform workers under algorithmic management using topic modeling. Technol. Forecast. Soc. Chang. 2023, 190, 122369. [Google Scholar] [CrossRef]
Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent Dirichlet Allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
Blei, D.M.; Carin, L.; Dunson, D. Probabilistic topic models. IEEE Signal Process. Mag. 2011, 23, 1–7. [Google Scholar] [CrossRef]
Jacobi, C.; Atteveldt, W.; Welbers, K. Quantitative analysis of large amounts of journalistic texts using topic modelling. Digit. J. 2016, 4, 89–106. [Google Scholar] [CrossRef]
Calli, L.; Calli, F. Understanding airline passengers during COVID-19 outbreak to improve service quality: Topic modeling approach to complaints with Latent Dirichlet Allocation algorithm. Transp. Res. Rec. J. Transp. Res. Board 2022, 2677, 656–673. [Google Scholar] [CrossRef] [PubMed]
Calli, L. Exploring mobile banking adoption and service quality features through user-generated content: The application of a topic modeling approach to Google Play Store reviews. Int. J. Bank Mark. 2023, 41, 428–454. [Google Scholar] [CrossRef]
Calli, L.; Alma Calli, B. Value-centric analysis of user adoption for sustainable urban micro-mobility transportation through shared e-scooter services. Sustain. Dev. 2024, 32, 6408–6433. [Google Scholar] [CrossRef]
Montes-Escobar, K.; De la Hoz-M, J.; Barreiro-Linzán, M.D.; Fonseca-Restrepo, C.; Lapo-Palacios, M.Á.; Verduga-Alcívar, D.A.; Salas-Macias, C.A. Trends in Agroforestry Research from 1993 to 2022: A Topic Model Using Latent Dirichlet Allocation and HJ-Biplot. Mathematics 2023, 11, 2250. [Google Scholar] [CrossRef]
Pilacuan-Bonete, L.; Galindo-Villardón, P.; Delgado-Álvarez, F. HJ-Biplot as a Tool to Give an Extra Analytical Boost for the Latent Dirichlet Assignment (LDA) Model: With an Application to Digital News Analysis about COVID-19. Mathematics 2022, 10, 2529. [Google Scholar] [CrossRef]
Isoaho, K.; Gritsenko, D.; Makela, E. Topic Modeling and Text Analysis for Qualitative Policy Research. Policy Stud. J. 2019, 49, 300–324. [Google Scholar] [CrossRef]
Rojas Rincón, J.S.; Riveros Tarazona, A.R.; Mejía Martínez, A.M.; Acosta-Prado, J.C. Sentiment Analysis on Twitter-Based Teleworking in a Post-Pandemic COVID-19 Context. Soc. Sci. 2023, 12, 623. [Google Scholar] [CrossRef]
Li, Z.; Chen, P.; Luo, J.M. Attributes Influencing Visitors’ Experiences in Conservation Centers with Different Social Identities: A Topic Modeling Approach. Systems 2025, 13, 442. [Google Scholar] [CrossRef]
Sievert, C.; Shirley, K. LDAvis: A method for visualizing and interpreting topics. In Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, Baltimore, MD, USA, 27 June 2014; pp. 63–70. [Google Scholar]
Chuang, J.; Manning, C.D.; Heer, J. Termite: Visualization techniques for assessing textual topic models. In Proceedings of the International Working Conference on Advanced Visual Interfaces, Capri Island, Italy, 21–25 May 2012; pp. 74–77. [Google Scholar] [CrossRef]
McCallum, A. Topic Model Diagnostics. UMASS. 2018. Available online: http://mallet.cs.umass.edu/diagnostics.php (accessed on 1 June 2025).

Figure 1. Methodological process followed in the research.

Figure 2. The generative process of Latent Dirichlet Allocation (LDA).

Figure 3. Most frequently used terms in platform workers’ reviews: word cloud analysis.

Figure 4. Distribution ratios of LDA topics.

Figure 5. Distance map between topics and the most relevant terms of Topic 1 in platform worker experiences [23,24].

Figure 6. Word co-occurrence network based on user reviews. Each node represents a word, and edges indicate frequent co-occurrence within the same document. Colored clusters represent semantically related word groups. (Threshold = 1, Resolution = 1.5; singletons filtered.).

Table 1. Mapping of empirical themes to theoretical dimensions of platform value systems.

Theoretical Dimension	Empirical Theme	Theme Weight (%)	Key Keywords	Theoretical Alignment
Trust and reliability mechanisms	User Experience and Platform Evaluation	23.77%	deel, company, easy, user, experience, work, fast, hiring, payment, friendly	Strong alignment: User experience directly reflects trust-building mechanisms
Workload and autonomy management	Financial Concerns and Time Management	18.49%	money, don’t, make, people, time, month, waste, looking, paid, application	Moderate alignment: Financial pressures indicate autonomy challenges
Service quality frameworks	Platform Satisfaction and Recommendation System	16.60%	best, ever, platform, offer, amazing, company, reliable, apps, working, like	Strong alignment: Satisfaction metrics reflect service quality perceptions
Payment system architectures	Paid Services and Investment Strategies	15.09%	even, subscription, work, search, give, much, paying, look, opportunity, make	Direct alignment: Payment models shape investment decisions
Platform governance structures	Job Search Processes and Remote Work Alternatives	13.96%	find, free, job, like, would, looking, help, remote, money, option	Moderate alignment: Governance affects accessibility and work arrangements
Professional identity formation	Overall Platform Performance and Account Management	12.08%	good, great, time, would, using, enough, overall, payment, platform, account	Emerging alignment: Account management relates to professional presence

Note: Theoretical dimensions are grounded in established platform economy literature [1,2,3,4,5,6,7,8,9], which collectively provide the conceptual foundation for mapping empirical themes to value system dimensions, while empirical themes themselves emerged inductively from LDA analysis.

Table 2. LDA topic modeling results: topics and keywords.

Topic No.	Labels	Keywords
1	User Experience and Platform Evaluation	deel, company, easy, user, experience, work, fast, hiring, payment, friendly
2	Financial Concerns and Time Management	money, don’t, make, people, time, month, waste, looking, application, paid
3	Platform Satisfaction and Recommendation System	best, ever, platform, offer, amazing, company, reliable, apps, working, like
4	Paid Services and Investment Strategies	even, subscription, work, search, give, much, paying, look, opportunity, make
5	Job Search Processes and Remote Work Alternatives	find, free, job, like, would, looking, help, remote, money, option
6	Overall Platform Performance and Account Management	good, great, time, would, using, enough, overall, payment, platform, account

Table 3. LDA model evaluation metrics.

Metric	Value
Perplexity	35.53
Coherence (c_v)	0.312

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ütük Bayılmış, O.; Orhan, S. Decoding Digital Labor: A Topic Modeling Analysis of Platform Work Experiences. Systems 2025, 13, 819. https://doi.org/10.3390/systems13090819

AMA Style

Ütük Bayılmış O, Orhan S. Decoding Digital Labor: A Topic Modeling Analysis of Platform Work Experiences. Systems. 2025; 13(9):819. https://doi.org/10.3390/systems13090819

Chicago/Turabian Style

Ütük Bayılmış, Oya, and Serdar Orhan. 2025. "Decoding Digital Labor: A Topic Modeling Analysis of Platform Work Experiences" Systems 13, no. 9: 819. https://doi.org/10.3390/systems13090819

APA Style

Ütük Bayılmış, O., & Orhan, S. (2025). Decoding Digital Labor: A Topic Modeling Analysis of Platform Work Experiences. Systems, 13(9), 819. https://doi.org/10.3390/systems13090819

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Decoding Digital Labor: A Topic Modeling Analysis of Platform Work Experiences

Abstract

1. Introduction

2. Literature Review

2.1. Platform Work

2.2. Studies on Topic Modeling

3. Materials and Methods

3.1. Data Acquisition

3.2. Data Pre-Processing

3.3. Semantic Filtering

3.4. Topic Modeling with LDA

4. Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI