Exploring Marshall–Olkin Models Through Bibliometric and Topic Modeling Approaches Using Latent Dirichlet Allocation (1981–2025): A Study Based on Scopus Data

Llinás, Humberto; Llinás, Brian; López, Carlos; Nuñez, Daniela

doi:10.3390/math14071215

Open AccessArticle

Exploring Marshall–Olkin Models Through Bibliometric and Topic Modeling Approaches Using Latent Dirichlet Allocation (1981–2025): A Study Based on Scopus Data

¹

Department of Mathematics and Statistics, Universidad del Norte, Barranquilla 080001, Colombia

²

Computer Science Department, Old Dominion University, Norfolk, VA 23508, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(7), 1215; https://doi.org/10.3390/math14071215

Submission received: 14 February 2026 / Revised: 30 March 2026 / Accepted: 1 April 2026 / Published: 4 April 2026

Download

Browse Figures

Versions Notes

Abstract

The Marshall–Olkin family of distributions has gained increasing attention in fields such as reliability engineering, survival analysis, financial risk modeling, and actuarial science because of its flexibility in modeling dependence among events and its wide range of extensions. Despite its growing relevance, a systematic understanding of how research on Marshall–Olkin models has evolved over time is still limited. This study addresses this gap by combining bibliometric techniques with topic modeling to analyze the structure and evolution of the scientific literature on Marshall–Olkin models. The analysis includes all 266 peer-reviewed publications on Marshall–Olkin models indexed in Scopus between 1981 and 2025. Bibliometric techniques (including heatmaps, clustering analyses, and temporal visualizations) are used to characterize publication patterns, source relationships, and thematic evolution. In addition, Latent Dirichlet Allocation (LDA) uncovered 27 topics and examined their prevalence across journals and time periods. The results reveal five main clusters of publication sources and three temporal groupings derived from hierarchical clustering of topic distributions, reflecting the thematic progression of the field. Overall, the findings highlight both the persistence of core research themes and the emergence of new applications, particularly in areas such as Bayesian competing risks, censoring models, and parameter estimation in Weibull-based frameworks. This study provides a systematic and data-driven perspective on the intellectual evolution of Marshall–Olkin research, helping scholars identify emerging trends and potential directions for future work.

Keywords:

Marshall–Olkin distributions; probabilistic modeling; bibliometric analysis; topic modeling; latent Dirichlet allocation; scientific mapping; temporal trend analysis; clustering visualization

MSC:

62-08; 62H30; 62F10; 68T50

1. Introduction

The Marshall–Olkin bivariate distribution (BMO) has emerged as a foundational statistical tool for modeling joint dependence between random variables. Since its introduction by Marshall and Olkin in 1967, the distribution has been widely adopted because of its ability to explicitly represent dependence generated by common shocks affecting multiple components [1]. Classical lifetime distributions such as the exponential or Weibull models typically assume independence between variables and therefore cannot adequately capture such dependence structures in reliability and survival data. More general multivariate models, such as the bivariate normal distribution, impose symmetric dependence structures and may fail to capture asymmetric or tail-dependent behavior commonly observed in reliability and financial data. Likewise, although copula-based approaches (particularly Archimedean copulas [2]) are widely used to model dependence, they generally lack an explicit shock-based interpretation, which is often desirable in reliability systems. The Marshall–Olkin framework addresses these limitations by incorporating shared shock mechanisms, allowing the joint modeling of dependent failure events. As a result, its applications span diverse fields, including reliability engineering, survival analysis, finance, and actuarial science [3].

In recent decades, scholarly interest in the BMO distribution has increased considerably. Its ability to accommodate various dependence structures and adapt to real-world phenomena has led to applications in modeling the lifetimes of interdependent systems in reliability, analyzing censored data and risk in survival contexts, and assessing the co-movement of financial assets under extreme conditions [4]. A notable recent advancement is the modified bivariate Marshall–Olkin Weibull model (MOBW-μ) proposed in [5], which introduces theoretical innovations in correlation structures relevant to engineering and actuarial domains. Despite its broad applicability, the growth of BMO-related literature has been somewhat fragmented, lacking a unifying framework to trace its thematic and methodological development. Existing reviews predominantly focus on theoretical models or on specific domains of application. Notably, no comprehensive bibliometric or topic modeling studies have been conducted to systematically explore the evolution and structure of research on the Marshall–Olkin distribution.

A notable exception is the bibliometric study by [6], which provided a comprehensive quantitative overview of Marshall–Olkin-based distributions from 1997 to 2021. Their analysis identified key contributors, journals, and collaboration networks, and emphasized the increasing complexity and flexibility of MO-derived models. However, their approach did not incorporate probabilistic topic modeling techniques, such as Latent Dirichlet Allocation (LDA) [7], to detect latent thematic structures or temporal topic shifts. More generally, although the Marshall–Olkin framework provides a natural mechanism for modeling dependence through common shocks [1], alternative dependence models are frequently used in practice across a wide range of applications. For instance, bivariate normal models and Gaussian copulas are often preferred due to their analytical tractability, while Archimedean copulas provide flexible closed-form dependence structures that are relatively straightforward to estimate. In comparison, the Marshall–Olkin distribution may involve additional inferential and computational challenges, particularly due to the presence of singular components and the complexity of parameter estimation under censoring. These factors may partly explain why MO-based models tend to be concentrated in specialized domains such as reliability engineering, actuarial science, and survival analysis. This omission creates a relevant methodological gap that our study aims to address.

Meanwhile, bibliometric and topic modeling approaches, particularly Latent Dirichlet Allocation (LDA), have gained traction in recent years for mapping research landscapes in statistical modeling and related disciplines [8,9]. LDA remains one of the most widely used machine learning algorithms for topic discovery in large document collections. As a probabilistic model based on the bag-of-words representation, it assumes that documents are mixtures of latent topics characterized by distributions of words. Although this representation is semantically agnostic and relies on probabilistic assumptions, it has proved effective in uncovering latent thematic structures and tracing the intellectual evolution of research fields, as demonstrated in studies on the Weibull distribution [10,11] and copula-based reliability models [12,13,14]. At the same time, LDA assumes document exchangeability and does not explicitly model temporal dynamics, which limits its ability to capture time-varying thematic structures without additional post hoc analysis. Extensions such as Dynamic Topic Models (DTM) or Structural Topic Models (STM) provide more explicit frameworks for modeling such temporal evolution [15,16]. However, no such integrated approach has yet been applied to the domain of Marshall–Olkin distributions.

Recent studies have highlighted the need for more flexible probabilistic models to capture non-monotonic hazard behavior, such as bathtub or upside-down bathtub failure rates, commonly observed in real-world reliability applications [17,18,19]. These works emphasize that classical models such as the Weibull or exponential distributions may fall short in such contexts, prompting the development of new families such as the Alpha Power Marshall–Olkin-G, extended Lindley, and bivariate Lindley distributions. Their contributions further justify the need to map the expanding landscape of MO-based generalizations from both a bibliometric and thematic perspective.

In this context, topic modeling techniques such as Latent Dirichlet Allocation (LDA) provide a useful exploratory framework for identifying thematic structures in the literature. However, it is important to note that LDA is based on Dirichlet priors and the assumption of document exchangeability, meaning that it does not explicitly model sequential or temporal dependencies among documents. As a result, time-related topic evolution must typically be analyzed through post hoc methods or through extensions such as Dynamic Topic Models. Furthermore, because LDA relies on an unsupervised bag-of-words representation, newly proposed distribution families may not always be clearly assigned to distinct topics. Recent approaches such as KeyATM [20] allow the incorporation of domain-specific keywords to guide topic identification, which may help isolate specific model families more explicitly. Exploring such keyword-guided topic models therefore represents a promising direction for future research.

This study addresses this methodological gap by applying a combined bibliometric and topic modeling framework to explore the scientific development of research related to the Marshall–Olkin distribution. Although the Marshall–Olkin framework has applications in several domains involving dependent events, its use is most prominent in specialized areas such as reliability engineering, survival analysis, actuarial science, and risk modeling, where shock-based dependence structures arise naturally. Using LDA, we identify core thematic areas, detect trends over time, and analyze patterns of scholarly collaboration. The findings provide a structured synthesis of how BMO-related research has evolved, highlighting both well-established domains and emerging areas of inquiry. By integrating bibliometric analysis with probabilistic topic modeling, this study offers a comprehensive mapping of the Marshall–Olkin research landscape. The proposed framework enables the identification of influential contributors, publication venues, and collaboration patterns while simultaneously uncovering latent thematic structures and emerging research directions within the field. This dual perspective provides valuable insights for researchers, methodologists, and practitioners engaged with the Marshall–Olkin framework.

2. Materials and Methods

2.1. Data Collection

To ensure methodological rigor, we adhered to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines for document selection and screening [21] (see Figure 1). The search was conducted in the Scopus database, one of the largest curated repositories of the peer-reviewed literature and a widely used source for bibliometric studies due to its broad journal coverage and structured metadata.

The search strategy was designed to capture publications explicitly referring to Marshall–Olkin models and their most common extensions. The query was applied to the Title, Abstract, and Keywords fields (TITLE-ABS-KEY) as follows:

The asterisk (*) represents a wildcard operator that allows the retrieval of all word variants sharing the same root (e.g., “copula*” includes “copula” and “copulas”). Given the relatively specialized and limited vocabulary associated with the Marshall–Olkin framework, the search space is comparatively focused, which facilitates both the construction of the query and the subsequent topic modeling analysis. The search string included both “Marshall-Olkin” and “Marshall Olkin” to account for spelling variations across publications and indexing formats. The additional terms “bivariate”, “shock model”, and “copula” were added to identify the main statistical contexts in which the Marshall–Olkin framework is typically applied, including bivariate distributions, shock models, and copula-based dependence structures.

Scopus was selected as the primary data source because it offers extensive coverage of peer-reviewed journals in statistics, reliability engineering, and applied probability, which are the main disciplines where Marshall–Olkin models are studied [22,23]. Several comparative studies have reported that Scopus offers broader journal coverage and more extensive citation indexing than the Web of Science in many scientific fields, making it particularly suitable for bibliometric analyses (e.g., [22,23]). However, both databases exhibit similar temporal trends in publication activity, supporting the robustness of the patterns observed in this study. In addition, its structured metadata facilitates large-scale scientometric analysis and topic modeling workflows. To maintain database consistency and avoid duplication across indexing systems, the analysis was conducted using Scopus as the single source of bibliographic records.

The initial query yielded 268 documents, from which one duplicate was removed. No documents were excluded for lacking abstracts. One document was discarded due to missing affiliation information. In the final eligibility stage, two documents were excluded because they did not meet the minimum criteria for full-text analysis, resulting in a total of 266 documents included for bibliometric and topic modeling analysis.

The resulting corpus size reflects the specialized nature of the Marshall–Olkin family of distributions within statistical research. Unlike broader statistical distribution families, Marshall–Olkin models constitute a focused research stream primarily associated with reliability theory, survival analysis, and multivariate dependence modeling. Therefore, the final set of 266 publications represents the core body of literature explicitly addressing this framework, ensuring thematic consistency for the bibliometric and topic modeling analyses.

2.2. Bibliometric Analysis

In the landscape of scientific knowledge production, bibliometric analysis plays a vital role in mapping how research fields evolve over time. Beyond quantitative metrics, its value lies in uncovering collaboration patterns, citation networks, and thematic trajectories that shape scholarly influence across disciplines [21]. We applied bibliometric analysis to examine the development of the literature on Marshall–Olkin models and their extensions. The analysis follows the classical bibliometric framework, which distinguishes three complementary dimensions of scientific development:

Social structure: examining co-authorship networks and country-level collaborations to identify patterns of scholarly interaction.
Intellectual structure: assessing the citation impact of highly cited documents, authors, and sources to detect influential contributions in the field.
Conceptual structure: mapping emerging themes and clusters of keywords to understand how theoretical developments and applications have evolved over time.

In the present study, the conceptual structure is further examined using topic modeling techniques applied to the document corpus. From a methodological perspective, these dimensions may also be interpreted as document-level metadata that could be incorporated directly into more advanced topic modeling frameworks, such as Structural Topic Models, enabling a more granular analysis of topic prevalence across sources, time periods, or collaboration patterns. However, in this work they are analyzed through complementary bibliometric indicators and standard topic modeling methods. To operationalize these dimensions, we followed two complementary methodological approaches commonly used in the bibliometric literature [24]:

Performance analysis, which evaluates the productivity and impact metrics of authors, journals, and countries.
Science mapping, implemented through probabilistic topic modeling and semantic clustering, reveals knowledge structures and thematic patterns within the field.

2.3. Topic Modeling Analysis

To identify latent themes in the corpus, we employed Latent Dirichlet Allocation (LDA), a hierarchical Bayesian generative model in which each document is assumed to be a mixture of latent topics, and each topic is characterized by a distribution over words [7]. Although several advanced topic modeling approaches have been proposed in recent years, LDA remains one of the most widely adopted probabilistic models for large-scale textual analysis due to its interpretability and robustness in exploratory bibliometric studies. This model is particularly well suited for large text collections, allowing the discovery of underlying thematic structures that are not easily detectable through manual review [8]. In this study, LDA was used to identify the main topics and emerging trends in research involving the Marshall–Olkin bivariate distribution.

The LDA model defines a generative process in which each document is represented as a mixture of latent topics, and each topic is characterized by a distribution over words. Formally, the topic distribution for each document is drawn from a Dirichlet distribution with parameter α, while the topic-word distributions are drawn from a Dirichlet distribution with parameter β. For each token in a document, a topic is assigned according to the document-specific topic distribution, and the observed word is generated from the corresponding topic-word distribution. Inference in LDA is computationally intractable in closed form due to the marginalization over latent variables. Therefore, approximate Bayesian inference methods are typically employed. In this study, we used collapsed Gibbs sampling, which iteratively updates topic assignments based on conditional posterior probabilities while integrating out the Dirichlet parameters. This procedure allows efficient estimation of document-topic and topic-word distributions. After convergence, the posterior estimates yield the document-topic distributions and topic-word distributions used for downstream analysis.

The LDA model requires the specification of several parameters, including the number of topics K and the hyperparameters α and β. In this study, multiple candidate models were estimated with values of K ranging from 4 to 30. The optimal number of topics was selected by evaluating model coherence across these candidate values and considering interpretability criteria. In particular, topic coherence was computed using the C_v coherence measure implemented in the textmineR package, which evaluates the semantic similarity of the most representative terms within each topic and is widely used to assess topic model interpretability. The hyperparameters α and β were set using standard symmetric priors, which allow flexible document-topic and topic-word distributions while maintaining computational stability during Gibbs sampling.

While these assumptions may limit its ability to capture complex temporal interactions in historical corpora, LDA remains widely used for exploratory thematic mapping due to its interpretability and robustness in large text collections. In the present study, LDA is therefore employed primarily as a baseline topic discovery tool to identify the latent thematic structure of the Marshall–Olkin research corpus. Modeling temporal topic evolution through sequential frameworks, such as Dynamic Topic Models or covariate-based topic models, is left for future research.

The implementation involved three main phases. First, corpus preprocessing was conducted through tokenization, stopword removal, and stemming. Second, the model was estimated and evaluated using topic coherence [25], a widely used measure that assesses the degree of semantic similarity among the most representative words within each topic [25]. Higher coherence indicates that the top terms in a topic tend to co-occur more frequently in the corpus, yielding more interpretable structures compared to alternatives such as perplexity, which focus on predictive likelihood but often lead to less coherent results. Finally, the topics were manually labeled by the authors based on the most representative terms and on the semantic interpretation of the topic-word distributions. In addition, to analyze their evolution over time, we computed topic prevalence across the publication years of the documents in the corpus. This integrated approach provided both a structural overview of scholarly activity and a semantic mapping of topic evolution in the field of Marshall–Olkin distribution models.

To analyze the distribution of topics across publication sources and years, we constructed matrices representing the proportion of each topic within sources and publication years. These matrices were used to identify structural similarities among sources and temporal patterns in the literature. Clustering was performed on the source-topic proportion matrix, where each row represents a publication source and each column corresponds to the proportion of documents associated with a given topic. The k-means algorithm [26] was applied to partition the sources into groups based on similarity in their topic proportion profiles. Although LDA is based on a Dirichlet generative framework and k-means relies on distance-based partitioning, their combination is commonly used in exploratory analyses, as the topic proportion vectors provide a suitable representation for identifying similarity patterns across sources. The optimal number of clusters was determined using the elbow method, which evaluates the within-cluster sum of squares (WSS) across multiple candidate values of k. Based on this criterion, five clusters were selected for the analysis of publication sources. To visualize the resulting structures, heatmaps combined with hierarchical clustering (complete linkage) were generated. These visualizations allow the identification of groups of sources and years that share similar topic distributions, providing an interpretable representation of thematic concentration patterns across the literature.

Several extensions of LDA have been proposed to address some of its limitations. For example, Dynamic Topic Models (DTM) introduce temporal dependencies between topics, enabling the analysis of topic evolution over time. Structural Topic Models (STM) incorporate document-level metadata to explain variation in topic prevalence across documents. Other approaches, such as keyword-guided models like KeyATM [20] or Seeded LDA [27], allow the integration of prior knowledge through domain-specific keywords to guide topic identification. While these models offer valuable methodological extensions, the present study adopts standard LDA as a widely used baseline method for exploratory thematic mapping of the Marshall–Olkin literature.

2.4. Software and Computational Tools

All computational analyses were conducted using the R statistical environment (version 4.4.1). Bibliometric indicators and performance analysis were implemented using the bibliometrix package (version 4.5.0) [24], which provides an integrated framework for bibliometric and scientometric analysis. Scientific mapping and network visualizations were generated using VOSviewer (version 1.6.20), a widely used tool for constructing and visualizing bibliometric networks. Text preprocessing and topic modeling procedures were implemented in R using the packages quanteda, tm, and textmineR, which provide tools for corpus processing, document-term matrix construction, and probabilistic topic modeling. The LDAvis package [28] was used to facilitate the visualization and interpretation of the resulting topic structures.

3. Results

3.1. General Information

Over the 44-year period from 1981 to 2025, a total of 266 publications from 119 sources were retrieved and analyzed (Table 1). This table outlines the bibliometric characteristics of research related to the Marshall–Olkin family of models, highlighting both productivity and collaboration trends. The dataset shows an annual publication growth rate of 3.73%, with documents averaging 10.4 years in age and receiving approximately 12 citations per publication. Most of the contributions are research articles (n = 250), followed by a smaller number of conference proceedings and book chapters, whereas no monographs were recorded.

The corpus includes 404 distinct authors, with 29 producing single-authored works, which account for 41 documents. The average number of co-authors per publication is 2.42, and the rate of international collaboration reaches 30.08%, reflecting a moderate-to-high level of global engagement. In terms of conceptual diversity, the dataset contains 845 author-provided keywords and 655 Keywords Plus, which refer to terms automatically generated by the database based on the titles of cited references, rather than the authors’ original keywords. This distinction helps capture additional latent thematic structures in the literature.

Taken together, these indicators suggest that the field of Marshall–Olkin modeling is both active and relatively collaborative, with a growing international presence and increasing thematic diversification. This profile points to a research area that, while grounded in mathematical theory, is progressively branching into applications where dependence modeling, reliability analysis, and copula structures play a key role. Table 1 provides a valuable snapshot of the field’s development and scholarly ecosystem, setting the stage for a deeper topic modeling analysis in the sections that follow.

Figure 2 depicts the temporal distribution of publications related to Marshall–Olkin models, along with the average number of citations per year. Although early contributions were sporadic before 2000, the field experienced a notable rise in output beginning in the mid-2000s, with a clear acceleration from 2015 onward. This upward trend suggests increasing scholarly engagement with the topic, particularly in the past decade.

The citation trend, represented by the dark line, has remained relatively stable over time, even as the volume of publications has grown. This pattern indicates a sustained level of academic interest and relevance for work in the field, reinforcing its conceptual and applied significance. The pronounced growth after 2015 may reflect both internal drivers, such as expanding applications in reliability theory, risk analysis, and dependence modeling, and external factors, including greater access to research tools, open access platforms, and global collaborations. Importantly, while part of this increase aligns with the general global surge in scientific publishing, the consistent citation rates and focused thematic development suggest that the Marshall–Olkin research community is not simply following publication trends but advancing a specialized and evolving research agenda. These findings are in line with the bibliometric indicators presented in Table 1 and underscore the dynamic yet stable trajectory of the field.

3.2. Sources

Table 2 presents the top 30 journals that have published research on the Marshall–Olkin family of models between 1981 and 2025. These journals are ranked by number of publications, while total citations and h-index are also reported, providing a comprehensive overview of the field’s core dissemination outlets.

In addition, Table 2 reports the publisher of each journal to provide further context regarding the editorial structure of the field. The distribution of outlets reveals a strong presence of major academic publishers such as Elsevier, Taylor & Francis, Springer, and MDPI. This concentration highlights the central role of established statistical and applied mathematics publishers in disseminating research on Marshall–Olkin models. At the same time, the presence of engineering-oriented publishers such as IEEE and Cambridge University Press reflects the interdisciplinary nature of the field, particularly its connections with reliability engineering, risk analysis, and applied probability.

At the forefront is the Communications in Statistics—Theory and Methods (rank 1), followed by highly productive and influential outlets such as the Journal of Multivariate Analysis (rank 2) and Computational Statistics and Data Analysis (rank 3). These journals serve as central venues for foundational and applied work in dependence modeling and multivariate analysis. Several other specialized journals, including Communications in Statistics: Simulation and Computation (rank 4), Methodology and Computing in Applied Probability (rank 5), and the IEEE Transactions on Reliability (rank 6), reflect the methodological breadth and application-oriented focus of research in this domain. The presence of the Journal of Statistical Computation and Simulation (rank 7) and the Journal of Applied Statistics (rank 8) further reinforces the statistical orientation of the field. In addition, multidisciplinary and emerging platforms such as Mathematics (rank 9), Symmetry (rank 16), and Springer Proceedings in Mathematics and Statistics (rank 19) have contributed to the field’s dissemination, highlighting both its theoretical foundation and expansion into modern applied contexts.

The diversity of publication venues—from theoretical outlets like the Journal of Statistical Planning and Inference (rank 10) and Metrika (rank 12) to application-driven journals like Reliability Engineering and System Safety (rank 25)—suggests a broad and evolving scholarly ecosystem. Journals such as Fuzzy Sets and Systems (rank 14) and Stochastic Environmental Research and Risk Assessment (rank 22) also underscore the growing intersection between Marshall–Olkin modeling and fields such as artificial intelligence, risk management, and environmental statistics. The consistent presence of specialized journals like the Journal of Statistical Theory and Practice (rank 15) and Model Assisted Statistics and Applications (rank 30) illustrates the steady development of a dedicated research community. Meanwhile, the inclusion of international journals such as the Brazilian Journal of Probability and Statistics (rank 21) and the Pakistan Journal of Statistics and Operation Research (rank 23) reflects geographically distributed engagement with the topic.

Taken together, the distribution of publication outlets demonstrates both the theoretical depth and interdisciplinary potential of Marshall–Olkin models. This variety of journals illustrates the field’s capacity to bridge fundamental statistical theory with applied domains ranging from reliability and engineering to decision sciences and stochastic systems.

3.3. Authors

Table 3 identifies the 30 most prolific contributors to Marshall–Olkin research during the period 1981–2025. These authors are ranked by number of publications, with citations and h-index reported alongside, reflecting both research output and academic influence.

In addition, Table 3 reports the primary affiliation country of each author to provide further context regarding the geographic distribution of leading contributors in Marshall–Olkin research. The results reveal a diverse international landscape, with prominent contributions from authors affiliated with institutions in India, China, Egypt, Germany, Italy, Brazil, Canada, the United States, France, and Slovenia. This geographic diversity highlights the global diffusion of Marshall–Olkin research and confirms that the field has developed through contributions from multiple regional and disciplinary traditions.

Leading the list is Kundu D. (rank 1), whose 18 publications and 361 citations attest to a significant and lasting presence in the field. Other major contributors include Hanagal D.D. (rank 2) and Scherer M. (rank 3), both of whom have established strong citation records and h-index values, indicating consistent scholarly engagement over time. Scholars such as Shi Y. (rank 4) and Eliwa M.S. (rank 5) have also emerged as influential voices in the last decade, contributing to both theoretical and applied aspects of Marshall–Olkin modeling. The diversity of contributors further reflects the global nature of the field. For instance, authors such as Kolev N. (rank 6), Mai J.-F. (rank 7), and Durante F. (rank 9) have worked extensively on copula theory and multivariate dependence modeling, which constitute central themes within the Marshall–Olkin framework. Meanwhile, Balakrishnan N. (rank 17), a well-established figure in reliability analysis, underscores the methodological depth of the literature.

Recent contributors such as Gui W. (rank 13), Wang L. (rank 16), and Abuelamayem O.A. (rank 26) illustrate the ongoing expansion of research in this area, supported by international collaborations and emerging applications in data science, actuarial science, and risk modeling. Overall, the list portrays a research community that balances foundational scholarship with practical relevance. Authors such as Li H. (rank 20) and Xu A. (rank 22) have achieved notable impact with relatively few publications, suggesting the presence of highly cited seminal contributions. The appearance of scholars such as Omladič M. (rank 14) and Rubino G. (rank 24) also highlights interdisciplinary connections with fields such as applied mathematics and operations research. Together, these results emphasize the intellectual architecture of the Marshall–Olkin research domain, characterized by methodological sophistication, international reach, and thematic evolution. The bibliometric profile captured here provides a valuable foundation for understanding the structure and development of the author community within this field.

3.4. Countries

Figure 3 shows the worldwide distribution of research output on Marshall–Olkin models across 30 countries between 1981 and 2025, based on logarithmic scaling of total scientific production. This spatial representation reveals clear geographic clustering, with notable concentrations in Asia, North America, and parts of Europe.

Table 4 reports bibliometric indicators for the top 30 contributing countries in Marshall–Olkin research, ranked by number of publications. Citations and average article impact are also included to illustrate the relative scientific influence of each country.

Among the top contributors to Marshall–Olkin research, China (rank 1) leads in publication count with 94 documents and 327 total citations, corresponding to 3.5 citations per publication. Close behind are India (rank 2) with 88 publications and 452 citations (5.1 citations per publication), and Egypt (rank 3) with 68 papers and 296 citations (4.4 per publication). These figures underscore the growing scientific influence of these countries in the development and application of statistical modeling techniques. The United States (rank 4) and Canada (rank 5) also make significant contributions, with 67 and 34 publications, respectively, and citation rates of 4.0 and 6.4 per publication, highlighting their continued presence in high-impact research supported by long-standing academic networks. In Europe, countries such as Germany, Italy, and France maintain consistent output and moderate citation performance, with citation rates generally ranging between 3 and 5 citations per publication.

Smaller but highly impactful contributors include Austria (rank 21) and the United Kingdom (rank 20), which, despite producing only 4 and 6 publications, exhibit high citation rates of 25.0 and 4.5 per publication, respectively. These values suggest the presence of influential or highly cited contributions. Emerging contributors such as Pakistan (rank 18, 5.9 citations per publication), Spain (rank 16, 3.3), and Chile (rank 15, 1.8) also demonstrate notable citation-to-publication ratios. Their presence indicates that impactful research is not confined solely to high-output countries but also emerges from smaller academic communities. Meanwhile, countries such as Colombia (rank 23), Iraq (rank 24), and the United Arab Emirates (rank 30) appear with low or zero citation counts, suggesting recent or still-developing participation. Nevertheless, their inclusion in the ranking reflects an expanding geographic interest in Marshall–Olkin models across regions previously underrepresented in the literature. In summary, the bibliometric indicators (publication volume, total citations, and citations per publication) reveal a dual dynamic: the consolidation of established research hubs alongside the emergence of new contributors. This pattern reinforces the view of Marshall–Olkin model research as a globally distributed and increasingly interdisciplinary area within applied probability and reliability theory.

3.5. Topics Identification

Table 5 provides an overview of the 27 latent topics inferred from the corpus of articles on Marshall–Olkin models using Latent Dirichlet Allocation (LDA). The topics are sorted by their estimated prevalence within the dataset. The optimal number of topics (K = 27) was determined using the coherence score, which reached its maximum value (0.145) for this specification, indicating the most semantically consistent and interpretable solution. Topic coherence evaluates the semantic similarity among the top terms within each topic and is widely used to assess topic-model interpretability. Although several alternative criteria exist for selecting the number of topics (such as perplexity, held-out likelihood, or stability-based approaches) coherence measures are often preferred in exploratory thematic analyses because they tend to produce more interpretable topic structures. It is important to note that in standard LDA, topics are not strictly independent in practice, as documents are modeled as mixtures of topics and some vocabulary may appear across multiple topics. Consequently, certain topics may share common terms, reflecting thematic overlap within the research corpus rather than strict probabilistic independence.

After fitting the LDA model and identifying 27 distinct topics, each topic was characterized by a ranked set of top terms based on frequency and semantic relevance. To enhance interpretability and support thematic synthesis, topic labels were manually assigned by the authors, who have domain expertise in reliability modeling and Marshall–Olkin distributions. This involved reviewing the most salient terms associated with each topic and identifying recurring concepts, domain-specific terminology, and contextual patterns. For example, topics including terms such as “competing risk”, “censor”, and “Bayesian” were labeled as Bayesian competing risks and censoring, while those dominated by terms like “shock”, “system”, and “reliability” were categorized as system reliability and shock models. This expert-driven labeling approach ensures semantic coherence and aligns each topic with recognized subfields in the literature on Marshall–Olkin modeling and its extensions. The topics are presented in Table 5, ordered by their estimated prevalence in the corpus, together with their labels, prevalence values, publication counts, and top associated terms.

The terms associated with each topic (e.g., compet, estim, copula, failur, reliabl) result from text preprocessing techniques commonly used in traditional topic modeling workflows, including tokenization, stopword removal, and stemming [7]. Stemming standardizes word forms to their lexical roots. This procedure, implemented using the Porter stemming algorithm in R, reduces lexical variation (e.g., estimate, estimated, and estimating are reduced to estim), thereby enhancing topic coherence and reducing dimensionality in bag-of-words representations. Each topic is labeled with an internal identifier (e.g., t₄, t₂₀) and is characterized by its relative prevalence, number of publications, and a set of top representative terms. Importantly, topic prevalence and the number of publications are related but not identical indicators. High prevalence with relatively few publications indicates that a topic is strongly concentrated within a small set of documents, meaning those articles are heavily dominated by the corresponding theme. Conversely, lower prevalence with higher publication counts suggests broader but more diffuse coverage across the literature. This distinction is crucial for interpreting the results, as it highlights the difference between thematic intensity within documents and thematic dispersion across the corpus.

The most prevalent topic, t₄ (6.10%), corresponds to Bayesian competing risks and censoring, encompassing themes such as failure times, censoring mechanisms, and posterior inference. Other frequently occurring topics include t₂₀ (5.38%), labeled Estimation of Unknown Parameters in Weibull models, and t₂₇ (4.53%), titled system reliability srand shock models. These capture core areas such as parametric estimation, lifetime analysis, and component-based reliability. Some topics address foundational modeling concern, such as t₈ on hazard rates and failure modeling, t₂₅ on stress-strength models, and t₂₄ on tail dependence in multivariate settings. Others reflect more specialized or advanced themes, including copula theory (t₂₃, t₁₄), Markov chain simulation (t₆), and stochastic ordering (t₃). Additionally, several moderately prevalent topics (t₂₆, t₅, and t₁₃) point toward practical applications, including system optimization, reliability engineering, and limit theorems in discrete contexts. Less frequent yet potentially emerging areas include t₂₁ (Bayesian Procedures and Statistical Frameworks) and t₁₈ (Systemic Risk in Banking Systems), which are characterized by terms such as bank_system, distort, and qualiti. Taken together, the topic modeling results offer a structured synthesis of the thematic landscape in Marshall–Olkin literature. The presence of both well-established domains and emergent research threads illustrates the LDA model’s capacity to capture the field’s intellectual evolution and thematic diversity. To validate the thematic coherence of the extracted topics, we examined representative papers from the corpus that exhibit high posterior probabilities (gamma values) for specific topics. For instance, topic t₅ (Marshall–Olkin Copulas and Reliability) is exemplified by the article “Calibrating a dependent failure model for components under Marshall-Olkin copulas” (2016), published in the Winter Simulation Conference Proceedings, which directly addresses reliability modeling within the MO framework. Similarly, topic t₂₇ (System Reliability and shock models) is reflected in the paper “Reliability of a k-out-of-n: G System Subjected to Multiple Failure Criteria under Marshall-Olkin Models” (2023), offering practical insights into system-level reliability assessment. These connections between high-gamma documents and their respective topics reinforce the interpretability of the LDA results and support their alignment with established research themes in the Marshall–Olkin literature.

3.6. Topic Trends

To assess how scholarly attention has evolved over time, we analyzed the year-by-year distribution of topic proportions across all documents in the corpus. To provide a comprehensive view of the thematic dynamics of the field, Figure 4 displays the temporal prevalence of all 27 topics inferred from the LDA model. The plots reflect smoothed trends derived from LDA-generated topic probabilities, highlighting thematic shifts in Marshall–Olkin model research from 1981 to 2025. The horizontal axis corresponds to publication years, while the vertical axis indicates the average annual contribution of each topic. Color coding distinguishes the nature of each trend: red for increasing topics, blue for declining ones (none observed), and black for relatively stable trends.

Based on the smoothed temporal trends displayed in Figure 4, several topics exhibit clear growth patterns over time. In particular, topics such as t₄ (Bayesian competing risks and censoring), t₂₀ (Estimation of Unknown Parameters in Weibull models), and t₂₃ (Marshall–Olkin Copulas and functional extensions) show noticeable upward trajectories in their temporal prevalence, reflecting increasing scholarly engagement in these areas. Specifically, t₄ has gained considerable momentum over the past decade, underscoring a growing interest in Bayesian frameworks for modeling competing risks and censored data. In parallel, the rising trajectory of t₂₀ highlights intensified research on parameter estimation techniques in Weibull and Marshall–Olkin settings, often involving maximum likelihood and Bayesian approaches. Likewise, t₂₃ shows sustained growth driven by the development of copula-based methods and functional extensions that enhance the modeling of dependence structures within the Marshall–Olkin framework. Focusing on the most recent period (2020–2025), these topics represent the subject areas with the most pronounced growth in the literature: Bayesian survival and censoring methods, parametric inference for Weibull-based Marshall–Olkin models, and copula-based dependence modeling and functional extensions. Their recent expansion suggests that the most dynamic frontier of Marshall–Olkin research lies in advanced inference under censoring, flexible lifetime modeling, and modern dependence structures. Conversely, several topics demonstrate relative temporal stability, suggesting a foundational role in the field. For example, t₁ (Residual Life and Entropy Measures), t₅ (Marshall–Olkin Copulas and Reliability), and t₉ (Marginal and Joint Distributions) display fluctuations over time but no significant directional shifts, indicating sustained attention from the research community. Similarly, t₁₄ (Discrete Lévy Copulas and Hierarchical Structures) and t₁₇ (Lifetime Models and Likelihood Inference) maintain moderate and steady prevalence across decades, underscoring their importance as conceptual anchors in both theoretical exploration and applied modeling. These patterns, revealed through topic prevalence over time, confirm the dual nature of the Marshall–Olkin research landscape: a set of well-established, structurally stable topics coexists with emerging themes that are actively reshaping the field. This dynamic interplay between stable foundational themes and emerging research directions reinforces the interpretability of the LDA-derived topic structure and provides evidence of the gradual evolution of Marshall–Olkin research from classical reliability formulations toward more flexible inferential frameworks and modern dependence modeling approaches.

3.7. Topic Distributions Across Sources and Years

Based on the elbow method, five clusters were selected for the source variable and three clusters for the year variable, as these cutoffs marked the point at which reductions in within-cluster variance began to stabilize. This choice provides a balance between interpretability and model fit, enabling the detection of meaningful structure without overfitting. Figure 5 presents heatmaps showing the distribution of topic proportions across the top 30 journals (Figure 5a) and publication years (Figure 5b) within the domain of Marshall–Olkin models and their statistical applications. Both heatmaps employ hierarchical clustering to uncover patterns of thematic concentration and temporal evolution. To facilitate the interpretation of the statistical clustering results, we provide a qualitative explanation of the thematic patterns represented by each cluster of publication sources and publication years.

In Figure 5a, warmer shades (orange to red) indicate stronger associations between journals and topics. The clustering of journals yields five distinct groups characterized by different thematic profiles in the Marshall–Olkin literature:

Cluster 5 (upper band). This cluster gathers broadly theoretical and general statistical journals such as Journal of Multivariate Analysis, Statistics and Probability Letters, Journal of Mathematical Analysis and Applications, Insurance: Mathematics and Economics, and Fuzzy Sets and Systems. The strongest intensities are observed for t₁₄ (Discrete Lévy Copulas and Hierarchical Structures), t₂₃ (Marshall–Olkin copulas and functional extensions), and t₂₄ (Tail dependence in multivariate models). These patterns indicate that, while these outlets cover a broad thematic range, they play a key role in advancing the theoretical underpinnings of dependence modeling and copula-based approaches.
Cluster 4 (second band). Includes applied statistics and reliability outlets such as IEEE Transactions on Reliability, Reliability Engineering and System Safety, Journal of Statistical Planning and Inference, Probability in the Engineering and Informational Sciences, Symmetry, and Journal of Computational and Applied Mathematics. This group shows concentrations in t₂₀ (Estimation of Unknown Parameters in Weibull models), t₂₅ (Bivariate Exponential Stress-Strength Models), and t₂₇ (system reliability and shock models), reflecting applied contexts of Marshall–Olkin models in lifetime data, stress-strength analysis, and system reliability.
Cluster 3 (third band). This cluster consists mainly of computation-oriented outlets such as Communications in Statistics: Simulation and Computation, Communications in Statistics—Theory and Methods, Journal of Statistical Computation and Simulation, Journal of Applied Statistics, Computational Statistics and Data Analysis, and Mathematics. The dominant intensities are observed in t₄ (Bayesian competing risks and censoring), t₂₀ (Weibull parameter estimation), and t₁₉ (hazard functions and reversed distributions), which appear across most journals in this cluster. In addition, t₂₅ (Bivariate Exponential Stress-Strength Models) shows relevance in selected outlets, while t₂₆ (generalizations of Weibull and special distributions) is more prominent in Mathematics. Altogether, this cluster reflects a focus on methodological and computational developments supporting Bayesian inference, parametric estimation, and hazard-based modeling in the Marshall–Olkin framework.
Clusters 1 and 2 (bottom). They include journals such as Stochastic Environmental Research and Risk Assessment and the Journal of Systems Engineering and Electronics and reflect emerging or interdisciplinary outlets where Marshall–Olkin models intersect with environmental, engineering, or financial domains. Although their distributions are more scattered, they show localized concentrations in t₈ (Hazard and Failure Rate Modeling) and t₂₂ (Frailty and Shared Survival Models), with additional links to t₇ (Multivariate Dependence in Insurance Models) and t₁₈ (Systemic Risk in Banking Systems). These patterns suggest that Clusters 1 and 2 capture niche but diverse applications of the models, spanning risk assessment, reliability, and financial systems.

Taken together, the clustering structure suggests the presence of three broad research orientations within the Marshall–Olkin literature. First, a reliability engineering and survival analysis stream, represented by journals such as IEEE Transactions on Reliability and Reliability Engineering and System Safety, focuses on lifetime modeling, shock models, and stress-strength reliability frameworks. Second, a theoretical probability and statistical distribution stream, concentrated in outlets such as Journal of Multivariate Analysis and Statistics and Probability Letters, emphasizes the development of new distributional families, dependence structures, and mathematical properties of Marshall–Olkin models. Third, an applied risk and actuarial modeling stream, visible in journals related to insurance and financial statistics, explores applications in credit risk, systemic risk, and portfolio dependence modeling. This disciplinary differentiation highlights how the Marshall–Olkin framework simultaneously supports theoretical development, engineering reliability modeling, and financial risk applications.

In Figure 5b, topic proportions are displayed across publication years, revealing three temporal clusters:

Cluster 1 (bottom) captures a very specific period in the late 1990s (1996 and 1999), with concentrated contributions in t₂₅ (Bivariate Exponential Stress-Strength Models) and, to a lesser extent, t₂₇ (system reliability and shock models). These early applications highlight the initial role of Marshall–Olkin models in classical reliability frameworks, particularly in stress-strength analysis and system-level failure modeling. Unlike other clusters, the thematic scope here is narrow, suggesting that the adoption of Marshall–Olkin approaches during this phase was focused on extending reliability methods rather than exploring broader distributional generalizations.
Cluster 2 (upper band) groups primarily early contributions (1980s–1990s), covering years such as 1981, 1986, 1988, 1991, 1992, 1994, and 1997. The most consistent intensities are observed in t₂₅ (Bivariate Exponential Stress-Strength Models) and t₂₇ (system reliability and shock models), reflecting the consolidation of Marshall–Olkin models within the reliability and stress-strength literature during this period. This cluster shows a thematic pattern very similar to Cluster 1, but spread across a wider temporal window, suggesting that the early adoption of MO models was largely driven by their ability to generalize and operationalize classical reliability concepts.
Cluster 3 (middle band) spans the 2000s through the mid-2020s and marks a transition from the early reliability-oriented applications to a more diversified set of research themes. Unlike Clusters 1 and 2, where t₂₅ and t₂₇ dominated, their influence here is diluted, giving way to emerging directions. Notable intensities include t₃ (Stochastic Ordering and Likelihood Ratios) in 2006; t₅ (MO Copulas and Reliability), t₁₃ (Limit Laws and Geometric Distributions), t₁₅ (EM Algorithm for Bivariate Exponential Models), and t₂₄ (Tail Dependence in Multivariate Models) in 2008; and sustained activity in t₄ (Bayesian competing risks and censoring) across the 2020s. Topics such as t₂₀ (Weibull estimation) and t₂₃ (MO Copulas and functional extensions) also appear with moderate prevalence, pointing to methodological developments in parametric inference and dependence modeling. Overall, Cluster 3 reflects the broadening of Marshall–Olkin research into Bayesian methods, copula theory, and stochastic modeling, aligning with the diversification of applied probability and interdisciplinary uptake in finance, biostatistics, and engineering.

Taken together, the clustering analysis of sources and years underscores both the relevance and adaptability of Marshall–Olkin models. The concentration of certain topics in specialized journals, alongside their gradual diffusion into broader outlets, reflects a balance between consolidation and diversification within the field. Likewise, the temporal patterns highlight a clear shift from foundational reliability-oriented modeling toward more complex, interdisciplinary, and application-driven research trajectories.

4. Discussion

This study integrates Latent Dirichlet Allocation (LDA) with bibliometric analysis to map the thematic structure, temporal dynamics, and geographic footprint of research on the Marshall–Olkin (MO) family. These findings can be interpreted in relation to previous studies examining the development of Marshall–Olkin models and related reliability distributions. In particular, a bibliometric study [6] documented the growing diversification of Marshall–Olkin-based distributions and highlighted the increasing complexity of methodological extensions within the field. Our results extend this perspective by incorporating probabilistic topic modeling, which allows the latent thematic structure of the literature to be identified and its temporal dynamics to be explored. The topics detected through LDA (especially those associated with Bayesian inference, Weibull-based extensions, and copula-based dependence modeling) are consistent with the methodological directions emphasized in recent reviews of the Marshall–Olkin framework [4]. Moreover, the prominence of themes related to reliability modeling, stress-shock systems, and flexible lifetime distributions aligns with recent developments in reliability theory and generalized lifetime models reported in the literature [17,18,19]. Taken together, these connections suggest that the evolution of Marshall–Olkin research reflects a broader shift toward more flexible dependence structures and advanced inference methods within modern reliability and applied probability research. Using 266 documents from 119 sources over 1981–2025 (annual growth approx. 3.73%, mean citations approx. 12), the analysis reveals a field that has broadened from classical reliability toward diversified applications in finance, insurance, and biomedicine, while continuing to generate methodological generalizations.

Temporal dynamics and volume. Publication activity was relatively limited during the early years of the literature. Notably, no records were retrieved for the 1960s and 1970s using the specified search strategy in Scopus. The earliest documents in our dataset appear in the early 1980s, and only 27 records were identified for the period 1981–1999, indicating a low level of publication activity during the initial stage of the field. This pattern can be attributed to two main factors. First, the Marshall–Olkin framework gained broader recognition only after the development of subsequent extensions and applications in reliability and dependence modeling during the 1980s and 1990s. Second, bibliographic databases such as Scopus have more limited indexing coverage for earlier decades, which may result in fewer retrievable records for the early period of the literature [22,23]. This limitation has been widely documented in bibliometric studies, which highlight the expansion and evolving coverage of major scientific databases over time (e.g., [29,30]). A complementary check using the Web of Science Core Collection reveals a similar pattern of low publication activity during these early decades, supporting the robustness of this temporal trend across databases.

Publication activity remained modest before 2005, followed by sustained growth and multiple surges after 2010, with a marked rise after 2020. This expansion coincides with growing interest in dependence modeling and copulas, together with the diffusion of Bayesian and EM-based estimation pipelines. It should also be noted that part of the observed increase in publication counts may reflect the continued expansion of bibliographic databases such as Scopus and Web of Science, which have broadened their journal coverage and indexing policies over time [29,30]. Consequently, some of the apparent growth in the literature may be partially attributable to improved database indexing rather than solely to an increase in research activity.

Topic-over-time patterns highlight recent momentum for t₄ (Bayesian competing risks and censoring), t₂₀ (Estimation of Unknown Parameters in (bi)variate Weibull models), and t₂₃ (Marshall–Olkin Copulas and functional extensions), signaling a shift toward inference under censoring, heavy-tailed lifetimes, and flexible dependence structures. Clustering by publication year further reinforces this transition: early decades (1980s–2000s) were dominated by stress-strength and reliability themes (t₂₅, t₂₇), while the 2010s onward introduced more diverse themes such as stochastic ordering (t₃), Bayesian censoring (t₄), MO copulas (t₅, t₂₃), and tail dependence (t₂₄), reflecting a broadening scope and interdisciplinary uptake.

Although the corpus contains 266 publications over the 1981–2025 period (corresponding to an average of approximately six publications per year) the trend shown in Figure 4 should be interpreted with caution. The annual average alone does not demonstrate a distinctive growth pattern attributable to a specific subgroup of researchers or disciplinary community. Rather, it reflects the gradual accumulation of contributions within a specialized area of reliability, dependence modeling, and applied probability. Accordingly, the observed pattern is better understood as evidence of the steady consolidation of the Marshall–Olkin literature rather than proof of a uniquely expanding contributor base.

Thematic cores and applications. LDA uncovered a dual backbone of (i) methodological work and (ii) applied problem-solving:

Reliability and survival (engineering/biomedicine). Topics t₂₂ (Frailty and Shared Survival Models), t₁₇ (Lifetime Models and Likelihood Inference), t₁₅ (EM Algorithm for Bivariate Exponential Models), t₂₇ (system reliability and shock models), t₂₅ (Bivariate Exponential Stress-Strength Models), and t₈ (Hazard and Failure Rate Modeling) reflect the classical MO use case: modeling dependent lifetimes, competing risks, and stress-strength reliability. Illustrative applications include degradation and shock models for series/parallel systems (t₂₇), censored device lifetimes and clinical survival with shared frailty (t₂₂), and EM-based estimation for bivariate exponential MO variants under incomplete data (t₁₅). The prominence of IEEE Transactions on Reliability and Reliability Engineering & System Safety among sources aligns with this thematic pillar.
Finance and insurance (risk/capital/contagion). Topics t₁₂ (Credit Risk and Financial Defaults), t₁₈ (Systemic Risk in Banking Systems), t₂₄ (Tail Dependence in Multivariate Models), and t₅ (MO Copulas and Reliability) document the adoption of MO-type copulas for credit portfolio losses, CDO tranching, and systemic risk measures. The focus on tail co-movements (t₂₄) and asymmetric dependence is consistent with post-crisis risk management. Journals such as Insurance: Mathematics and Economics, Computational Statistics & Data Analysis, and Journal of Multivariate Analysis cluster strongly around these topics.
Generalizations and multivariate structure. Topics t₁₆ (Gamma and Multivariate MO Extensions), t₁₁ (Classes of Bivariate Distributions), t₂₆ (Generalizations of Weibull and Special Distributions), t₃ (Stochastic Ordering and Likelihood Ratios), and t₁ (Residual Life and Entropy Measures) trace theoretical extensions: new distributional families, identifiability/ordering results, and residual-life/entropy characterizations, often coupled to EM, Bayesian, and Monte Carlo (t₆) toolkits. The rise in t₂₃ (MO Copulas and functional extensions) signals an active frontier linking MO shocks with transform-based copula constructions, hierarchical/dependent Lévy structures (t₁₄), and high-dimensional dependence modeling.

Emerging gaps and underexplored directions. Beyond identifying the dominant research streams, the LDA results also reveal several underexplored thematic areas within the Marshall–Olkin literature. These gaps emerge from topics with relatively low prevalence and limited publication counts in Table 5. First, systemic risk modeling in banking systems (topic t₁₈) appears as one of the least represented themes in the corpus, despite the growing relevance of dependence modeling in financial stability analysis. This suggests that Marshall–Olkin frameworks remain underutilized for modeling interconnected financial networks and contagion dynamics. Second, topics related to Bayesian statistical frameworks and advanced inference procedures (t₂₁) also exhibit relatively low prevalence, indicating opportunities to expand Bayesian estimation approaches beyond traditional competing-risk and reliability settings. Third, the analysis reveals limited research connecting Monte Carlo and Markov chain simulation techniques (t₆) with Marshall–Olkin dependence models, suggesting potential for methodological developments in computational inference for complex stochastic systems. Together, these low-density areas highlight opportunities for future research integrating Marshall–Olkin models with modern computational statistics, financial risk analytics, and high-dimensional stochastic modeling.

Outlets, authors, and geography. The top venues combine methodological statistics with application-oriented journals (e.g., Communications in Statistics—Theory and Methods, Journal of Multivariate Analysis, CSDA, IEEE Transactions on Reliability). The source-topic heatmap reveals five clusters: concentrated MO-copula/finance niches, reliability/survival hubs, and mixed-method clusters in general statistics journals. Leading contributors (e.g., Kundu, Hanagal, Scherer, Mai, Durante, Hofert) span reliability, survival, and copula theory, reinforcing the methodological-applied bridge. Country outputs are led by China, India, Egypt, and the USA, with high average citations for Canada and Austria, and approx. 30% international co-authorship (evidence of a global collaborative ecosystem).

Methodological and practical implications. Three main points stand out: (i) MO models remain particularly well-suited where simultaneous shocks, shared risks, or asymmetric/tail dependence are substantively justified (engineering lifetimes, credit contagion, reinsurance portfolios, hydrological extremes); (ii) Bayesian and EM pipelines facilitate estimation under censoring and incomplete data (t₄, t₁₅), routine in survival and reliability; (iii) copula-based MO generalizations (t₂₃, t₂₄) provide interpretable dependence parameters for stress testing, capital allocation, and system-level risk metrics. From a practitioner perspective, the thematic patterns identified in this study provide guidance for selecting appropriate Marshall–Olkin variants in applied risk modeling. In banking and insurance contexts, MO copula-based models (topics related to tail dependence and multivariate risk) are particularly suitable for modeling joint default events, portfolio losses, and systemic risk propagation, where asymmetric dependence and extreme co-movements are relevant. In reliability-oriented insurance applications, such as operational risk or infrastructure insurance, shock-based Marshall–Olkin models and stress-strength formulations provide natural mechanisms for representing simultaneous failure events. Finally, Bayesian and EM-based estimation frameworks identified in recent literature offer practical tools for parameter estimation under censoring and incomplete data, which are common in insurance claims and financial loss datasets. These insights help practitioners align the choice of MO variants with the structural characteristics of the risk processes being modeled.

Limitations and future work. The results depend on Scopus coverage and author-provided keywords; topic labeling involves informed judgment; and alternative preprocessing or parameter choices could slightly shift topic boundaries. Although Scopus is one of the largest curated bibliographic databases, previous studies have documented potential limitations related to metadata inconsistencies, indexing errors, and the presence of “homeless” publications not properly linked to their source records (e.g., [31,32]). Consequently, bibliometric analyses based exclusively on this database may not capture all relevant publications or may contain minor metadata inaccuracies. Future extensions include (i) multi-database triangulation (WoS/MathSciNet), (ii) dynamic topic modeling to capture regime shifts, and (iii) embedding-based models (BERTopic/Top2Vec) to improve semantic cohesion, particularly across the finance-reliability interface and emerging biomedical device applications. In addition, the thematic map highlights several low-density areas that suggest promising directions for future research. First, the integration of Marshall–Olkin models with machine learning and AI-based risk modeling frameworks remains largely unexplored, despite the growing use of probabilistic dependence models in data-driven risk analytics. Second, further development of Marshall–Olkin models in financial systemic risk and portfolio dependence modeling represents an important opportunity, particularly for capturing extreme co-movements and contagion effects in complex financial networks. Third, the literature still lacks scalable high-dimensional extensions of Marshall–Olkin models, which would enable their application in modern settings involving large interconnected systems such as financial markets, infrastructure networks, or high-dimensional survival data. Another promising direction for future research involves the use of alternative topic modeling frameworks that incorporate prior knowledge or covariate information. Methods such as KeyATM [20] or Seeded LDA [27] allow the integration of domain-informed keywords to guide topic discovery and improve interpretability. In addition, covariate-based topic models could incorporate variables such as publication year, country of origin, or field of application to better explain the temporal dynamics and contextual drivers of Marshall–Olkin research topics. Such approaches would enable a more refined analysis of how thematic structures evolve across time and disciplinary contexts.

5. Conclusions

Drawing on 266 MO-related publications (1981–2025) from 119 sources, this study offers a comprehensive LDA-based map of the field. Classical reliability and survival analysis remain central, including stress-strength models, shock models, frailty, and hazard modeling, while finance and insurance have become a second pillar through MO copulas and multivariate generalizations applied to credit defaults, systemic risk, and tail dependence. Recent momentum is particularly evident in inference under censoring and MO-Weibull families. The strongest recent growth is observed for t₄ (Bayesian competing risks/censoring), t₂₀ (estimation for MO/Weibull variants), and t₂₃ (MO copulas/extensions), reflecting both methodological innovation and data realities in engineering, biomedicine, and risk management.

The results also show that the field has developed within a global and collaborative ecosystem. Output is geographically diverse, notably in China, India, Egypt, and the USA, with approximately 30% international co-authorship and high-impact contributions from Europe and North America, consistent with a mature and collaborative research network. The field’s dissemination channels span general statistics, multivariate and copula modeling, and reliability and engineering outlets, with leading authors bridging theory and applications (an architecture that sustains both methodological depth and real-world uptake).

From a practical perspective, MO models are particularly suitable when domain knowledge supports common-shock mechanisms or asymmetric and tail dependence, and when censoring is prevalent. For methodologists, promising directions include multivariate and high-dimensional MO copulas, robust and Bayesian estimation under complex censoring, and comparative evaluations against alternative dependence models in finance and biomedicine.

Overall, the MO framework has evolved from a specialized reliability tool into a versatile modeling family for dependent risks. The combination of interpretable mechanisms (shocks), flexible dependence (copulas), and practical inference (EM/Bayesian) explains its persistence and recent expansion. Extending this analysis with Dynamic Topic Models and cross-database coverage will sharpen our understanding of how MO research continues to adapt to modern challenges in engineering reliability, health analytics, and quantitative risk management.

Author Contributions

Conceptualization, H.L. and B.L.; Investigation, B.L. and C.L.; Methodology, D.N. and C.L.; Project administration, H.L., C.L. and B.L.; Supervision, H.L.; Visualization, D.N. and C.L.; Writing—original draft, B.L.; Writing—review and editing, H.L. and B.L. All authors have read and agreed to the published version of the manuscript.

Funding

Although the project did not benefit from external funding, Universidad del Norte provided institutional support to cover the publication expenses.

Data Availability Statement

This study relied on a dataset extracted from Scopus, which has been made publicly accessible at: https://drive.google.com/file/d/1Ts0YDFCV9XonFArK3Vao1ELMgKFVS1Oj/view?usp=sharing (accessed on 14 February 2026). The dataset may be used by other scholars wishing to reproduce the findings or conduct complementary investigations.

Acknowledgments

We express our appreciation for the institutional collaboration extended by Universidad del Norte, and Old Dominion University.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Marshall, A.W.; Olkin, I. A multivariate exponential distribution. J. Am. Stat. Assoc. 1967, 62, 30–44. [Google Scholar] [CrossRef]
Nelsen, R.B. An Introduction to Copulas; Springer: New York, NY, USA, 2006. [Google Scholar]
Sarhan, A.M.; Gomaa, R.S.; Magar, A.M.; Alsadat, N. Bivariate exponentiated generalized inverted exponential distribution with applications on dependent competing risks data. AIMS Math. 2024, 9, 29439–29473. [Google Scholar] [CrossRef]
Bayramoglu, I.; Ozkut, M. Recent Developments About Marshall–Olkin Bivariate Distribution. J. Stat. Theory Pract. 2022, 16, 58. [Google Scholar] [CrossRef]
Brango, H.; Guerrero, A.; Llinás, H. Marshall–Olkin Bivariate Weibull Model with Modified Singularity (MOBW-μ): A Study of Its Properties and Correlation Structure. Mathematics 2024, 12, 2183. [Google Scholar] [CrossRef]
González-Hernández, I.J.; Granillo-Macías, R.; Rondero-Guerrero, C.; Simón-Marmolejo, I. Marshall–Olkin distributions: A bibliometric study. Scientometrics 2021, 126, 9005–9029. [Google Scholar] [CrossRef]
Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent Dirichlet Allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
Rejeb, A.; Rejeb, K.; Zrelli, I. Exploring the state-of-the-art of halal food research using latent Dirichlet allocation. Discov. Food 2025, 5, 24. [Google Scholar] [CrossRef]
Tekin, Y. Initialization in Gibbs Sampling Implementation of LDA. In 2024 32nd Signal Processing and Communications Applications Conference (SIU); IEEE: Mersin, Turkey, 2024; pp. 1–4. [Google Scholar]
Brown, C.K.; Cameron, B.G. Assessing changes in reliability methods over time: An unsupervised text mining approach. Qual. Reliab. Eng. Int. 2024, 40, 3597–3619. [Google Scholar] [CrossRef]
Zhang, H.; Chen, B.; Guo, D.; Zhou, M. WHAI: Weibull hybrid autoencoding inference for deep topic modeling. arXiv 2018, arXiv:180301328. [Google Scholar]
Amoualian, H. Modeling and Learning Dependencies with Copulas in Latent Topic Models; Université Grenoble Alpes: Saint-Martin-d’Hères, France, 2017. [Google Scholar]
Amoualian, H.; Clausel, M.; Gaussier, E.; Amini, M.R. Streaming-LDA: A copula-based approach to modeling topic dependencies in document streams. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 695–704. [Google Scholar]
Lin, L.; Jiang, H.; Rao, Y. Copula guided neural topic modelling for short texts. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Online, 25–30 July 2020; ACM: New York, NY, USA, 2020; pp. 1773–1776. [Google Scholar]
Blei, D.M.; Lafferty, J.D. Dynamic topic models. In Proceedings of the 23rd International Conference on Machine Learning, Online, 26–28 August 2020; ACM: New York, NY, USA, 2006; pp. 113–120. [Google Scholar]
Roberts, M.E.; Stewart, B.M.; Tingley, D.; Lucas, C.; Leder-Luis, J.; Gadarian, S.K.; Albertson, B.; Rand, D.G. Structural topic models for open-ended survey responses. Am. J. Political Sci. 2014, 58, 1064–1082. [Google Scholar] [CrossRef]
Oliveira, R.P.; Achcar, J.A.; Mazucheli, J.; Bertoli, W. A new class of bivariate Lindley distributions based on stress and shock models and some of their reliability properties. Reliab. Eng. Syst. Saf. 2021, 211, 107533. [Google Scholar] [CrossRef]
Algarni, A. On a new generalized Lindley distribution: Properties, estimation and applications. PLoS ONE 2021, 16, e0246468. [Google Scholar] [CrossRef] [PubMed]
Eghwerido, J.T.; Oguntunde, P.E.; Agu, F.I. The Alpha Power Marshall-Olkin-G Distribution: Properties and Applications. Sankhya A Indian J. Stat. 2023, 85, 172–197. [Google Scholar] [CrossRef]
Eshima, S.; Imai, K.; Sasaki, T. Keyword-assisted topic models. Am. J. Political Sci. 2024, 68, 730–750. [Google Scholar] [CrossRef]
Donthu, N.; Kumar, S.; Mukherjee, D.; Pandey, N.; Lim, W.M. How to conduct a bibliometric analysis: An overview and guidelines. J. Bus. Res. 2021, 133, 285–296. [Google Scholar] [CrossRef]
Mongeon, P.; Paul-Hus, A. The journal coverage of Web of Science and Scopus: A comparative analysis. Scientometrics 2016, 106, 213–228. [Google Scholar] [CrossRef]
Falagas, M.E.; Pitsouni, E.I.; Malietzis, G.A.; Pappas, G.; Kouranos, V.D.; Arencibia-Jorge, R.; Karageorgopoulos, D.E.; Reagan-Shaw, S.; Nihal, M.; Ahmad, N.; et al. Comparison of PubMed, Scopus, web of science, and Google scholar: Strengths and weaknesses. FASEB J. 2008, 22, 338–342. [Google Scholar] [CrossRef]
Aria, M.; Cuccurullo, C. bibliometrix: An R-tool for comprehensive science mapping analysis. J. Informetr. 2017, 11, 959–975. [Google Scholar] [CrossRef]
Röder, M.; Both, A.; Hinneburg, A. Exploring the Space of Topic Coherence Measures. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, Shanghai, China, 2–6 February 2015; ACM: New York, NY, USA, 2015; pp. 399–408. [Google Scholar]
Hastie, T.; Tibshirani, R.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, NY, USA, 2009. [Google Scholar]
Jagarlamudi, J.; Daumé, H., III; Udupa, R. Incorporating lexical priors into topic models. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, 23–27 April 2012; ACL: Avignon, France, 2012; pp. 204–213. [Google Scholar]
Sievert, C.; Shirley, K. LDAvis: A method for visualizing and interpreting topics. In Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, Baltimore, MD, USA, 27 June 2014; ACL: Avignon, France, 2014; pp. 63–70. [Google Scholar]
Haupka, N.; Culbert, J.H.; Schniedermann, A.; Jahn, N.; Mayr, P. Analysis of the publication and document types in OpenAlex, Web of Science, Scopus, PubMed and Semantic Scholar. Quant. Sci. Stud. 2026, 7, 179–194. [Google Scholar] [CrossRef]
Kim, E. In-depth examination of coverage duration: Analyzing years covered and skipped in journal indexing. Publications 2024, 12, 10. [Google Scholar] [CrossRef]
Franceschini, F.; Maisano, D.; Mastrogiacomo, L. The museum of errors/horrors in Scopus. J. Informetr. 2016, 10, 174–182. [Google Scholar] [CrossRef]
Liu, W.; Wang, H. Red alert: Millions of “homeless” publications in Scopus should be resettled. J. Assoc. Inf. Sci. Technol. 2025, 76, 1283–1291. [Google Scholar] [CrossRef]

Figure 1. PRISMA flowchart: procedures and results by stage.

Figure 2. Annual distribution of publications and average citations in Marshall–Olkin model research (1981–2025).

Figure 3. Geographic spread of Marshall–Olkin model publications (log scale, 1981–2025).

Figure 4. Temporal evolution of topic prevalence in the Marshall–Olkin research corpus (1981–2025). Each panel corresponds to one of the 27 topics inferred from the LDA model. Trends are highlighted using regression-based smoothing (red = increasing, black = stable). Axis labels were optimized to improve readability across panels.

Figure 5. Heatmap depicting topic proportions across sources and years. Columns (t1–t27) represent the extracted LDA topics, while rows correspond to journals (a) and publication years (b). Color intensity indicates the relative proportion of each topic within each source or year. The numbers (1–5) indicate clusters obtained through hierarchical clustering and are used solely as identifiers of groups with similar thematic profiles.

Table 1. Overview of the bibliographic dataset on Marshall–Olkin models (1981–2025).

Description	Result
Main information about data:
Timespan	1981:2025
Sources (journals, books, etc.)	119
Documents	266
Annual growth rate %	3.73
Document average age	10.4
Average citations per doc	11.99
Document contents:
Keywords Plus	655
Author’s keywords	845
Authors:
Authors	404
Authors of single-authored docs	29
Authors collaboration:
Single-authored docs	41
Co-authors per doc	2.42
International co-authorships %	30.08
Document types:
Article	250
Conference paper	11
Other (book, editorial, note, review, etc.)	5

Table 2. Top 30 scientific journals publishing research on Marshall–Olkin models based on the 266-document dataset (1981–2025).

	Source	Publisher	Pub.	Cit.	h-Index	Year
1	Communications in Statistics—Theory and Methods	Taylor & Francis	19	149	9	1992
2	Journal of Multivariate Analysis	Elsevier	15	399	10	1989
3	Computational Statistics and Data Analysis	Elsevier	8	360	7	2009
4	Communications in Statistics—Simulation and Computation	Taylor & Francis	8	36	4	2017
5	Methodology and Computing in Applied Prob.	Springer	8	93	4	2008
6	IEEE Transactions on Reliability	IEEE	7	182	6	1981
7	Journal of Statistical Computation and Simulation	Taylor & Francis	7	64	4	2011
8	Journal of Applied Statistics	Taylor & Francis	6	43	3	2006
9	Mathematics	MDPI	6	76	3	2020
10	Journal of Statistical Planning and Inference	Elsevier	5	213	5	1989
11	Journal of Computational and Applied Mathematics	Elsevier	5	58	4	2014
12	Metrika	Springer	5	50	4	1998
13	Statistics and Probability Letters	Elsevier	5	39	4	1991
14	Fuzzy Sets and Systems	Elsevier	5	31	3	2016
15	Journal of Statistical Theory and Practice	Taylor & Francis	5	28	3	2015
16	Symmetry	MDPI	5	62	3	2020
17	Journal of Statistics Applications and Probability	Natural Sciences Publishing	5	32	2	2018
18	Insurance: Mathematics and Economics	Elsevier	4	43	4	2007
19	Springer Proceedings in Mathematics and Stat.	Springer	4	37	3	2015
20	Statistical Papers	Springer	4	52	3	1997
21	Brazilian Journal of Probability and Statistics	Taylor & Francis/ASA	4	27	2	2017
22	Stoch. Environmental Research—Risk Assessment	Springer	4	12	2	2010
23	Pakistan Journal of Statistics—Operation Research	Univ. Punjab	3	38	3	2021
24	Prob. in the Engineering—Informational Sciences	Cambridge Univ. Press	3	18	3	2013
25	Reliability Engineering and System Safety	Elsevier	3	61	3	2020
26	Statistics	Taylor & Francis	3	7	2	1996
27	Stochastics and Quality Control	De Gruyter	3	14	2	2008
28	Journal of Mathematical Analysis and Applications	Elsevier	2	9	2	1991
29	Journal of Systems Engineering and Electronics	IEEE/Chinese Society of Electronics	2	8	2	2019
30	Model Assisted Statistics and Applications	IOS Press	2	8	2	2013

Cit., total citations; Pub., number of publications; Year, publication start year; Publisher, publishing house responsible for the journal.

Table 3. Top 30 most prolific authors in Marshall–Olkin distribution research (1981–2025).

Rank	Author	Country	Pub.	Cit.	h-Index	Year
1	Kundu, Debasis	India	18	361	8	2009
2	Hanagal, David D.	India	12	107	6	1991
3	Scherer, Matthias	Germany	9	137	6	2009
4	Shi, Yimin	China	9	66	5	2017
5	Eliwa, Mohamed S.	Egypt	8	125	6	2016
6	Kolev, Nikolai	Brazil	8	55	4	2015
7	Mai, Jan-Frederik	Germany	7	127	6	2009
8	El-Morshedy, Mahmoud	Egypt	7	104	5	2020
9	Durante, Fabrizio	Italy	6	158	6	2010
10	Mulinacci, Sabrina	Italy	6	49	4	2011
11	Yousof, Haitham M.	Egypt	5	132	4	2020
12	Cherubini, Umberto	Italy	5	45	3	2011
13	Gui, Wenhao	China	5	40	3	2021
14	Omladič, Matjaž	Slovenia	5	33	3	2016
15	Pinto, Jayme	Brazil	5	30	3	2015
16	Wang, Liang	China	5	19	3	2021
17	Balakrishnan, Narayanaswamy	Canada	5	97	2	2007
18	Bai, Xuchao	China	4	47	4	2019
19	Hofert, Marius	Canada	4	53	4	2010
20	Li, Haijun	United States	4	200	4	2008
21	Lu, Jye-Chyi	United States	4	95	4	1989
22	Xu, Ancha	China	4	92	4	2013
23	Li, Xiaohu	China	4	94	3	2011
24	Rubino, Gerardo	France	4	22	3	2016
25	Sarhan, Ammar M.	Egypt	4	131	3	2007
26	Abuelamayem, Ola A.	Egypt	4	11	2	2020
27	Aly, Hanan M.	Egypt	4	11	2	2020
28	Dey, Arabin Kumar	India	4	91	2	2009
29	Zhang, Cheng	China	4	12	2	2016
30	Dey, Sanku	India	3	70	3	2017

Cit., total citations; Pub., number of publications; Year, publication start year; Country, primary affiliation country of the author based on the most frequent affiliation country recorded in the Scopus dataset.

Table 4. Top 30 countries in Marshall–Olkin research: publication output and citation indicators.

Rank	Country	Pub.	Cit.	Cit./Pub.
1	China	94	327	3.5
2	India	88	452	5.1
3	Egypt	68	296	4.4
4	USA	67	266	4.0
5	Canada	34	219	6.4
6	Germany	33	156	4.7
7	Italy	32	113	3.5
8	Iran	27	80	3
9	France	22	77	3.5
10	Brazil	21	46	2.2
11	Saudi Arabia	15	49	3.3
12	Turkey	15	42	2.8
13	South Korea	14	4	0.3
14	Slovenia	13	33	2.5
15	Chile	9	16	1.8
16	Spain	9	30	3.3
17	Japan	8	5	0.6
18	Pakistan	7	41	5.9
19	New Zealand	6	5	0.8
20	United Kingdom	6	27	4.5
21	Austria	4	100	25
22	Romania	4	0	0
23	Colombia	3	0	0
24	Iraq	3	0	0
25	Switzerland	3	6	2
26	Uzbekistan	3	0	0
27	Kuwait	2	8	4
28	Mexico	2	2	1
29	Poland	2	1	0.5
30	United Arab Emirates	2	0	0

Cit., total citations; Pub., number of publications.

Table 5. List of 27 LDA topics extracted from Marshall–Olkin literature (1981–2025), ordered by topic prevalence.

	Label	Prev.	Pub.	Top Terms
t₄	Bayesian Competing Risks and Censoring	6.098	30	compet, risk, compet_risk, censor, depend, bayesian, interv, depend_compet, base, estim, infer, posterior, failur, maximumlikelihood, illustr
t₂₀	Estimation of Unknown Parameters in Weibull Models	5.384	19	paramet, estim, unknown, weibul_distribut, weibul, unknown_paramet, maximumlikelihood, distribut, consid, bay, bivari_weibul, marshallolkin_bivari, bay_estim, prior, obtain
t₂₇	System Reliability and Shock Models	4.533	19	compon, system, shock, reliabl, independ, parallel, seri, parallel_system, system_compon, obtain, lifetim, bivari, magnitud, distribut, numer
t₂₅	Bivariate Exponential Stress-Strength Models	4.478	21	exponenti, bivari_exponenti, test, exponenti_distribut, bivari, marshallolkin, model, distribut, stress, freund, compon, strength, marshallolkin_bivari, estim, block
t₂₃	Marshall–Olkin Copulas and Functional Extensions	4.326	19	copula, class, function, transform, variabl, introduc, singular, depend, stochast, marshallolkin_copula, famili, induc, shock, gener, random_variabl
t₉	Marginal and Joint Distributions	4.276	12	distribut, bivari, joint, margin, moment, exponenti, introduc, gompertz, real, densiti, maximumlikelihood, hazard, hazard_rate, import, call
t₁₅	EM Algorithm for Bivariate Exponential Models	4.125	7	distribut, bivari, algorithm, em, em_algorithm, exponenti_distribut, marshallolkin, exponenti, estim, bivari_distribut, observ, bivari_exponenti, distribut_marshallolkin, paramet, maximumlikelihood
t₁₀	Marshall–Olkin Applications and Variants	3.809	11	marshal, olkin, marshal_olkin, olkin_bivari, correl, bivari, distribut, distribut_marshal, applic, olkin_copula, mobw, olkin_distribut, pareto, paramet, variat
t₁₁	Classes of Bivariate Distributions	3.671	5	bivari, distribut, model, bivari_distribut, continu, absolut, absolut_continu, class, develop, common, observ, set, exist, fit, class_bivari
t₂₄	Tail Dependence in Multivariate Models	3.668	9	depend, multivari, tail, tail_depend, marshallolkin, extrem, copula, distribut, exampl, examin, lower, upper, orthant, model, multivari_distribut
t₈	Hazard and Failure Rate Modeling	3.662	8	distribut, rate, increas, lifetim, failur_rate, shape, hazard_rate, constant, decreas, rate_distribut, bivari, failur, hazard, monoton, paramet
t₁₄	Discrete Lévy Copulas and Hierarchical Structures	3.618	15	lévy, archimedean, discret, structur, variabl, hierarch, copula, depend, introduc, distribut, depend_structur, correspond, memori, sampl, random_variabl
t₁₂	Credit Risk and Financial Defaults	3.516	13	default, risk, credit, price, structur, cdo, depend, analyt, standard, correl, deriv, model, time, depend_structur, contract
t₁₉	Hazard Functions and Reversed Distributions	3.497	7	distribut, hazard, effect, bivari, analyz, invers, observ, flexibl, singular, proport, singular_compon, base, revers_hazard, invers_weibul, revers
t₁₆	Gamma and Multivariate Marshall–Olkin Extensions	3.454	10	distribut, character, marshallolkin, extend, marshallolkin_distribut, multivari, gamma, bivari, applic, gamma_distribut, size, margin_distribut, laplac, possess, chen
t₃	Stochastic Ordering and Likelihood Ratios	3.428	6	order, stochast, condit, distribut, stochast_order, likelihood, posit, ratio, likelihood_ratio, weak, multivari, suffici, usual, bivari, random
t₂₂	Frailty and Shared Survival Model	3.367	6	surviv, distribut, bivari, frailti, group, share, joint, time, parametr, specif, covari, exponenti, fit, model, real
t₁₇	Lifetime Models and Likelihood Inference	3.268	4	distribut, bivari, lifetim, likelihood, lifetim_distribut, analyz, ratio, marshallolkin, bivari_lifetim, statist, distribut_bivari, likelihood_ratio, test, confid, engin
t₂₆	Generalizations of Weibull and Special Distributions	3.253	5	weibul, paramet, gener, bivari, distribut, univari, extens, special, unit, kumaraswami, altern, real, real_life, assess, illustr
t₅	Marshall–Olkin Copulas and Reliability	3.236	4	failur, mo, reliabl, simultan, link, calibr, methodologi, mo_copula, optim, copula, exponenti, correl, suggest, captur, small
t₁₃	Limit Laws and Geometric Distributions	3.118	9	distribut, law, limit, geometr, bivari, random, converg, marshallolkin, approxim, variabl, poisson, larg, bivari_geometr, compound, geometr_distribut
t₁	Residual Life and Entropy Measures	3.112	4	distribut, residu, bivari, extrem, ag, variabl, measur, random, life, residu_life, entropi, student, marshallolkin, import, random_variabl
t₆	Monte Carlo and Markov Chain Methods	3.077	4	markovchain, montecarlo_markovchain, montecarlo, compar, estim, effici, network, base, markov, distribut, markov_chain, chain, chain_montecarlo, independ, propos
t₂	Ranked Sampling and Control Charts	3.053	6	sampl, rank, distribut, modifi, rank_sampl, simpl, base, design, random, skew, chart, control, bivari, varianc, state
t₇	Multivariate Dependence in Insurance Models	3.024	4	model, random, base, combin, distribut, depend, field, bivari, margin, dimension, emploi, caus, copula, laplac, insur
t₂₁	Bayesian Procedures and Statistical Frameworks	2.979	5	procedur, situat, bivari, prior, parametr, distribut, approach, bayesian, statist, consist, model, identifi, likelihood, methodologi, framework
t₁₈	Systemic Risk in Banking Systems	2.969	4	risk, measur, system, propos, imag, bank, system_risk, qualiti, distribut, state, distort, depend, origin, countri, bank_system

Pub., number of publications; Prev., Prevalence. Top terms appear in stemmed form (e.g., “failur” for “failure”), a preprocessing step that groups word variants under a common root to enhance topic coherence.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Llinás, H.; Llinás, B.; López, C.; Nuñez, D. Exploring Marshall–Olkin Models Through Bibliometric and Topic Modeling Approaches Using Latent Dirichlet Allocation (1981–2025): A Study Based on Scopus Data. Mathematics 2026, 14, 1215. https://doi.org/10.3390/math14071215

AMA Style

Llinás H, Llinás B, López C, Nuñez D. Exploring Marshall–Olkin Models Through Bibliometric and Topic Modeling Approaches Using Latent Dirichlet Allocation (1981–2025): A Study Based on Scopus Data. Mathematics. 2026; 14(7):1215. https://doi.org/10.3390/math14071215

Chicago/Turabian Style

Llinás, Humberto, Brian Llinás, Carlos López, and Daniela Nuñez. 2026. "Exploring Marshall–Olkin Models Through Bibliometric and Topic Modeling Approaches Using Latent Dirichlet Allocation (1981–2025): A Study Based on Scopus Data" Mathematics 14, no. 7: 1215. https://doi.org/10.3390/math14071215

APA Style

Llinás, H., Llinás, B., López, C., & Nuñez, D. (2026). Exploring Marshall–Olkin Models Through Bibliometric and Topic Modeling Approaches Using Latent Dirichlet Allocation (1981–2025): A Study Based on Scopus Data. Mathematics, 14(7), 1215. https://doi.org/10.3390/math14071215

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Exploring Marshall–Olkin Models Through Bibliometric and Topic Modeling Approaches Using Latent Dirichlet Allocation (1981–2025): A Study Based on Scopus Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Collection

2.2. Bibliometric Analysis

2.3. Topic Modeling Analysis

2.4. Software and Computational Tools

3. Results

3.1. General Information

3.2. Sources

3.3. Authors

3.4. Countries

3.5. Topics Identification

3.6. Topic Trends

3.7. Topic Distributions Across Sources and Years

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI