Modeling Rank Distribution and the Relative Importance Factor Index in Discrete Power-Law Models: Application to Social Resilience Using the Scopus Database

Llinas, Brian; Padilla, Jose; Llinas, Humberto; Frydenlund, Erika; Palacio, Katherine

doi:10.3390/math14060966

Open AccessArticle

Modeling Rank Distribution and the Relative Importance Factor Index in Discrete Power-Law Models: Application to Social Resilience Using the Scopus Database

by

Brian Llinas

^1,2,*

,

Jose Padilla

²

,

Humberto Llinas

³

,

Erika Frydenlund

²

and

Katherine Palacio

⁴

¹

Computer Science Department, Old Dominion University, Norfolk, VA 23529, USA

²

Virginia Modeling, Analysis, and Simulation Center, Old Dominion University, Suffolk, VA 23435, USA

³

Department of Mathematics and Statistics, Universidad del Norte, Barranquilla 080001, Colombia

⁴

Department of Industrial Engineering, Universidad del Norte, Barranquilla 080001, Colombia

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(6), 966; https://doi.org/10.3390/math14060966

Submission received: 7 January 2026 / Revised: 21 February 2026 / Accepted: 10 March 2026 / Published: 12 March 2026

Download

Browse Figures

Versions Notes

Abstract

Prior research on power-law distributions has primarily focused on modeling frequency patterns, with less attention given to rank distributions and how ranked positions reflect relative importance among elements. In discrete power-law distributions, frequency-based metrics often provide limited discrimination in the tail, where elements may exhibit similar counts but differ in relative dominance. These patterns are especially evident, for instance, in academic publishing, where keywords, affiliations, and citations commonly exhibit power-law behavior. To address this limitation, we introduce the Relative Importance Factor (RIF) Index, a statistical measure derived from the estimated discrete power-law rank distribution rather than an additional independent parameter. The RIF Index compares the probability of an element at a given rank with its probabilities at lower ranks, enabling explicit pairwise statistical comparison, particularly within the tail. We formalize the mathematical framework for discrete rank modeling and apply RIF to synthetic data and a Scopus dataset on social resilience. Our results show that RIF clarifies dominance relationships among ranked elements, providing stronger discrimination in the tail than frequency-based measures alone. We further introduce the RIF matrix and RIF network to represent these pairwise relationships structurally, supporting interpretation of prominence patterns. Although demonstrated in academic publishing, the method generalizes to domains where categorical variables follow discrete power-law behavior under appropriate model-fit validation.

Keywords:

heavy-tailed distribution; keyword analysis; power-law distribution; rank distribution; Relative Importance Factor (RIF); social resilience

MSC:

62E20; 62G05; 91C99

1. Introduction

Heavy-tailed distributions are common in both natural and human-made phenomena. Unlike normal or exponential distributions, they decay more slowly in their tails and include forms such as the log-normal, stretched exponential, and the widely studied power-law distribution [1]. They model systems in which a small number of large occurrences account for most aggregate effects.

Given this versatility, power-law models have been widely used to analyze complex systems across diverse disciplines. They have been applied to understand ecological patterns [2], information dissemination in tourism studies [3], earthquake dynamics in geophysics [4], urban spatial structures in geography [5], and scaling laws in physical systems [6]. Notably, power-law behavior is also prevalent in academic publishing, where it characterizes the distribution of key components such as keyword frequencies [7,8], author productivity [9], and citation counts [10,11]. Such components are critical for assessing scholarly impact and often exhibit heavy-tailed patterns, where a small number of keywords, authors, or papers account for the majority of scientific influence.

This explanatory versatility makes the power-law model a valuable tool for exploring diverse natural, social, and scholarly systems. However, two key limitations remain. First, most studies focus exclusively on modeling frequency distributions, overlooking the mathematical relationship between rank and frequency, which can reveal how dominant elements relate to less frequent ones and provide deeper insights into relative importance. Second, widely used metrics in academic publishing, such as citation counts [12] and the h-index [13], are frequency-based and fail to account for the heavy-tailed nature of scholarly data, limiting their ability to capture hierarchical prominence.

This limitation is most pronounced in the tail of discrete power-law distributions, where many low-ranked elements exhibit similar or tied counts, making it increasingly difficult to distinguish relative dominance using frequency alone. Yet studies continue to show that elements like citations, keywords, and author productivity follow power-law distributions [14,15,16], underscoring the need for complementary metrics that incorporate rank-based structure. Addressing these gaps is essential for developing more accurate representations of complex systems, where relative importance (not just raw frequency) determines influence and systemic behavior.

To address these gaps, this study proposes: (1) a mathematical foundation for modeling rank-frequency relationships in heavy-tailed data; (2) the Relative Importance Factor (RIF) Index, a derived relational functional of the estimated rank distribution under a discrete power-law model, that quantifies how much more dominant an element is compared to those ranked below it; and (3) interpretive tools (RIF matrix and RIF network) that visualize both local and systemic patterns of conceptual prominence. Rather than introducing a new independent distributional parameter, the RIF transforms the global decay exponent into a structured pairwise dominance representation, enabling explicit comparison between ranked elements. Together, these contributions enable a nuanced, rank-aware analysis of importance, allowing for exploration of both localized and systemic thematic structures. We evaluated the RIF Index using synthetic data and applied it to Scopus records on social resilience, underscoring its versatility in uncovering thematic hierarchies and shifts across diverse contexts.

This paper is structured as follows: Section 2 reviews related work on power laws and their applications in academic publishing, establishing the conceptual foundation for the study. Section 3 introduces the modeling of rank distributions and formally defines the RIF Index. Section 4 explains how the RIF can be applied in different contexts, including synthetic data and a real-world case study on social resilience. Section 5 illustrates the RIF Index through synthetic scenarios, focusing on intra- and cross-group rank comparisons. Section 6 presents the empirical case study, detailing the dataset, keyword analysis, and thematic context. Section 7 discusses the main findings, limitations, and future research directions. Finally, Section 8 concludes the article. The appendices include supporting materials such as statistical derivations and pseudocode for implementing the RIF Index.

2. Background

Power-law distributions, commonly observed in natural and social phenomena, can be either discrete or continuous in form [14]. They exhibit two fundamental characteristics: (1) scale invariance, meaning the distribution’s shape remains consistent regardless of the scale at which it is observed [16]; and (2) a characteristic “L” shape on a linear axis, which transforms into a straight line when plotted on a logarithmic scale [16,17]. However, empirical validation of power-law behavior can be challenging due to data limitations that may obscure the distribution’s full tail behavior or lead to misidentification of the underlying pattern [18]. Moreover, power-law behavior may only hold above a minimum threshold and can be sensitive to parameter estimation choices, such as the selection of

x_{min}

and the method used to estimate the scaling exponent, potentially leading to over- or under-identification of heavy-tailed structure.

In this study, we focus on discrete power-law distributions, which are particularly suitable for rank-frequency analysis in categorical data. This focus enables a deeper understanding of systems where items, such as terms, authors, or journals, are not just frequent but hierarchically ordered. Modeling the discrete rank distribution allows us to move beyond raw frequency and better capture the relational structure among elements, particularly in domains shaped by cumulative advantage and thematic centrality.

Beyond their statistical properties, power-law models offer conceptual value in understanding how influence and importance are distributed within complex systems. Their ability to capture hierarchical relationships makes them especially useful for analyzing ranked structures. In such systems, a few components appear very frequently, while many others are less common but still structurally relevant. Mid-ranked elements, in particular, often play important roles in linking or reinforcing broader patterns, even if they are not immediately visible through simple frequency counts. This makes power-law models powerful tools for revealing the underlying organization and influence embedded in real-world distributions.

Building on these foundations, power-law models have proven especially relevant for analyzing academic publishing, where they capture the extreme disparities inherent in key indicators used to assess scholarly influence. Datasets in academic publishing frequently exhibit heavy-tailed behaviors across variables reflecting critical dimensions of scientific activity, including keyword frequencies, author productivity, journal distributions, subject categories, affiliations, and citation networks.

For instance, analyses of keyword frequencies have revealed thematic concentration and topic dynamics [7,15,19], while author productivity and co-authorship networks demonstrate cumulative advantage processes, where a small number of authors account for a disproportionate share of output and collaborations [20]. Journal-level distributions highlight the uneven spread of publications across venues [21], and patterns in subject categories and institutional affiliations expose disparities in disciplinary and institutional research output [22,23]. Citation networks and counts capture extreme inequality in scientific impact, with studies showing that power-law properties persist over time even as networks evolve [10,11,24]. Furthermore, research funding has been shown to influence citation distributions, with funded papers exhibiting heavier tails than their unfunded counterparts [25]. Collectively, these findings demonstrate that power-law patterns (where the frequency of items at rank x is proportional to

x^{- α}

, with the scaling exponent

α

reflecting the degree of inequality) provide a powerful framework for quantifying disparities across academic publishing indicators and for understanding the structural dynamics of scholarly communication [14,15,16].

While scaling exponents provide valuable information about global concentration, existing studies have largely focused on estimating frequency-based parameters, such as scaling exponents, while overlooking the detailed examination of rank distributions (that is, how the hierarchical positions of elements within datasets can reveal deeper information about their relative importance). This omission limits our understanding of how dominant elements (e.g., highly cited papers, frequently used keywords, or prolific authors) relate to those that are less frequent, which is critical for mapping the structure of scientific knowledge.

This limitation becomes particularly evident when frequency-based rankings are used to order categorical elements, as they provide limited insight when frequency differences are small or identical. For example, while large disparities (e.g., 50 vs. 10) suggest clear dominance, it is far more difficult to determine relative importance when frequencies are equal (e.g., 40 vs. 40) or nearly so (e.g., 40 vs. 38). In such cases, frequency becomes an unreliable signal, offering no clear justification for ranking one element above another. This challenge is particularly salient in power-law distributed systems, where many mid- and low-ranked elements play meaningful roles despite having similar occurrence rates. Thus, relying solely on frequency can obscure structural relevance. These challenges point to the need for a principled, mathematically grounded approach to comparing elements not just by how often they occur, but by how their frequency relates to the overall distribution. A rank-aware perspective (rooted in power-law modeling) enables clearer differentiation, especially in cases of small frequency deltas, and strengthens our capacity to capture hierarchical importance within complex systems.

Many widely used metrics in academic publishing, such as total citation counts [12], the h-index [13], and the Journal Impact Factor (JIF) [26], primarily emphasize raw frequency without accounting for the heavy-tailed nature of academic publishing data, restricting their capacity to capture hierarchical prominence across elements. Other indicators, including the g-index [27], altmetric scores [28], and PageRank [29], offer complementary perspectives but remain focused on frequency or popularity, neglecting the insights afforded by the relative ranks of elements.

Crucially, none of these existing metrics explicitly model rank distributions across academic publishing elements, an essential aspect for understanding systems governed by heavy-tailed behavior. Rank distributions reveal how frequently and prominently elements appear relative to one another, providing insight into their structural positions within the broader discourse. These gaps underscore the need for metrics grounded in distributional logic that integrate rank-based significance into importance assessment, moving beyond approaches that rely solely on frequency-based measures.

3. Theoretical Foundations of the RIF Framework

This section establishes the theoretical foundations of the RIF framework. We begin by formalizing the discrete power-law model governing frequency distributions in Section 3.1. Next, in Section 3.2, we derive the induced rank distribution under the power-law assumption, providing the probabilistic structure necessary for rank-based analysis. The RIF Index is then formally introduced in Section 3.3 as a relational functional of the estimated rank distribution. Its mathematical properties are examined in Section 3.4, where we characterize its domain and range. Finally, Section 3.5 analyzes how the exponent

θ

determines the magnitude of

\hat{Φ} (s, r)

, establishing the monotonic relationship between dominance intensity and the scaling parameter.

3.1. Understanding the Discrete Power-Law Foundations

Given our focus on modeling element frequencies in categorical data, we adopt the discrete form of the power-law distribution, which is appropriate for variables that take on integer values. Let

V = V_{1}, V_{2}, \dots, V_{n}

denote the set of all observed elements, and x the frequency of occurrence of a given element

V_{i}

. The probability mass function of a discrete power-law is:

f (x) = P (X = x) = \frac{C}{x^{α}}

(1)

Here, X is the frequency variable,

α > 0

is the scaling exponent, and C is a normalizing constant ensuring the distribution sums to one over the support

x \geq x_{min}

, where

x_{min}

represents the minimum frequency from which the power-law behavior is assumed to hold. Both

α

and C depend on the distribution and can be found in Clauset et al. [14]. The constant C is given by

C = {[ζ_{H} (α, x_{min})]}^{- 1}

, with

ζ_{H} (α, x_{min})

denoting the Hurwitz zeta function,

ζ_{H} (α, x) : = \sum_{i = 0}^{\infty} {(i + x)}^{- α}

[30]. The cumulative distribution function is

F (x) = C \cdot ζ_{H} (α, x)

. To estimate

α

and

x_{min}

, we follow the methodology of Clauset et al. [14]. For

x \geq {\hat{x}}_{min}

, the maximum likelihood estimator is:

\hat{α} \approx 1 + n {[\sum_{i = 1}^{n} ln (\frac{x_{i}}{{\hat{x}}_{min} - 0.5})]}^{- 1}

(2)

where

x_{i}

denotes the frequencies such that

x_{i} \geq \hat{x} min

. The lower bound

\hat{x} min

is selected by minimizing the Kolmogorov–Smirnov (KS) distance,

K S = {max}_{x \geq x_{min}} | G (x) - F (x) |

[31]. Linearizing the model in log–log form provides an alternative estimation path:

ln (f (x)) = ln (C) - α ln (x)

. Bootstrapping is used to evaluate the uncertainty of

\hat{α}

, and goodness-of-fit is tested via p-values. All computations were performed using the poweRlaw package in R (version 4.5.0) [32], with details of the application provided in Section 6.

To analyze rank distributions, we sort the frequencies

x (r)

of the elements

V_{i}

in descending order, such that

x (r) \geq x (r + 1)

for all

r \geq 1

. Empirical patterns often follow Zipf’s law, where frequency decays as a power of rank [18]:

x (r) = δ \cdot r^{β}

. Taking logarithms, we obtain

ln (x (r)) = ln (δ) + β \cdot ln (r)

. Parameters

δ

and

β

are estimated using either MLE or OLS, depending on sample size. Bootstrapping may also be used to construct confidence intervals. The final fitted model is

x (r) = \hat{δ} \cdot r^{\hat{β}}

.

3.2. Rank Distribution Under Power-Law Assumption

We first establish key properties of rank distributions under a power-law model, providing formal derivations that justify our approach. Theorem 1 presents the probability distribution of the rank for a variable whose frequency follows a power-law.

Theorem 1

(Rank distribution under power-law assumption). Let V be a variable composed of several elements

V_{i}

, where

i = 1, 2, \dots, n

, with descending ordered frequencies of occurrence

x (r)

for each element

V_{i}

at rank r, where r is the rank of the element within V. That is,

x (r) \geq x (r + 1)

, for all

r \geq 1

. Suppose that V follows a power-law distribution with estimated scale parameter

\hat{α}

, threshold

{\hat{x}}_{min}

, and normalization constant

\hat{C}

, ensuring that the probabilities sum to 1 for V. Then, for all ranks r such that

x (r) \geq {\hat{x}}_{min}

, the estimated probability

\hat{P} (R = r)

that an element

V_{i}

occupies rank r is given by:

\hat{P} (R = r) = \frac{\hat{B}}{\hat{P} (X = x (r))} = \hat{A} \cdot r^{- \hat{θ}}, f o r a l l \hat{θ} > 0,

(3)

where X represents the random variable corresponding to the frequency of occurrences,

\hat{B}

is a proportionality constant, and

\hat{A}

is the normalizing constant defined by:

\hat{A} : = \frac{\hat{B} \cdot {\hat{δ}}^{\hat{α}}}{\hat{C}}, s u c h t h a t \sum_{r} \hat{A} \cdot r^{- \hat{θ}} = 1,

(4)

where

\hat{δ}

(a real number) and

\hat{β} < 0

are estimates such that

x (r) = \hat{δ} \cdot r^{\hat{β}}

, and

\hat{θ} : = - \hat{α} \cdot \hat{β} > 0

is the power-law exponent for the rank R.

Proof of Theorem 1.

The derivation of the rank distribution under the power-law assumption is provided in Appendix A. □

3.3. Definition of the RIF Index

We now introduce the RIF Index as a derived relational functional of the estimated rank distribution that allows for the comparison of the relative importance of elements across different ranks. This estimate supports both intra-group comparisons (assessing how an element at a given rank compares to those ranked below it within the same group) and cross-group comparisons, which evaluate corresponding ranks across different groups or systems. For example, the RIF Index can quantify how dominant the second-ranked element is relative to lower-ranked ones within a single dataset (intra-group), or compare the strength of the second-ranked element in Group 1 versus that in Group 2 (cross-group). Since it is explicitly derived from the estimated rank probabilities, RIF provides a consistent normalized scale, even when the groups differ in overall size, frequency distributions, or decay exponents.

Definition 1

(RIF Index). For all ranks

s \leq r

such that

x (r) \geq x (s) \geq {\hat{x}}_{min}

, the estimated RIF Index of an element

V_{s}

at rank s, with respect to another element

V_{r}

at rank r, is defined as:

\hat{Φ} (s, r) : = \frac{\hat{P} (R = s)}{\hat{P} (R = r)} = {(\frac{r}{s})}^{\hat{θ}}, \hat{θ} > 0,

(5)

where:

$\hat{P} (R = s)$ is the estimated probability of the element occupying rank s,
$\hat{P} (R = r)$ is the estimated probability of the element occupying rank r.

Importantly, the RIF Index does not introduce an additional independent parameter beyond the estimated decay exponent

\hat{θ}

; rather, it operationalizes the fitted exponent into probability-ratio (pairwise) comparisons across ranks, enabling explicit relational (pairwise) comparisons among elements. When

s = 1

, we denote it simply as

\hat{Φ} (r) : = \hat{Φ} (1, r)

and refer to it as the RIF Index with respect to element

V_{r}

. Since we know

\hat{P} (R = 1) = \hat{A}

, in particular, the RIF Index becomes

\hat{Φ} (r) = r^{\hat{θ}}

. When

r = 1

, then

\hat{Φ} (1) = 1

, which is trivially true as it compares the first rank to itself. Now, suppose that

r > 1

. Then, the probability of observing the first element at rank 1 is

r^{\hat{θ}}

times higher than the probability of observing the element at rank r. More generally, the relationship between ranks can be expressed as:

\hat{Φ} (s, r) = \frac{\hat{Φ} (r)}{\hat{Φ} (s)} and \hat{P} (R = s) = {(\frac{r}{s})}^{\hat{θ}} \cdot \hat{P} (R = r)

(6)

Figure 1 explores the relationship between rank r, the RIF Index

\hat{Φ} (r)

, and the probability

\hat{P} (R = r)

under different values of the scaling parameter

\hat{θ}

.

Figure 1a,b show how the RIF Index and the probability change with respect to r for various values of

\hat{θ}

. As r increases and

\hat{θ}

becomes more positive, the RIF Index increases exponentially, while the probability decreases rapidly. Steeper curves reflect systems with greater inequality in importance concentration at the top ranks. Figure 1c shows how

\hat{P} (R = r)

affects the RIF Index for varying values of

\hat{P} (R = 1)

. This plot emphasizes how a higher baseline probability at the top rank amplifies the contrast in relative importance across ranks. Figure 1d reverses the dependency by showing how changes in

\hat{P} (R = 1)

influence the RIF Index for fixed values of

\hat{P} (R = r)

. Here, the RIF Index increases approximately linearly with

\hat{P} (R = 1)

, showing that differences in top-rank prominence strongly drive perceived importance gaps. Together, these visualizations provide a comprehensive perspective on how RIF and rank-based probabilities behave under different scaling assumptions and offer interpretability for both within- and across-system comparisons.

3.4. Properties of the RIF Index

The domain of the RIF Index

\hat{Φ} (s, r)

is the set of all positive integers

s \leq r

such that

x (r) \geq x (s) \geq {\hat{x}}_{min}

. The range of

\hat{Φ} (s, r)

is

[1, \infty)

. Since

\hat{P} (R = s) \geq \hat{P} (R = r)

for any

s \leq r

, the value of

\hat{Φ} (s, r)

will always be greater than or equal to 1. The range of values that the RIF Index can take enables different interpretations regarding the importance of an element at rank s compared to rank r:

Near 1: When $\hat{Φ} (s, r) \approx 1$ , it indicates that the probability of the element being at rank r is nearly the same as being at rank s. This suggests the element maintains relatively stable importance across ranks.
High Values: When $\hat{Φ} (s, r) ≫ 1$ , the element is significantly more likely to be found at rank s than at r. This reflects a sharp decline in importance as rank increases.

To facilitate the interpretation and communication of element importance within our analysis, we introduce the following qualitative categorization of

\hat{Φ} (s, r)

values. These thresholds are proposed as interpretive guidelines rather than inferential cutoffs, intended to facilitate qualitative communication of dominance intensity under power-law regimes.

Definition 2

(Interpretive Categories for the RIF Index). Given an element

V_{s}

at rank s and another element

V_{r}

at rank

r \geq s

, with

\hat{Φ} (s, r) = {(\frac{r}{s})}^{\hat{θ}}

, the following interpretations are proposed:

Stable: $\hat{Φ} (s, r) \approx 1$
The element’s importance remains nearly constant across different ranks. It suggests a stable distribution of importance with minimal variation as the rank changes.
Moderate: $1 < \hat{Φ} (s, r) \leq 3$
The element has a slightly higher importance at rank s compared to a higher rank r. The difference is noticeable but not extreme, indicating a moderate decline in importance.
Significant: $3 < \hat{Φ} (s, r) \leq 6$
The element is significantly more important at rank s than at higher ranks. This represents a clearly elevated level of importance.
Critical: $6 < \hat{Φ} (s, r) \leq 9$
The element’s importance at rank s is substantially higher than at other ranks. This reflects a critical difference in importance with a steep decline as rank increases.
Dominant: $\hat{Φ} (s, r) > 9$
The element is overwhelmingly more important at rank s than at any higher rank. This indicates a peak level of importance concentrated at rank s.

These interpretive categories support both qualitative and quantitative evaluations, helping to contextualize the relative significance of a given element

V_{s}

when compared with elements

V_{r}

for

r \geq s

.

3.5. Values of $\hat{θ}$ for Each Range of $\hat{Φ} (s, r)$

The value of

\hat{θ}

directly influences the values of

\hat{Φ} (s, r)

for all

r \geq s

such that

x (r) \geq x (s) \geq {\hat{x}}_{min}

, as shown in the following theorem.

Theorem 2

(Monotonic Relationship Between

\hat{θ}

and

\hat{Φ} (s, r)

). Let

\hat{Φ} (s, r)

be the estimated RIF Index of the element

V_{s}

at rank s with respect to another element

V_{r}

at rank r, where

r \geq s

and

x (r) \geq x (s) \geq {\hat{x}}_{min}

. Let

\hat{θ}

be defined as in previous sections. Then, for a fixed ratio

p : = r / s

and given estimations

{\hat{θ}}_{1}

and

{\hat{θ}}_{2}

such that

0 \leq {\hat{θ}}_{1} \leq {\hat{θ}}_{2}

, the following holds:

{\hat{θ}}_{1} < \hat{θ} < {\hat{θ}}_{2} i f a n d o n l y i f {\hat{Φ}}_{1} \leq \hat{Φ} (s, r) \leq {\hat{Φ}}_{2}

(7)

where

{\hat{Φ}}_{i} : = {\hat{Φ}}_{i} (s, r) = p^{{\hat{θ}}_{i}}

. Observe that

{\hat{θ}}_{i} = \frac{ln ({\hat{Φ}}_{i})}{ln (p)}

.

Proof of Theorem 2.

See Appendix B. □

Table 1 presents the estimated minimum and maximum values of

\hat{θ}

for different fixed ranks r across various ranges of

\hat{Φ} (r)

. This estimation is based on the formula

\hat{Φ} (r) = r^{\hat{θ}}

, which provides insight into how

\hat{θ}

varies with different values of r and

\hat{Φ} (r)

.

For the range

1 < \hat{Φ} (r) \leq 3

,

\hat{θ}

spans from 0.014 to 1.585 for

r = 2

, 0.009 to 1.000 for

r = 3

, and gradually decreases with increasing rank, reaching 0.004 to 0.477 for

r = 10

. This pattern reflects that for elements with the lowest RIF Index values, the scaling exponent

\hat{θ}

tends to be smaller and diminishes consistently as rank increases. In the range

3 < \hat{Φ} (r) \leq 6

,

\hat{θ}

shows moderate growth, ranging from 1.590 to 2.585 for

r = 2

and narrowing to 0.479 to 0.778 for

r = 10

. This suggests that moderately prominent elements still exhibit variation in scaling, although the values become more compressed in higher ranks. For the range

6 < \hat{Φ} (r) \leq 9

,

\hat{θ}

further increases, with values from 2.587 to 3.170 for

r = 2

and 0.779 to 0.954 for

r = 10

, reflecting stronger relative importance among top-ranked elements and a continued downward trend as rank grows. Finally, in the range

\hat{Φ} (r) > 9

,

\hat{θ}

values are the highest, particularly for the lowest ranks: from 3.172 to 5.322 for

r = 2

and decreasing to 0.955 to 1.602 for

r = 10

. This confirms that the most prominent elements (i.e., those with the largest

\hat{Φ} (r)

) exhibit the steepest scaling, while their influence diminishes markedly for lower-priority ranks.

This analysis demonstrates the variability of

\hat{θ}

and highlights the differences in the RIF Index across different ranks and ranges of

\hat{Φ} (r)

. This behavior is visualized in Figure 2.

4. Methodology and General Framework

Figure 3 shows the general workflow for applying the RIF Index. The process is organized into three main steps: (1) Data Processing, (2) Power-law Estimation, and (3) RIF Estimation. This framework ensures that categorical data (particularly those following a power-law distribution) are properly validated, transformed, and analyzed using a mathematically grounded approach.

We begin by collecting the data and selecting a variable of interest, which must be categorical; otherwise, the RIF Index cannot be applied. We then estimate frequencies, calculate totals, compute relative frequencies, and sort the values to assign ranks. These ranked frequencies serve as the basis for fitting a discrete power-law model.

Next, we evaluate the statistical validity of the fitted model (e.g., p-value ≥ 0.05). Since the RIF Index is explicitly derived from the estimated discrete power-law rank model, if the power-law fit is not supported, the RIF Index is not applicable. Otherwise, we compute the RIF Index for all valid rank pairs. This enables both intra-group comparisons (assessing how dominant an element is within its group relative to lower-ranked elements) and cross-group comparisons (e.g., comparing the strength of the second-ranked element across two datasets).

Finally, we visualize the induced relational structure using RIF matrices and networks to interpret structural patterns and relationships. Each step of the workflow is formalized in the pseudocode provided in Appendix C (Algorithm A1), which outlines the complete implementation of the RIF Index (from raw frequency data to rank-based analysis).

5. Demonstration with Synthetic Data

To demonstrate the practical application of the RIF Index in domain-specific contexts, we construct two synthetic cases using simulated datasets labeled with conceptual codes (e.g., a, b, c…). These examples simulate realistic rank-frequency distributions and allow us to explore two core use cases: Case 1, which compares elements within a single group (Section 5.1); and Case 2, which compares corresponding ranks across two different groups (Section 5.2). Building on the theoretical constructs developed in Section 3.3, we show how the RIF Index can be visualized and interpreted using two complementary formats: a comparison matrix and a relational network.

5.1. Case 1: Intra-Group Rank Comparison

Figure 4 illustrates the RIF Index in the general context of transitioning from raw frequency data to a quantifiable measure of comparative importance between ranked elements within a single group.

Figure 4a displays the raw frequencies of several elements (e.g., a, b, c, …) in no particular order. While this view shows which elements appear more frequently, it does not facilitate meaningful comparisons of how much more important one element is over another. Figure 4b improves upon this by arranging the same elements in descending order by frequency. Although this helps visually identify the most prominent elements, the visual differences between adjacent bars diminish rapidly. As the ranking progresses, it becomes harder to tell how much more relevant one element is compared to its neighbors (leading to a reliance on perception rather than quantifiable distinction).

Figure 4c addresses this ambiguity. By applying a log-log transformation and fitting a power-law distribution to the frequency data, a consistent pattern in the rate of frequency decay emerges. The RIF, denoted as

\hat{Φ} (s, r)

, leverages this distribution to provide a mathematically grounded way to compare any two ranks

s \leq r

. Rather than simply stating that one element is more frequent than another, RIF quantifies “how much more” in consistent, interpretable terms. In this way, the RIF transforms the frequency-rank relationship into a structured, scalable representation of relative importance.

This intra-group analysis demonstrates that RIF is particularly useful in long-tailed distributions, where traditional frequency comparisons tend to break down. It offers a scalable and interpretable measure of conceptual distance between ranks within a single system.

5.2. Case 2: Cross-Group Rank Comparison

While Case 1 focused on comparisons within a single system, Case 2 demonstrates the utility of RIF for comparing equivalent ranks across different groups. This is particularly valuable when analyzing parallel systems (such as distinct domains, time periods, or geographical regions) that share similar structural patterns but differ in overall scale or magnitude.

Figure 5 illustrates this application. Sub-plot a displays the raw frequencies of elements within two distinct groups (Group 1 and Group 2).

Sub-plot b presents the ranked frequency distributions for both groups. Although both exhibit a similar decay pattern consistent with a power-law, their absolute frequencies and scaling behaviors differ. Traditional rank or frequency-based comparison methods fall short in accounting for these disparities.

Sub-plot c demonstrates how the RIF Index can address this challenge. By comparing

\hat{Φ} (r)

values at corresponding ranks across groups, we can evaluate how relatively “important” an element at rank r is in Group 1 compared to its counterpart in Group 2. This facilitates interpretable, scale-invariant comparisons between systems. For instance, even if both groups share a second-ranked element, the RIF may reveal that its relative prominence is significantly higher in one group than the other (enabling a fair and meaningful cross-group assessment).

In this second case, RIF functions as a normalization bridge, making it possible to compare groups that would otherwise be incomparable due to structural or scale-related differences. By anchoring each group’s rank distribution to its own estimated power-law exponent

\hat{θ}

, the RIF offers a unified framework for interpreting relative prominence across distinct systems. Nevertheless, this approach introduces a critical caveat: cross-group comparisons are only valid if the estimated exponents are statistically sound. Inaccurate estimation of

\hat{θ}

may distort the comparative scale, highlighting the need for rigorous model validation before applying RIF in heterogeneous contexts.

5.3. Visualizing the RIF Index: Matrices and Networks

To further illustrate the flexibility of the RIF Index, we present two complementary visualizations (matrix-based and network-based) using the top six concepts (e.g., a, b, c…) from each synthetic group. These representations demonstrate how the RIF Index can be applied not only to compare individual rank gaps (e.g., from rank 1 to rank 2), but also to examine the overall relational structure between any pair of ranks (whether adjacent or not) within a group or between groups (e.g., rank 2 from group 1 vs. rank 2 from group 2).

5.3.1. Matrix-Based Comparison

Figure 6 displays the RIF Index

\hat{Φ} : = \hat{Φ} (s, r)

,

s \leq r

, for all pairwise combinations of the top six ranked concepts in each group.

Each cell indicates how many times more important the concept at rank s is compared to the concept at rank r. This structure allows us to answer specific questions such as: Who is the better second? or Which concept is closer to the top s? even when absolute frequencies may mislead.

To aid interpretation, each

\hat{Φ}

value is categorized into five qualitative tiers: Stable (

\hat{Φ} \approx 1

), Moderate (

1 < \hat{Φ} \leq 3

), Significant (

3 < \hat{Φ} \leq 6

), Critical (

6 < \hat{Φ} \leq 9

), and Dominant (

\hat{Φ} > 9

). These categories are color-coded, with pale blue tones representing stronger conceptual proximity (lower RIF values), and progressively deeper orange to dark red tones indicating weaker thematic connections (higher RIF values). For example, a Stable relationship (light blue) implies strong thematic overlap between two concepts, while a Dominant relationship (dark red) signals a large thematic distance. This interpretive framework supports a more nuanced understanding of intra-group and cross-group thematic structures. Importantly, this matrix representation operationalizes the scalar decay exponent into a complete pairwise dominance structure, enabling a relational analysis that is not immediately accessible from inspection of the exponent alone.

Examples from Case 1 (Intra-Group)

For illustrative purposes, we focus on Group 1. The interpretation for Group 2 is analogous. The RIF between concept b (rank 1) and j (rank 2) is approximately

5.3

, placing it in the Significant tier. This suggests that j is meaningfully less important than b, but still plays a strong secondary role in the system. In contrast, the gap between b and d (rank 4) is around

27.6

, a Dominant relationship, indicating that d is thematically distant from the system’s conceptual core. Moreover, the relationship between j and c (ranks 2 and 3) has an RIF near

2.6

, categorized as Moderate, revealing a relatively cohesive middle layer in the group. This value results from dividing the RIF of j (

{\hat{Φ}}_{2} = 5.3

) by the RIF of c (

{\hat{Φ}}_{3} = 13.9

), that is:

\hat{Φ} (2, 3) = {\hat{Φ}}_{3} / {\hat{Φ}}_{2} \approx 2.62

. Additional comparisons provide further insight. For example, comparing j (rank 2) and d (rank 4), we obtain an RIF of

{\hat{Φ}}_{2, 4} \approx 5.21

, again in the Significant range. This suggests that j retains a strong role even when compared to lower-tier concepts beyond its immediate neighbor. Another case is the comparison between c (rank 3) and u (rank 5), with an RIF of approximately

3.4

, which also falls into the Significant category. This confirms that the drop in importance between these mid-ranked concepts is not as steep as the one observed between top and lower ranks. These values correspond to previously computed RIF indices, where each

\hat{Φ} (s, r)

expresses how many times more relevant concept s is compared to concept r based on their rank positions.

Examples from Case 2 (Cross-Group)

Comparing the second-ranked concepts across groups, j in Group 1 and h in Group 2, we see that j is relatively closer to its first-ranked concept (b, with

\hat{Φ} = 5.3

) than h is to l in Group 2 (

\hat{Φ} = 7.1

). This suggests that j is the stronger second, even though h has a higher raw count. Similarly, while the third-ranked concepts in both groups (c in Group 1 and m in Group 2) show different absolute counts, their RIF values (

13.9

vs.

22.3

) reveal that c is relatively more integrated within its group’s core than m.

5.3.2. Network-Based Representation

To complement the matrix view, Figure 7 shows a network representation of the RIF values.

Node size reflects concept frequency (i.e., rank), while edge thickness is inversely proportional to the RIF: thicker edges indicate stronger conceptual proximity. This layout offers a more intuitive view of centrality, cohesion, and isolation among concepts across and within groups. Taken together, the matrix and network representations illustrate how a scalar decay regime is re-expressed as a relational dominance topology, enabling not only simple pairwise comparisons, but also a structural interpretation of the internal dynamics within and across groups. This provides a robust foundation for exploring questions like Is the second-ranked concept in Group 1 stronger than the second-ranked concept in Group 2? or Which group has a more cohesive conceptual core? In both views, the RIF Index transforms rank-based frequency data into a relational framework capable of capturing subtle differences in importance and proximity among concepts.

6. Case Study: Measuring Conceptual Prominence in Social Resilience Literature

Author and index keywords are essential for organizing and retrieving academic content, helping to identify trends, thematic structures, and disciplinary focuses in bibliometric studies [33,34]. Author keywords, chosen by researchers, emphasize study-specific topics but may lack interdisciplinary breadth. In contrast, index keywords assigned by databases like Scopus using controlled vocabularies (e.g., MeSH) standardize terminology across fields [35,36], enhancing cross-disciplinary retrieval but sometimes missing contextual nuance [34].

Previous studies have used keyword analysis to uncover intellectual patterns, identify topics by frequency and citations [37], establish disciplinary indicators [7], and create co-occurrence networks to map knowledge domains [19]. Here, we apply keyword analysis to explore conceptualizations of social resilience, aiming to reveal field structures, interdisciplinary dialogues, and emerging research directions.

Social resilience, conceptualized as communities’ ability to absorb shocks and maintain essential functions during crises, has become increasingly relevant in addressing global challenges like climate change, economic instability, and pandemics [38,39,40]. Research on social resilience helps institutions develop strategies by examining community responses to external stresses. Scholars have mapped the field’s intellectual structure, highlighting key themes such as community resilience (governance, networks, social capital), risk mitigation (exposure, vulnerability), and institutional adaptation (policies, planning) [41,42,43].

Analyzing author and index keywords can reveal how resilience is conceptualized and applied, exposing shifts in focus, disciplinary boundaries, and terminological inconsistencies [34]. This study uses bibliometric techniques, particularly the RIF Index, to identify dominant concepts and measure their prominence over time.

We apply the RIF Index to an empirical dataset on social resilience by analyzing both author and index keywords. Following the methodological workflow outlined in Section 4, we begin with data collection and preprocessing procedures (Section 6.1), where we construct and clean the bibliometric dataset. We then evaluate whether keyword frequencies exhibit power-law behavior using estimation and goodness-of-fit procedures (Section 6.2), providing the statistical foundation for applying the RIF Index.

Section 6.3 presents the RIF scores derived from the fitted distributions, enabling us to quantify conceptual prominence across keywords. Finally, scenario-based analyses in Section 6.4 and Section 6.5 demonstrate two core applications of the RIF framework: intra-group rank dynamics (Case 1) and cross-group rank comparisons (Case 2), each supported by matrix and network visualizations that reveal hierarchical patterns of conceptual dominance.

6.1. Data Collection and Processing

We examined key concepts in social resilience research using bibliometric data from Scopus-indexed publications up to 2023. Our objective was threefold: identify influential keywords, assess their relative prominence through a power-law lens, and explore their thematic evolution using the RIF Index.

We selected Scopus for its broad coverage of Science, Technology, and Medicine (STM) (https://www.stmjournals.com/) journals and its robust citation capabilities [44]. An initial query returned 1402 publications (1993–2024) (search performed on 2 September 2024) containing the term “social resilience” in the title, abstract, or keywords. After filtering (search query: “social resilience” AND NOT (“mental health” OR pharmaceutical OR aviation)), restricting to English-language journal articles in the social sciences, and validating metadata, we obtained a final dataset of 401 articles (1997–2023). Keyword extraction yielded 1410 unique author keywords and 1293 index keywords. As shown in Figure 8, author keywords typically ranged from 4 to 6 per article, while index keywords exhibited greater variability.

To prepare the keyword data for analysis, we standardized the dataset to enhance conceptual coherence, minimize redundancy, and reduce keyword fragmentation. We applied a custom-built thesaurus with over 50 rules to merge synonyms and unify related terms. For instance, we mapped variants like adaptation, psychological to the canonical form psychological adaptation, and consolidated pandemic-related terms such as SARS-CoV-2 and pandemics under COVID-19.

We also filtered out overly generic or irrelevant entries by programmatically removing country names, broad geographical regions (e.g., Europe, Asia), and publication types (e.g., article, review). Additionally, we excluded the term resilience itself to avoid circularity in the analysis [45,46]. This comprehensive cleaning and aggregation produced a refined dataset of 1344 unique author keywords and 1204 unique index keywords, which formed the basis for our power-law analysis and RIF Index estimation.

6.2. Modeling Keyword Distributions with Power Laws

We estimated the key parameters (

\hat{α}

,

{\hat{x}}_{min}

, and

\hat{C}

) to characterize keyword frequency in social resilience research. Figure 9 visually suggests that both author (a) and index (b) keywords may follow a power-law distribution, as indicated by the straight-line behavior on a log–log scale.

To confirm whether the observed patterns adhere to a power-law distribution, we performed a Kolmogorov–Smirnov (KS) test. The results presented in Table 2 indicate that both author and index keywords exhibit characteristics of heavy-tailed distributions, though with differing degrees of power-law adherence.

For author keywords (n = 1344),

\hat{α} = 2.460

, with a minimum cutoff of

{\hat{x}}_{min} = 2

, covering 11.8% of the data (158 keywords). The corresponding p-value is 0.972, indicating strong plausibility for the power-law model. For index keywords (n = 1204),

\hat{α} = 2.322

with the same

{\hat{x}}_{min} = 2

, but a higher tail coverage of 28.9% (348 keywords). The p-value of 1.000 confirms an excellent fit to the power-law model. These results confirm power-law behavior in both keyword sets, especially index keywords, consistent with [8]. In our dataset, Climate Change was the most frequent index term (40 occurrences), while Social Resilience led among author keywords (127). Even after excluding single-occurrence terms (

x_{min} = 2

), the pattern remains consistent.

6.3. Quantifying Conceptual Prominence Using the RIF Index

Table 3 presents the top nine concepts with the highest prominence based on the RIF Index for both index and author keywords. The “Probability

P (R = r)

” column represents the probability that a concept appears at rank r within the fitted rank distribution over the modeled support

x (r) \geq {\hat{x}}_{min}

(i.e., normalized over the ranks included in the power-law fit). These probabilities are computed using the estimated power-law parameters from Table 2, particularly

\hat{α}

and

{\hat{x}}_{min}

, and provide a mathematical grounding for interpreting each concept’s expected ranking under a heavy-tailed model. The RIF Index quantifies how dominant a concept is within a heavy-tailed distribution, where lower values indicate greater prominence relative to subsequent ranks. Specifically, the table reports

\hat{Φ} (1, r)

, the RIF value comparing each concept at rank r against the top-ranked concept at

s = 1

. For example, an RIF value of 5 means the top-ranked concept is five times more dominant than the compared concept under the modeled distribution.

Among author keywords, COVID-19 is the most prominent, with the highest frequency (

n = 30

), a probability of

P (R = 1) = 0.134

, and an RIF of 1.0, confirming its stable role as the field’s conceptual core. Vulnerability (RIF = 2.7) falls within the moderate category, while community resilience (RIF = 5.6) and climate change (RIF = 9.6) lie in the significant and critical ranges, respectively, indicating progressive declines in relative importance. Notably, urban resilience and social capital, positioned fifth and sixth, exhibit RIF values of 14.9 and 22.3, marking the transition toward the dominant range. Long-tail concepts such as adaptation and migration, with RIF values above 40 and probabilities near 0.003, reflect more specialized or emerging research areas.

Similarly, for index keywords, climate change ranks first with an RIF of 1.0 (stable) and

P (R = 1) = 0.704

, maintaining consistent prominence over lower-ranked terms. Human (RIF = 5.0) is classified as significant, while vulnerability (RIF ≈ 12.8) and decision making (RIF ≈ 25.1) fall into the critical range. Lower-ranked concepts such as COVID-19 (RIF = 41.4) and sustainability (RIF = 88.0) exhibit large dominance gaps, while adaptive management and disaster management reach the highest RIF values (above 170), indicating their position in the long tail of the distribution. These findings demonstrate how the RIF Index (when combined with probability estimates derived from power-law modeling) uncovers hierarchical structures in conceptual prominence and enables systematic comparisons across different keyword attribution strategies in academic publishing.

While Table 3 compares each concept against the top-ranked one (

s = 1

), the visualizations in Figure 10 and Figure 11 generalize the approach by enabling pairwise comparisons between any two ranks where

s \leq r

. This flexibility expands the analytical power of the RIF framework, allowing for more nuanced and context-sensitive assessments of relative dominance across the entire rank spectrum. For example, in the author keyword matrix, comparing vulnerability to COVID-19 yields

Φ (1, 2) = 2.7

(moderate), meaning that although vulnerability ranks second, its dominance is moderately lower than that of the top concept. Comparing community resilience (rank 3) to vulnerability (rank 2) gives

Φ (2, 3) = 2.0

(moderate), reflecting a smaller gap. However, comparing community resilience directly to COVID-19 yields

Φ (1, 3) = 5.6

(significant), indicating a clear decline in prominence.

In the index keyword matrix, comparing decision making to climate change gives

Φ (1, 4) = 25.1

(critical), while the comparison of human to climate change yields

Φ (1, 2) = 5.0

(significant). These comparisons help reveal which concepts maintain dominance and which quickly fall into the long tail.

Figure 11 provides a network-based view of directional dominance patterns among keywords, where arrows point from lower-ranked concepts (r) to higher-ranked ones (s), and edge thickness reflects the RIF value

\hat{Φ} (s, r)

. Thicker edges indicate smaller gaps in relative importance, meaning the lower-ranked concept more closely approximates the dominance of the higher-ranked one, while thinner edges denote sharper hierarchical differences. Node size corresponds to the rank position.

In the author keyword network, COVID-19 stands out as the top-ranked and most dominant concept. It receives multiple incoming connections, most of which are thin, such as from urban resilience (

Φ = 14.9

) and social capital (

Φ = 22.3

), highlighting steep declines in relative importance as measured by RIF. Concepts like vulnerability (

Φ = 2.7

) and community resilience (

Φ = 5.6

) exhibit thicker links, indicating narrower dominance gaps, though still asymmetrical. Despite these differences, COVID-19 also functions as a conceptual hub, likely to co-occur with many other terms due to its frequency and central role. A similar structure emerges in the index keyword network, where climate change is the most prominent node. It receives directional links from decision making (

Φ = 25.1

), COVID-19 (

Φ = 41.4

), and vulnerability (

Φ = 12.8

), most of which are relatively thin. In contrast, the connection from human (

Φ = 5.0

) is thicker, indicating a smaller dominance gap. Like COVID-19, climate change may frequently co-occur with other concepts despite being structurally distant in the RIF hierarchy.

This reveals a useful paradox: thinner RIF edges reflect larger dominance gaps, yet may correspond to more frequent co-mentions in practice, as dominant concepts tend to appear across a wide thematic landscape. Conversely, thicker RIF edges indicate similar levels of importance but do not necessarily imply joint usage. In this way, RIF-based networks offer a reverse interpretive lens to co-occurrence networks, capturing structural prominence rather than co-mention frequency, and thus enabling researchers to distinguish between thematic relevance and conceptual influence.

Rather than replacing co-occurrence analysis, the RIF approach can serve as a complementary tool for identifying conceptual hierarchies and asymmetries in prominence. When used together, RIF and co-occurrence networks offer a richer, multidimensional understanding of thematic landscapes: one grounded in rank-based dominance, and the other in observed linkage patterns.

6.4. Case 1: Intra-Group Rank Analysis of Index Keywords

We compare the relative prominence of concepts within the index keyword group by analyzing their ranked positions. The goal is to identify how lower-ranked concepts differ in importance from higher-ranked ones and whether certain terms maintain influence despite not being at the top. Although the focus is on index keywords, similar patterns are observable among author keywords, as shown in the matrix and network visualizations (Figure 10 and Figure 11, right panels).

While climate change holds the highest rank and shows expectedly smaller dominance gaps with concepts like human (Significant,

Φ = 5.0

) and vulnerability (Critical,

Φ = 12.8

), the analysis becomes more insightful when examining lower-ranked concepts. For instance, decision making (rank 4) shows a Moderate dominance gap with vulnerability (

Φ \approx 2.3

), suggesting thematic proximity in discussions of governance, risk assessment, and adaptive planning. However, its relationships with COVID-19 (

Φ \approx 1.6

) and sustainability (

Φ \approx 3.5

) are classified as Significant, indicating intersections in relative importance that remain substantial but more specialized.

Likewise, based on Table 3, the concept human (rank 2) reveals nuanced contrasts. While it shows a Significant dominance gap relative to climate change, its relationships with adaptive management (

Φ \approx 35

) and sustainability (

Φ \approx 17.6

) fall into the Dominant category, suggesting divergence in conceptual salience despite thematic relevance to resilience and decision-making frameworks.

A similar pattern is observed with sustainability (rank 6), which exhibits high RIF values in the Dominant range when compared to all higher-ranked concepts. Rather than implying marginality, this suggests a more focused or domain-specific presence within the discourse, perhaps associated with long-term policy and development goals. Additionally, lower-ranked keywords with sustained presence but weaker dominance, such as adaptive management or disaster management, may signal emergent or niche research areas gaining conceptual traction within the field.

These gradients in relative importance tend to follow the rank distance. This is visually reinforced in the matrix through the transition from blue to red tones and in the network diagram (Figure 11). Core concepts such as climate change, human, and decision making form a tightly connected hub, while others like vulnerability and sustainability appear more peripheral yet remain thematically relevant.

In sum, the intra-group analysis confirms that conceptual prominence in the index keyword set is shaped not only by frequency but also by relational positioning. The RIF Index captures this duality, revealing both central anchors and peripheral connectors. Moreover, it highlights how lower-ranked concepts may represent emerging topics whose growing presence warrants continued attention. This supports the notion of a rank-dependent, power-law-driven thematic organization.

6.5. Case 2: Cross-Group Rank Comparison Analysis of Author and Index Keywords

We compare RIF Index rankings across index and author keywords to examine how conceptual prominence and hierarchical structure vary depending on attribution type. Figure 10 and Figure 11 (left and right panels) support this analysis by visualizing rank-based dominance and directional relationships across both keyword sets. A key observation lies in how shared concepts shift rank across groups. For instance, COVID-19 is the top-ranked author keyword but appears fifth in the index keyword list. Conversely, climate change leads the index keywords but ranks fourth among author keywords. These positional shifts reflect distinct perspectives: author keywords capture researcher-driven framing, while index keywords result from systematic classification practices.

This contrast is interesting given that social resilience, originally part of the search query, was expected to remain central across both keyword types. However, the analysis reveals divergent patterns in how concepts are labeled: for authors, COVID-19 functions as the conceptual hub, whereas for indexers, climate change emerges as the main thematic core. This underscores important differences in how thematic relevance is interpreted and encoded across attribution systems.

Despite these differences, several core concepts, such as climate change, COVID-19, and vulnerability, appear in the top five of both lists, suggesting broad thematic overlap. However, their relative positions and associated RIF values reveal important structural contrasts. Author keywords foreground normative or emergent discourse (e.g., community-centered resilience), while index keywords emphasize consolidated themes (e.g., environmental and decision-making systems) embedded in indexed metadata.

A notable exception is vulnerability, which retains the third rank in both groups. This consistency suggests that vulnerability plays a bridging role (anchored in both authorial intent and indexing logic). Its stable position signals foundational relevance across the resilience literature. However, its relative prominence differs across attribution systems: in author keywords, vulnerability has

Φ (1, 2) = 2.7

, whereas in index keywords it has

Φ (1, 3) = 12.8

, reflecting a substantially larger dominance gap with respect to the top-ranked concept in the indexed structure. This subtle but meaningful difference underscores how metadata classification may better capture its structural centrality compared to author-driven labeling.

Further contrasts emerge when comparing keywords occupying the same rank across groups. At Rank 2, both vulnerability (author) and human (index) appear with comparable frequencies (20 and 30, respectively), yet their RIF values differ (

Φ = 2.7

for vulnerability and

Φ = 5.0

for human). This indicates that, despite similar frequencies, human exhibits a steeper dominance gap within the index keyword structure. At Rank 6, a similar contrast appears: while social capital in the author group has an RIF of 22.3, sustainability in the index group records a value of 88.0. Although both occupy the sixth position in their respective lists, the lower RIF of social capital implies a narrower drop-off in relative importance and greater structural integration than its counterpart.

The RIF matrix (Figure 10) reinforces these patterns. In the author keyword matrix, RIF values show a wider spread, with high dominance gaps (e.g.,

Φ > 40

) between top-ranked and lower-ranked terms. This broader gradient may reflect the diversity and specialization of author-driven contributions. In contrast, the index keyword matrix presents a more cohesive structure: top-ranked terms such as climate change, decision making, and human exhibit smaller dominance gaps, suggesting tighter conceptual integration within the indexed literature.

The network view (Figure 11) supports these distinctions. For author keywords, COVID-19 functions as a central hub, receiving links from closely related terms like vulnerability and community resilience. In the index keyword network, a dual-core structure emerges, dominated by climate change and decision making, with strong ties to human, vulnerability, and COVID-19. These configurations reflect differing organizing principles: author keywords emphasize narrative framing and emergent discourse, whereas index keywords exhibit more structured, hierarchical relationships.

Overall, the cross-group comparison reveals how similar thematic domains, such as resilience, climate, and vulnerability, are organized differently depending on the attribution source. Author keywords capture conceptual intent and field emergence, while index keywords reflect formalized classification. The RIF Index provides a unifying metric to evaluate these differences in relative importance, enabling systematic cross-group comparisons and revealing conceptual hierarchies embedded in both narrative and metadata structures.

7. Discussion and Limitations

The findings of this study offer several important implications. For researchers, the RIF Index serves as a valuable tool for identifying foundational and emerging concepts that shape academic discourse and guide future research. For policymakers, insights into dominant themes in social resilience research can inform more targeted interventions, especially in areas such as climate change, disaster preparedness, and community adaptation. Practitioners can similarly benefit by aligning resilience-building strategies with key themes like adaptive management and disaster management, which hold structural prominence in the literature.

Beyond the domain of social resilience, the RIF Index demonstrates strong potential for application in other fields. For example, analyzing media coverage on migration through this lens could help identify dominant frames and influential narratives, offering new insights into public discourse and agenda-setting dynamics [47]. This flexibility underscores the RIF Index’s utility as a generalizable method for thematic analysis across disciplines.

The results further confirmed that keyword frequencies, particularly among index terms, follow a power-law distribution, where a few concepts dominate while most appear infrequently. This aligns with prior findings in academic publishing research [8] and reflects broader patterns in academic publishing. Statistical tests validated the distribution, supporting the conclusion that discourse around social resilience is anchored in a core set of topics. The use of a controlled thesaurus to standardize terminology strengthened the analysis, revealing strong conceptual connections among central terms such as vulnerability, COVID-19, and sustainability, each playing a crucial role in resilience research [48].

Unlike citation-based metrics (e.g., the h-index or g-index), which assess scholarly impact at the author level [49], the RIF Index evaluates the relative prominence of concepts within a rank-based distribution. This makes it a powerful complement to existing tools for examining the structure of academic and public discourse. Future research can extend this approach to explore evolving thematic landscapes across fields such as policy analysis, media studies, and scientometrics. Importantly, while the scaling exponent summarizes global concentration within the distribution, the RIF Index operationalizes this global property into a full pairwise dominance structure. In other words, whereas the power-law exponent provides a single summary parameter of inequality, the RIF transforms this scalar decay regime into explicit relational comparisons between ranked elements. This relational lifting cannot be directly inferred from the exponent alone and constitutes a key methodological contribution of the present framework.

While the RIF Index is not a direct measure of co-occurrence or semantic similarity, it provides a complementary way to explore conceptual associations. In co-occurrence networks, edge thickness typically reflects the frequency of joint mentions (stronger connections imply more frequent co-appearance). In contrast, RIF networks represent hierarchical gaps in relative importance between pairs of concepts. Specifically, lower RIF values (thicker edges) indicate that the lower-ranked term closely approximates the prominence of the higher-ranked one. This structural proximity may suggest, but does not guarantee, a lower likelihood of co-mention, since similarly ranked terms are often topically distinct.

Conversely, higher RIF values (thinner edges) reflect steep dominance gaps, where the lower-ranked concept is much less prominent. These uneven relationships may still show frequent co-occurrence, especially when dominant concepts like COVID-19 or climate change appear across a wide thematic spectrum. In these cases, a term may exhibit thin RIF edges with many others due to its high centrality, yet co-occur frequently in practice.

From this exploratory perspective, RIF and co-occurrence networks may yield inverted interpretations of edge thickness. While RIF edges reflect structural proximity within a ranked distribution, co-occurrence edges capture empirical patterns of joint usage. A thick RIF edge may imply similar importance between two concepts without indicating frequent co-appearance, whereas a thin RIF edge may connect dominant terms that frequently co-occur. These contrasting logics are not contradictory but complementary, each offering insight into different dimensions of conceptual relationships.

Taken together, these insights position the RIF Index as a valuable complement to existing tools for understanding conceptual structure in research domains. RIF-based networks can serve as structural priors, mathematically grounded estimates of conceptual prominence between keyword pairs, that help guide or contextualize the analysis of co-mention patterns or semantic similarity. While co-occurrence networks highlight surface-level associations, RIF networks uncover latent thematic hierarchies and prominence tiers. Used together, they provide a more nuanced understanding of how concepts are organized and interconnected within a research field.

While the RIF Index provides a flexible and generalizable framework for analyzing hierarchical prominence in ranked distributions, several methodological limitations should be acknowledged. First, the RIF Index is designed for discrete rank-frequency data and is not directly applicable to continuous distributions or datasets without a clear rank structure. Second, RIF values range from 1 to infinity, which can make interpretation unintuitive without normalization. Although we proposed qualitative categories (e.g., Stable, Significant, Dominant) to aid interpretation, these thresholds are heuristically defined and not derived from a formal calibration method. Future research could explore statistically grounded calibration strategies, such as percentile-based thresholds or model-based confidence bands, to formalize these interpretive tiers.

Third, the RIF framework assumes that the underlying frequency distribution follows a power-law model. Because RIF values are algebraically derived from the estimated decay exponent, model misspecification or unstable parameter estimation may propagate directly into dominance estimates. While this assumption holds in many domains. Fourth, RIF quantifies dominance gaps between ranked elements but does not account for semantic similarity or co-occurrence frequency. As such, concepts with similar RIF values may be thematically unrelated, and concepts with large RIF differences may frequently co-appear in practice. Finally, because the RIF is calculated relative to a chosen reference rank (typically the top-ranked item), all other values are dependent on this choice, and interpretation can vary if the point of reference shifts.

In the context of this study, additional limitations arise. The analysis relied exclusively on English-language journal articles indexed in Scopus, potentially omitting relevant literature from other languages, disciplines, or repositories. The data was treated as static, capturing a snapshot of conceptual prominence without considering how keyword hierarchies shift over time. Moreover, the focus on a single disciplinary field (social resilience) means the results may not generalize to more interdisciplinary or fragmented research domains.

8. Conclusions and Future Research

This study introduces the RIF Index as a derived, rank-based functional of the fitted discrete power-law rank distribution for measuring the prominence of concepts in academic literature. Unlike traditional metrics that rely solely on frequency, in particular, RIF provides stronger discrimination in the tail, where frequency-based metrics often provide limited discrimination among concepts with similar counts but different relative dominance. By applying the RIF to both synthetic and real-world data on social resilience, the study highlighted how terms like COVID-19, climate change, and vulnerability can hold structural importance even if they are not always the most frequent.

This work makes three main contributions. First, it offers a formal approach to model rank-frequency relationships based on power-law behavior. Second, it proposes the RIF Index as a way to quantify how much more dominant one concept is compared to another. Unlike the scaling exponent alone, which summarizes inequality globally, the RIF enables explicit relational comparisons across the entire rank spectrum. Third, it provides two visual tools, the RIF matrix and the RIF network, that help researchers see both local and overall patterns of importance. Together, these elements support a structured and scalable way to study how research topics are organized, compared, and evolve.

The findings confirmed that index keywords followed a power-law distribution, supporting the use of the RIF model. Comparing author and index keywords revealed differences in how concepts are emphasized depending on who defines them. Importantly, the RIF Index helped highlight differences in relative importance even when two concepts shared similar frequencies or ranks, answering subtle questions like which “second-ranked” term is more prominent. Additionally, the method allowed for intra-group and cross-group comparisons, helping identify dominant concepts, groups of equally prominent lower-ranked terms, and potentially emerging topics in the long tail (such as social capital, sustainability, and adaptive management).

Future work should improve the RIF method, particularly by applying normalization strategies (e.g., rescaling to 0–1) to enhance interpretability and cross-study comparisons. Researchers should also test whether power-law assumptions hold across different fields, or if other models (such as log-normal or exponential) provide better fits. Tracking changes in ranks over time could help show how topic importance evolves, while combining RIF with other measures, like the h-index or citation networks, could offer a more complete view of scientific impact.

Beyond refining the method itself, the RIF Index should be tested in other research areas and databases. Expanding to multilingual collections or platforms like Web of Science, Google Scholar, or PubMed would help assess its generalizability. Applying RIF to diverse domains such as public health, artificial intelligence, or environmental policy could also demonstrate its value in dynamic or interdisciplinary contexts. Case studies that apply the RIF to track thematic shifts or guide research planning could further show how it supports practical decision-making in science policy and bibliometric evaluation.

In summary, the RIF Index offers a flexible, interpretable, and scalable approach to analyzing conceptual importance in ranked data. By going beyond frequency-based methods, it provides deeper insight into how key topics shape and structure academic research. Through further development and broader application, the RIF framework can help scholars, analysts, and decision-makers better understand evolving knowledge systems across disciplines.

Author Contributions

Conceptualization, B.L. and J.P.; methodology, B.L.; software, B.L. and H.L.; validation, E.F., J.P. and K.P.; formal analysis, B.L., H.L. and K.P.; investigation, J.P. and E.F.; resources, B.L.; data curation, B.L.; writing—original draft preparation, B.L.; writing—review and editing, B.L. and H.L.; visualization, B.L. and H.L.; supervision, E.F., J.P. and K.P.; project administration, B.L.; funding acquisition, J.P. and E.F. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the Office of Naval Research through the Minerva Research Initiative (Grant No. N000141912624) and the U.S. Department of Education (Grant No. P116S210003).

Data Availability Statement

The datasets generated and analyzed during the current study are available in the RIF Index repository at https://github.com/ODU-Storymodelers/RIF-Index/blob/main/data/papers.csv (accessed on 9 March 2026). The repository is regularly updated to include the most recent data used in the study.

Acknowledgments

The authors gratefully acknowledge the academic and technical support provided by colleagues from the Virginia Modeling, Analysis and Simulation Center (VMASC) at Old Dominion University and from the Universidad del Norte. Their insightful discussions, methodological feedback, and institutional support contributed to the refinement of the theoretical framework and the interpretation of the results presented in this study.

Conflicts of Interest

The authors declare that they have no competing interests.

Appendix A. Proof of Theorem 1

Proof of Theorem 1.

Given that

x (r) \geq x (r + 1)

for all

r \geq 1

, and assuming

x (r)

follows a power-law distribution, we have

x (r) = \hat{δ} \cdot r^{\hat{β}}

, where

\hat{δ} > 0

is a constant of proportionality and

\hat{β} < 0

is the power-law exponent.

To estimate the distribution of the rank R, we use the relationship between the probabilities

\hat{P} (R = r)

and

\hat{P} (X = x (r))

. Since these probabilities are inversely proportional, we have:

\hat{P} (R = r) = \frac{\hat{B}}{\hat{P} (X = x (r))},

(A1)

where

\hat{B}

is a proportionality constant. Substituting

x (r) = \hat{δ} r^{\hat{β}}

into

\hat{P} (X = x (r))

and using the normalization constant

\hat{C}

, we obtain:

\hat{P} (R = r) = \frac{\hat{B}}{\hat{C} {(\hat{δ} r^{\hat{β}})}^{\hat{α}}} = \frac{\hat{B} \cdot {\hat{δ}}^{\hat{α}}}{\hat{C}} \cdot r^{\hat{α} \hat{β}} .

(A2)

Defining

\hat{θ} : = - \hat{α} \cdot \hat{β} > 0

, we get

\hat{P} (R = r) = \hat{A} \cdot r^{- \hat{θ}}

, where

\hat{A}

is the normalization constant defined in Theorem 1. This confirms that

\hat{P} (R = r)

follows a power-law distribution with exponent

\hat{θ} > 0

, completing the proof. □

Appendix B. Proof of Theorem 2

Proof of Theorem 2.

This proof involves showing that the bounds on

\hat{Φ} (s, r)

translate directly to bounds on

\hat{θ} (r)

through the relationship

\hat{Φ} (s, r) = p^{\hat{θ}}

. Taking natural logarithms yields

ln \hat{Φ} (s, r) = \hat{θ} ln p

, so inequalities on

\hat{Φ}

correspond directly to inequalities on

\hat{θ}

, completing the proof. □

Appendix C. Pseudocode and Implementation

Algorithm A1 summarizes the procedure for computing the RIF Index.

Algorithm A1 Computation of the RIF Index

Require:: A set of elements $F = {f_{1}, \dots, f_{n}}$ with observed counts $count (f_{i})$
Ensure:: Ranks $R_{i}$ , estimated slope $\hat{θ}$ , p-value, and RIF Index $Φ (s, r)$

1:: $T \leftarrow \sum_{i} count (f_{i})$ ▹ Total count across all elements
2:: for $i = 1, \dots, n$ do
3:: $rel_freq (f_{i}) \leftarrow count (f_{i}) / T$ ▹ Compute relative frequencies
4:: end for
5:: Sort F in descending order of $rel_freq (f_{i})$ ; assign ranks $R_{i}$ accordingly
6:: for $i = 1, \dots, n$ do
7:: $x_{i} \leftarrow {log}_{10} (R_{i})$ , $y_{i} \leftarrow {log}_{10} (rel_freq (f_{i}))$
8:: end for
9:: Fit linear model $y \sim x$ (i.e., ${log}_{10} (freq) \sim {log}_{10} (rank)$ ); estimate slope $\hat{θ}$ ; compute p-value for fit
10:: if $p < 0.05$ then
11:: abort: data do not follow power-law; RIF Index is not computed
12:: else
13:: for all pairs $(f_{s}, f_{r})$ such that $R_{s} < R_{r}$ do
14:: $Φ (s, r) \leftarrow {(R_{r} / R_{s})}^{\hat{θ}}$
15:: end for
16:: end if

First, raw counts are normalized into relative frequencies (Lines 1–4), followed by sorting the elements in descending order and assigning ranks accordingly (Line 5). The ranks and their corresponding frequencies are then transformed into log–log space (Lines 6–8).

In Line 9, a linear model is fitted to these transformed data using ordinary least squares (OLS) to estimate the slope

\hat{θ}

, and a p-value is computed to test the goodness-of-fit of the power-law model. If the resulting p-value satisfies

p \geq 0.05

, we proceed to compute the RIF Index as

Φ (s, r) = {(R_{r} / R_{s})}^{\hat{θ}}

for all valid rank pairs where

R_{s} < R_{r}

(Lines 10–14).

References

Jiang, J.J.; Yamada, K.; Takayasu, H.; Takayasu, M. Scale-dependent power law properties in hashtag usage time series of Weibo. Sci. Rep. 2023, 13, 22298. [Google Scholar] [CrossRef]
Blasius, B. Power-law distribution in the number of confirmed COVID-19 cases. Chaos Interdiscip. J. Nonlinear Sci. 2020, 30, 093123. [Google Scholar] [CrossRef] [PubMed]
Chen, X.; Pan, Y.; Luo, B. Research on power-law distribution of long-tail data and its application to tourism recommendation. Ind. Manag. Data Syst. 2021, 121, 1268–1286. [Google Scholar] [CrossRef]
Fogang, C.; Pelap, F.; Tanekou, G.B.; Kengne, R.; Kagho, L.Y.; Fozing, T.F.; Nbendjo, R.N.; Koumetio, F. Earthquake dynamic induced by the magma up flow with fractional power law and fractional-order friction. Ann. Geophys. 2021, 64, SE101. [Google Scholar] [CrossRef]
Jiang, B.; de Rijke, C. A power-law-based approach to mapping COVID-19 cases in the United States. Geo-Spat. Inf. Sci. 2021, 24, 333–339. [Google Scholar] [CrossRef]
Sethna, J.P. Power laws in physics. Nat. Rev. Phys. 2022, 4, 501–503. [Google Scholar] [CrossRef]
Zhao, R.; Zhu, W.; Huang, H.; Chen, W. Social mediametrics: The mention laws and patterns of scientific literature. Libr. Hi Tech 2025, 43, 377–397. [Google Scholar] [CrossRef]
Bray, N.J.; Gilstrap, D.L.; Scalfani, V.F. The Power Law and Emerging and Senior Scholar Publication Patterns. Innov. High. Educ. 2022, 47, 989–1005. [Google Scholar] [CrossRef]
Ahmad, M.; Batcha, D.M.S.; Jahina, S.R. Testing Lotka’s Law and Pattern of Author Productivity in the Scholarly Publications of Artificial Intelligence. arXiv 2021, arXiv:2102.09182. [Google Scholar] [CrossRef]
Nuermaimaiti, R.; Bogachev, L.V.; Voss, J. A generalized power law model of citations. In Proceedings of the 18th International Conference on Scientometrics and Informetrics, ISSI 2021; International Society for Scientometrics and Informetrics: Berlin, Germany, 2021. [Google Scholar]
Du, W.; Li, Z.; Xie, Z. A modified LSTM network to predict the citation counts of papers. J. Inf. Sci. 2024, 50, 894–909. [Google Scholar] [CrossRef]
García-Villar, C.; García-Santos, J. Bibliometric indicators to evaluate scientific activity. Radiol. (Engl. Ed.) 2021, 63, 228–235. [Google Scholar] [CrossRef]
Bihari, A.; Tripathi, S.; Deepak, A. A review on h-index and its alternative indices. J. Inf. Sci. 2023, 49, 624–665. [Google Scholar] [CrossRef]
Clauset, A.; Shalizi, C.R.; Newman, M.E. Power-Law Distributions in Empirical Data. SIAM Rev. 2009, 51, 661–703. [Google Scholar] [CrossRef]
Gupta, S.; Singh, V.K. Distributional characteristics of Dimensions concepts: An Empirical Analysis using Zipf’s law. Scientometrics 2024, 129, 1037–1053. [Google Scholar] [CrossRef]
Banshal, S.K.; Gupta, S.; Lathabai, H.H.; Singh, V.K. Power Laws in altmetrics: An empirical analysis. J. Inf. 2022, 16, 101309. [Google Scholar] [CrossRef]
Chen, B.; Ma, C.; Krajewski, W.F.; Wang, P.; Ren, F. Logarithmic transformation and peak-discharge power-law analysis. Hydrol. Res. 2020, 51, 65–76. [Google Scholar] [CrossRef]
Navas-Portella, V.; González, A.; Serra, I.; Vives, E.; Corral, A. Universality of power-law exponents by means of maximum-likelihood estimation. Phys. Rev. E 2019, 100, 062106. [Google Scholar] [CrossRef]
Lu, W.; Huang, S.; Yang, J.; Bu, Y.; Cheng, Q.; Huang, Y. Detecting research topic trends by author-defined keyword frequency. Inf. Process. Manag. 2021, 58, 102594. [Google Scholar] [CrossRef]
Zhao, F.; Zhang, Y.; Lu, J.; Shai, O. Measuring academic influence using heterogeneous author-citation networks. Scientometrics 2019, 118, 1119–1140. [Google Scholar] [CrossRef]
della Briotta Parolo, P.; Kujala, R.; Kaski, K.; Kivelä, M. Tracking the cumulative knowledge spreading in a comprehensive citation network. Phys. Rev. Res. 2020, 2, 013181. [Google Scholar] [CrossRef]
Krawczyk, M.J.; Libirt, M.; Malarz, K. Analysis of scientific cooperation at the international and intercontinental level. Scientometrics 2024, 129, 4983–5002. [Google Scholar] [CrossRef]
Chen, G.; Chen, S.; Chen, Z.; Xiao, L.; Hu, J. How much data is sufficient for reliable bibliometric domain analysis? A multi-scenario experimental approach. Scientometrics 2025, 130, 2923–2946. [Google Scholar] [CrossRef]
Benatti, A.; de Arruda, H.F.; Silva, F.N.; Comin, C.H.; da Fontoura Costa, L. On the stability of citation networks. Phys. A Stat. Mech. Its Appl. 2023, 610, 128399. [Google Scholar] [CrossRef]
Mosleh, M.; Roshani, S.; Coccia, M. Scientific laws of research funding to support citations and diffusion of knowledge in life science. Scientometrics 2022, 127, 1931–1951. [Google Scholar] [CrossRef]
Ali, M.J. Questioning the Impact of the Impact Factor. A Brief Review and Future Directions. Semin. Ophthalmol. 2022, 37, 91–96. [Google Scholar] [CrossRef]
Alter, N.; Daiem, M.; Pontell, M.E.; Galdyn, I.; Golinko, M.; Perdikis, G.; Lineaweaver, W. Limitations of Academic Bibliometric Indices: The Need for More Comprehensive Metrics. Ann. Plast. Surg. 2025, 95, 603–606. [Google Scholar] [CrossRef]
Amiri, M.R.; Saberi, M.K.; Ouchi, A.; Mokhtari, H.; Barkhan, S. Publication Performance and Trends in Altmetrics: A Bibliometric Analysis and Visualization. Int. J. Inf. Sci. Manag. 2023, 21, 97–117. [Google Scholar] [CrossRef]
Gupta, M.; Parvathy; Givi, J.; Dey, M.; Kent Baker, H.; Das, G. A bibliometric analysis on gift giving. Psychol. Mark. 2023, 40, 629–642. [Google Scholar] [CrossRef]
Zaghloul, G. A note on the zeros of generalized Hurwitz zeta functions. J. Number Theory 2019, 196, 197–204. [Google Scholar] [CrossRef]
Zeimbekakis, A.; Schifano, E.D.; Yan, J. On Misuses of the Kolmogorov–Smirnov Test for One-Sample Goodness-of-Fit. Am. Stat. 2024, 78, 481–487. [Google Scholar] [CrossRef]
Gillespie, C.S. Fitting Heavy Tailed Distributions: The poweRlaw Package. J. Stat. Softw. 2015, 64, 1–16. [Google Scholar] [CrossRef]
Corrin, L.; Thompson, K.; Hwang, G.J.; Lodge, J.M. The importance of choosing the right keywords for educational technology publications. Australas. J. Educ. Technol. 2022, 38, 1–8. [Google Scholar] [CrossRef]
Sugimoto, C.R.; Work, S.; Larivière, V.; Haustein, S. Scholarly use of social media and altmetrics: A review of the literature. J. Assoc. Inf. Sci. Technol. 2017, 68, 2037–2062. [Google Scholar] [CrossRef]
Kumpulainen, M.; Seppänen, M. Combining Web of Science and Scopus datasets in citation-based literature study. Scientometrics 2022, 127, 5613–5631. [Google Scholar] [CrossRef]
Culbert, J.H.; Hobert, A.; Jahn, N.; Haupka, N.; Schmidt, M.; Donner, P.; Mayr, P. Reference coverage analysis of OpenAlex compared to Web of Science and Scopus. Scientometrics 2025, 130, 2475–2492. [Google Scholar] [CrossRef]
Narong, D.K.; Hallinger, P. A Keyword Co-Occurrence Analysis of Research on Service Learning: Conceptual Foci and Emerging Research Trends. Educ. Sci. 2023, 13, 339. [Google Scholar] [CrossRef]
Moya, J.; Goenechea, M. An Approach to the Unified Conceptualization, Definition, and Characterization of Social Resilience. Int. J. Environ. Res. Public Health 2022, 19, 5746. [Google Scholar] [CrossRef] [PubMed]
Schweitzer, F.; Andres, G.; Casiraghi, G.; Gote, C.; Roller, R.; Scholtes, I.; Vaccario, G.; Zingg, C. Modeling social resilience: Questions, answers, open problems. Adv. Complex Syst. 2022, 25, 2250014. [Google Scholar] [CrossRef]
Kuhlicke, C.; de Brito, M.M.; Bartkowski, B.; Botzen, W.; Doğulu, C.; Han, S.; Hudson, P.; Karanci, A.N.; Klassert, C.J.; Otto, D.; et al. Spinning in circles? A systematic review on the role of theory in social vulnerability, resilience and adaptation research. Glob. Environ. Change 2023, 80, 102672. [Google Scholar] [CrossRef]
Liu, Y.; Cao, L.; Yang, D.; Anderson, B.C. How social capital influences community resilience management development. Environ. Sci. Policy 2022, 136, 642–651. [Google Scholar] [CrossRef]
Cafer, A.; Green, J.; Goreham, G. A community resilience framework for community development practitioners building equity and adaptive capacity. In Community Development for Times of Crisis; Routledge: London, UK, 2022; pp. 56–74. [Google Scholar]
Bernita, B. Social Networks and Risk Management: The Role of Social Capital in Community Resilience in Indonesia. Int. J. Innov. Think. 2024, 1, 1–12. [Google Scholar]
bin Ali, N.; Tanveer, B. A Comparison of Citation Sources for Reference and Citation-Based Search in Systematic Literature Reviews. E-Inform. Softw. Eng. J. 2022, 16, 220106. [Google Scholar] [CrossRef]
Gandasari, D.; Tjahjana, D.; Dwidienawati, D.; Sugiarto, M. Bibliometric and visualized analysis of social network analysis research on Scopus databases and VOSviewer. Cogent Bus. Manag. 2024, 11, 2376899. [Google Scholar] [CrossRef]
Thelwall, M.; Pinfield, S. The accuracy of field classifications for journals in Scopus. Scientometrics 2024, 129, 1097–1117. [Google Scholar] [CrossRef]
McCann, K.; Sienkiewicz, M.; Zard, M. The Role of Media Narratives in Shaping Public Opinion Toward Refugees: A Comparative Analysis; International Organization for Migration: Geneva, Switzerland, 2023. [Google Scholar]
Jumanboyevna, A.M. Thesaurus – A system of terminographic mechanisms. West. Eur. J. Linguist. Educ. 2025, 3, 24–28. [Google Scholar]
Ahmed, B.; Wang, L.; Mustafa, G.; Afzal, M.T.; Akhunzada, A. Evaluating the Effectiveness of Author-Count Based Metrics in Measuring Scientific Contributions. IEEE Access 2023, 11, 101710–101726. [Google Scholar] [CrossRef]

Figure 1. Distribution of: (a) RIF Index and (b) Probability vs. Rank, for different values of

\hat{θ}

; RIF Index vs. (c)

\hat{P} (R = r)

; and (d)

\hat{P} (R = 1)

. These plots illustrate how the concentration of importance changes across ranks in power-law systems.

Figure 1. Distribution of: (a) RIF Index and (b) Probability vs. Rank, for different values of

\hat{θ}

; RIF Index vs. (c)

\hat{P} (R = r)

; and (d)

\hat{P} (R = 1)

. These plots illustrate how the concentration of importance changes across ranks in power-law systems.

Figure 2. Visualization of the relationship between

\hat{θ}

and

\hat{Φ} (r)

for different rank values r. Curves are color-coded by rank to illustrate how the estimated parameter

\hat{θ}

varies across RIF index regions (Moderate, Significant, Critical, and Dominant).

Figure 2. Visualization of the relationship between

\hat{θ}

and

\hat{Φ} (r)

for different rank values r. Curves are color-coded by rank to illustrate how the estimated parameter

\hat{θ}

varies across RIF index regions (Moderate, Significant, Critical, and Dominant).

Figure 3. General workflow for applying the RIF Index. Green ellipses indicate the start and end of the process. Blue rectangles represent processing steps, yellow diamonds denote decision points, and red boxes indicate termination due to failed conditions.

Figure 4. Application Case 1: Intra-Group Rank Comparison. (a) Unordered frequencies, (b) ranked frequencies, (c) power-law fit with RIF-based comparisons. The letters (a–z) represent categorical elements used to illustrate the rank–frequency distribution. The RIF allows precise comparison between ranks, even when visual differences in frequency become less pronounced at lower ranks.

Figure 5. Application Case 2: Cross-Group Rank Comparison. (a) Raw frequencies for Group A and B, (b) ranked distributions with fitted power laws, (c) estimated RIF values for cross-group rank comparison. The letters (a–z) represent categorical elements used to illustrate the rank–frequency distributions in each group. RIF provides a normalized lens for comparing equivalent ranks across systems with different frequencies and scales.

Figure 6. Cross-Group RIF Matrix Comparison. Each cell shows the

Φ (s, r)

index between pairs of concepts within the top 6 ranks. Blue shades indicate stronger relationships; red indicates weaker ones.

Figure 6. Cross-Group RIF Matrix Comparison. Each cell shows the

Φ (s, r)

index between pairs of concepts within the top 6 ranks. Blue shades indicate stronger relationships; red indicates weaker ones.

Figure 7. Cross-Group RIF Network Visualization. Thicker arrows indicate stronger conceptual proximity (lower RIF). Arrow color reflects RIF category, and node size denotes concept frequency or rank importance. Arrows point from higher to lower-ranked concepts.

Figure 8. Distribution of: (a) author keywords and (b) index keywords per paper.

Figure 9. Power-Law Distribution for Keywords: (a) Author keywords; (b) Index keywords. Both plots show frequency decay on a log–log scale.

Figure 10. Comparative RIF Matrix Analysis for Author and Index Keywords. Each cell shows the RIF value

\hat{Φ} (s, r)

that compares a concept at rank r to one at rank

s \leq r

. Lower values indicate that the lower-ranked concept closely approximates the dominance of the higher-ranked one.

Figure 10. Comparative RIF Matrix Analysis for Author and Index Keywords. Each cell shows the RIF value

\hat{Φ} (s, r)

that compares a concept at rank r to one at rank

s \leq r

. Lower values indicate that the lower-ranked concept closely approximates the dominance of the higher-ranked one.

Figure 11. Comparative RIF Network Graph for Author and Index Keywords. Thicker arrows indicate stronger conceptual proximity (lower RIF), and node size reflects keyword prominence. Arrows point from higher to lower-ranked keywords.

Table 1. Estimated ranges of

\hat{θ}

for different ranks across intervals of the RIF index

\hat{Φ} (r) = r^{\hat{θ}}

.

Table 1. Estimated ranges of

\hat{θ}

for different ranks across intervals of the RIF index

\hat{Φ} (r) = r^{\hat{θ}}

.

Rank	Min–Max of $\hat{θ}$
	$1 < \hat{Φ} (r) \leq 3$	$3 < \hat{Φ} (r) \leq 6$	$6 < \hat{Φ} (r) \leq 9$	$9 < \hat{Φ} (r)$
2	0.014–1.585	1.590–2.585	2.587–3.170	3.172–5.322
3	0.009–1.000	1.003–1.631	1.632–2.000	2.001–3.358
4	0.007–0.792	0.795–1.292	1.294–1.585	1.586–2.661
5	0.006–0.683	0.685–1.113	1.114–1.365	1.366–2.292
6	0.006–0.613	0.615–1.000	1.001–1.226	1.227–2.059
7	0.005–0.565	0.566–0.921	0.922–1.129	1.130–1.896
8	0.005–0.528	0.530–0.862	0.862–1.057	1.057–1.774
9	0.005–0.500	0.502–0.815	0.816–1.000	1.001–1.679
10	0.004–0.477	0.479–0.778	0.779–0.954	0.955–1.602

Note:

\hat{θ}

is estimated from the expression

\hat{Φ} (r) = r^{\hat{θ}}

. Values are grouped into four RIF Index ranges to support interpretability across different ranks.

Table 2. Results of the power-law estimations for keywords.

Keyword (n)	$\hat{α}$	${\hat{x}}_{min}$	CP % ( $n_{tail}$ )	KS	p-Value
Author (1344)	2.460	2	11.8 (158)	0.079	0.972
Index (1204)	2.322	2	28.9 (348)	0.025	1.000

Table 3. Top 9 RIF Index results for index and author keywords. RIF values reflect

\hat{Φ} (1, r)

comparisons with the top-ranked concept (

s = 1

).

Table 3. Top 9 RIF Index results for index and author keywords. RIF values reflect

\hat{Φ} (1, r)

comparisons with the top-ranked concept (

s = 1

).

Rank (r)	Concept	Frequency	Probability $P (R = r)$	RIF $\hat{Φ} (1, r)$
Author Keywords
1	COVID-19	30	0.134	1.000
2	vulnerability	20	0.049	2.735
3	community resilience	20	0.024	5.583
4	climate change	19	0.014	9.571
5	urban resilience	15	0.009	14.889
6	social capital	15	0.006	22.333
7	adaptive capacity	10	0.004	33.500
8	migration	10	0.003	44.667
9	adaptation	9	0.003	44.667
Index Keywords
1	climate change	40	0.704	1.000
2	human	30	0.141	4.993
3	vulnerability	25	0.055	12.800
4	decision making	24	0.028	25.143
5	COVID-19	20	0.017	41.412
6	sustainability	17	0.008	88.000
7	sustainable development	17	0.006	117.333
8	adaptive management	17	0.004	176.000
9	disaster management	16	0.003	234.667

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Llinas, B.; Padilla, J.; Llinas, H.; Frydenlund, E.; Palacio, K. Modeling Rank Distribution and the Relative Importance Factor Index in Discrete Power-Law Models: Application to Social Resilience Using the Scopus Database. Mathematics 2026, 14, 966. https://doi.org/10.3390/math14060966

AMA Style

Llinas B, Padilla J, Llinas H, Frydenlund E, Palacio K. Modeling Rank Distribution and the Relative Importance Factor Index in Discrete Power-Law Models: Application to Social Resilience Using the Scopus Database. Mathematics. 2026; 14(6):966. https://doi.org/10.3390/math14060966

Chicago/Turabian Style

Llinas, Brian, Jose Padilla, Humberto Llinas, Erika Frydenlund, and Katherine Palacio. 2026. "Modeling Rank Distribution and the Relative Importance Factor Index in Discrete Power-Law Models: Application to Social Resilience Using the Scopus Database" Mathematics 14, no. 6: 966. https://doi.org/10.3390/math14060966

APA Style

Llinas, B., Padilla, J., Llinas, H., Frydenlund, E., & Palacio, K. (2026). Modeling Rank Distribution and the Relative Importance Factor Index in Discrete Power-Law Models: Application to Social Resilience Using the Scopus Database. Mathematics, 14(6), 966. https://doi.org/10.3390/math14060966

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modeling Rank Distribution and the Relative Importance Factor Index in Discrete Power-Law Models: Application to Social Resilience Using the Scopus Database

Abstract

1. Introduction

2. Background

3. Theoretical Foundations of the RIF Framework

3.1. Understanding the Discrete Power-Law Foundations

3.2. Rank Distribution Under Power-Law Assumption

3.3. Definition of the RIF Index

3.4. Properties of the RIF Index

3.5. Values of θ ^ for Each Range of Φ ^ ( s , r )

4. Methodology and General Framework

5. Demonstration with Synthetic Data

5.1. Case 1: Intra-Group Rank Comparison

5.2. Case 2: Cross-Group Rank Comparison

5.3. Visualizing the RIF Index: Matrices and Networks

5.3.1. Matrix-Based Comparison

Examples from Case 1 (Intra-Group)

Examples from Case 2 (Cross-Group)

5.3.2. Network-Based Representation

6. Case Study: Measuring Conceptual Prominence in Social Resilience Literature

6.1. Data Collection and Processing

6.2. Modeling Keyword Distributions with Power Laws

6.3. Quantifying Conceptual Prominence Using the RIF Index

6.4. Case 1: Intra-Group Rank Analysis of Index Keywords

6.5. Case 2: Cross-Group Rank Comparison Analysis of Author and Index Keywords

7. Discussion and Limitations

8. Conclusions and Future Research

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Proof of Theorem 1

Appendix B. Proof of Theorem 2

Appendix C. Pseudocode and Implementation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.5. Values of $\hat{θ}$ for Each Range of $\hat{Φ} (s, r)$