#
Statistical Indicators of the Scientific Publications Importance: A Stochastic Model and Critical Look^{ †}

^{1}

^{2}

^{3}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

- (i1)
- (i2)

## 2. Citation Model Construction

**Assumption**

**1.**

**Assumption**

**2.**

## 3. Distribution of Citation Number of a Paper

- The probabilities ${p}_{n}=p$ are constant. So (2) is$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=n\}=p\xb7{(1-p)}^{n-1},\phantom{\rule{1.em}{0ex}}\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=\infty \}=0$$$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X\ge n\}={(1-p)}^{n-1},\phantom{\rule{1.em}{0ex}}m=1,2,\dots $$Clearly, the tail and probabilities (3) decrease exponentially fast as n tends to infinity.
- The probabilities are given by ${p}_{n}=p/n$, where p is a number from the interval $(0,1)$. Equation (3) is transformed to$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=n\}=\frac{p}{n}\xb7\prod _{k=1}^{n-1}(1-\frac{p}{k}).$$According to (4) X is a proper random variable and has, in this case, the Sibuya distribution with parameter $p\in (0,1)$ with the following tail$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X\ge n\}=\frac{\Gamma (n-p)}{\Gamma \left(n\right)\xb7\Gamma (1-p)}\sim \frac{1}{\Gamma (1-p)\xb7{n}^{p}}$$$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=n\}\sim p/({n}^{p+1}\xb7\Gamma (1-p)),\phantom{\rule{1.em}{0ex}}n\to \infty .$$

## 4. Main Result on Citation Number Distribution

- (A)
- $0<\gamma <1$;
- (B)
- $\gamma >1$.

- (i)
- Suppose that $1/\gamma $ is not integer. Then $\gamma \xb7[1/\gamma ]<1$ and$$\sum _{j=1}^{[1/\gamma ]+1}\frac{{p}^{j}}{j}\sum _{k=1}^{n-1}\frac{1}{{k}^{\gamma j}}=\sum _{j=1}^{[1/\gamma ]}\frac{{n}^{1-\gamma j}}{1-\gamma j}\frac{{p}^{j}}{j}+\sum _{j=1}^{[1/\gamma ]}\zeta \left(\gamma j\right)\frac{{p}^{j}}{j}+\frac{{p}^{[1/\gamma ]+1}}{[1/\gamma ]+1}\sum _{k=1}^{n-1}\frac{1}{{k}^{\gamma \left(\right[1/\gamma ]+1)}}+o\left(1\right).$$However, $\gamma \left(\right[1/\gamma ]+1)>1$ and, therefore,$$\underset{n\to \infty}{lim}\sum _{k=1}^{n-1}\frac{1}{{k}^{\gamma \left(\right[1/\gamma ]+1)}}=\sum _{k=1}^{\infty}\frac{1}{{k}^{\gamma \left(\right[1/\gamma ]+1)}}<\infty .$$From this and (10) it follows$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=n\}\sim {C}_{2}\xb7\frac{p}{{n}^{\gamma}}\xb7exp\left\{\sum _{j=1}^{[1/\gamma ]}\frac{{n}^{1-\gamma j}}{1-\gamma j}\xb7\frac{{p}^{j}}{j}\right\},$$
- (ii)
- Suppose that $1/\gamma $ is positive integer. Then $\gamma [1/\gamma ]=1$ and$$\sum _{j=1}^{[1/\gamma ]+1}\frac{{p}^{j}}{j}\sum _{k=1}^{n-1}\frac{1}{{k}^{\gamma j}}=\sum _{j=1}^{[1/\gamma ]-1}\frac{{n}^{1-\gamma j}}{1-\gamma j}\frac{{p}^{j}}{j}+\sum _{j=1}^{[1/\gamma ]-1}\zeta \left(\gamma j\right)\frac{{p}^{j}}{j}$$$$+\frac{{p}^{[1/\gamma ]}}{[1/\gamma ]}\sum _{k=1}^{n-1}\frac{1}{k}+\frac{{p}^{[1/\gamma ]+1}}{[1/\gamma ]+1}\sum _{k=1}^{n-1}\frac{1}{{k}^{2}}.$$It is known that$$\underset{n\to \infty}{lim}\sum _{k=1}^{n-1}\frac{1}{{k}^{2}}=\sum _{k=1}^{\infty}\frac{1}{{k}^{2}}<\infty $$$$\sum _{k=1}^{n-1}\frac{1}{k}=log\left(n\right)+{\gamma}_{e}+o\left(1\right),$$$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=n\}\sim {C}_{3}\xb7\frac{p}{{n}^{\gamma +{p}^{[1/\gamma ]}/[1/\gamma ]}}\xb7exp\left\{\sum _{j=1}^{[1/\gamma ]-1}\frac{{n}^{1-\gamma j}}{1-\gamma j}\xb7\frac{{p}^{j}}{j}\right\}\phantom{\rule{0.277778em}{0ex}}\mathrm{as}\phantom{\rule{0.277778em}{0ex}}n\to \infty .$$

**Theorem**

**1.**

- If $\gamma =0$ then $\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=n\}=p{(1-p)}^{n-1}$, $n=1,2,\dots $.
- If $0<\gamma <1$ and $1/\gamma $ is not a positive integer then$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=n\}\sim {C}_{2}\xb7\frac{p}{{n}^{\gamma}}\xb7exp\left\{-\sum _{j=1}^{[1/\gamma ]}\frac{{n}^{1-\gamma j}}{1-\gamma j}\xb7\frac{{p}^{j}}{j}\right\}\phantom{\rule{0.277778em}{0ex}}as\phantom{\rule{0.277778em}{0ex}}n\to \infty .$$If $0<\gamma <1$ and $1/\gamma $ is a positive integer then$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=n\}\sim {C}_{3}\xb7\frac{p}{{n}^{\gamma +{p}^{[1/\gamma ]}/[1/\gamma ]}}\xb7exp\left\{-\sum _{j=1}^{[1/\gamma ]-1}\frac{{n}^{1-\gamma j}}{1-\gamma j}\xb7\frac{{p}^{j}}{j}\right\}\phantom{\rule{0.277778em}{0ex}}as\phantom{\rule{0.277778em}{0ex}}n\to \infty .$$
- If $\gamma =1$ then$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=n\}\sim p/({n}^{p+1}\Gamma (1-p)),\phantom{\rule{1.em}{0ex}}n\to \infty .$$
- If $\gamma >1$ then$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=n|X<\infty \}\sim {C}_{4}\frac{p}{{n}^{\gamma}}\phantom{\rule{1.em}{0ex}}as\phantom{\rule{1.em}{0ex}}n\to \infty ,$$and$$\mathrm{I}\phantom{\rule{-1.79993pt}{0ex}}\mathrm{P}\{X=\infty \}=exp\left\{-\sum _{k=1}^{\infty}p/({k}^{\gamma}-p)\right\}>0,$$All $C,{C}_{1}-{C}_{6}$ depend on parameters p and γ only.

## 5. Comments

## 6. The Case of Growing ${p}_{n}$

## 7. Back to the Distribution of Citation Number of One Author

#### 7.1. Analyzing Data from Scholar Google “Mathematics"

- The serial number of the author;
- The total number of citations by the author;
- Hirsch Index;
- The number of citations of the most popular work (By the most popular work we understand the work of this author having the largest number of citations among the works of this scientist);
- Ratio of citations to squared Hirsch index;

#### 7.2. Analyzing Data from Scholar Google “Biostatistics"

#### 7.3. Final Model for the Distribution of Citations

#### 7.4. Remarks on the Model with $\gamma >1$

## 8. Hirsch Index

#### 8.1. Data in “Physics”

#### 8.2. Data Comparison

## 9. Distribution of the Hirsch Index

- (a)
- no less than h works were published;
- (b)
- h of the published works are cited at least h times, and the rest - less than h times.

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Garfield, E. Citation Indexes for Science. Science
**1955**, 122, 108–111. [Google Scholar] [CrossRef] [PubMed] - Garfield, E. Citation Index in Sociological and Historical research. Curr. Contents
**1969**, 9, 42–46. [Google Scholar] - Garfield, E. The evolution of the Science Citation Index. Int. Microbiol.
**2007**, 10, 65–69. [Google Scholar] [CrossRef] [PubMed] - Hirsch, J.E. An index to quantify an individual’s scientific research output. Proc. Natl. Acad. Sci. USA
**2005**, 102, 16569–16572. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Richter, M. Was misst der h-Index (nicht)?—Kritische Überlegungen zu einer populären Kennzahl für Forschungsleistungen. WiSt Wirtsch. Stud.
**2018**, 47, 64–68. [Google Scholar] [CrossRef] - Klebanov, L.B. One look at the rating of scientific publications and corresponding toy-models. arXiv
**2017**, arXiv:1706.01238v1. [Google Scholar] - Sibuya, M. Generalized Hypergeometric, Digamma and Trigamma Distributions. Ann. Inst. Statist. Math.
**1979**, 31, 373–390. [Google Scholar] [CrossRef] - Klebanov, L.B.; Antoch, J.; Karlova, A.; Kakosyan, A.V. Outliers and related problems. arXiv
**2017**, arXiv:1701.06642v1. [Google Scholar] - Volchenkova, I.V.; Klebanov, L.B. Characterization of the Pareto distribution by the properties of neighboring order statistics. Zap. Nauchnih Semin. POMI
**2019**, 486, 63–70. (In Russian) [Google Scholar]

1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|

1. | 448,557 | 270 | 28,303 | 6.15 |

2. | 162,457 | 98 | 44,406 | 16.92 |

3. | 159,123 | 147 | 26,929 | 7.36 |

4. | 138,820 | 64 | 110,393 | 33.89 |

5. | 101,662 | 59 | 35,640 | 29.20 |

6. | 99,206 | 78 | 41,647 | 16.31 |

7. | 85,288 | 59 | 55,293 | 24.50 |

8. | 84,918 | 48 | 18,901 | 36.86 |

9. | 77,319 | 98 | 11,715 | 8.05 |

10. | 73,989 | 72 | 17,153 | 14.27 |

1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|

1. | 478,691 | 227 | 66,611 | 9.29 |

2. | 301,786 | 132 | 59,613 | 17.32 |

3. | 253,221 | 208 | 26,127 | 5.85 |

4. | 223,038 | 218 | 10,184 | 4.69 |

5. | 199,143 | 169 | 23,447 | 6.97 |

6. | 178,855 | 117 | 39,271 | 13.07 |

7. | 150,695 | 105 | 42,485 | 13.67 |

8. | 119,199 | 111 | 20,666 | 9.67 |

9. | 108,648 | 140 | 20,842 | 5.54 |

10. | 100,491 | 111 | 30,315 | 8.16 |

1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|

1. | 326,718 | 206 | 25,605 | 7.70 |

2. | 259,321 | 223 | 7275 | 5.21 |

3. | 240,376 | 200 | 15,651 | 6.01 |

4. | 232,057 | 206 | 26,535 | 5.47 |

5. | 231,746 | 218 | 15,589 | 4.88 |

6. | 227,530 | 206 | 15,684 | 5.36 |

7. | 217,495 | 144 | 35,746 | 10.49 |

8. | 200,565 | 191 | 11,807 | 5.50 |

9. | 198,735 | 190 | 7497 | 5.50 |

10. | 197,679 | 198 | 25,649 | 5.04 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Klebanov, L.B.; Kuvaeva, Y.V.; Volkovich, Z.E.
Statistical Indicators of the Scientific Publications Importance: A Stochastic Model and Critical Look. *Mathematics* **2020**, *8*, 713.
https://doi.org/10.3390/math8050713

**AMA Style**

Klebanov LB, Kuvaeva YV, Volkovich ZE.
Statistical Indicators of the Scientific Publications Importance: A Stochastic Model and Critical Look. *Mathematics*. 2020; 8(5):713.
https://doi.org/10.3390/math8050713

**Chicago/Turabian Style**

Klebanov, Lev B., Yulia V. Kuvaeva, and Zeev E. Volkovich.
2020. "Statistical Indicators of the Scientific Publications Importance: A Stochastic Model and Critical Look" *Mathematics* 8, no. 5: 713.
https://doi.org/10.3390/math8050713