Ranking Multi-Metric Scientific Achievements Using a Concept of Pareto Optimality

The ranking of multi-metric scientific achievements is a challenging task. For example, the scientific ranking of researchers utilizes two major types of indicators; namely, number of publications and citations. In fact, they focus on how to select proper indicators, considering only one indicator or combination of them. The majority of ranking methods combine several indicators, but these methods are faced with a challenging concern—the assignment of suitable/optimal weights to the targeted indicators. Pareto optimality is defined as a measure of efficiency in the multi-objective optimization which seeks the optimal solutions by considering multiple criteria/objectives simultaneously. The performance of the basic Pareto dominance depth ranking strategy decreases by increasing the number of criteria (generally speaking, when it is more than three criteria). In this paper, a new, modified Pareto dominance depth ranking strategy is proposed which uses some dominance metrics obtained from the basic Pareto dominance depth ranking and some sorted statistical metrics to rank the scientific achievements. It attempts to find the clusters of compared data by using all of indicators simultaneously. Furthermore, we apply the proposed method to address the multi-source ranking resolution problem which is very common these days; for example, there are several world-wide institutions which rank the world’s universities every year, but their rankings are not consistent. As our case studies, the proposed method was used to rank several scientific datasets (i.e., researchers, universities, and countries) for proof of concept.


Introduction
Nowadays, ranking of scientific impacts is a crucial task and it is a focus of research communities, universities, and governmental funding agencies. In this ranking, the target entities can be researchers, universities, countries, journals, or conferences. Performance analysis and benchmarking of scientific achievement has a variety of substantial purposes. At the researcher level, the research's impact is an important measure to define the main rules of academic institutions and universities on determination of funding, hiring, and promotions [1][2][3]. From the university's view point, university rankings are considered as a source of strategic information for governments, funding agencies, and the media in order to compare universities; then students and their parents use university rankings as a selection criterion [4]. As the assessment of scientific achievement has gained a great deal of attention for various interested groups, such as students, parents, institutions, academicians, policy makers, political leaders, donors/funding agencies, and news media; several assessment methods have been developed in the field of bibliometry and scientometrics through the utilization of mathematical and/or statistical methods [1].
In order to measure a researcher's performance, many indicators have been proposed which can also be utilized in other scientific areas. Traditional research indicators include the numbers of publications and citations, the average number of citations per paper, and the average number of citations per year [5]. In 2005, Hirsch [6] proposed a new indicator, called h-index, which revolutionized scientometrics (informetrics). The original definition of the h-index indicator is that, "A scientist has the index h if h of his/her N p papers have at least received h citations each, and the other N p − h papers have no more than h citations each." Later, other indicators were proposed to enhance the h-index. Additionally, h-index was defined for other scientific aggregation levels [7]. Ranking methods at researcher level tend to use only one indicator (h-index or its improved versions), but at other aggregation scientific levels they prefer to have a more comprehensive set of indicators. Research works in the scientometrics can be divided into the following two main categories: the first category includes methods which focus on introducing new indicators to enhance the performances of assessment metrics, and in the second category, methods attempt to develop enhanced ranking methods for obtaining ranks by using several various indicators.
There are various kinds of ranking methods; first, methods which focus only on one indicator; and second, methods which combine several of them. Considering only a specific indicator makes differences among the quality assessments of research outcomes very hard to be revealed. On the other hand, there are a few challenges for considering several indicators simultaneously. For instance, the method needs to find the proper weights for combining the indicators and also an efficient merging strategy to combine several different types of indicators.
In the field of optimization, an algorithm tries to find the best solution in a search space in terms of an objective function which should be minimized or maximized [8] accordingly. However, in singe-objective problems [9], there is only one objective to be optimized; in the multi-objective version, the algorithm tries to find a set of solutions based on more than one objective [10]. In the multi-objective optimization [11,12], the non-dominated sorting [13,14] is defined and used as a measure of efficiency in metaheuristic-based methods [15,16]. In [17], the basic dominance ranking was used to identify the excellent scientists according to all selected criteria. They selected all researchers in the first Pareto-front as excellent scientists, but by increasing the number of criteria (more than three) most compared entities were placed in the first Pareto front [17]. In this paper, we propose a modified, non-dominated sorting, which according to the basic dominance ranking, utilizes two main metrics and then two statistical metrics which are the computed means and medians of some ranks obtained by sorting each criterion's value in all compared vectors. This ranking has many major advantages: (1) it can perform very well at ranking all compared vectors even with a large number of criteria; (2) each obtained Pareto front in the modified non-dominated sorting has a smaller number of vectors in compared to the basic non-dominated sorting approach; (3) it can consider the length time of academic research (called the research period) as an independent indicator, which makes it possible to compare junior and senior researchers; (4) it is independent and capable of accommodating new indicators; (5) there is no need to determine the optimal weights to combine indicators. The modified Pareto dominance ranking was used to rank two research datasets with many criteria, ranking universities (200 samples) and countries (231 samples); additionally, the basic dominance ranking was applied to rank two research datasets with a low number of the criteria, ranking computer science researchers based on h-index and period of publication (350 samples) and ranking of universities based on triple rankings resources (100 samples).
The remaining sections of this paper are organized as follows. Section 2 presents a background review which provides state-of-the-art scientific indicators and ranking methods. Section 3 describes the proposed ranking method in detail. Section 4 presents case studies and corresponding discussions. Finally, the paper is concluded in Section 5.

Background Review
In this section, we review several state-of-the-art scientific indicators and several recent ranking methods.

A Brief Description of State-of-the-Art Scientific Indicators
Several indicators have been proposed to measure the scientific achievements. The pioneer studies introduced some basic indicators and described how these indicators can be combined to find the general intuition of the scientific outputs for researchers [18,19]. These indicators can be categorized in the following three main groups [20,21]: • Production based indicators: these indicators were developed to assess the quantity of production such as the total number of published papers and the number of papers published during a limited time. • Impact based indicators: they were proposed to quantify the impact of the researchers' publications; e.g., the total number of citations, the average number of citations per paper, the number of high-impact papers (papers with more than a specific number of citations), and the number of citations of the high-impact papers. • Indicators based on the impact of the journals: these indicators were designed to consider journals where the papers are published; e.g., the median impact factor of the journals, relative citation rates (publication citations compared with the average citations of papers in the journal), and normalized position of the journals (computed according to position of journal in the ordered list in term of impact factor).
Some advantages and disadvantages of well-known indicators [6,22] are shown in Table 1. Table 1. A summary of advantages and disadvantages for some commonly used indicators.

Indicator Advantage Disadvantage
The total number of published papers It is a proper measure to quantify the productivity.
It does not consider the impact of their publications.

The total number of received citations
It can measure the total impact. It may be inflated by a small number of "big hits" when a paper has many co-authors. It gives a Excess weight to highly cited survey papers.
Average number of citations per publication, without counting self-citations It can be applied to compare junior and senior scientists (not in a complete way, because the senior researchers had more time for better building up of this metric).
It is hard to find and rewards low productivity and penalizes high productivity.
Number of "significant papers" (as the number of papers with having more than y citations) It eliminates disadvantages of the previous mentioned indicators; the total number of published papers, the total number of citations, and average number of citations per publication.
The value of "y" should be adjusted.
The number of citations to each of the q most cited papers Similar to Number of "significant papers," it can overcomes many of the mentioned disadvantages above.
"q" has not a single value so it is difficult to compute and compare.
In 2005, Hirsch dramatically changed scientometrics (informetrics) by introducing the h-index measure. Several studies have discussed and extended the validity of the h-index [23] since its introduction. The h-index has some significant properties [24,25]. It considers two aspects, the number of publications and their impacts on research. It performs better than other basic indicators (total number of papers, total number of citations, average number of significant papers, etc.) at evaluating scientific achievements. In [25], an empirical study was conducted to confirm the superiority of the h-index over other basic indicators. In addition, the h-index can effortlessly be computed by using available resources such as the ISI Web of Science. Although it was extensively utilized as a scientometrics measure, it still suffers from the following drawbacks [1, [26][27][28]: • The h-index highly depends on the length of the academic career (the research period) because it is supposed the publications and citations of researchers increase over time. The h-index of new researchers has a very low value, and so it is not applicable for comparing scientists at different stages of their academic careers. • It is field-dependent; therefore it can be useful to compare scientists in the same field of study.
• The h-index never decreases and also it may increase even if no new papers are published because the number of received citations for scientists can be increased with time. However, the value of h-index indicates the impact of the publications; it is strongly dependent on one aspect of the research; i.e., the age of research. In order to compare two scientists fairly based on their research achievements, in addition to quality evaluation, the period of time that they have researched over is also important. In other words, for two researchers with the same value of h-index, the researcher with shorter research period is the more successful researcher. Consequently, the h-index cannot be a standalone metric to assess the rank of a scientist in terms of different criteria. • It is insensitive to performance changes because when first h articles received at least h times h, i.e., h 2 citations, it does not consider the number of citations they receive. • Additionally, the h-index suffers from the same issues as other indicators, such as self-citations and being field-dependent. Some of these issues include difficulty in finding reference standards, and also problems of collecting all required data to compute the h-index (for example, discriminating between scientists with the same names and initials is challenging).
Several variants of the h-index have been developed to overcome the drawbacks of the h-index. The m-quotient [6] was proposed to account for years since the first publication, and it is computed as follows.
where n is the number of years since the first published paper of the scientist. Batista et al. [29] introduced a complementary index as the h I index which is defined by: where N T a is the number of authors in the considered h papers. In [30], A-index was suggested as the average number of citations of publications included in the h(Hirsch)-core which is mathematically defined as.
The AR index [31] was proposed as the square root of the sum of the average number of citations per year of articles included in the h(Hirsch)-core. The mathematical definition of the index is as bellow.
where a j is the age of jth paper. Liang et al. [26] suggested a new index, the R-index, which found by calculating the square root of the sum of citations in the Hirsch core without dividing by h. This indicator is mathematically defined as.
Egghe [28] introduced the g index which is defined as the highest number g of papers such that the top g papers together have at least g 2 citations. Additionally, it has proven that there is a unique g for any set of papers and g > h. Egghe and Rousseau [32] proposed the citation-weighted h-index (h w -index) as follows.
where cit j is the number of the j-th most cited paper; r 0 is the largest row index i such that r w (i) ≤ cit i . In general, even enhanced version of h-index metrics suffer from combining several metrics instead of considering them simultaneously.

A Brief Review of Ranking Methods
At the researcher level, all mentioned indicators can be applied to measure researchers' achievements. Although other scientific applications such as ranking scientific journals, research teams, research institutions, and countries tend to include a more comprehensive set of indicators, it is possible to apply the scientific indicators of researcher in other scientific comparative applications. For example, h-index can be calculated for an institute: "The h-index of an institute would be h 2 if h 2 number of its researchers have an h 1 -index of at least h 2 each, and the other (N − h 2 ) researchers have h 1 -indices lower than h 2 each" [7]. In following, we briefly review some common ranking methods and indicators for universities. University rankings mainly use two different general categorizes of methodologies [33][34][35][36][37][38][39]; the first category uses all indicators [40,41] to calculate a single score, while the second category focuses more on a single dimension of university performance, such as the quality of research output [4], career outcomes of graduates [37], or the mean h-index [42]. The other indicators for university rankings are publication and citation counts, student/faculty ratio, percentage of international students, Nobel and other prize commonality, number of highly cited researchers and papers, articles published in Science and Nature, the h-index, and web visibility. First, some ranking methodologies of the first category are briefly described as below.
Liu and Cheng [43] proposed a ranking strategy, called Academic Ranking of World Universities (ARWU), which considers four measures: quality of education, quality of faculty, research output, and per capita performance. For comparison of four measures, the following six indicators are considered: (1) alumni of a university winning a Nobel Prize or a Fields Medal, (2) staff of a university winning a Nobel Prize or a Fields Medal, (3) highly cited researchers in 21 broad scientific fields, (4) publications in Nature and Science, (5) publications indexed in Web of Science, and (6) per capita academic performance of a university. It gives a score of 100 for the best performing university in each category and this university is considered as the benchmark against for computing the scores of all other universities. Then, the total scores of Universities are calculated as weighted averages of their individual category scores [44]. THE-QS World University Ranking (THE-QS) (http: //www.topuniversities.com) was published by the Quacquarelli Symonds Company and considers six distinct indicators: academic reputation according to a large survey (40%), employer reputation (10%), the student faculty ratio (20%), citations per faculty based on the Scopus database (20%), the proportions of international professors (5%), and international students (5%). The World University Ranking was developed by Times Higher Education (www.timeshighereducation.co.uk/world-universityrankings) [41] which considers 13 indicators to rank universities. These indicators are categorized into five areas: teaching (30%), research (30%), citations (30%), industry income (2.5%), and international outlook (7.5%). They normalize the citation impact indicator to be suitable for different scientific output data.
Another global ranking is the Scimago Institutions Rankings (SIR) developed by the Scimago research group in Spain (www.scimagoir.com) [45]. SIR combines a quantity and various quality metrics. Indicators are divided into three groups: research output (total number of the publication based on the Scopus database), international collaboration, leader output, high quality publications, excellence, scientific leadership (excellence with leadership, and scientific talent pool), innovation (innovative knowledge and technological impact), and societal (web size and the number of incoming links). The Cybermetrics Lab developed the Ranking Web of World Universities or Webometrics Ranking [46,47] which uses web data extracted from commercial search engines, including the number of webpages, documents in rich formats (pdf, doc, ppt, and ps), papers indexed by Google Scholar (indicator added in 2006), and the number of external in links as a measure of link visibility or impact. Higher Education Evaluation and Accreditation Council of Taiwan [48]) conducts university ranking which applies multiple indicators in the three categories: research productivity (the number of articles published in the past 11 years (10%) and the number of articles published in the current year (15%)), research impact (number of citations in the past 11 years (15%), number of citations in the past 2 years (10%), and average number of citations in the past 11 years (10%)), and research excellence (the h-index of the last 2 years (10%), the number of highly cited papers in the past 11 years (15%), and the number of articles of the current year in high impact journals (15%)). These rankings combine multiple weighted indicators to gain a single aggregate score to rank all universities. Additionally, some universities rankings [49,50] employed I-distance method [51] to apply all indicators for computing a single score as the rank. Besides its ability to calculate a single index (by considering several indicators) and consequently ranking countries, CIDI startegy utilizes the Pearson's coefficients of correlation, calculated using the I-distance method. In this case, the relevance of each input measure will be preserved. The I-distance method specifies the most important indicator instead of calculating numerical weights. The rank of indicator is determined by ordering them based on these correlations. In following, we mention some of ranking methodologies of the second category. The Centre for Science and Technology Studies at Leiden University published the LEIDEN Ranking (http://www.cwts.nl/ranking/LeidenRankingWebsite) [4,52] which has two main categories of indicators: impact and collaboration. The impact group includes three indicators: mean citation score, mean normalized citation score, and proportion of top 10% publications. The collaboration group includes four indicators: proportion of inter-institutional collaborative publications, proportion of international collaborative publications, proportion of collaborative publications with industry, and mean geographical collaboration distance. The Leiden Ranking considers the scientific performance instead of combining multiple indicators of university performance in a single aggregate indicator. U-Multirank [53,54] employs the variety of institutional missions and profiles and includes teaching and learning-related indicators. Additionally, it considers the importance of a user-driven approach in which the stakeholders/users are asked to determine indicators and their quality for ranking. In [37], they proposed a ranking methodology which considers only career outcomes of university graduates. This ranking focuses on the impact of universities on industry by their graduates. The mean h-index was used in [42] as a ranking metric to rank the chemical engineering, chemistry, materials science, and physics departments in Greece.

Proposed Methodology
As mentioned in the Section 2, several indicators and ranking methods have been proposed to measure the scientific achievements. There are two main categories of ranking methods: in the first one, the methods use all indicators (multi-metric) and in the second one, the methods focus on only one indicator (single-metric). Ranking methods by focusing on one indicator of scientific achievements cannot reveal significant differences among compared entities. In ranking methods with several indicators, first they need to assign weights for indicators which have considerable impacts on the results of these raking methods [55,56]. Finding the proper weights according to importance of indicators is a challenging task [57]. They also suffer from combining several different kinds of indicators to achieve a single score. In this paper, we modify the dominance depth ranking proposed in [13,14] utilized in the multi-objective optimization to rank scientific achievements. In 1964, Pareto [58] proposed the Pareto optimality concept, which has been applied in a wide range of application, such as economics, game theory, multi-objective optimization, and the social sciences [59]. Pareto optimality was mathematically defined as a measure of efficiency in the multi-objective optimization [12,60]. We explain Pareto optimality concepts and also the proposed method and how it can be applied to evaluate scientific achievements. Without loss of generality, it is assumed that the optimal value of each criterion as a preference be a minimal value. Seeking the optimal value among both the minimal and maximal values is analogous, and if a criterion value element C i to be maximized, it is equivalent to minimize −C i .
In the following, the Pareto optimality definitions are described by the assumption of the minimal value as the optimal.
Definition 1 ((Pareto Dominance) [61]). A criterion vector u = (u 1 , . . . , u n ) dominates another criterion vector v = (v 1 , . . . , v n ) (denoted as u ≺ v) if and only if ∀i ∈ {1, . . . , n}, u i ≤ v i and u = v. This type of dominance is called weak dominance in which two vectors can be same in some objectives, but they should be different in at least one objective. However, in strict dominance, u has to be better on all objectives; i.e., it can not have the same objective value with v.
The Pareto optimality concept is defined from the dominance concept as follows.   Definition 3 (Definition (Pareto-front) [61]). For a given set S, the Pareto front is defined as set S {x ∈ S| y ∈ S, y ≺ x}.  Dominance depth ranking in the non-dominated sorting genetic algorithm (NSGA-II) was proposed by Deb et al. [13] to partition a set of objective function vectors (criterion value vectors) into several clusters by Pareto dominance concept. First, the non-dominated vectors in a set of criterion value vectors assigned to rank 1 and form the first Pareto front (PF1), and all these non-dominated vectors are removed. Then, non-dominated solutions are determined in the set and form the second Pareto front (PF2). This process is repeated for other remaining criterion value vectors until there is no vector left. Figure 3 illustrates an example of this ranking for a set of eight points (criterion value vectors) and Table 2 shows the coordinates of points. First points 1, 2, 3, and 4 as non-dominated solutions are ranked to rank 1. Then, for the rest of the points (points 5, 6, 7, and 8), non-dominated solutions are determined so points 5 and 6 as non-dominated solutions are ranked as 2 and removed. In the last iteration, the remaining points 7 and 8 are ranked as rank 3. The details of non-dominated sorting algorithm is presented in Algorithm 1.
// Temporarily removing Pareto front from set to compute the next fronts Table 2. A numerical example of computed new metrics for eight points shown in Figure 3. Four new statistical metrics are mean-ranks, median-ranks, dominated number, and nn-dominated number. Ranks-F1 and Ranks-F2 are ranks (two columns Ranks-F1 and Ranks-F2) for two criterion vectors F1 and F2. In [17], the dominance concept was used to identify the excellent scientists whose performances cannot be surpassed by others with respect to all criteria. The proposed method can provide a short-list of the distinguished researchers in the case of award nomination. It computes the sum of all criteria and sorts all researchers according to this calculated sum value. After that, the researcher with the maximum sum r max is placed in the skyline set. The second best researcher is compared with the researcher in the skyline set (r skyline ); if he/she is not dominated by r max , he/she is added into the skyline set. This process is repeated for all remaining researchers to construct the skyline set: if they are not dominated by all researchers in the skyline set (r skyline ), then they are added into the skyline set. In fact, they select all researchers in the first Pareto front using the dominance concept. There is a well-known problem with the first Pareto created by the basic non-dominated sorting [17]. By increasing the number of criteria (more than three criteria) in the set of the criterion value vectors, a large number of the compared vectors become non-dominated vectors and are placed in the first Pareto front. By increasing the number of criteria, the chance of placing a criterion value vector while having only one better criterion value in the first Pareto front is increased. In order to demonstrate this problem, Table 3 shows three Pareto fronts by the non-dominated sorting for countries data extracted from the site "http://www.scimagojr.com" including five indicators: citable documents (CI-DO), citations, self-citations (SC), citations per document (CPD), and h-index; Table 3 shows the results of the non-dominated sorting method. As it can be seen from Table 3, three countries, Panama, Gambia, and Bermuda, are in the first Pareto front because they have higher values for only one criterion indicator (CPD) while other criteria values are low. Additionally, Montserrat has the rank 2 because it has the high value for only the CPD indicator. In this paper, we propose a modified non-dominated sorting (described in Algorithm 2) to rank the scientific data. First we use the dominance depth ranking for all vectors; after that for each criterion value vector two new statistical metrics are calculated. For each vector, two metrics are the dominated number and the non-dominated number which show the number of the dominated vectors by this vector and the number of vectors which dominate this vector. Additionally, we used two other statistical measures proposed in [62]. These statistical measures are computed to sort the criterion value vectors. In [62], first for each criterion value C i , all vectors are sorted according to this criterion value C i in ascending order and their ranks are assigned based on their sorting order. After that, for each criterion value vector some statistical measures like the minimum of its rank or the sum of its rank are used to make Pareto fronts.

Median-Ranks
We also sort all vectors according to each criterion value and calculate the ranks of vectors corresponding this sorting; after that we compute the mean and median of ranks of each vector as two new metrics. Table 2 shows an example of computed new metrics for eight points in Figure 3. F 1 and F 2 are the values of sample points in Figure 3 which are considered just as the numerical examples for a two-objective problem. For each point, ranks (two columns Ranks-F1 and Ranks-F2) for two criterion vectors (F 1 , F 2 ) are computed according to their sorting order. Thus, we have four new statistical metrics (the mean and median of ranks, also the dominated number and the non-dominated number) which we use as criteria (objectives) to measure various levels of scientific achievement by applying dominance depth ranking again to make all Pareto fronts. We used the basic non-dominated sorting for data with two and three criteria and the modified non-dominated sorting for the data with more than three criteria. The proposed method has major advantages that are described in detail. In this method, vectors with one better criterion value than others cannot move toward the first front.
Additionally, increasing the number of criteria cannot negatively influence the obtained ranks (no big portion of entities in the first front, as before); each rank corresponding to a Pareto front has a smaller number of vectors, so in total it assigns more ranks to the criterion vectors. In order to demonstrate the performance of this modified non-dominated sorting, Table 4 shows four Pareto fronts by the modified non-dominated sorting for extracted country data. Because the considered criteria have different scales, in all experiments, in order to apply the proposed method, they are normalized. As can be seen in the first Pareto front, only the United States is placed and Panama is in the forth Pareto front. Additionally, other countries with only one high criterion value, Gambia and Bermuda, which are in the first Pareto front by the non-dominated sorting method (as it can be seen in Table 3) are not placed in four Pareto fronts obtained by the modified non-dominated sorting method. Additionally, it can be seen that the number of countries in each Pareto front by using the modified non-dominated sorting is smaller than in basic non-dominated sorting.
In addition, we consider the period research as a new criterion value. Using Pareto dominance ranking makes it possible to have the research period as an independent indicator to be considered for ranking the scientific data. Considering the research period as the indicator provides a predication mean for some research cases. For example, suppose for comparing authors, criterion values be h-index and the research period A i = (h − index, time): two authors A 1 = (80, 40) and A 2 = (20, 10) would be in the same Pareto front because based on Pareto optimality concept, they do not dominate each other; therefore, we can predict that the author A 2 probably will be able to have the same performance as the author A 1 (or even better) after some years. According to observed values of indicators for universities, authors, and countries, this method can be utilized for prediction of their future performance. Additionally, the time length indicator enhances this ranking method with a traceable feature; that means by collecting data during times, we can observe how the performances of universities or researchers change and if they can improve their Pareto front ranks or not. In addition, this method can be applied to compute ranks by using obtained ranks from other ranking methods (ranking by multiple resources). In this way, each indicator is an obtained rank from a ranking method and it is expected that the non-dominated vectors in the first Pareto front contain the vectors with the minimum/maximum values of indicators, for Min-Min or Max-Max cases, respectively. Pareto dominance ranking can take into account any new kind of indicator as a new criterion value.

Experimental Case Studies and Discussion
We run the basic Pareto dominance ranking on the following scientific data with two and three criteria and modified Pareto dominance ranking on the following scientific data with more than three criteria. The first dataset includes 350 top computer science researchers (http://web.cs.ucla.edu/ palsberg/h-number.html) which contains a partial list of computer science researchers who each has an h-index of 40 or higher according to the Google Scholar report. This data has two indicators: research period (a low value is better) and h-index (a high value is better). The h-index values were collected from Google Scholar for the year 2016 and research period values were calculated from the year of the first publication of an author so far. The second dataset includes the 200 top universities ranked by URAP (a nonprofit organization (http://www.urapcenter.org)). This dataset has six indicators: article, citation, total document (TD), article impact total (AIT), citation impact total (CIT), and international collaboration (IC). The third dataset has 231 top countries (for the year 2015) extracted from the site SJR (http://www.scimagojr.com), including six indicators: documents, citable documents (CI-DO), citations, self-citations (SC), citations per document (CPD), and h-index. We do not consider the SC indicator because it is not certain that the maximum value or minimum value of this value is desirable. The forth dataset consists of the three ranks of 100 top common universities collected from three resources; the QS World University Rankings (https://www.topuniversities.com), URAP (http://www.urapcenter.org), and CWUR Rankings (http://cwur.org). In the following, we report all results of mentioned approaches on the four datasets in detail.  Figure 4 shows the ranks in terms of Pareto fronts for all researchers. It can be seen from Figure 4 that the extent of improvement for a researcher A i can change his/her Pareto front ranking by looking at researchers which dominate A i and are located in the better Pareto fronts. 10 15 20 25 30 35 40 Reserach period  To gain a better understanding of the Pareto ranking with each indicator, we plot the obtained Pareto ranks from the first rank to the thirty fifth versus each indicator. In Figure 5

The Second Case Study: Ranking of Universities
Six indicators of university dataset and their ranks obtained by modified Pareto dominance ranking are summarized in Table A2. As mentioned in Section 3, for fair comparison, we add the time period of academic research (the research period (RP)) mentioned in Table A2 as an indicator in the data which is calculated as the length of the university established year to present. Based on the proposed method, the first Pareto front has six universities, including top universities; for example, Harvard University, University of Toronto, and Stanford University. In the basic Pareto dominance ranking, the first Pareto front has twenty universities. Additionally, the proposed ranking clusters this data into twenty three Pareto fronts but the Pareto dominance ranking has only eight Pareto fronts. As was mentioned in the Section 3, the proposed method can assign more ranks to the criterion vectors even by increasing the number of criteria (many-metric cases).
In order to deep understand the behavior of the obtained Pareto ranks and indicators, we plot the maximum, minimum, and average of values for all indicators versus Pareto ranks in Figures 6-8 as mentioned before. It can be seen from these figures-all plots for six indicators-that there is a decreasing behavior in terms of the maximum, minimum, and average values, observable from the first Pareto front to the last Pareto front. In addition, Figure 9 visualizes universities in the four top ranked Pareto fronts. Each line illustrates one university (a five dimensional vector) in which the values of five indicators are presented using vertical axes; i.e., coordinate's value.     Table A3 shows countries, the values of five indicators (documents, CI-DO, citations, CPD, and h-index), and the obtained Pareto ranks from the proposed method (Pareto ranking). The United States is located in the first Pareto front because it has the maximum values of four indicators: documents, CI-DO, citations, CPD, and h-index. The United States is assigned to the rank 1 and in the second Pareto front, Switzerland and the United Kingdom are placed. The proposed method ranks these countries into forty six ranks while in the Pareto dominance ranking, it has thirty Pareto fronts.

The Third Case Study: Ranking of Countries
Additionally, for this data, we plot the maximum, minimum, and average of values for all indicators versus Pareto ranks in Figures 10 and 11. Figures show a falling tendency of the average values from the first Pareto front to the last Pareto front. Additionally, we compute the percentage of the number of countries from the different continents (Asia, Europe, Latin America, Middle East, North America, and Pacific region) for each Pareto front. Figure 12 shows the percentage number for each continent. In Figure 12, the first largest and second largest percentages of the first Pareto front are North America and Europe. In addition, Figure 13 visualizes the values of indicators for countries in the four top ranked Pareto fronts by the parallel coordinates visualization technique. Each line illustrates one country (a five dimensional vector) in which the values of five indicators are presented using vertical axes; i.e., coordinate's value. For instance, the value of CI-DO indicator is in interval [1, 10 7 ] for countries on the four first Pareto fronts.

The Forth Case Study: Resolution for Multi-Rankings of Universities
This case study collects the three ranks of 100 top common universities collected from the three mentioned resources, from which it is supposed that the criterion vectors with the lesser values for all three ranks are better vectors (i.e., Min-Min-Min). Table A4 shows universities, the values of three ranks, and the obtained Pareto ranks from Pareto dominance ranking (Pareto ranking). As we can see, three universities, "Massachusetts Institute of Technology," "Stanford University," and "Harvard University" are located in the first Pareto front, which has elements with the values 1 and 2 as the obtained ranks from other ranking resources. Figure 14 shows the numbers of Pareto fronts for all data. Additionally, the maximum, minimum, and average of values for three rankings versus Pareto ranks are plotted in Figure 15. It can be seen from Figure 15 that the average values of three ranks increase from the first Pareto front to 13th Pareto front.
At the end of this section, several points regarding the performance of the method and its differences with other ranking strategies are mentioned. First of all, a multi-criteria indicator is proposed for ranking the researchers, universities, and countries. Considering two or more objectives simultaneously can provide a fairer ranking. For instance, using research period along with other important criteria provides a fair comparison for senior and junior researchers to discover more-talented researchers. Secondly, since the considered criteria to assess the entities are different from indicators in other ranking strategies, the resultant rankings are completely different. In fact, they evaluate the universities in terms of different metrics. As a result, the comparison between the results of ranking strategies does not lead to a precise and meaningful conclusion. On the other hand, the proposed method clusters the entities based on multiple criteria into different levels. Accordingly, all universities in the same Pareto are ranked equally; for instance, based on this perspective, all universities in the first Pareto are the top ranked universities. Finally, this method does not actually define an evaluation measure; it gives a strategy to rank not only the case studies in the paper, but also any multi-criteria data entities. In addition, using this general platform provides the chance to utilize any metric to assess the related entities without modification to other parts of the algorithm.

Conclusions and Future Directions
In this paper, a modified Pareto-front based ranking was suggested as a new ranking method for measuring the scientific achievements, or in general multi and many-metric rankings. By using some dominance metrics obtained from the basic Pareto dominance depth ranking and some statistical metrics sorting compared criteria, the proposed method is able to find some different groups (clubs) for entities of a dataset having a large number of the criteria. It provides simultaneously multiple comparisons, considering the time period of academic research, and the use of other ranking methods. We selected different kinds of the scientific datasets; namely, computer science researchers, top universities, countries, and multiple rankings of universities to rank by using Pareto ranking. In future, we are planning to develop ranking strategies based on other dominance-based rankings; for example, dominance rank [61,63] which is related to the number of data entries in the set which dominates the considered point. Finally, we are interested in considering the use of other types of domination definition, such as the concepts of weak dominance, strict dominance, and -dominance. Additionally, many (more than three) metrics and various resources will be studied in the future.

Conflicts of Interest:
The authors declare no conflict of interest.