Science Evaluation in the Czech Republic: The Case of Universities

In this paper, we review the current official methodology of scientific research output evaluation in the Czech Republic and present a case study on twenty-one Czech public universities. We analyze the results of four successive official research assessment reports from 2008 to 2011 and draw the following main conclusions: (a) the overall research production of the universities more than doubled in the period under investigation, with virtually all universities increasing their absolute research output each year, (b) the total research production growth is slowing down and (c) Charles University in Prague is still the top research university in the Czech Republic in both absolute and relative terms, but its relative share in the total research performance is decreasing in favor of some smaller universities. We also show that the rankings of universities based on the current methodology are quite strongly correlated with established indicators of scientific productivity. This is the first time ever that the official present-day Czech science policy and evaluation methodology along with the results for the Czech university system has been communicated to the international public.


Introduction and Related Work
The evaluation of scientific research output has become crucial in recent years, as the budgets of science funding bodies (governments, foundations, etc.) have become tight, but the need for research and innovations has been ongoing or even growing.Therefore, it has become clear that it is absolutely necessary to identify high quality research that should be prioritized in receiving funding and also poor quality research whose funding is no more effective.The key concept here is to promote the OPEN ACCESS advancement of science as efficiently as possible, i.e., to maximally increase the effort/award rate from the point of view of financing science.This is why many countries have introduced various research performance evaluation systems (especially for institutions), some of which are the well-known Research Assessment Exercise (RAE) in the United Kingdom or Excellence in Research for Australia (ERA) in Australia.Science evaluation has also been a hot topic in the Czech Republic in recent years.The Czech government (or more precisely, the Research, Development and Innovation Council-an advisory body to the government) published an official methodology of research output evaluation that later changed several times within a few years.We will review the current methodology (from May 2011) in the following sections and show the results of the last four official research evaluation reports based on this methodology in the context of twenty-one Czech public universities.Although the official methodology should only serve as an input into the process of research budget creation, its application inevitably leads to university rankings, which are part of this paper's results section.(There are no official university rankings in the Czech Republic.) The Czech Republic is little covered in science and technology literature.Some of the few studies devoted exclusively to the Czech Republic include bibliometric analyses of Czech research publications [1], patents [2] or European framework program results [3].Other scientometric studies usually observe the Czech Republic in the context of a larger group of (Central) European countries, e.g., [4] or [5].As far as the official evaluation of scientific research output in the Czech Republic is concerned, it seems that the Czech research evaluation system is (almost) unknown to the rest of the world: neither [6] nor, more recently, [7] make an explicit mention of the Czech Republic in their comprehensive overviews of university research evaluation and funding systems in different countries.Country-specific research evaluation at the university level is currently a lively topic for scientometricians, as is well documented by the recent studies for Colombian [8], Spanish [9], Chinese [10], South African [11] or Taiwanese [12] universities.Many papers (e.g., [13][14][15][16]) are also concerned with the use of peer review and bibliometric indicators in national university research evaluation and funding systems and argue why the former or latter approach is better, but this is not the intent of this article.We solely present the currently used research evaluation methodology and a case study for universities.

Data and Methods
In this study, we concentrated on a set of twenty-one public universities (see Table 1) run by the Ministry of Education, Youth and Sports of the Czech Republic and, in one case, by the Ministry of Defense of the Czech Republic (University of Defense).These universities are also the most highly ranked in the 2011 Research Evaluation Report (the most recent evaluation).Other public universities in the Czech Republic do not conduct research in the fields of science and technology (such as colleges of arts or police academies) and are discarded from this study.

Scores
The official methodology for the evaluation of research output has been slightly modified a few times since 2008, the first year in a series of successive comparable research evaluation reports.(There were research evaluation methodologies and reports before 2008, but they differed from the current methodology to such extent that it would make no sense to compare those evaluations to the current ones.For instance, the reports only considered research results related to completed grant projects, etc.In contrast, the current methodology considers all results.)In the following sections, we will present a short summary of the current methodology (available in Czech [17]) defined by the Czech government in May 2011.In general, the methodology is based on assessing scientific production, i.e., it counts publications and other research results produced and only indirectly (in some cases) on assessing the quality of research output.No citations are counted, but, in the case of journal articles, the journal impact factor is taken into account, which is a de facto cheap estimate of potential citation counts.In this methodology, all research results yielded in the five years preceding the evaluation year are assigned the scores shown in Table 2.For instance, all journal articles indexed in the Web of Science (WoS) database by Thomson Reuters that were published in journals with a nonzero impact factor in the Journal Citation Reports (JCR, edited in the publication year) from 2006 to 2010 will be assigned a score between ten and 305 in the 2011 Evaluation.The score is computed according to the following formula: where N is the normalized journal rank obtained from JCR when the journals in its category are sorted by their impact factor (IF) in descending order: N = (P -1)/(P max -1), where P is the journal rank and P max is the number of journals in the category.If the journal belongs to two or more categories, N is the average normalized rank from all categories.However, there are two cases in which this formula is not needed: if an article is published in the prestigious multidisciplinary journals, Nature or Science, it is assigned a score of 500 without any computation.Articles published in refereed journals without IF (J noimp ) can also get scores, provided they are indexed by the well-known databases, Scopus and/or ERIH (European Reference Index for the Humanities-categories A, B, C).For Scopus, there is a unique score of twelve, whereas for ERIH, there is a distinct score for each journal category, and in addition, articles in journals on "nation-specific" topics, such as history or linguistics, have more weight than articles in other journals.There is also a category for articles that appear in Czech refereed journals (J ref ), whose list is pre-defined and which can also be classified into "national" fields and other fields subcategory.In the case a journal article happens to belong to two or more categories (or subcategories), the highest possible score is considered for that article.Books (B) are rewarded with scores of forty or twenty, depending on the publication language (English, Chinese, French, German, Russian and Spanish are considered "world" languages) and scientific field.Book chapters receive scores proportional to the score of the entire book based on the chapter's scope within the book.The last result category in basic research are conference proceedings papers (D) indexed in WoS that score eight points each.In addition, any of the above results whose presence in WoS is required must be one of the following document types: article, review, proceedings paper or letter.The other result categories in Table 2 comprise applied research results such as patents (P), pilot plants, certified technologies, varieties, and breeds (Z), utility models and industrial designs (F), prototypes and functional samples (G), results implemented by funding body (H, e.g., results implemented in legal documents), certified methodologies and procedures and specialized maps (N), software (R), and research reports with confidential information (V).The highest score here (500) can be assigned to a patent granted by the European Patent Office or by the US or Japanese patent offices.The second highest score (200) is achieved by a national patent (granted by patent offices other than the three above offices), provided the patent is commercially exploited based on a valid license.All other patents receive a unified score of forty.The other applied research results equally obtain forty points each, except for categories Z (100) and V (fifty).The result categories, H and N, are further split into subcategories (with the same score) whose descriptions are not shown in Table 2.

Renormalization
The scores in Table 2 are given for a full research result-they are further distributed to individual universities (or, more generally speaking, to research institutes) according to their share in the result.In principle, outputs are fractionally allocated to universities based on their share of authors.However, domestic and foreign affiliations are weighted differently.Finally, the current methodology employs a score renormalization process, whose goals are the following: (a) prevent excessive growth of results whose existence and quality is difficult to verify, (b) retain the funding proportion between basic and applied research and (c) retain the funding proportion among various disciplinary research areas.The renormalization steps must be taken exactly in the following order: (c) Setting of the proportion among various disciplinary research areas.Let a x = p x (SB + SA)/X be the correction factor of research area, X, where SB and SA are defined above, X is the total score of results in research area, X, after the corrections described in the two previous steps, and p x is the (desired) research area share from Table 3.The results in each research area shall be multiplied by the corresponding correction factor.
The final scores achieved by universities after renormalization are used by the Czech government in the creation of the budget for the support of research institutions.Officially, the scores are not used to rank research institutions in any way.

Results and Discussion
From 2008 to 2011, the universities under investigation more than doubled their overall research output, achieving a total score of 0.73, 1.20, 1.56 and 1.75 million points in the respective years (see Table 4).Thus, there is an increase of 140% in scientific productivity between 2008 and 2011.This can be documented by the year-by-year growth in 2009, 2010 and 2011, which is 65%, 30% and 12%, respectively.Therefore, research productivity is still growing, but the growth is slowing down.As far as the absolute scores of the individual universities are concerned, all of the universities (but two) managed to increase their research output compared to the previous year, sometimes quite remarkably, e.g., Hradec by 131% in 2009 and by 114% in 2010 or Ostrava by 101% in 2009; other times, only modestly, e.g., MU by 3% in 2011, VŠCHT by 5% in 2010 or Charles University (UK) by 5% in 2011.The only exceptions to the "ever-growing" research productivity are VŠE, dropping by 6% in 2011, and the University of Defense (UO) in 2011, which declined by 2%.Note, however, that because of some methodological changes in the research assessment between 2008 and 2011, a 100% score growth does not necessarily mean a twofold productivity.Now, let us have a look at how the relative shares of universities in the overall research output (produced by twenty-one public science and technology universities) changed between 2008 and 2011.
In Figure 1, we can see that Charles University (UK) was the leading institute, with 34% in 2008, followed by ČVUT and MU (other "big" universities), with 12% and 11%, respectively.In 2011 the top three universities remained the same, but UK's share dropped by five percentage points (see bottom chart in Figure 1).On the other hand, some "small" universities managed to raise their shares, e.g., Olomouc, Budějovice or Plzeň.In Figure 2     The difference between the absolute and relative research output can be seen by comparing the two charts in Figure 3.In the top chart, all universities improve their absolute research performance (except VŠE and UO in 2011), but in the bottom chart, only some of them increase their relative research output, while others decline it.Speaking in relative terms, Charles University (UK) is still the top research university, but its lead is diminishing, other big universities stagnate (ČVUT and MU) and small universities are catching up (the trend is definitely positive for Olomouc and Budějovice).As for the rankings themselves, they are very highly correlated with Spearman's rho, varying from 0.961 between 2008 and 2011 to 0.992 between 2008 and 2009 (both statistically significant at the 0.01 level two-tailed).However, let us underline again that the scores we are comparing here are not officially meant to be used to create university rankings-they are merely input into the process of research budget creation in the Czech Republic.As for the scientific production of Czech universities as measured by their publication counts in Web of Science in the five years preceding the census years, let us have a look at Figure 4.The growth of absolute publication output is still quite evident (see top chart) and so is (to a smaller extent) the relative production increase of some smaller universities (see bottom chart).Furthermore, the relative decline of Charles University (UK) is less steep.Nevertheless, the rankings of universities based on the methodology described in this paper and those grounded in the productivity indicators from Web of Science in a particular year are very highly positively correlated with Spearman's correlation coefficients between 0.884 in 2008 and 0.935 in 2011 (always significant at the 0.01 level, two-tailed).For complete information on WoS-indexed publication output, see Table 5, in which we can see that productivity increased by about 49% between 2008 and 2011 and grew by only 13% in the last year.

Conclusions and Future Work
The evaluation of scientific research output at the level of institutions has become extremely important in recent years, due to the increasing effort of national governments (and other research funding bodies) to support research, development and innovations as efficiently as possible.In this study, we concentrate on the science evaluation policy in the Czech Republic (which is hardly known in science and technology literature) and present the results of the most recent official assessments (2008-2011) of the research output of twenty-one Czech public universities.The key findings are the following: − The overall research output of the universities under study more than doubled from 2008 and 2011, with virtually all universities increasing their absolute research production each year.
− The production growth seems to be slowing down.
− Charles University in Prague is still the leading research university in both absolute and relative terms, but its relative share in the total research production is decreasing in favor of smaller universities.
In addition, we have shown that although the current evaluation methodology places some emphasis on applied research, the rankings of universities that can be generated using these assessment reports are very strongly correlated with the rankings based on publication counts from Web of Science.Even if the total production increase between 2008 and 2011 was 140% based on the official methodology and only 49% based on Web of Science publication data, the trends of university research output remained similar.The difference in the overall production growth may be caused by taking into account also non-WoS publications and applied research results, such as patents or prototypes by the official methodology, as well as by the way the points for research results are normalized and distributed to individual institutions in the national assessment.In spite of this, university rankings grounded in Web of Science publication data seem to be a good approximation to the national assessment results.However, there are no official university rankings in the Czech Republic, and even the results of the annual research evaluations are only used to help allocate research funds.Therefore, the rankings presented in this article should be considered "unofficial", even though they are based on an analysis of official and publicly available data.In our future work, we would like to focus on the updates and modifications of the official science assessment methodology, as well as on other types of research institutions, as well, such as the institutes of the Academy of Sciences of the Czech Republic and on the comparison of the research evaluation systems and university performance in Central European countries.
(a) 115% reduction of excessive growth of results of a certain type.Let X 2009 be the total score of results of type X yielded in 2009 and X 2010 be the total score of results of type X yielded in 2010.If X 2010 /X 2009 > 1.15, then the scores of all results of type X from 2010 shall be multiplied by factor c x :c x = 1.15(X 2009 /X 2010 ).This step does not concern J imp results.(b) Correction of the proportion between basic and applied research results to eighty-five: fifteen.Let SB = J + B + D be the total score of basic research results and SA = P + Z + F + G + H + N + R + V be the total score of applied research results.(Previous methodologies also included result categories, C-basic research-and L, S, and T-applied research.)Let a 85 = 0.85(SB + SA)/SB be the correction factor for basic research results and a 15 = 0.15(SB + SA)/SA be the correction factor for applied research results.Then, all results of categories J, B and D shall be multiplied by factor, a 85 , and all results of categories P, Z, F, G, H, N, R, and V shall be multiplied by factor, a 15 .
, the pie charts are quite similar, even though they are based on the number of publications indexed in Web of Science in 2003-2007 (for 2008) and in 2006-2010 (for 2011) that were affiliated with the Czech universities under study.(The publication counts were retrieved in April 2013 using the "Organization-Enhanced" advanced search feature, including all document types from the five main citation databases of the Web of Science by Thomson Reuters.)

Figure 2 .
Figure 2. Relative university publication output in 2008 and 2011 by Web of Science (WoS).

Figure 4 .
Figure 4. Absolute and relative university publication output in 2008-2011 by WoS.

Table 2 .
Research result categories and their scores.ERIH, European Reference Index for the Humanities.EPO, European Patent Office.

Table 3 .
Disciplinary areas and their desired shares.

Table 4 .
Absolute and relative university scores in 2008-2011.