Next Article in Journal
Intellectual Capital: A Review and Bibliometric Analysis
Previous Article in Journal
Open Access Publishing Probabilities Based on Gender and Authorship Structures in Vietnam
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Differences in Citation Patterns across Areas, Article Types and Age Groups of Researchers

Department of Informatics, Universidad Técnica Federico Santa María, Valparaíso 2340000, Chile
Publications 2021, 9(4), 47; https://doi.org/10.3390/publications9040047
Submission received: 16 August 2021 / Revised: 28 September 2021 / Accepted: 12 October 2021 / Published: 19 October 2021

Abstract

:
The evaluation of research proposals and academic careers is subject to indicators of scientific productivity. Citations are critical signs of impact for researchers, and many indicators are based on these data. The literature shows that there are differences in citation patterns between areas. The scope and depth that these differences may have to motivate the extension of these studies considering types of articles and age groups of researchers. In this work, we conducted an exploratory study to elucidate what evidence there is about the existence of these differences in citation patterns. To perform this study, we collected historical data from Scopus. Analyzing these data, we evaluate if there are measurable differences in citation patterns. This study shows that there are evident differences in citation patterns between areas, types of publications, and age groups of researchers that may be relevant when carrying out researchers’ academic evaluation.

1. Introduction

A seminal contribution to the measure of scientific impact is the one proposed by Eugene Garfield, who noted that the best way to follow the life cycle of a scientific article is through its citations. He claims that a relevant article for a community is frequently cited, so by tracking its citations, we can measure its impact [1]. Garfield extended the notion of impact of an article to the impact of a journal, introducing the journal impact factor (IF). Despite their wide use in research evaluation, metrics based on citations have criticisms and limitations. Moed highlighted the importance of the use of more complex measures that consider differences in citation patterns and dynamics between disciplines [2]. The problem with the use of a unique index that will not consider the specificities of each discipline produces distortions at different levels of research evaluation, both for projects, people, and even institutions [3].
When evaluating individuals, citation-based indices at the researchers level take an even greater role [4]. Hirsch [5] introduced the H-index. An H-index equal to h indicates that the researcher authored h papers with at least h citations. The H-index avoids the evaluation of long lists of uncited articles, determining which of these papers have had an impact on the scientific community. However, since citation patterns between disciplines are different, in some disciplines, the evolution of citations is much slower than in others [6]. Accordingly, to make a fair comparison, citation-based indices as the H-index [5] should be observed in very long time windows. These delays in citation patterns discouraged their use in the evaluation of young researchers.
One of the crucial points in the criticism of the scientific impact indicators is if they have predictive capacity [7]. Robien [8] showed that several of the most relevant works in Chemistry did not come from journals with the highest impact factor. The same observation was made in entrepreneurship journals [9]. Finardi [10] also observed something similar in a study that covered more disciplines. Vanclay [11] showed that the IF is based on a sample that may represent half the whole-of-life citations to some journals and a small fraction of the citations accruing to other journals. Since citation patterns across disciplines have different dynamics, the IF negatively impacts disciplines with slow dynamics. Other factors, such as the impact of self-citations, have also shown that some journals show abrupt changes in their IF [12]. In addition, journals that publish reviews tend to be highly cited. In contrast, journals that publish in very specific topics often have low citation impacts [13].
Variants of the IF or H-index have been proposed to overcome their limitations. For example, Braun et al. [14] proposed weighting the effect of citations on the H-index computation, incorporating the journal’s IF, giving high weight to citations that come from journals with a more significant impact. Balaban [15] proposed the opposite, giving high weight to citations that come from low IF journals since there would be more exceptional merit in receiving citations. Bergstrom [16] introduced the eigenfactor score, which homologates the concept of prestige with the stationary probability of a random walk in a citation graph. An extension of the eigenfactor is the Article Influence Score (AIS), which calculates the average impact of articles in a given journal, measuring impact in terms of eigenfactors. To infer prestige, agnostically at the scale of a community, González-Pereira et al. [17] introduced the Scimago Journal Research indicator (SJR), which ranks journals based on citation weighting schemes and eigenfactor centrality. Moed [18] introduced SNIP (Source Normalized Impact per Paper), which is a ratio of a journal’s citation impact and the citation potential in its subject field. Liu and Fang [19] detected the differences between citation-based indicators, comparing IF with 2-year windows (IF2) with IF over five years (IF5). The research shows that IF2 has much more significant changes in terms of quartiles than IF5. IF5 was also found to have a strong correlation with AIS, having a better predictive ability of paper influence than IF2.
Whatever the metric used to evaluate, it is essential to get citations [20]. The need to have a recognized track record is a crucial element to access competitive funds and promote scholars [21]. The scientific environment is becoming increasingly competitive. In this context, delays in a publication, measured as the time elapsed between reviews, acceptance, and definitive publication, affect research evaluation [22].
We introduce a citation-based study that elucidates differences between areas, types of publications, and researchers’ age groups. This article expands the scope of previous studies, measuring the cross effect of citations patterns between areas and types of publications. In addition, we study differences in citation patters in terms of age groups of researchers. By extending the domain of analysis, we will find new citation patterns that have not been clearly evidenced in the related work. Our primary focus is to study the evolution of citation patterns in a wide observation window on a massive amount of citations recorded in different areas. We pay attention to differences between areas and publication types, focusing on finding evidence about the importance of journals and conference proceedings between classical areas of knowledge and newer areas, such as Computer Science. As we will show in Section 2, related work shows the existence of these differences but in collections that compare few areas with each other or in studies with smaller observation windows. Our study covers a wide range of knowledge areas, analyzing the evolution of citation patterns with data spanning almost five decades. We also pay attention to differences across age groups of researchers, which is less explored in related work. To achieve this goal, we conducted a data collection process in Scopus, including different types of publications by researchers from a broad spectrum of areas. The study will elucidate the existence of differences in citation patterns between areas, conditioned by the type of media considered for publication. We will show that these patterns have clear differences according to the age of the researchers. The conclusions of the study will indicate the need for indicators that consider different media of publication incorporating the age factor when evaluating the scientific outputs of a researcher.
The contributions of this article are:
We study differences in citation patterns considering a large volume of data that covers different areas, types of publications and age group of researchers.
We detect differences in citation patterns conditioned on areas and types of publications.
We detect differences in citation patterns between areas conditioned on age groups of researchers.
The article is organized as follows. Related work is discussed in Section 2. In Section 3, we present the materials and methods used in this study. Section 4 presents differences in citations patterns between areas and types of publications. In Section 5, we study differences in citation patters across areas and age groups of researchers. In Section 6, we discuss results, implications and limitations of this study. Finally, Section 7 highlights findings and future work.

2. Related Work

Lariviere et al. [23] evaluated the concentration of papers and funding resources at the researchers’ level, showing the existence of differences between areas. The study highlights a higher concentration of research funds, publications, and citations in Social Sciences and Humanities than in other fields. Nederhof et al. [24] detected that the publications in Space-Life Physical Sciences are cited initially much lower than similar ones of ground-based research, showing that the use of short-citation window (1–4 years) is detrimental in impact assessment. The study states that bibliometric indicators should be calibrated to provide optimal monitoring of the impact. Prins et al. [25] detected that some areas had a greater diversity of media in which they produced scientific outputs. In these areas, the use of Google Scholar could be beneficial to improve the coverage of publications, offering advantages over databases such as WoS, which offer a lower diversity of publication media.
Studies of differences in citation patterns according to type of publication have strongly focused on measuring the impact of conferences in Computer Science. Freyne et al. [26] shows that Computer Science emphasis on publishing in conference proceedings hurt research evaluation when this evaluation is based on WoS indices. Vrettas and Sanderson [27] corroborates that Computer Scientists value conferences as a publication venue more than any other area. The analysis also shows that only a few conferences attract many citations, while the mean rate per paper is higher in articles than in conferences. Glänzel et al. [28] shows that conference proceedings are a valuable source of information for research evaluation in Computer Science. Furthermore, the study shows interesting international collaboration patterns triggered by these types of events not observed in other areas. Meho [29] assesses the quality of Computer Science conferences matching the citescore ranges of the top quantile journals. The study shows that top conferences make up 30% of top-quartile publications. Thelwall [30] compares Scopus citations with Mendeley reader counts for conference papers and journal articles in eleven computing subfields showing high correlations between both counts. Kochetkov et al. [31] applies a methodology similar to that of SJR to rank conferences in Computer Science. The study shows that the top conferences have similar SJR values to those reached by the top journals in the area. Li et al. [32] study the impact of conference rankings in Computer Science using regression analysis. The study determines how these rankings have generated changes in the citation patterns of Australia and China. The study shows that the researchers align themselves with the rankings used in their country, modifying their publication patterns to fit these rankings better. However, Qian et al. [33] shows that journals and conferences’ relative status varies significantly between different areas of Computer Science. Recently, Yang and Qi [34] showed that book publications are more relevant than conference publications in most areas. They also show that productivity indicators based on citations fail to capture these types of publications’ relevance.
The skewness of citation distributions between areas has also received much attention from the community. Ruiz-Castillo and Costas [35] study the skewness of citations per area, comparing 27 different areas. The study shows that skewness is very similar across areas, and only a small part of the authors of each area are responsible for the production in this area. The most prolific authors dominate citation skewness. Radicchi et al. [36] study how the probability of citing an article c times varies between areas. However, it shows that by rescaling the distributions by the average number of citations per article, a universal distribution of citations with high skewness emerges. Based on this index, the study introduces a variant of the h index that allows scientists from different areas to be compared. Albarrán et al. [37] show that citation distributions based on 5-year observation windows are highly skewed. Rescaling these distributions, the level of skewness across areas is quite similar, showing that 10% of the most cited articles concentrate more than 40% of the total citations. By reproducing the study incorporating subfields, Crespo et al. [38] show that the differences between areas increase. At a finer granularity, more differences are detected between areas in terms of citation distributions. Bensman et al. [39] study the mean citation rate per article in Mathematics. The study shows that the citation rate is highly skewed. Many of the publications that explain this phenomenon are due to citations captured by Reviews. The study also analyzes the bibliometric structure of the area, detecting a weak central core of journals. The study hypothesizes that these differences with other areas would explain why methods based on citations do not apply to this particular area.

3. Materials and Methods

To build the data repository we used for this study, we accessed information provided by Scopus. Scopus provides access to researcher profiles, in which lists of publications are displayed. To have a significant number of profiles of researchers, we built a profile crawler using the rscopus library 1. This library makes available functionalities that allow using the Scopus API, from which data can be accessed. Our crawler was built on top of rscopus, which invokes the Scopus API to get the data. The crawler uses a set of seed profiles from which it retrieves its collaborators, which in turn are used as seeds by the crawler. Iteratively, the crawler builds a repository of researcher profiles along with its publication lists.
Our crawler stores the profiles in a relational database, on which it asks if a profile is already stored. If it is not stored, it indexes it. It does the same with publications, avoiding duplicates. Scopus marks the areas of knowledge of the publishing media. A method of imputation maintains a system of classification of areas at the researchers’ level. Our crawler downloaded all this data.
Each author’s list includes descriptive fields such as the title, author names, name of the media, doi, publication date, and some descriptive keywords. An author id is stored for each author, provided by Scopus. The identifier of each publication is the Scopus id. A key method used in this study allows us to retrieve the references of each publication. This method establishes the relationship between the Scopus id of the publication and that of each of its references. We use these data records to count citations between the papers of the repository.
We conducted the data collection process from a set of profiles for each of the disciplines considered in the taxonomy of areas maintained by Scopus. This taxonomy considers 27 areas of knowledge. In each of them, we fed the crawler with a list of top cited authors. To determine the seeds of each area, we looked for the most reputable journals by field, according to the impact factor, and in them, we looked for the most cited papers. From these works, we selected the top ten most-cited authors. The crawler ran for about three months collecting profiles from 111,813 researchers. We stored a total of 4,504,779 citable document records in our repository. Our collection spans the years 1970 to 2018. A summary of our collection’s basic statistics is shown in Appendix A.
Each publication media declares a list of related areas, in which Scopus maps 27 areas of knowledge. Then, Scopus classifies authors in knowledge areas, according to the journals and conferences in which they publish. These areas are attributed to researchers, accounting for the proportion of publications in each field. Accordingly, each investigator is classified in the area in which they record more publications. We discarded for this study areas with too few authors. We picked the areas with at least 3000 authors in our repository. Accordingly, the dataset used for this study comprises data records for ten areas from 95,668 different authors, as it is shown in Table 1.
Scopus classifies the documents into 15 different types of publications. Four publication types (Conference Review, Report, Abstract Report, and Business Article) have too few documents in our dataset, totaling less than 600 documents. We discarded these publication types from the study. Accordingly, the resulting dataset has 4,102,168 documents classified into 11 types of publications. We show the number of documents per publication type in Table 2.
Citations are verified relationships of references between citable documents in our collection. This track of citations is a sample of the total citations of Scopus since it corresponds to citations between documents in our repository. Our collection totals more than eight and a half million citations among citable documents. We discarded citations from or to areas not considered in this study to pay attention to the main areas covered in our dataset. In total, we discarded 976,349 citations, focusing the analysis into the 7,534,371 citations recorded between documents of the top-ten most salient areas of our repository.

4. Differences in Citations Patterns across Areas and Publication Types

An essential factor in this study is to conduct an analysis that distinguishes between the type of media in which the publications are made. We consider eleven types of publications, among which articles and conference papers are the ones with the most documents (see Table 2). Several of the other types of publications are related to journals. Review, letter, editorial, note, short survey, and erratum are types of publications included within journals. We decided to report them separately to distinguish the differences between these specific types of publications and regular articles. Something similar happens with chapter and book, two related types since chapter is a type of publication included within a book. We decided to report these types separately to distinguish the differences between them.
The number of documents according to each type of publication varies between areas. Table 3 shows that Reviews and Letters are more frequent in Medicine than in other areas, while Chapters and Books are more frequent in Social Sciences. A significant difference is observed in Computer Science, where conference papers are much more frequent than in other areas, totaling more than 60% of the publications in that area.
Table 3 shows significant differences between areas. Articles are a majority in all areas except Computer Science, in which they cover 31% of the sample. On the other hand, conference papers cover 61% of Computer Science production, surpassing the coverage shown in other areas. Conference papers are also relevant in other areas, with 38% in Engineering and 19% in Physics. However, in the other areas its incidence is very low, with around 3% or even less.
Citations according to each type of publication vary between areas. Table 4 shows that Reviews, Letters, and Editorials receive more citations in Medicine than in other areas, while Chapters and Books are more cited in Social Sciences. This fact is explained because these types of publications are more frequent in these areas than in other areas. Articles receive more citations than the other types of publications, which is valid in all areas, even in Computer Science, where these publications are not the majority.
Articles capture the majority of the citations recorded in our dataset. For example, in Mathematics, articles capture 95% of citations, while in Physics, they capture 92%. Something similar occurs in Biochemistry, and Agricultural and Biological Sciences. Psychology shows a similar pattern, with 89% of citations captured in articles. In Engineering, the incidence of citations in articles is 70%, showing that 25% of citations go to conference papers. This difference is most important in Computer Science. In this area, 51% of citations go to articles and 43% to conference papers.
Table 5 shows citations across areas. The quantities are shown as a fraction of the total amount of citations produced by a given area. For instance, entry 0.12 from Medicine to Biochemistry indicates that 12% of the citations produced by Medicine go to Biochemistry. Table 5 shows that citations within the same area are a majority. We show a high prevalence of within citations in red and a small preponderance of this pattern in blue. Second majorities, indicated with underlines, are essential to understand which areas are strongly related to others. Medicine and Biochemistry are firmly related, with the volume of citations from Biochemistry to Medicine being higher than the other way. Other strong relationships occur between Psychology and Medicine and between Agricultural and Biological Sciences and Psychology. The most substantial connection is from Social Sciences to Psychology, which delivers 35% of its citations to publications in this area. Curiously, this relationship is not symmetric since only 2% of Psychology citations go to Social Sciences.
Table 6 shows the average number of citations per paper according to the type of publication and area. These results show that the expected number of citations per paper varies across areas, while the largest amount is observed in Physics Reviews, other types of publications show great relevance for other areas. For example, Letters have an expected number of citations per paper close to 4 in Chemistry, followed by Physics, with 3.4. This type of publication has a much lower quantity in other areas. Other types of publications with a high number of citations per document are Books in Chemistry and Engineering. The amounts of publications of this type are very low, as shown in Table 3. Despite this fact, this type of publication captures a great deal of attention in these areas, reflected in a high average number of citations. The average number of citations per article varies between areas, obtaining its highest value in Physics, with 4.2 average citations, followed by Biochemistry with 3.4. These numbers are very low in Agricultural and Biological Sciences, with only 0.29 citations per paper. In this area, the expected number of citations is higher in reviews. In general, the expected number of citations per paper in Agricultural and Biological Sciences is low, which is explained because many citations go to other areas.
Interestingly, in Computer Science, the expected number of citations in conference papers reaches only 0.72, less than half of what the index for articles shows. This finding suggests that many conference papers receive very few citations, a fact that can be corroborated from the Zipfians that show the distributions of citations per paper, as was illustrated in Brzezinski [40], Moreira et al. [41]. In this area, reviews also have a high average number of citations, with 3.32 citations per review. In general, the results show that the attention provided to each type of publication strongly depends on each area. In general, reviews receive more citations than other types of publications, and both conference papers and articles have very different indicators between areas. It is also striking that in Biochemistry, the expected number of citations in conference papers is very high, with 1.65 per paper, which is the highest in the repository. This fact occurs even though there are very few conference papers in Biochemistry (just over 1% of the total number of Biochemistry papers, see Table 3), which indicates that although conference papers’ production is small, these papers attract much attention from the community.
Our repository records documents between the years 1970 to 2018. In this period, the volume of citations to articles has grown in all areas, as shown in Figure 1. However, the volume of citations between areas is very different, while Physics, Medicine, and Biochemistry show explosive growth in citations since the 1980s, growth in Social Sciences and Agricultural and Biological Sciences has been slower. The differences in volumes are noticeable. The gap between the most cited areas and the rest is in one to two orders of magnitude (note that the figure uses a logarithmic scale). We can see that Figure 1 shows a decrease in citation growth from the year 2013 onwards. This fact is due to the delay produced by the dynamics of citations. As the sample ended in 2018, citation growth is captured up to five years earlier. This observation indicates that many papers’ life span is more extensive than five years, on average, corroborating the recent finding of Liu and Fang [19].
Figure 2 shows that citations to conference papers have grown since the 1980s. This growth begins in Engineering and Computer Science. Other areas show slow growth in the volume of citations to conference papers. For example, Medicine started growth in the late 1980s, as did Biochemistry and Physics. Even later is the growth in this type of citation in Chemistry (the 1990s). In the other areas, this growth has been prolonged or, for all practical purposes, it is non-existent.
The growths on the curves in Figure 3 represent different events, while in Engineering and Computer Science, this dynamic represents a significant volume of the total citations, in the other areas, these quantities are much lower. We observe this fact in Figure 3, where we show the fraction of citations to conference papers over the total number of citations per area and year. The figure clearly shows that citations’ dynamics to conference papers are part of a phenomenon in Engineering and Computer Science. In contrast, in the other areas, this phenomenon is practically non-existent. The curves show slower growth in Engineering than in Computer Science. In this area, citations to conference papers reached almost half of the total citations in 2010.
We analyze citation patterns in the top 1000 most cited papers per area. Figure 4 shows that many of these papers are articles. The figure also shows that some of these papers belong to other types of publications. For example, more than 200 highly cited papers in Medicine belong to other types of publication. Something similar appears in other areas such as Physics and Chemistry. This fact is explained by the high prevalence in these areas of citations to reviews and letters. Agricultural and Biological Sciences and Social Sciences also show highly cited papers in other types of publications, which is explained by the high prevalence of citations to letters, books, and chapters. The area with the least highly cited papers in other types of publications is Mathematics, where more than 900 highly cited papers are articles. On the other hand, Computer Science shows that several highly cited papers are conference papers, reaching almost the 400 top papers. This quantity is the largest amount in this category in the entire repository. Curiously, the Engineering conference papers that are not articles are very few, indicating that even though there is a growth in the number of citations to conference papers, the most cited papers in Engineering continue to belong to journals.
The top 1000 papers per area span many years of the timeline. Figure 5 and Figure 6 shows that in Medicine, many of the top-cited articles were published in the 1990s and 2000s, with a distribution very similar to that exhibited by Physics, Biochemistry, and Chemistry. Mathematics and Psychology papers span along more years, ranging from the 1970s to the 2000s. Computer Science is the area in which top papers are located in a shorter timeline. The bulk of these papers were published from the late 1990s and until the early 2010s. This distribution is similar to Engineering, even though its first most cited papers are young. Psychology shows two periods with many highly cited papers, around the years 96 and 2000. The distributions of Social Sciences and Agricultural and Biological Sciences are similar. As for the highly cited papers of another type, such as conference papers, letters, and reviews, these show to be relevant in Medicine and to a lesser extent in Physics and Biochemistry. In Engineering, the impact of this type of paper is much lower. It is striking that these papers acquired a little more relevance in Social Sciences in the mid-2000s and to a lesser extent around the same time in Chemistry and Agricultural and Biological Sciences. The area where conference papers have had the most impact is Computer Science. In this area, its relevance from 2005 onwards is significant, surpassing even the articles. This trend is only observed in Computer Science, illustrating that this reality is unique to this area.
We also analyze if there are differences between areas regarding the papers’ life span, according to their activity in terms of citations. To do this analysis, we counted the number of citations received for each paper in our repository since its year of publication. Our repository spans almost 50 years. In Figure 7, we show the average life span of all the papers in our repository. These results show that the average life span of a paper begins in the same year in which it is published and acquires its most significant relevance in its third year of life. Then, its impact progressively wanes. Very few papers have a life span of more than 20 years, and as time progresses, the citations that most papers receive diminish until they are almost entirely extinct.
To study the differences in life span according to citations between areas, we split the timeline into six five-year periods to compare papers published in close years in each plot. These results are shown in Figure 8, which shows the 20-year life spans of papers published in 6 different five-year periods, from the 1980s to the end of the 2000s.
Figure 8 shows some interesting dynamics, while various areas in the 1980s followed a similar life span to Figure 7, showing a performance peak in the third year, Psychology and Social Sciences show a much slower growth curve. This effect becomes more noticeable in the 1990s, where the curves of these two areas show growth during the 20 years of the life span while the rest of the areas’ curves begin to decline after the third year of life. This finding suggests that the papers cited in Psychology and Social Sciences have a longer life span than other areas. This difference is also noticeable in the 2000s. Most of the areas show their peak of citation in the third year, followed by a substantial decline, while Psychology and Social Sciences have a slower growth but a long life.

5. Differences in Citation Patterns across Areas and Age Groups of Researchers

We analyze the differences in citation patterns by comparing groups of researchers of different ages. For this purpose, we calculate each researcher’s academic birth, defined as the year in which the author published their first work. Authors span many years of the timeline, beginning in 1970. Figure 9 shows that many authors recorded their academic birth after the 1990s. Most of the academic births are located in the decade of the 2000s.
We separate the authors into three groups according to their academic birth. Seniors are those researchers whose birth is before the year 2000. Mid-age researchers are those born between the years 2000 and 2009. Young researchers are those born after 2009. The sizes of these groups do not have significant differences, which facilitates the comparison between them. The senior group has 42,417 (44%) researchers; the mid-age group has 29,923 (31%) and the young group 23,321 (25%).
For each of these groups, we compute authors’ life span, defined as the track of citations recorded over time. Each citation track records the total number of citations received by an author per year. We compute a life span of ten years for young researchers since these authors were academically born after 2009. The other two groups are analyzed using 20-year life spans.
Figure 10 and Figure 11 show the life span per area and age group. The solid line shows the mean life span, while the colored area indicates the standard deviation around the mean. The axes use the same scales for each age group to facilitate comparison between areas. The only area in which it was not possible to scale was Physics, which receives many more citations per author than the rest of the groups.
Authors’ life spans vary widely between areas and age groups, while the young and mid-age life spans decrease towards the end of the timeline, the seniors show growth towards the year 20. This fact is attributable to the cut-off year in data acquisition. Some comparisons between groups are very interesting. Physics shows rapid growth in authors’ life span, with an average of citations in young authors of more than 20 citations by the third year of life. This effect is observed in the other age groups, with a peak in mean citations in the tenth year of life in mid-age researchers and a sustained growth that exceeds 50 citations on average by year 20 in seniors. Physics is out of scale in this comparison because the volumes of citations in the area records are much higher than in the other areas. On the other hand, the other areas record comparable amounts of citations, which allows the comparison between areas on the same scale. It can be seen that the axes used in these figures are the same for each age group (except Physics).
Mathematics shows a strong fit to the mean, with a low standard deviation (see Figure 10). This effect is maintained in the three age groups, indicating that very few authors have rapid growth in citations in this area. It is also noteworthy that the mean life span in Mathematics is very low, which indicates that most authors concentrate few citations. This fact is due to two factors. First, the volume of citations in Mathematics is low (see Table 4). Furthermore, the average number of citations per paper is also very low (see Table 6). Figure 10 shows that the life spans of Medicine and Biochemistry are similar, which may be because these two areas have a high volume of cross-citations. The Computer Science standard deviation indicates that some authors receive many citations and others receive very few. In this area, the fit to the mean is more significant than in Medicine and Biochemistry, indicating that authors with rapid growth in citations are scarcer than in these other areas.
Figure 11 shows that Agricultural and Biological Sciences and Social Sciences also show a strong fit to the mean, with many authors receiving very few citations. The authors’ low mean life span is because the volume of citations in these areas is very low (see Table 4) and because many of their citations go to other areas (see Table 5). In the case of Agricultural and Social Sciences, only 0.3868 of its citations remain in the area, while in Social Sciences, the fraction of within citations is 0.429. The other areas show a weaker fit to the mean. Both Engineering, Psychology, and Chemistry show the existence of separate author tracks from the media. Chemistry shows more careers with a high volume of citations, with consolidation in mid-age around year 10 of the life span. In young researchers, the peak of citations is observed at year 4. The mean of citations in Engineering has a slower growth, which indicates that many authors receive few citations in this area. Psychology has a higher growth curve than Engineering, with a peak of citations around year 10 for mid-age researchers.
A key question in citation-based research is whether citations do indeed have predictive power. The ground on which citation-based evaluation is based is that citations received in the past are a good indicator of future citations. A question for this study is whether the citation data has the same predictive capacity in different areas. One possible scenario is that citation data has better predictive capabilities in some areas than in others. If this is the case, we may conclude that evaluation based on citations is more informative in some areas than in others.
We analyze the correlation of citations in each age group. A strong correlation indicates that the volume of past citations is a good indicator of future citations. To do this, and following the guidelines discussed by Liu and Fang [19], we add each author’s total citations for five years. In the group of young researchers, we compare the correlation between the first and second five-year periods. We measure the correlation between the second and third five years of the life span for mid-age researchers. Finally, in the senior group, we measure the correlation between the third five-year and the fourth five-year period. Spearman’s correlation coefficients are shown in Table 7.
Table 7 shows that the correlation increases as the authors become more experienced. For the senior group, all areas show a correlation factor close to 0.8. The strong correlation in this group indicates that the careers of these researchers are already consolidated. Therefore, authors with low productivity will have low productivity in the future, and authors with high productivity will maintain their productivity. The academic assessment based on citations in this segment is very informative. This correlation decreases in the mid-age group in all areas. It is striking that this segment is where the correlations are lower, even lower than in young researchers. A plausible explanation is that many authors stop researching between the second and third five years of the life span, which causes the correlation between past and future citation to decline. In young researchers, the correlation between the first five years and the second five years of the life span is more significant than in the mid-age. In some areas, the correlations are much lower than in others. For example, in young researchers at Mathematics, the correlation is only 0.31, the lowest in the entire study. This fact indicates that the first five years of the academic life of mathematicians is very uninformative. In other areas, this life span period provides much more information. For example, in Social Sciences and Chemistry, the correlation exceeds 0.7. The reasons why these areas reach this value are different, while in Social Sciences, young researchers have a slow growth (cold start phenomenon), which explains the high correlation; in Chemistry, the first years of activity show rapid growth with a peak of citations between years 4 and 6.
To analyze how sensitive the evaluation is concerning the year of the life span in which it is carried out, we repeated the previous analysis using sliding quinquenniums. If the evaluation is carried out in year i, the five-year sliding period is set five years before and five years after i. Then we measure the correlation between the sum of citations for both five-year periods. In this experiment, we span the evaluation year through the first ten years of the life span. The results are disaggregated by age group in Figure 12.
Figure 12 shows that there are differences between areas. Concerning young researchers, Physics, Biochemistry, Engineering, Psychology, and Medicine show that between years 3 and 4, the correlation between past and future citations reaches its maximum value. Computer Science shows that the first two years of the life span are very uninformative for an evaluation, but in the third year, the correlation rises to 0.6. Both Social Sciences and Agricultural and Biological Sciences need a longer time to make a fair evaluation, requiring an observation window of at least four years for a young researcher. In Social Sciences, the fourth year is very informative, with a correlation of 0.8. Chemistry is the area in which the evaluation times could be shorter since high correlations are already observed between past and future citations from the second year. Mathematics is the most challenging area for young researchers, with the lowest correlation values between citations. When analyzing the mid-age and senior groups, all areas show an improvement in the correlation between past and future citations. In fact, in most areas, the correlation increases with the year of evaluation.

6. Discussion of Results

This study corroborates the intuition that there are notable differences in citation patterns between areas of knowledge. For example, the differences in citations to conference proceedings between Computer Science and the other areas are evident, as we showed in Section 4. As conference proceedings are essential to Computer Science, representing a singularity concerning other fields, we need to discuss why this occurs in Computer Science and not in other areas. Each one of the areas of knowledge has different rationales for evaluating results, while the areas related to Social Sciences value the background knowledge, supporting the research on a solid connection with the theories and classical approaches of the area, Computer Science values much more the novelty and the relevance of a contribution for a given context. The contrast of both valuation rationales, one anchored in the theories tested and studied over decades, and the other with a vital component in technological innovation, makes the different peer review rationales more suitable for certain purposes than others, while a conference proceeding offers fast responses and wide dissemination of results, the review process of an article can be slower, releasing the times of dissemination of results and findings. Although many journals have reduced their response times, this study shows that there is still a high interest in Computer Science for publishing in conference proceedings. Furthermore, this study shows that no signs of diminishing interest are observed in these Computer Science publications.
While much of the effort in bibliometrics aims to establish indexes of scientific productivity with predictive capacity, we hypothesize that in Computer Science, the problem is not related to the productivity index used but to the indexed collection. As journal citation reports are only based on journal citations, the real impact of publications in Computer Science is highly distorted by not including publications in conference proceedings. Top-tier conference proceedings show high impact in Computer Science, and many citations are ruled out by only evaluating researchers’ careers based on journal impact indices. This observation from our study indicates that the effort to develop a single bibliometric index ruling all areas has been unsuccessful, introducing distortions in the analysis of scientific productivity. It seems to us that this approach should also be revised to incorporate the new challenges raised by the advancement of scientific research, which is strongly determined by inter and trans-disciplinary research.
This study covers a long period of observation, ranging almost five decades, in which many practices in the modes of publishing have changed. These almost five decades that the study has spanned reveal some essential facts, such as the invariance of specific publication patterns but the apparent changes these have witnessed. The growing volume of publications registered in the last two decades shows a more active and numerous scientific community, with a growing interest in peer-reviewed publications. However, the sheer volume of publications poses new challenges to young researchers, who must review many papers to identify related work. The bibliometric analysis given to some journals based on their impact indexes makes high-impact journals increase their visibility, revealing a rich get richer phenomenon. Often, the rich get richer phenomenon is reinforced by community practices or even by journal editors, who encourage citing journal articles as a primary source in their studies. As a consequence, a few journals concentrate the majority of citations, to the detriment of a large number of journals with poor visibility. This effect can affect new subspecialties, which publish their findings in new journals and, therefore, have less visibility. The increase in subspecialties has been evident mainly in Medicine and related sub-areas, leading to a massive volume of publications. At the risk of perishing, the need to publish makes many researchers seek to publish quickly, and from their sub-disciplines, publish in emerging journals, which their peers do not consistently recognize. The low visibility of these specialized journals would make many of these publications less likely to be cited.
This study also shows evidence that the life span of some publications is higher than five years. As many bibliometric impact indicators consider observation windows to capture citations less than five years, the actual impact of a publication in a journal may be underestimated. This finding reinforces the idea that current citation windows are extremely short, underestimating the impact of publications in journals from areas with slower publishing dynamics. This study highlights that 5-year windows are recommended to address this deficiency.
Limitations of this study. A study of this type, which covers almost five decades of observation, has some limitations to discuss. First, in nearly 50 years of observation, the study has considered some citation-based measures that can be affected by the passage of time. For example, younger disciplines, such as Computer Science, would have a lower volume of citations when studying citation dynamics over time since their research community is more incipient. Second, in general, the areas are generally compared in different stages of maturity, understanding that some classic areas, such as basic sciences, are in a higher stage of maturity than Computer Science. This fact may affect the comparison between areas, so a seasonally adjusted study according to the level of maturity would be more suitable to carry out a longitudinal comparison like the one in this study. The same observation applies to how the life-spans of researchers were calculated since it is understood that some more consolidated areas could have a more stable career model than emerging areas. A comparison by groups of similar areas according to a level of maturity might be more appropriate in this regard, which is left as future work. Third, as we discarded citations that came from areas not covered by the study, our study may underestimate the volume of citations in some areas. One way to address this limitation is to work on complete Scopus data to characterize the total number of publications and citations recorded in a given observation period. Finally, the construction of the dataset based on established researchers can introduce a bias in the data since the snowball crawling conducted from these authors would be mediated by the citation patterns that these types of authors have. Less established researchers are likely to follow different citation patterns, affecting the type of accounts retrieved by our Scopus account crawler. Despite this limitation and criticism, we believe that our sample represents a relevant and highly visible part of the Scopus data, which makes the study’s conclusions valid.

7. Conclusions

We have introduced a citation-based study that elucidates differences between areas, types of publications, and researchers’ age groups. The main findings of this study are:
Articles are a majority in all areas except Computer Science, corroborating the finding of Vrettas and Sanderson [27]. Conference papers cover 61% of Computer Science production, surpassing the coverage shown in other areas. The volume of citations captured by articles is a majority in all areas.
Citations between areas show strong symmetric relationships between some specific areas (e.g., Biochemistry and Medicine) and a robust asymmetric citation pattern from some areas to other (e.g., Social Sciences to Psychology). Asymmetric relationships reduce citations received by articles in the same area.
The average number of citations per paper has significant variations between areas and types of publications, corroborating previous studies [35]. The lowest amounts are seen in Social Sciences and Agricultural and Biological Sciences, contradicting the findings of Lariviere et al. [23]. Reviews attract many citations, its performance in Physics being notable. Computer Science conference papers have a lower average number of citations than journal articles in this area. The citations to conference papers are only relevant in Computer Science, and only in this area an important fraction of the top-cited papers belong to this category.
The life span of a paper, measured from the track of citations it receives over time, reaches its maximum value in the third year of the publication’s life. Psychology and Social Sciences’ life span has slower growth than in the rest of the areas, suggesting that these papers have a longer useful life.
The life span of authors is very different between areas, while most areas show a weak fit to the mean number of citations, this indicator is more informative in some specific areas (e.g., Mathematics). The predictive capacity of citations, measured from the productivity correlation between consecutive five-year periods, is very informative in seniors and less informative in mid-age researchers.
The evaluation of researchers based on citations has different informative levels depending on the year in which the observation is carried out. If the evaluation is done in the early years of academic life, this evaluation is more informative in some areas than in others. As researchers gain experience, their citation-based evaluation is more informative in all areas.
This study has allowed us to identify some critical differences between areas, types of publication, and age groups of researchers that may be relevant when carrying out researchers’ academic evaluations. Each area has its specificities, which must be taken into account when comparing scientific productivity. It is expected that the findings of this article will motivate further research that requires the development of more and better methodologies and metrics for the evaluation of scientific productivity, which should include the differences in citation patterns between areas, types of publication, and age groups of researchers.

Funding

Marcelo Mendoza acknowledges funding support from the Millennium Institute for Foundational Research on Data. Marcelo Mendoza was funded by the National Agency of Research and Development (ANID) grants Programa de Investigación Asociativa (PIA) AFB180002 and Fondo Nacional de Desarrollo Científico y Tecnológico (FONDECYT) 1200211.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A

Table A1. Basic statistics of our repository. In articles we group up regular papers, articles in press, surveys, reviews, letters, erratum, reports, business articles, abstract reports, editorials, and notes that were included in any number of an indexed journal. Conference papers group up conference reviews and papers included in indexed proceedings.
Table A1. Basic statistics of our repository. In articles we group up regular papers, articles in press, surveys, reviews, letters, erratum, reports, business articles, abstract reports, editorials, and notes that were included in any number of an indexed journal. Conference papers group up conference reviews and papers included in indexed proceedings.
StatisticsValue
# citable documents4,504,779
# articles3,849,858
# conference papers595,122
# books and chapters59,799
# authors111,813
# citations8,510,720
# subjects27
Table A2. Distributions of authors, documents, and citations across Scopus subjects. Scopus covers 27 subject areas.
Table A2. Distributions of authors, documents, and citations across Scopus subjects. Scopus covers 27 subject areas.
Scopus Subject# Authors# Documents# Citations
Medicine34,9741,931,9763,367,647
Physics and Astronomy11,308517,7231,830,612
Computer Science9052389,411398,904
Biochemistry, Genetics and Molecular Biology8418278,396980,208
Mathematics7520202,670190,678
Engineering7367326,294256,990
Psychology6145132,199320,362
Agricultural and Biological Sciences3886107,71639,276
Social Sciences369136,61430,053
Chemistry3307179,808440,192
Neuroscience237553,251243,522
Materials Science2331122,09073,575
Pharmacology and Toxicology207645,87780,618
Earth and Planetary Sciences177839,96221,390
Business and Management138025,39665,818
Economics128632,753120,277
Environmental Science113232,63826,989
Energy96981291961
Immunology and Microbiology74431952324
Nursing5425676860
Chemical Engineering46518,36610,682
Arts and Humanities37365735873
Veterinary25768751834
Multidisciplinary241359
Decision Sciences8029590
Health professions76142
Dentistry4062074

Note

1

References

  1. Garfield, E. Citation index for science: A new dimension in documentation through association of ideas. Science 1955, 122, 108–111. [Google Scholar] [CrossRef]
  2. Moed, H.F. The impact-factors debate: The ISI’s uses and limits. Nature 2002, 415, 731–732. [Google Scholar] [CrossRef]
  3. Glanzel, W.; Moed, H.F. Journal impact measures in bibliometric research. Scientometrics 2002, 53, 171–193. [Google Scholar] [CrossRef]
  4. Waltman, L. A review of the literature on citation impact indicators. J. Inf. 2016, 10, 365–391. [Google Scholar] [CrossRef] [Green Version]
  5. Hirsch, J. An index to quantify an individual’s scientific research output. Proc. Natl. Acad. Sci. USA 2005, 102, 16569–16572. [Google Scholar] [CrossRef] [Green Version]
  6. Finardi, U. On the time evolution of received citations in different scientific fields: An empirical study. J. Inf. 2014, 8, 13–24. [Google Scholar] [CrossRef]
  7. Smith, D.R. Impact factors, scientometrics and the history of citation-based research. Scientometrics 2012, 92, 419–427. [Google Scholar] [CrossRef] [Green Version]
  8. Robien, W. Do high-quality 13c-nmr spectral data really come from journals with high impact factors? TrAC Trends Anal. Chem. 2009, 28, 914–922. [Google Scholar] [CrossRef]
  9. Stewart, A.; Cotton, J. Making sense of entrepreneurship journals: Journal rankings and policy choices. Int. J. Entrep. Behav. Res. 2009, 19, 303–323. [Google Scholar] [CrossRef] [Green Version]
  10. Finardi, U. Correlation between journal impact factor and citation performance: An experimental study. J. Inf. 2013, 7, 357–370. [Google Scholar] [CrossRef]
  11. Vanclay, J.K. Bias in the journal impact factor. Scientometrics 2009, 78, 3–12. [Google Scholar] [CrossRef] [Green Version]
  12. Campanario, J.M. Large increases and decreases in journal impact factors in only one year: The effect of journal self-citations. J. Assoc. Inf. Sci. Technol. 2011, 62, 230–235. [Google Scholar] [CrossRef]
  13. Walters, W.H. Citation-based journal rankings: Key questions, metrics, and data sources. IEEE Access 2017, 5, 22036–22053. [Google Scholar] [CrossRef]
  14. Braun, T.; Glanzel, W.; Schubert, A. A hirsch-type index for journals. Scientometrics 2006, 69, 169–173. [Google Scholar] [CrossRef]
  15. Balaban, A.T. Positive and negative aspects of citation indices and journal impact factors. Scientometrics 2012, 92, 241–247. [Google Scholar] [CrossRef]
  16. Bergstrom, C. Eigenfactor: Measuring the value and prestige of scholarly journals. Coll. Res. Libr. News 2007, 68, 314–316. [Google Scholar] [CrossRef]
  17. González-Pereira, B.; Bote, V.P.G.; deMoyaAnegón, F. A new approach to the metric of journals’ scientific prestige: The SJR indicator. J. Inf. 2010, 4, 379–391. [Google Scholar] [CrossRef]
  18. Moed, H.F. Measuring contextual citation impact of scientific journals. J. Inf. 2010, 4, 265–277. [Google Scholar] [CrossRef] [Green Version]
  19. Liu, X.; Fang, H. A comparison among citation-based journal indicators and their relative changes with time. J. Inf. 2020, 14, 1. [Google Scholar] [CrossRef]
  20. Pyke, G.H. Struggling scientists: Please cite our papers! Curr. Sci. 2013, 105, 1061–1066. [Google Scholar]
  21. Liu, M.; Hu, X.; Wang, Y.; Shi, D. Survive or perish: Investigating the life cycle of academic journals from 1950 to 2013 using survival analysis methods. J. Inf. 2018, 12, 344–364. [Google Scholar] [CrossRef]
  22. Shi, D.; Rousseau, R.; Yang, L.; Jiang Li, A. Journal’s impact factor is influenced by changes in publication delays of citing journals. J. Assoc. Inf. Sci. Technol. 2017, 68, 780–789. [Google Scholar] [CrossRef]
  23. Lariviere, V.; Macaluso, B.; Archambault, E.; Gingras, Y. Which scientific elites? On the concentration of research funds, publications and citations. Res. Eval. 2010, 19, 45–53. [Google Scholar] [CrossRef]
  24. Nederhof, A.; Van Leeuwen, T.; Clancy, P. Calibration of bibliometric indicators in space exploration research: A comparison of citation impact measurement of the space and ground-based life and physical sciences. Res. Eval. 2012, 21, 79–85. [Google Scholar] [CrossRef]
  25. Prins, A.A.; Costas, R.; van Leeuwen, T.N.; Wouters, P.F. Using Google Scholar in research evaluation of humanities and social science programs: A comparison with Web of Science data. Res. Eval. 2016, 25, 264–270. [Google Scholar] [CrossRef]
  26. Freyne, J.; Coyle, L.; Smyth, B.; Cunningham, P. Relative status of journal and conference publications in computer science. Commun. ACM 2010, 53, 124–132. [Google Scholar] [CrossRef]
  27. Vrettas, G.; Sanderson, M. Conferences versus journals in computer science. J. Assoc. Inf. Sci. Technol. 2015, 66, 2674–2684. [Google Scholar] [CrossRef]
  28. Glänzel, W.; Schlemmer, B.; Schubert, A.; Thijs, B. Proceedings literature as additional data source for bibliometric analysis. Scientometrics 2006, 68, 457–473. [Google Scholar] [CrossRef]
  29. Meho, L.I. Using scopus’s citescore for assessing the quality of computer science conferences. J. Inf. 2019, 13, 419–433. [Google Scholar] [CrossRef]
  30. Thelwall, M. Mendeley reader counts for US computer science conference papers and journal articles. Quant. Sci. Stud. 2020, 1, 347–359. [Google Scholar] [CrossRef]
  31. Kochetkov, D.M.; Birukou, A.; Ermolayeva, A. The importance of conference proceedings in research evaluation: A methodology for assessing conference impact. arXiv 2010, arXiv:2010.01540. [Google Scholar]
  32. Li, X.; Rong, W.; Shi, H.; Tang, J.; Xiong, Z. The impact of conference ranking systems in computer science: A comparative regression analysis. Scientometrics 2018, 116, 879–907. [Google Scholar] [CrossRef]
  33. Qian, Y.; Rong, W.; Jiang, N.; Tang, J.; Xiong, Z. Citation regression analysis of computer science publications in different ranking categories and subfields. Scientometrics 2017, 110, 1351–1374. [Google Scholar] [CrossRef]
  34. Yang, S.; Qi, F. Multidisciplinary comparison of proceedings papers and academic books based on altmetrics and citation. In Proceedings of the iConference 2020, Boras, Sweden, 23–26 March 2020. [Google Scholar]
  35. Ruiz-Castillo, J.; Costas, R. Individual and field citation distributions in 29 broad scientific fields. J. Inf. 2018, 12, 868–892. [Google Scholar] [CrossRef] [Green Version]
  36. Radicchi, F.; Fortunato, S.; Castellano, C. Universality of citation distributions: Toward an objective measure of scientific impact. Proc. Natl. Acad. Sci. USA 2008, 105, 17268–17272. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  37. Albarrán, P.; Crespo, J.A.; Ortuño, I.; Ruiz-Castillo, J. The skewness of science in 219 sub-fields and a number of aggregates. Scientometrics 2011, 88, 385–397. [Google Scholar] [CrossRef] [Green Version]
  38. Crespo, J.A.; Herranz, N.; Li, Y.; Ruiz-Castillo, J. The effect on citation inequality of differences in citation practices at the web of science subject category level. J. Assoc. Inf. Sci. Technol. 2014, 65, 1244–1256. [Google Scholar] [CrossRef]
  39. Bensman, S.J.; Smolinsky, L.J.; Pudovkin, A.I. Mean citation rate per article in mathematics journals: Differences from the scientific model. J. Assoc. Inf. Sci. Technol. 2010, 61, 1440–1463. [Google Scholar] [CrossRef]
  40. Brzezinski, M. Power laws in citation distributions: Evidence from scopus. Scientometrics 2015, 103, 213–228. [Google Scholar] [CrossRef] [Green Version]
  41. Moreira, J.; Zeng, X.; Nunes Amaral, L. The distribution of the asymptotic number of citations to sets of publications by a researcher or from an academic department are consistent with a discrete lognormal model. PLoS ONE 2015, 10, e0143108. [Google Scholar] [CrossRef]
Figure 1. Number of citations to articles per year and area. Each point represents the number of citations recorded by documents published in that year that belongs to the indicated area.
Figure 1. Number of citations to articles per year and area. Each point represents the number of citations recorded by documents published in that year that belongs to the indicated area.
Publications 09 00047 g001
Figure 2. Number of citations to conference papers per year and area. Each point represents the number of citations recorded by documents published in that year that belongs to the indicated area.
Figure 2. Number of citations to conference papers per year and area. Each point represents the number of citations recorded by documents published in that year that belongs to the indicated area.
Publications 09 00047 g002
Figure 3. Fraction of citations to conference papers per year and area. Each point represents the number of citations recorded by documents published in that year that belongs to the indicated area.
Figure 3. Fraction of citations to conference papers per year and area. Each point represents the number of citations recorded by documents published in that year that belongs to the indicated area.
Publications 09 00047 g003
Figure 4. Most cited papers per area. Even though many highly cited papers are articles, there are highly cited papers in the category others. Computer Science records nearly 400 highly cited papers in the category others.
Figure 4. Most cited papers per area. Even though many highly cited papers are articles, there are highly cited papers in the category others. Computer Science records nearly 400 highly cited papers in the category others.
Publications 09 00047 g004
Figure 5. Most cited papers per area along time (Medicine, Physics, Comp. Sci., Biochemistry, and Mathematics).
Figure 5. Most cited papers per area along time (Medicine, Physics, Comp. Sci., Biochemistry, and Mathematics).
Publications 09 00047 g005
Figure 6. Most cited papers per area along time (Engineering, Psychology, Agr. and Bio, Social Sciences and Chemistry).
Figure 6. Most cited papers per area along time (Engineering, Psychology, Agr. and Bio, Social Sciences and Chemistry).
Publications 09 00047 g006
Figure 7. Life span of papers in our repository.
Figure 7. Life span of papers in our repository.
Publications 09 00047 g007
Figure 8. Life span of papers across areas. The figure shows the 20-year life spans of papers published in 6 different five-year periods, from the 1980s to the end of the 2000s.
Figure 8. Life span of papers across areas. The figure shows the 20-year life spans of papers published in 6 different five-year periods, from the 1980s to the end of the 2000s.
Publications 09 00047 g008
Figure 9. The academic birth of the authors studied in this paper. Academic births are firmly located in the 1990s and 2000s.
Figure 9. The academic birth of the authors studied in this paper. Academic births are firmly located in the 1990s and 2000s.
Publications 09 00047 g009
Figure 10. Mean and standard deviation of the life span of authors pear area and age group. Physics records many more citations than the other areas, while most areas show a weak fit to the mean, Mathematics has a very low standard deviation, which indicates that most authors concentrate few citations.
Figure 10. Mean and standard deviation of the life span of authors pear area and age group. Physics records many more citations than the other areas, while most areas show a weak fit to the mean, Mathematics has a very low standard deviation, which indicates that most authors concentrate few citations.
Publications 09 00047 g010
Figure 11. Mean and standard deviation of the life span of authors pear area and age group. Chemistry records more citations than the other areas. Agricultural and Biological Sciences and Social Sciences show a strong fit to the mean which indicates that most authors concentrate few citations.
Figure 11. Mean and standard deviation of the life span of authors pear area and age group. Chemistry records more citations than the other areas. Agricultural and Biological Sciences and Social Sciences show a strong fit to the mean which indicates that most authors concentrate few citations.
Publications 09 00047 g011
Figure 12. Correlation analysis of citations using sliding quinquenniums. Young researchers can only be evaluated up to year 5, since their life span lasts only ten years. The evaluation year for mid-age and senior researchers spans the first ten years of the timeline.
Figure 12. Correlation analysis of citations using sliding quinquenniums. Young researchers can only be evaluated up to year 5, since their life span lasts only ten years. The evaluation year for mid-age and senior researchers spans the first ten years of the timeline.
Publications 09 00047 g012
Table 1. Distributions of authors across knowledge areas covered in this study.
Table 1. Distributions of authors across knowledge areas covered in this study.
Area# Authors
Medicine34,974
Physics and Astronomy11,308
Computer Science9052
Biochemistry, Genetics and Molecular Biology8418
Mathematics7520
Engineering7367
Psychology6145
Agricultural and Biological Sciences3886
Social Sciences3691
Chemistry3307
Total95,668
Table 2. Documents per publication type.
Table 2. Documents per publication type.
Publication Type# Documents
Article2,978,819
Conference Paper561,048
Review237,186
Letter95,650
Editorial66,501
Chapter45,191
Note42,988
Article in Press23,912
Short Survey23,743
Erratum21,727
Book5403
Total4,102,168
Table 3. Types of publications by area. Some types of publications are much more frequent in some areas than in others, while Review and Letters are frequent in Medicine and Book and Chapters are in Social Sciences, conference papers are very relevant in Computer Science. The quantities are shown as percentages of the amount of documents in each area, to facilitate comparison between areas.
Table 3. Types of publications by area. Some types of publications are much more frequent in some areas than in others, while Review and Letters are frequent in Medicine and Book and Chapters are in Social Sciences, conference papers are very relevant in Computer Science. The quantities are shown as percentages of the amount of documents in each area, to facilitate comparison between areas.
AreaDocumentsArt.Conf.Rev.Let.Edit.Chap.NoteIn Pr.Surv.Err.Book
Medicine2,156,5010.75360.03610.09410.04480.02200.00890.01780.00680.00940.00570.0008
Physics.532,6020.75490.19980.01650.00670.00320.00370.00280.00110.00300.00760.0007
Comp. Sci.411,7960.31910.61230.01090.00100.03140.01510.00090.00590.00030.00080.0023
Biochemistry295,1120.85080.02720.06190.00850.00700.01200.00790.00300.01020.01060.0008
Mathematics211,2920.91500.03280.00970.00090.00870.01400.00340.00830.00040.00470.0022
Engineering332,7540.58330.38420.01110.00140.00550.00620.00130.00430.00020.00100.0013
Psychology145,7750.83260.00910.05560.00240.01520.04670.01300.01490.00190.00330.0052
Agr. and Bio.112,7470.88120.02630.04570.00520.00590.01190.00720.00860.00250.00430.0011
Social Sci.43,0800.73260.01570.07500.00420.03050.09340.00960.01660.00090.00240.0191
Chemistry185,6240.91090.03070.02590.00620.00480.00590.00390.00270.00170.00640.0007
Table 4. Citations across types of publications per area. Articles receive more citations than the other types of publications, which is valid in all areas. The quantities are shown as percentages of the amount of citations in each area, to facilitate comparison between areas.
Table 4. Citations across types of publications per area. Articles receive more citations than the other types of publications, which is valid in all areas. The quantities are shown as percentages of the amount of citations in each area, to facilitate comparison between areas.
AreaCitationsArt.Conf.Rev.Let.Edit.Chap.NoteIn Pr.Surv.Err.Book
Medicine3,329,1580.83480.02030.11770.01150.00480.00060.00460.00020.00500.00010.0003
Physics1,819,7210.92030.01920.04920.00660.00050.00010.00070.00000.00110.00160.0005
Comp. Sci.396,8290.51520.43770.03600.00110.00200.00340.00030.00050.00050.00000.0034
Biochemistry943,5840.88160.01350.08570.00300.00060.00050.00270.00010.01180.00030.0002
Mathematics188,7300.95670.01320.02290.00030.00060.00180.00120.00010.00030.00060.0024
Engineering252,2740.70880.25780.02070.00170.00070.00130.00060.00030.00000.00010.0080
Psychology283,7270.89910.00710.07490.00030.00210.01050.00250.00010.00060.00010.0029
Agr. and Bio.33,3210.83830.03800.11370.00050.00140.00200.00280.00010.00160.00010.0016
Social Sci.26,3050.79200.01740.10880.00020.00430.02880.00320.00040.00010.00000.0447
Chemistry410,9290.90970.01300.05790.01070.00140.00130.00210.00000.00200.00040.0015
Table 5. Citations across areas. The quantities are shown as a fraction of the total amount of citations produced by a given area, to facilitate comparison between areas. Red and blue entries show the diagonal, indicating the prevalence of within citations. Underlined entries show the second majority for each area, suggesting connections between areas.
Table 5. Citations across areas. The quantities are shown as a fraction of the total amount of citations produced by a given area, to facilitate comparison between areas. Red and blue entries show the diagonal, indicating the prevalence of within citations. Underlined entries show the second majority for each area, suggesting connections between areas.
From/ToMedicinePhysicsComp. Sci.BiochemistryMathematicsEngineeringPsychologyAgr. and Bio.Social Sci.Chemistry
Medicine0.83040.02450.00240.12000.00160.00080.01480.00230.00100.0022
Physics0.00810.95570.00360.01440.00570.00370.00020.00030.00010.0082
Comp. Sci.0.03860.02800.79990.02430.02420.05370.02250.00180.00510.0019
Biochemistry0.31630.06400.00310.57790.00260.00090.00320.00620.00030.0256
Mathematics0.02200.06360.02870.01760.81110.04430.00200.00720.00080.0026
Engineering0.02540.04010.09390.01370.04800.76100.00230.00190.00140.0123
Psychology0.14700.00760.00680.02840.00150.00070.78050.00780.01970.0000
Agr. and Bio.0.20960.04140.01110.23460.02370.01090.04720.38680.00440.0303
Social Sci.0.12420.00810.03490.02370.01050.00960.34990.00680.42900.0033
Chemistry0.02680.06660.00120.07370.00090.00320.00010.00270.00010.8247
Table 6. Average number of citations per type of publication and area. Quantities are reported over the amount of documents per area, to facilitate the comparison between areas.
Table 6. Average number of citations per type of publication and area. Quantities are reported over the amount of documents per area, to facilitate the comparison between areas.
Art.Conf.Rev.Let.Edit.Chap.Notein Pr.Surv.Err.Book
Medicine1.85860.94562.09770.42900.36460.11120.43570.05340.89850.03860.6337
Physics4.23750.334010.34653.41670.55800.14010.83870.13941.30140.74852.7389
Comp. Sci.1.62740.72053.32171.09180.06510.22780.34390.06171.52100.05721.5045
Biochemistry3.44651.65264.60121.17620.30100.13931.14280.08523.83050.09110.6609
Mathematics0.96670.37232.18970.30050.06180.11990.31270.01080.58540.12381.0067
Engineering0.92910.51311.42250.89940.09150.15810.36700.05190.03950.05904.7176
Psychology2.27451.64892.83880.26630.29510.47150.40020.00910.62300.03201.1642
Agr. and Bio.0.29140.44310.76240.02670.07040.05050.11710.00430.19560.00430.4500
Social Sci.0.71790.73610.96320.02580.09410.20480.22100.01650.06250.01141.5587
Chemistry2.23900.94465.01293.88760.67590.48741.18480.01022.64840.14204.7200
Table 7. Spearman’s correlation between consecutive citation quinquenniums in each age group. The correlation in young researchers is measured between the first and second quinquennium; in mid-age, it is measured between the second and third quinquennium and seniors between the third and fourth quinquennium.
Table 7. Spearman’s correlation between consecutive citation quinquenniums in each age group. The correlation in young researchers is measured between the first and second quinquennium; in mid-age, it is measured between the second and third quinquennium and seniors between the third and fourth quinquennium.
MedicinePhysicsComp. Sci.BiochemistryMathematicsEngineeringPsychologyAgr. and Bio.Social Sci.Chemistry
young0.5850.6260.5570.6130.3160.4280.6110.6710.7330.719
mid age0.6110.4910.5030.6920.6790.7430.5840.4640.5840.621
senior0.8630.8310.8110.8380.7990.8170.8790.8050.7810.857
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Mendoza, M. Differences in Citation Patterns across Areas, Article Types and Age Groups of Researchers. Publications 2021, 9, 47. https://doi.org/10.3390/publications9040047

AMA Style

Mendoza M. Differences in Citation Patterns across Areas, Article Types and Age Groups of Researchers. Publications. 2021; 9(4):47. https://doi.org/10.3390/publications9040047

Chicago/Turabian Style

Mendoza, Marcelo. 2021. "Differences in Citation Patterns across Areas, Article Types and Age Groups of Researchers" Publications 9, no. 4: 47. https://doi.org/10.3390/publications9040047

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop