Web of Science (WoS) and Scopus: The Titans of Bibliographic Information in Today’s Academic World

Nowadays, the importance of bibliographic databases (DBs) has increased enormously, as they are the main providers of publication metadata and bibliometric indicators universally used both for research assessment practices and for performing daily tasks. Because the reliability of these tasks firstly depends on the data source, all users of the DBs should be able to choose the most suitable one. Web of Science (WoS) and Scopus are the two main bibliographic DBs. The comprehensive evaluation of the DBs’ coverage is practically impossible without extensive bibliometric analyses or literature reviews, but most DBs users do not have bibliometric competence and/or are not willing to invest additional time for such evaluations. Apart from that, the convenience of the DB’s interface, performance, provided impact indicators and additional tools may also influence the users’ choice. The main goal of this work is to provide all of the potential users with an all-inclusive description of the two main bibliographic DBs by gathering the findings that are presented in the most recent literature and information provided by the owners of the DBs at one place. This overview should aid all stakeholders employing publication and citation data in selecting the most suitable DB.


Introduction
The initial purpose of scientific publishing was to enable the global sharing of scientific results, ideas, and discussions among academic society for more efficient scientific achievements [1,2]. However, over the years, the role of scientific publications has changed enormously. Nowadays, many of the most important decisions in industrial and economic growth priorities, the allocation of funding resources, education policies, creation of collaboration opportunities, tenure, academic staff hiring, and so on are based on the evaluation of scientific output, and research quality approximated as an impact of a publication has become the most important criterion [1][2][3][4][5][6][7][8].
Because bibliographic databases (DBs) are the main sources of publication metadata and citation metrics, their importance has also highly increased [9]. Web of Science (WoS) and Scopus are the two bibliographic DBs generally accepted as the most comprehensive data sources for various purposes [10]. WoS was the first broad scope international bibliographic DB. Therefore, over time it became the most influential bibliographic data source traditionally used for journal selection, research evaluation, bibliometric analyses, and other tasks [11]. WoS was the only source of bibliographic data for more than 40 years, until 2004, when Scopus was launched by Elsevier [12]. Over the years, Scopus has earned its equal place as a comprehensive bibliographic data source and it has proven itself to be reliable and, in some respects, even better than WoS [10,13].
However, as both WoS and Scopus are commercial and subscription based products, the worldwide recognition and use of these DBs has resulted in their high price, which makes it rarely affordable for an institution to subscribe to both of them [14]. Consequently, institutions are often forced to choose between these data sources [15]. Usually, the institution's choice of the DB subscription is primarily determined by the metrics that are It is impossible to cover all the relevant literature due to an enormous amount of studies performed on these two DBs and their provided metrics. In particular, due to the increased use of publication metadata and impact indicators, which makes bibliographic DBs relevant for almost all knowledge domains in academic community [11,26]. Additionally, due to the aforementioned obsolescence of information, only findings that were described in studies performed in the last five years will be discussed.
The structure of the study is divided into several major sections and organized, as follows: in the first part, a brief background on bibliographic DBs is provided. The second section presents an overview of the most recent studies comparing WoS and Scopus. The main findings and facts described in the literature are discussed in the third section, and are grouped by described DBs' features (content coverage, quality, additional information and functionalities, errors and inconsistencies, search performance, and data accessibility limitations). A description of official information that is provided by their owners is also included, in order to determine how easily and accurately a typical user can gather the required information to choose the data source that is best suited for a particular task. The fourth part contains a brief description of the most prevalent impact indicators provided in WoS and Scopus, as well as general guidelines for choosing the most suitable metrics. The fifth section discusses the major conceptual problems in bibliometric practices, highlighting the main concerns, application biases, and limitations. Although these topics are being extensively discussed in the literature, the importance of issues has not decreased, but, on the contrary, even increased, therefore, in the author's opinion, they have to be repeatedly reminded. The work is generalized in discussion section, followed by brief conclusions. Additionally, the major limitations of the work and several ideas for future studies are also provided.

Main Bibliographic Databases
Web of Science (formerly known as Web of Knowledge) was the first bibliographic DB, which was founded by Eugene Garfield in 1960s as the Institute for Scientific Information (ISI) and during the acquisition by Thompson Reuters company in 1992 ISI received its current name-Web of Science (abbreviated as WoS). Not long ago (in 2016), WoS was acquired by, and now belongs to, Clarivate Analytics (in this work further referred as Clarivate) company [28].
WoS is a multidisciplinary and selective DB that is composed of a variety of specialized indexes, grouped according to the type of indexed content or by theme. The main part of WoS platform is Core Collection (WoS CC), which includes six main citation indexes: Science Citation Index Expanded (SCIE); Social Sciences Citation Index (SSCI); Arts & Humanities Citation Index (A&HCI); Conference Proceedings Citation Index (CPCI); Books Citation Index (BKCI); and, established not long ago-Emerging Sources Citation Index (ESCI) [28]. Institutions mostly subscribe to only WoS CC instead of the whole WoS Platform due to the possibility to adjust subscription of WoS DB [29]. Therefore, in this work, only WoS CC will be evaluated.
Scopus is a similar multidisciplinary and selective DB, which was launched by Elsevier in November 2004 [12]. The main difference from WoS is that all Scopus content is accessible with a single subscription without possible modulations. Thus, although Scopus also includes content from many specialized databases, such as Embase, Compendex, World Textile Index, Fluidex, Geobase, Biobase, and Medline [30], their content is integrated and equally accessible.
The Beta version of Google Scholar (GS) DB also made its appearance in 2004 [31]. The main advantage of this DB is that it does not require subscription and all of the content is freely available for all internet users. GS also offers considerably wider and deeper, yet not clearly defined, overall content coverage by disciplines, document types, countries, and languages. Although free access and all-inclusive coverage provides GS with a huge advantage over WoS and Scopus, this also makes GS less reliable as a bibliographic data source as compared to the subscription-based DBs. Thus, the main drawbacks of GS are the This overview of the most recent literature demonstrates that, although WoS and Scopus DBs have been extensively studied and compared from various aspects, to the best of the author's knowledge, there is no single literature review describing all important features and differences of WoS and Scopus DBs, based on the most recent findings adequately reflecting current situation, which would allow for making the most reliable choice of suitable data source for the particular task. Especially for the DBs users who may not be aware of all possible aspects that should be considered for their purposes.

Overall Content Coverage, Overlap and Variations between Disciplines
The fact that Scopus provides wider overall coverage as compared to WoS CC was confirmed multiple times, both by early and the most recent content coverage comparisons. Generally, the content indexed in WoS and Scopus was also shown to be highly overlapping, with Scopus indexing a greater amount of unique sources not covered by WoS [44]. However, the extent of content overlap between WoS and Scopus was determined to be varying greatly across disciplines [40,47]. In cases of specific subject fields, the variations were even more noticeable. For example, Scopus was shown to cover even 99% of nursing journals that were covered by WoS [85]. Meanwhile, in another study focused on computer sciences, only 63% of the documents retrieved by WoS were also found by Scopus [57].
Nevertheless, practically all content comparison studies have highlighted the same biases in content coverage, being characteristic for both DBs. One of them is related to the fact that content coverage in the DBs varies greatly across disciplines, with one discipline being covered more extensively than others. For instance, a large-scale comparison performed at the journal level has shown that WoS and Scopus are both biased toward Natural Sciences, Engineering, and Biomedical Research, with Scopus offering wider coverage of all investigated broad fields, especially of Biomedical Research. Meanwhile, Natural Sciences and Engineering appeared to be overrepresented to a greater extent in WoS [47]. These results were confirmed by later comparisons that were performed at the publication level, showing that both DBs offer the widest coverage of Natural, Medicine, Health Sciences, and Technology, while Social Sciences and Humanities (SSH) are underrepresented in both DBs [40,49,53,56]. Similar distributions by disciplines were also determined in investigations that were performed within the context of regional, national, or institutional scientific output [13,[50][51][52]59]. Nevertheless, the absolute majority of these studies have reported better Scopus coverage of all major disciplines when compared to WoS. Scopus was also shown to provide a better representation of particular subject fields, such as nursing [85], pharmacy [54], Library and Information Sciences (LIS) [64], and Computer Sciences [57]. Yet, the results of coverage in humanities did not always coincide. Although the majority of the aforementioned studies concluded a better coverage of Humanities in Scopus, opposite observations indicating slightly better coverage of this discipline in WoS have also been noted [49,53]. However, it should be noted that some of these studies were only focused on journal coverage, and/or not all WoS CC indexes were included, which might result in inaccurate assessment of coverage in certain disciplines.

Coverage by Source and Document Types
The coverage of literature in specific disciplines or subject fields in selected bibliographic DBs should be evaluated with additional precautions, as this is highly dependent on several other aspects of coverage. One of them is an inclusion of different source types, as, in certain disciplines and subject fields, a considerable portion of the research results is published in other sources than journals. For example, it is well known that, in Computer sciences, the research findings tend to be published in conference papers [86,87], while books and text-books are more important sources in social sciences and even more so in humanities [26,46]. Therefore, the coverage of source types other than journals should also be evaluated, as it can highly influence the suitability of the DB for a particular task. However, WoS and Scopus are both concentrated on journal indexing. Therefore, it is not surprising that the coverage of books and conference proceedings both in WoS and Scopus was generally determined to be insufficient, aiming to use the DBs as the data sources for analyzing or evaluating disciplines where these source types are the most prevalent [26,44,88].
In response, both DBs' vendors have made considerable efforts to expand their coverage of these source types. In 2011, the Book Citation Index (BkCI) was added to WoS CC [28]. Meanwhile, Scopus has also expanded the coverage of books by executing Book Expansion Project, which was fulfilled during 2013-2015 period, with 210,000 titles being added from selected publishers to Scopus in addition to the existing book series [89]. However, even the most recent large-scale content analyses performed at publication level and including all major source types did not indicate a significant improvement of book coverage in both DBs. For instance, in a study where the representation of Norwegian scientific output was evaluated, WoS and Scopus were both shown to only cover 14% of book chapters [50]. Similar results were obtained investigating the representation of Switzerland scientific output [59]. Meanwhile, another study has explored the distribution of publishers providing content to WoS and Scopus, and it has shown that the content of BkCI is from a number of publishers, exceeding the book publisher list in Scopus two times [82], but, judging by the numbers of indexed items, Scopus provides better coverage of academic books when compared to BkCI [90]. These findings were confirmed by recent large-scale comparisons of three data sources (WoS, Scopus, and MA), where Scopus was shown to be indexing more books, book-chapters, reference books, and monographs when compared to WoS CC [53,59].
The coverage of conference material was also improved in both DBs. In WoS, conference papers are indexed in a separate Conference Citation Index, which was included in WoS CC in 2011 [28]. In Scopus conferences are indexed together with the rest of content, and not long ago the coverage of conferences in Scopus was expanded by Conference Expansion Program, which ran from 2011 to 2014 [89]. Serial conference proceedings were shown to be highly overlapping between the DBs [40,52]. However, although the share of conference proceedings articles in Scopus and in WoS is statistically similar (approximately 10%) [52], the majority of studies determined better coverage of conference material in Scopus when compared to WoS [26,39,53,55,57,59].
It was demonstrated that both DBs continuously increase their coverage of all source types, showing stable overall content growth [13,53]. Meanwhile, the content coverage of Russian scientific literature was determined to be growing even exponentially, especially in Scopus, although higher growth rates in the coverage of conference material were determined in WoS, when compared to Scopus [52]. Nevertheless, increased growth rates in both DBs were mostly observed within the period of the last decade, and they were were generally determined to be higher in Scopus, as compared to WoS, most likely as a result of aforementioned content coverage expansion programs executed by Elsevier.
In addition, certain disciplines often favor less popular types of publication that may not be indexed in certain DB. For example, in SSH, a significant portion of scientific output is intended for the general public and, therefore, is often published as letters, reports, book reviews, or "trade" publications not usually included in the main DBs [5,26,91]. Yet, there are some exceptions. Differently than WoS, Scopus indexes trade publications, but do not cover book reviews and meeting abstracts, which are indexed in WoS [28,40,89].

Coverage of Regional and Non-English Literature
Over the years, both Scopus and WoS were repeatedly shown to be biased toward sources that were published in the English language and providing relatively poor coverage of regional literature. In response, their owners took actions to expand DBs' regional and language coverage. For instance, both DBs incorporate SciELO content: Scopus made an alliance with SciELO in 2007, and the SciELO Citation Index was integrated into the WoS platform in 2014, aiming to cover more research from Latin America and the Caribbean [51]. Three other regional indexes were also included into WoS platform: Korean Citation Index (KCI) [92], Russian Science Citation Index (RSCI) [93], and, only recently, Arabic Citation Index (ACI) [94]. In 2015, WoS also launched Emerging Sources Citation Index (ESCI), aiming to expand the universe of WoS CC by including more publications of regional importance and representing emerging scientific fields [50].
Although the more recent studies, including ESCI in the evaluation, have seen the improved coverage of regional and non-English language documents in WoS, Scopus was identified as superior to WoS in the coverage of both non-English and regional literature of all types [49]. This fact was also confirmed by several studies comparing WoS and Scopus coverage of scientific output from particular countries and regions [50][51][52]87,95], or institutions [59]. However, all of the authors agree that the both DBs still obviously support English language publications more.
On the other hand, the proportions of literature in different languages vary drastically between the DBs. Scopus was shown to be indexing over ten-times more documents in Chinese than WoS. Additionally, Scopus indexes more documents in Danish, Japanese, Persian, and Swedish, which are barely present in WoS, and substantially more documents in Russian [49]. In fact, it was shown that during 2006-2016 period the share of papers in Russian-language journals in Scopus increased from 4.8% to 14.8%, while, in WoS, this percentage declined from 6.5% to 3.0% during the same period [52]. Documents that are written in French are also much better covered by Scopus when compared to WoS [87].
When compared to Scopus, WoS indexes more documents in Spanish and Portuguese, also Catalan, Croatian, Malay, Norwegian, and Turkish [49]. On the other hand, in a study of Norwegian scientific output coverage, Norwegian-language publications were shown to be better covered by Scopus, as compared to WoS [50]. The different conclusions that were drawn from these studies may be related to the differences in the applied time frames, because, while the total number of publications in the DBs is growing steadily, the increase of non-English documents may be more volatile, especially due to the execution of content expansion programs that were focused on specific source types or regions [52]. Therefore, an assessment of content coverage by languages between the DBs might produce different results if publications from different time periods are compared.
The differences in the distribution of non-English language documents between disciplines were determined to be even greater than in the case of English-language documents. The majority of non-English documents indexed in Scopus represented Life Sciences & Medicine and Technology, with Arts & Humanities being covered the least. Meanwhile, in WoS non-English documents were shown to be distributed between disciplines more equally, with the widest representation of Life Sciences & Medicine, Social Sciences, and Arts & Humanities [49].

Coverage of Citations
The coverage of citation data is extremely important for the analyses focused on quality and impact. The amounts of citations that were retrieved by the DBs firstly depend on coverage width [95]. However, it also depends on the coverage depth, which becomes more important when intended citation analysis is not limited by a particular time-frame, but it is rather aimed at an overall evaluation of citation counts [26,27].
Although the time-frame of covered citations in Scopus is shorter than in WoS, Scopus was shown to provide higher citation counts than WoS, even in the studies that were performed before an execution of Scopus Cited Reference Expansion Program, when the citations in Scopus were only covered up to 1996 [26,44,60]. Several studies have reported lower citation counts in Scopus when compared to WoS [13,48], but these studies were performed before the accomplishment of the program, which was fulfilled during 2015-2017. In fact, the differences in citation counts between WoS and Scopus were observed having been decreased by more than 40% already only a year after the initiation of the expansion program in Scopus [13]. Nevertheless, even after this, WoS may appear to be more suitable for large-scale citation analyses, as it covers citations up to 1900, while, in Scopus, citation coverage was only extended to 1970 [89]. However, in WoS, the extent of accessible citation depends not only on the overall citation data coverage, but also on the restrictions that are imposed by the subscription terms. Therefore, in certain cases the accessible timeframe of citations in WoS CC is even shorter than in Scopus. This might explain the results of the most recent studies, which have indicated higher citation counts being retrieved from Scopus in all disciplines, with the degree of overlap and differences between disciplines following similar patterns observed in content coverage. For instance, a very recent large scale comparison of six data sources has shown that Scopus retrieved 57% of all combined citations to the sample documents, while WoS retrieved 52% of the total citations. Regarding the overlap of citations, Scopus was shown to cover 93% of citations that were covered by WoS, ranging from 80% in Humanities, Literature & Arts to 96% in Chemical & Material Sciences, while WoS covered 83% of Scopus citations, ranging from 68% in Humanities, Literature & Arts, and only exceeded 90% in Chemical & Material Sciences [32]. Higher citation counts retrieved from Scopus compared to WoS were also determined in other recent analyses [57,59,91], including ones where citation retrieval was not limited to a particular time frame [39,54,96].
These findings suggest that a shorter time frame of citation coverage in Scopus might be very well compensated by the wider overall coverage of content. This may be supported by the studies where citations or citation based ranking results between WoS and Scopus were shown to be highly correlated [48,54,55,59,87]. On the other hand, several studies have reported lower citation counts being obtained from Scopus as compared to WoS, but usually only in the cases of researchers and academics with very long scientific careers [13,85].

Coverage of Patents
In the current technology-based era, patents are being increasingly acknowledged as an equally important and impactful part of scientific output, especially in technical and engineering sciences, as patents are practically the only form of publication that can be used as an indicator of technological impact. Therefore, patents can be used as a part of scientific output in the evaluations of institutions, countries, or research fields [5,44,[97][98][99].
WoS and Scopus both index patents: WoS includes a separate index (DWPI) covering patents from 59 patent-issuing authorities and two literature sources worldwide [100]. Meanwhile, Scopus indexes patents of the five major international patent offices: the World Intellectual Property Organization (WIPO), the Intellectual Property Owners Association (IPO), the European Patent Office (EPO), the United States Patent and Trademark Office (USPTO), and the Japan Patent Office (JPO) [89].
Although the overall number of patents indexed in Scopus is smaller than in WoS, they are equally accessible with a subscription as the rest of the content of Scopus DB. That is not the case in WoS, as DWPI is not a part of WoS CC and, therefore, patent information is not accessible through the basic WoS CC subscription [40].
On the other hand, the majority of patent related studies have investigated citations between patents or between patents and scientific literature aiming to determine patent relations with technological advancements and/or economic relevance (for an overview, see [99]), but not the patents themselves. Nevertheless, studies of patent citations require a bibliographic data source covering this type of citing relations. WoS and Scopus are both suitable and they have been used for this purpose (e.g., [98,101,102]). Some studies have also used bibliographic DBs as a source for patent metadata (e.g., [102]). However, the comprehensiveness of patent coverage in these two DBs, to the author's knowledge, have not been extensively compared. It was only noted that Scopus data lack relevant offices, such as the China Trademark and Patent Office (CTPO), which can be disadvantageous for analyzing certain countries, as patent applicants usually apply first at the home country office [98].

Coverage of Funding Information
With the increased role of publication data in research evaluations and grant applications, the availability of comprehensive research funding information (FI) in the bibliographic DBs has also become more relevant. Accordingly, both DBs are making efforts to include FI in descriptions of the publications.
WoS and Scopus DBs began to actively extract and add FI to bibliographic entries only relatively recently. Funding acknowledgments have been added to the SCIE records since August 2008, to the SSCI records since March 2015, and to AHCI since 2017 [28]. Now, FI is captured for all WoS CC indexes [103]. Although the indexing of FI is WoS that began in 2008, FI was also retrieved in up to 5% of earlier publications and FI coverage was shown to be steadily growing over time across all indexes [77]. In WoS, FI is provided in separate entry fields of funding organization (FO), grant number (FG), and funding acknowledgment text (FT). FO and FG are extracted from FT field [76]. Since 2016, WoS CC publication entries have been supplemented with FO and FG from Medline and Researchfish [103]. Meanwhile, Scopus started to collect FI since July 2013 [76]. Elsevier states that, in Scopus, the full text funding acknowledgement sections are included for documents (where applicable) along with the grant number, funding organization's name, and its acronym going back to 2008. Funding data in Scopus are directly harvested from funder websites. FI is provided in standardized format organized in the entry fields of Funding Sponsor, Funding Acronym, Funding Number, and Funding All [89].
The availability of funding data in the bibliographic DBs sparked investigations of relationships between funding inputs and the quality of research outputs (publication impact). Accordingly, the extent of FI coverage in the DBs has attracted the attention of the scientometric community. However, the majority of studies addressing this matter has only focused on the coverage and accuracy of funding data in WoS or compared WoS with other data sources, but not with Scopus. It was shown that the coverage of WoS FI varies across different research domains. In the fields of Natural Sciences, FI was determined to be covered better when compared to Social Sciences and Humanities, which coincides with the observations that WoS more systematically covers FI for publications that are indexed in SCIE, as compared to other WoS CC indexes (SSCI, AHCI, and ESCI) [74,76]. These findings might be related to the fact that AHCI and ESCI cover higher amounts of non-English language publications and other types of documents than articles, while, according to WoS Bibliographic Policy, funding acknowledgements are only processed if they are published in English, regardless of the language of the publication [76]. This was also confirmed by several other studies, where the availability of FI in WoS was shown to be almost exclusive for papers in English and for those in Chinese with acknowledgements being presented in English, and it is mainly provided for original research articles [74,77]. On the other hand, the analysis of available funded articles (FAs) by document types revealed that, in WoS, FAs were distributed between articles (73%), editorials (12%), reviews (8%), and letters (6%), while all of the FAs in Scopus were classified as articles [80]. Although this study was limited to a specific set of journals and publishing years, the similar observations regarding WoS FAs distribution by document types was also observed in the most recent larger-scale study [77].
Meanwhile, the comprehensiveness of Scopus FI was only investigated in very few studies. The extent of FI coverage in Scopus was assessed by comparison to WoS and PumMed, which showed that the coverage of FAs differed significantly among the DBs in a sample of the same medical journals. Although Scopus indexed the highest number of articles in investigated journals, it retrieved the lowest share of FAs (only 7.7%), while, in WoS, the share of FAs was 29.0% [80]. The low coverage of FAs in Scopus was also confirmed by a small case study where Scopus was able to retrieve only 17 from 25 FAs, while WoS retrieved 23 [79]. Regarding the accuracy of FI retrieval, in one study values of recall and precision for WoS were reported being well above 90% [75], while other authors found the completeness of WoS FI being little below 90 % [78]. However, in many cases, the FI in WoS was shown to be incomplete, as usually it was only present for some of FT, FO, and/or FG fields [75,77,78]. Meanwhile, the recall of FI in Scopus was estimated being only approximately 67% [79].
On the other hand, an accurate extraction of FI from publication text is not an easy task for DBs and the coverage of FI in the DBs first depends on the extent to which the FI is provided in the publications. Research in different disciplines varies greatly with respect to additional funding requirements and recognition practices. The reporting of funding may vary greatly across disciplines and countries due to the different funding systems and policies. Different editorial rules for acknowledging funders may also hinder the correct extraction of FI [75,76,78]. Besides, the Acknowledgement section may be used to express the gratitude, not only for financial, but also for other types of support [74,104]. Apart from that, FI may not be correctly (if at all) declared in the publication by the authors themselves [75][76][77][78]. The presence of mistakes in FI may cause its inaccurate extraction in the DBs. Because FI is not standardized, it is difficult to disambiguate between funding organizations [76,78]. The language in which the name of the funder is reported can also be an issue [75].
However, because the extraction and improvement of FI in both DBs is an ongoing process, it could be expected that the situation may change. Hopefully, especially with regard to Scopus, since, while, according to their representative, the coverage of FI in Scopus has improved over the last several years [12], the most recent comparative studies between WoS and Scopus have shown that WoS performs much better than Scopus in both coverage and accuracy of FI [79,80]. In fact, in a recent study of WoS FI coverage, after comparing the obtained results with the ones from previous studies, the author concluded that there was a significant improvement of WoS FI coverage, especially in the case of AHCI [77]. This might be related to a WoS program for FO name unification, which started in 2019 [103]. However, the same biases of FI coverage in favor of SCIE, English-language publications (except for publications in Chinese), and journal articles in WoS still remains [74,76,77]. Therefore, the majority of authors agree that, in the meantime, in both DBs (especially in Scopus) the coverage of FI may not be comprehensive enough for accurate and reliable analyses of funding trends in SSH and in research with strong national orientation.

Publication and Source Information
WoS and Scopus both provide detailed metadata for indexed publications, including publication title, abstract in English, keywords, authors, affiliations, document type, source information, and citation counts [95]. For every publication, their usage metrics are also indicated: WoS lists the publication's usage in WoS CC counts [105], while Scopus provides broader usage metrics (PlumX) [106,107]. The lists of all references (even ones that are not indexed in the DBs) and related documents can also be viewed. Where available, direct links to the publisher are also provided, where the full text documents can be viewed and/or downloaded, if the user has the required access [105,108]. WoS additionally provides links pointing directly to the full text documents (for OA content) [109].
Regarding OA content, in July 2015, Scopus launched an OA indicator for an easier identification of OA journals within Scopus content. However, the indicator was only available at the journal level, and only for fully OA journals [110]. In 2018, Scopus made OA indicators available at the article-level for articles published both in full Gold OA and hybrid journals [111]. Meanwhile, WoS started including OA status for publications using article-level information from oaDOI since December 2017. Moreover, in addition to gold OA, the information is now also provided for content published by green and bronze OA [28,112,113].
Both DBs also provide detailed source information. For every journal, individual profiles are created, where the main source's information, including current title, previous titles, ISSN/ISBN number, publisher, country, and indexing period, is presented. Besides the general source information, both DBs also provide citation information, including source's ranking in the relevant subject fields, the most recent and earlier values of journal impact indicators, and the ability to review the data from which the main journal impact indicators (JIF in WoS and CiteScore in Scopus) were calculated [114,115].
In Scopus, full source profiles can be reached directly from document lists, source browsing results, author or institutional profiles, and publication's entry. Source information can also be viewed directly from publication lists and from full publication entries in WoS. However, only very limited information appears in the emerging box by pressing on the active journal title, providing only journal's title, the most recent JIF value (if Journal Citation Reports (JCR) tool is subscribed), journal's quartiles (Q) in all of the assigned subject categories, publisher, and country [28]. More detailed information of WoS indexed sources can be accessed through the JCR tool [114] and at the Master Journal List (MJL) webpage [116].

Author Information
An accurate disambiguation between authors is extremely important for performing bibliometric analyses, assessing research performance, evaluating research collaboration and mobility trends, or tracking personal careers [70,117]. Recognizing the importance of the author information, both of the DBs aggregate corresponding publications and related information for every author by employing automatized algorithms. In Scopus, all of the authors listed in the indexed publications are automatically assigned with individual Author Identifiers (AUIDs) and personalized profiles of every author are created. Documents are assigned to a particular AUIDs via an algorithm that operates by aligning several authorship criteria, such as the name and additional name spelling variants, affiliation, subject area, prior publication history, and co-authorship [70,89].
WoS also captures the main information (names, surnames, and initials) of all authors from the indexed publications. Since 2008, all author names are associated with their affiliated institutions and individual profiles, named Author Records, are created. WoS Author Records are generated in a similar way as in Scopus-by a proprietary algorithm identifying and weighting shared data elements, such as author names, institution names, citing, and cited author relationships [28,118]. However, as opposed to Scopus, individual author identifiers are not assigned to each author.
Besides the basic author information, such as name, surname and their variants, current and previous institutional affiliations and publication lists, author profiles in both DBs' also include citation data, h-index, total amount of citations, and citation counts indicated separately for every publication with the ability to view citing documents and preview the publications in the search results format. Additional tools for author's output analysis are also provided. However, the graph representing the h-index calculation is included directly in the Scopus author profile [108], while, in WoS, it can only be viewed by selecting "View full Citation Report", which cannot be opened in a separate window. Additionally, only publications' details and citing documents can be reached directly from WoS author profiles. Meanwhile, most of the information that is provided in Scopus Author profiles is linked and allows for easily transferring to the corresponding information, such as publication details, citing and related documents, institutional profiles, and profiles of co-authors and source.
Scopus Author profiles additionally provide information regarding contributed topics, co-authors, and link to Mendeley account (if present). Yet, one of the major advantages is that Scopus Author profiles may be freely viewed via Scopus Preview without the subscription of the DB. However, although preview version also provides all of the basic author's information, including a list of the most recent 10 publications, citation counts, and h-index graph, the majority of additional functionalities are disabled [119].
Apart from that, both of the DBs offer a possibility to include ORCID (Open Researcher and Contributor ID) identifiers [120], which are helpful in distinguishing the authors with very common names. WoS also offers the ability to create a ResearcherID profile and integrates claimed Author Records with Publons profiles [118]. However, these profiles are optional and they can only be created by the authors themselves. Accordingly, only relatively small part of researchers have them [70].
However, the disambiguation between authors using author profiles that were generated in the DBs may not be completely accurate. Apart from the fact that some surnames, and even both surnames and names, are very frequent, an accurate isolation of the exact author of interest is even more complicated by the presence of misspelled or by alternative, yet valid forms of name variants. Algorithms that are applied in the DBs often mistakenly recognize them as different ones, thus dividing the publications of one author into two or more separate author profiles with the same or slightly different names. However, the split of one author's publications to multiple author profiles may appear, even when the authors are spelled and listed correctly. This is mainly caused by the changes in the author's information used as the main criteria by which publications are grouped under one profile, such as author's institutional affiliation and/or research field [70,117].
In the case of Scopus author disambiguation system, author profiles might also be split due to the changes in research activity, when for certain period author has not published any papers. Additionally, the absence of an e-mail address was determined to be a frequent cause of the split profiles in Scopus. Even more incidents of split identities were identified in the assignment of the most recent author's publications. This is due to the calibration of the algorithm in the way that in the cases when newly published publication lacks the information required for its reliable assignation, new author ID is created rather than taking the risk to include the publication incorrectly in an existing author profile. On the other hand, these additional IDs should be only temporal, as, according to Elsevier, when amount of publications in the additional ID becomes sufficient for proving their belonging to a particular author, additional ID is merged with the dominant author ID [70]. However, it is not clear how long this process could take.
In some cases, the profiles of two (or more) separate authors with identical names may be merged by the DBs. This usually occurs when affiliations in co-authored publications are mixed up or only one affiliation is listed, but profile merges were determined to be very rare [70,121]. It should be noted that the international mobility of authors may also affect the accuracy of their publication assignment to AUIDs, since affiliation and co-authorship data are also important for disambiguation systems of the DBs [122]. Therefore, only the author himself can assure that all his/her publications are present and correctly assigned to his/her profile.
Despite that, Scopus AUIDs appear to be precise. When compared to the largest Japan funding database, the recall and precision of Scopus AUIDs were over 98% and 99%, respectively, with the majority of authors having only one AUID [122]. In the study of 193 German scientists, it was shown that the majority of scientists had only one AUID in Scopus (68.4%). For the rest of the authors, where more than one AUID was identified, the absolute majority of their publications (around 97%) were assigned to the dominant ID. The precision of publication assignment to authors was determined to be very high (up to 100%) [70]. Meanwhile, in the more recent large-scale study (sample of 100,000 authors), the accuracy of Scopus AUIDs was shown to be even higher and improving over time, as, in 2017, the reported precision was 98% with an average recall (ratio of correctly assigned publications) of 93.5%, while, in 2019, these percentages have increased up to 99.9% and >94%, respectively [121].
However, the accuracy of author disambiguation systems applied in WoS and Scopus has not been extensively compared. This might be related to the fact that author information in WoS was only recently expanded from simple author groups into separate profiles (Author Records), and it is still in the testing stage (beta version) [118].
Both of the owners of DBs acknowledge the possibility of mistakes in author profiles providing possibilities to correct them. In WoS, until authors' records are unclaimed, they can be corrected by any person who can remove certain publications from the profile. Once the author claims the record as his/her own, only authorized corrections are allowed and additional missing publications can be added to the profile [118]. Meanwhile, in Scopus, additional missing publications of the author can be manually added by Author Feedback Wizard tool (currently listed as "Edit profile") provided in the profile, which also allows for setting a preferred name, to update affiliation and merge potentially matching profiles [12,70].

Institution Information
WoS and Scopus both also employ disambiguation systems for institutions: WoS Organization Enhanced (OE) and Scopus Affiliation ID (AFID). Both DBs capture the main existing variations of institution's name and they are presented along with the other institutional information. According to the information that was provided by the DBs' owners, institutional information is also curated manually [12,28]. However, it was noted that the explanatory information is insufficient, aiming to effectively utilize curated institutional information that is provided by the DBs for bibliometric uses [72].
Clarivate states that all author affiliations are indexed in WoS CC, including all determined institution's name variants and parent/child relationships, which are mapped and connected to a preferred institutional name. It is also noted that, although more than 11,000 institutions have undergone unification process, not all organizations have been unified yet. Yet, in addition to the main institution's name and all additional name variants, only the address of the institution is provided in the OE individual institution's details page [28].
Meanwhile, differently than in WoS, all of the institutions in Scopus are assigned with unique identifiers (AFIDs), similarly to the case of authors. Currently, more than 70,000 institutional profiles are created and searchable in Scopus. Scopus institutional profiles contain wide and detailed information, including the main and additional name variants, address, document count, and their distribution by types, subjects, sources, authors, countries, and years [123]. Yet, only a part of existing name variants is listed in the profile.
In both DBs, algorithms that are responsible for the correct assignation of publications to institutions are not always are capable of accurately identifying the belonging of publications. Individual institutional profiles in Scopus are generated similarly to the case of authors; therefore, mistakes in publication's affiliation data may also result in the creation of additional institutional profile. Thus, a publication with incorrect or missing affiliation may not be assigned to the main institutional profile, which, in turn, may also affect the correct assignment of the publication to the authors [70,72]. Although, in WoS, institutions are not assigned with individual IDs, mistakes in affiliation information may also cause the incorrect assignments of publications. Thus, such publications may be overlooked when analyzing scientific output of institutions in the DBs.
Accurate disambiguation of institutions may also be impaired by the presence of institutional branches and other units that may be left unassigned to the parent institution in the DBs. In this case, WoS and Scopus institutional disambiguation systems act differently. In Scopus, university hospitals are not assigned to the universities and they are regarded as separate institutions without any hierarchical links with the university. Meanwhile, the WoS OE system only covers head organizations, but not their member institutes of non-university research organizations. Thus, WoS OE mostly covers universities. This bias towards certain types of organizations has not been demonstrated in Scopus [72].
The structure of large multi-unit institutions might change over the time with new branches being added, split, or merged. In both WoS and Scopus, predecessor units are grouped with their current main unit. The same is true for institution's name, which could be changed to a variant seemingly completely unrelated to the previous one. Thus, these possible changes make it difficult to identify all institution's names and units without any knowledge of institution's history [72]. Accordingly, Scopus allows for subscribing institutions to establish affiliation administrators [123].
Regarding the accuracy of WoS OE and Scopus AFID systems, a recent large scale study has shown that both systems are not currently perfect and do not provide the full coverage of disambiguated research organizations. It was shown that only 20% of investigated institutions (445) were located in WoS OE, which was mainly attributed to the aforementioned fact that WoS OE does not cover non-university research institutes. Meanwhile, in the Scopus AFID system, 85% of institutions were located [72].
A deeper insight into systems' precision (ratio of correctly assigned publications to all publications grouped under the institution in the DB) and recall (share of institution's publications accurately retrieved by the institutional disambiguation system) showed that both of the systems provide adequate levels of precision (0.95 for WoS OE and 0.96 for Scopus AFID), which means that less than 5% of publications assigned to the institution in the DBs do not actually belong to that institution. The recall of both systems was determined to be considerably lower (0.93 for WoS and 0.86 for Scopus), indicating that investigated systems, including search method, often do not retrieve full lists of institution's publications, although it occurs less frequently with larger organizations. Additionally, substantial variations both in precision and recall between organizations were determined in the cases of both WoS and Scopus [72].
WoS and Scopus both provide opportunities to report observed inconsistencies via customer feedback services. However, although, in Scopus, additional matching institutions can be viewed and grouped at the institution's profile page in the same way as in the case of author profiles, the permanent merging of additional variants of the institution may only be requested by an affiliation administrator or librarian, and requests from researchers, for example, cannot be processed [124]. Meanwhile, the abilities to correct affiliation in WoS DB are more limited and only possible at the institutional level by requesting full WoS institutional information and submitting a request for corrections via online form [28].

Content Quality
Because of the widespread practice of evaluating research based on the quality of scientific output, the quality of the journal is becoming a leading criterion when choosing a journal for publications, since the quality of research is currently judged by the quality of the journal in which it was published. The question of journal quality has become even more important with the growing interest in Open Access (OA) publishing [125,126]. This publishing model is aimed at enabling scientific content to be freely available for public without the requirement of journal subscription. In fact, in alignment with Plan S initiated by European Science Foundation [127], a number of countries already apply a requirement for publicly funded researches to be published by the OA model [126,[128][129][130][131][132]. However, although OA publishing is very beneficial as free access makes scientific researches more visible, transparent, and reproducible, and also shortens the time required for publication to be available online, the unintended, yet possible, OA effects on research and journal quality have become questioned [133].
However, the quality of the content that is indexed in the DBs is defined not only by the quality of indexed sources, but even more by a quality of the provided metadata, especially for bibliometric analyses. Like any other platforms accommodating huge data sets, bibliographic DBs are also not free from errors. Many of them occur due to the automatic loading of the data into DBs as machines not always managing to recognize and transfer the data properly. On the other hand, some mistakes can be caused by the authors or publishers even before the metadata is being imported into the DBs. Nevertheless, all of the discrepancies in the publication metadata, source information, or other inconsistencies occurring in the DBs not only complicate analysis and other tasks, but can also negatively affect the accuracy and reliability of the obtained results [4,61,95].

Content Indexing and Inclusion Policies
Generally, a high quality of the journal is perceived as its inclusion in WoS and Scopus, because these DBs allegedly only index the highest quality sources carefully selected according to strict selection procedures [134]. The Scopus Content Selection and Advisory Board (CSAB) determines Scopus content inclusion, which is an international group of scientists, researchers, and librarians representing the major scientific disciplines. In WoS, this task is performed by an internal Editorial Development team. Both DBs publicly declare their content indexing policies, which are separately provided for different source types [82] and they are explained in several descriptive documents. However, Clarivate provides at least three separate documents describing journal selection procedures [135][136][137]. Apart from that, this information is also provided in other documents [28,138] and websites [135,139]. Moreover, the inclusion of books and conference material is described in separate additional documents and websites [140][141][142]. Meanwhile, Scopus descriptive information is more concentrated, as selection criteria and policies for all document types are presented in both general [89] and separate [143] descriptive documents, and at a single website [144].
In both DBs, sources are evaluated through several stages. When a journal is evaluated for inclusion in WoS CC, after an initial triage for gathering general information regarding the evaluated source, the journal is submitted to an editorial triage, where a set of 28 criteria is applied: 24 quality criteria and four criteria of impact [139]. In Scopus, evaluation criteria are grouped under five categories: journal policy, content, journal standing, publishing regularity, and online availability [144]. However, the main criteria in both DBs are the same: the source must be peer-reviewed, published in a timely manner, comply with editorial standards and certain citation thresholds indicating its impact. Additionally, the linguistic criterion exists, requiring abstracts and titles to be written in English language, and the importance of content's international relevance with references being listed in Roman script is clearly stated [12,82].
However, a recent study of WoS CC journal inclusion criteria have shown that WoS CC coverage is dependent not only on the general (universalistic) inclusion criteria listed above, but may also be potentially influenced by specific (particularistic) characteristics of the journals, such as the represented discipline, publishing language, publishing institution's type, and country of residence, and even country's economic wealth, as the majority of journals published in other than English languages and in smaller countries, and journals representing very specific research fields, especially ones that were published by universities, were shown to have a lesser probability to be included in WoS CC. Meanwhile, from the perspective of universalistic criteria, it was observed that complying with the editorial standards does not guarantee the inclusion in WoS, while the quality of journal impact may have a greater influence [81]. Although the study was confined to the journals that were produced in Latin America, Spain, and Portugal, it is probable that the inclusion of all journals into WoS may be affected by the same determined biases.
Unfortunately, to the author's knowledge, no similar study was performed regarding possible biases in Scopus source inclusion. However, as both DBs are selective, it can be assumed that Scopus source inclusion may also be affected by particularistic criteria. For instance, in a study where selected publishers providing content to WoS and Scopus DBs were analyzed by their distribution across countries, the main part of content being indexed in both WoS and Scopus was determined being published in United States, United Kingdom, the Netherlands, and Germany [47], which is not surprising when considering that the major academic publishing companies are located in these countries. The fact is confirmed by the DBs owners themselves, as they list the top publishers that are covered in the DBs being Elsevier (the Netherlands), Springer (UK), Taylor & Francis (UK), Willey-Blackwell (USA), and SAGE (USA) [28,89]. This may suggest that both DBs could be biased against sources published by universities when considering their inclusion.
Although WoS was created aiming to collect all of the scientific resources of the best quality, the early studies have already shown that, in reality, this may not always be true, at least not in all subject fields [47]. However, the conclusions derived from the more recent studies do not always coincide. It was observed that the unique sources that were indexed by Scopus, but not by WoS, had lower citation impact values [57], but this may be attributed to the wider Scopus coverage. Meanwhile, other authors have not determined any significant differences in the quality of journals that were indexed in both DBs [58,85]. Moreover, it was also shown that a significant share of journals ranking in highest positions in Scopus DB were not even indexed by WoS [51,58].
Only a minor part of evaluated sources is selected for inclusion into DBs. In WoS approximately 10-12% of evaluated journals become indexed [28], while, in Scopus, the exact percentage is not declared, but is most certainly higher and likely contributes to a wider Scopus content coverage. Additionally, it was noted that both DBs do not practice backwards indexing (except for ESCI BackFile) in WoS [55]. However, according to Elsevier, "[...] if back file content for newly added journals is provided, Scopus may decide to cover the back files as well" [89].
On the other hand, there are some differences in the process of indexing new sources itself. Apart from the content being indexed under separate indexes in WoS DB, all the new sources accepted for inclusion in WoS CC firstly fall under ESCI index and can be redirected into other Index (SCIE or SSCI) after only three years, if, during this period, the source will still meet the required criteria [28]. Meanwhile, in Scopus, new sources are added a few weeks after confirmation of their acceptance [89] and are included into the Source lists after a threshold of 15 papers has been reached [115].
Sources that are already indexed are being reevaluated annually. After reevaluation, sources may be discontinued (removed) or suppressed. These measures could be applied if the source fails to meet certain indexing criteria, e.g., when journal self-citation patterns drastically increase and exceed the established threshold, the irregularities in publishing are determined or unethical publishing practices are suspected [138,143]. It has been estimated that, although the wider inclusion of journals into Scopus is also associated with the higher number of journals being removed, the overall proportion of "short-lived" journals that have been included and excluded from coverage in a relatively short period appears to be higher in WoS [58].
It was also noted that, in some cases (mostly in Scopus), the removal of a journal from the DB may result in the entire removal of previously indexed papers from this journal, while, in other cases (mostly on WoS), previously indexed papers are not necessarily removed [62]. However, according to a Scopus representative, upon the discontinuation of a title, a previously indexed content is not removed from the DB [12]. However, this practice may be detrimental, especially in the cases when journals were discontinued due to the unethical publishing practices, as the remaining indexed publications continue to be cited and may result in a misleading interpretation of the quality of the publications. On the other hand, the editorial ethics of a journal may change over time and previously published content may actually fulfill publishing criteria [145].
Nevertheless, the lists of discontinued and suppressed titles are available in both DBs. Scopus provides a separate lists of discontinued titles with the indicated reason of journal indexing termination [146]. Meanwhile, WoS provides lists of suppressed titles [147]. These lists should be checked when choosing the journal for publishing, especially when the journal indexing status is highly important, as it was shown that a considerable portion of discontinued journals in Scopus have not changed their indexing status in their webpages (whether accidentally or intentionally, aiming to attract more authors), even for several years [148]. Meanwhile, more than 9% of journals that were discontinued in Scopus were still indexed in WoS DB [145].

Open Access Content and "Predatory" Journals
Although generally peer-review practices, layout, indexing, and other main features of OA journals remain largely the same as for traditional subscription-based journals [149,150], there is often a concern that, in OA journals, the review standards may be lower, allowing for easier and faster publishing of less scientifically sound publications [130,133,151]. The detrimental effect on the quality of research resulting from the shift from publishing in subscription-based to OA journals was, in fact, demonstrated empirically [133].
Judging from the perspective of quality, in both WoS and Scopus, OA journals are selected for indexing by applying the same criteria and policies as for the rest of indexed sources. Although OA journals indexed in the DBs are usually being estimated as having lower citation impact indicator values for the time being [128,152], this could be at least partially explained by the fact that many of OA journals are relatively new. However, the quality of indexed OA journals has not been compared between WoS and Scopus. On the other hand, as there is often highlighted that papers published by OA model and OA journals in general are prone to receive more citations than ones that are published in traditional ways due to the wider and faster accessibility of OA content, the quality of OA journals estimated by citations, and citation-based indicators may very fast reach and even overgrow the values of traditional journals [149,152]. Indeed, many studies have demonstrated that OA content attracts more citations [129,133,[153][154][155], although others did not observe significant OA advantage in citation impact [156,157]. Thus, there is no clear consensus on the OA effect on citation impact [126,129,131,152].
On the other hand, it can be argued that the OA citation advantage may be more related to the free access to content, and less to the actual quality of published research [153], since highly impactful researches that are published in subscription-based sources are often not accessible [154]. Thus, the validity of measuring the quality of OA journals by citations and impact indicators may seem questionable [128]. Yet, because the investigations of OA publications and journals are currently gaining momentum, there is still a lack of credible empirical proofs to support or deny any of the aforementioned assumptions [149]. Moreover, it was also noted that the results of the performed related empirical studies strongly depend on the data source used [125,152,153].
OA journals are only published in electronic format shortening the time delay from submission to publishing. Thus, OA journals are particularly attractive to authors pressurized for high publication productivity [158][159][160]. Apart from that, the major difference of OA from the traditional subscription-based model is that, in OA publishing, method coverage of article processing charges (APCs) is redirected from publishers to authors [130,161], which opens the possibility to profit from scientific publishing without any actual contribution to science. Today, there are many publishers and journals that exploit OA model and its attractiveness for authors in order to collect APCs, but they are highly questionable regarding peer-review practices and overall trustworthiness. These highly questionable publishers and journals are most often called as "predatory", but they may also be referred as "pseudo", "fake", or "hijacked" journals [150,158,162,163].
Predatory journals usually exploit names and other details, even webpages of credible journals. Thus, it is often very hard to determine the credibility of the journal. On the other hand, there are several features that may signal a possible predatory nature of a journal. For instance, the APC charges of predatory journals are usually much lower when compared with the APCs applied by credible OA journals. Additionally, often authors are requested to pay APC only after the acceptance of their manuscript. Other features include the absence or a questionable location and composition of journal editorial boards, as their members are often lacking academic competencies. Predatory journals also often do not clearly declare manuscript submission, revision, acceptation, and licensing policies [131,150,157,158]. In 2012, Jeffrey Beall composed and published a list of nearly 50 criteria for identifying predatory publishing, and continuously updates these criteria and an index of publishers as well as individual journals fulfilling such criteria [162]. Although these criteria are often considered to be controversial [164], it was shown that predatory journals usually fulfilled several of the criteria outlined by Beall [150,165]. However, there is no common definition allowing for clearly distinguishing between predatory and credible OA publishing [164].
Nevertheless, aiming to become indexed in the DBs in order to create a reputable image and attract more authors, these artificial journals have evolved and they even manage to pass the selection process of the major bibliographic, as well as OA, DBs [145,150,157,158,166], and the unethical nature of predatory journals may only be noticed after a long time. For instance, in January 2018, Elsevier discontinued indexing of 424 journals in Scopus, indicating "publication concerns" and, less often, "metrics" as the reason for discontinuation. A plausible assumption was made that these journals were potentially or actually predatory [52]. More recently, a more detailed examination of journals discontinued from Scopus for publication concerns was performed. The study, in fact, confirmed that the major part of discontinued titles may be regarded as predatory journals (77% of discontinued journals were included in Beall's black list) [145]. However, is was also determined that the problem of predatory OA seems to be highly contained to USA and, a few, mainly in developing countries (e.g., India, Pakistan, Turkey, Nigeria) [145,150,158].
Having published in predatory journal may negatively affect the reputations of authors [157]. Thus, although the OA publishing model provides researchers with faster dissemination and greater visibility of their work, when selecting a journal for publishing, OA journals, even ones that are indexed in WoS and Scopus, should be evaluated with caution.

Errors in Publication Metadata
Discrepancies in the publication metadata occurring in the DBs, such as incorrect titles, author surnames, or mistakes in affiliation and source data, should also be accounted for to ensure the accuracy and reliability of the results. Especially, when citations are being analyzed, since the main consequence of all errors in publication metadata is missing or incorrect links between citing and cited papers in the DB, which leads to missing citations [61,62,65,66]. Consequently, data errors that result in omitted citations may significantly affect the results of citation analyses.
Errors that occur in the DBs may be divided into two main groups, depending on their origin: pre-existing errors, which are made by authors/publishers/editors in preparation and publication of the manuscripts; and, database errors, which are introduced by the DB due to inaccuracies occurring in data uploading processes, such as transcription errors or omitted data. Editorial style of some publishers can also favor database errors [62]. On the other hand, some pre-existing errors that are introduced by authors may be corrected by the publisher in the manuscript editing state or by the DBs during the data import [65].
A considerable portion of mistakes in the publication metadata is caused by incorrect spelling due to the presence of language specific characters (e.g., letters with diacritics), which may be incorrectly identified by optical character recognition (OCR) systems [65,70,71]. Spelling mistakes mostly occur in the publication title, author names, and institutional affiliations. These mistakes can be caused either by the authors or publishers or by inaccurate publication data uploading into DBs.
Various errors can occur in author names. For instance, certain characters may be omitted or replaced by incorrect ones or by spaces, names may be truncated, diacritics may be presented incorrectly or missing, compound names can be listed in incorrect order, part of the first name may be attributed to a last name and vice versa, or even part of the last name may be omitted. A recent large scale analysis has shown that, although these mistakes are present in both DBs, but, in Scopus, frequencies are lower for almost all error types in author names, except for the cases where part of the first name is being mistakenly assigned to the last name, which appeared in fewer than 1% of Scopus entries, while, in WoS, were not detected at all. On the other hand, the incorrect assignment of part of the last name as a first name was relatively common in both DBs (12.61% in WoS and 10.19% in Scopus). Meanwhile, although, in both DBs, incomplete and mistyped last names were quite rare (rarely exceeding 1%), approximately two-times more of these mistakes were identified in WoS when compared to Scopus, while the frequency of omitted apostrophes in WoS was almost fifteen times higher than in Scopus. However, in both DBs in almost all last names (approx. 95%), containing diacritics (only approx. 1% of investigated names contained this feature), diacritics were omitted, but none of the diacritics were incorrectly imported [71].
Not all of the mistakes in author names are being caused by the DBs, since the author himself/herself can introduce spelling mistakes or may decide to use different, yet valid, name variants in different publications. Additionally, the first and last names might be misplaced due to the editorial style of the journal (e.g., when first names are being listed before last names). In the case of female researchers', additional surname variants may also appear due to the change in marital status [70,71]. Thus, it is not possible to know all surname variations of the author present in the DBs.
Mistakes in institutional affiliations also occur in the DBs. Although this information is very important in conducting bibliometric analyses and evaluating the scientific output of the institutions and authors [4,72], the issue has not been extensively explored in the scientometric literature. Only recently was the missing author address information in WoS investigated. The study showed that more than one-fifth of the publications in WoS lacked address information, while full-text analyses have revealed that about 40% of the articles actually had at least a partial address information listed, and a part of investigated publications had full address information, but, for some reason, it was not indexed by WoS. Meanwhile, for the remaining publications with missing address information in WoS, the information was also not declared in the publications themselves, which was probably due to the different editorial policies of journals [73].
In addition to the complete absence of publication's address information, the information may be incomplete and/or incorrect. Apart from the possible spelling mistakes, part of an address may be missing. In this case, the omission of country name is of the highest importance, as this part of address is often exploited in the extraction of data for bibliometric analyses or evaluations. The issue was investigated among publications that were (co)authored with USA researchers, where a significant amount of publications only contained state information, but missed the name of the country [73].
The frequencies of affiliation mistakes varied greatly among publication years, disciplines, document types, publication languages, and evaluated WoS indexes. The mistakes were determined to be more frequent in older publications (before 1970). As for recent publications, the problem remained more significant within Arts and Humanities. Additionally, the mistakes were significantly less frequent in original articles and reviews as compared to the other publication types, and the journals for which the omitted address information was more pronounced, were determined to be low-impact ones. On the other hand, the study has also demonstrated that, for publications later than 1998, the situation in WoS has drastically improved. Unfortunately, the presence and accuracy of Scopus indexed address information was not investigated in the study, since there is no reliable way to identify and analyze publications with missing address information in Scopus online. Although authors have determined that address information may also be missing in Scopus, the extent of the issue could not be assessed reliably [73]. On the other hand, an earlier comparison of the DBs showed that errors in address information at the time were more frequent in Scopus, as compared to WoS [87]. Meanwhile, in publication-level analyses, an initial matching of publications from different data sources is mainly carried out using Digital Object Identifiers (DOIs). However, authors employing this method have usually reported a certain amount of missing or not functional DOIs in both DBs [53,57,59]. The cases of the same DOI being assigned to two (or more) publications have also attracted attention, as the purpose of DOI by definition is to enable the permanent and unambiguous identification of objects [167]. Such cases were firstly identified in Scopus DB with the estimated rate of frequency being 1/1000 [67]. A similar small-scale study performed more recently has shown that more than one-third of DOIs from over 300 papers in WoS were incorrect, as they were not found in DOI system. A further analysis of the incorrect DOIs has revealed that the majority of respective publications in WoS had two DOIs (one correct and the other incorrect) [69]. Meanwhile, instances of missing DOIs were estimated to be slightly more frequent in Scopus (6.5%) when compared to WoS (4.7%) [168].
Apart from that, other mistakes might also occur in DOIs. For instance, illegal naming (e.g., not beginning with the prefix "10.") [68,169] or zeros being treated as the letter "O" [69]. The vast majority of illegal DOI naming errors in WoS were identified as prefixtype errors [68]. Meanwhile, in Scopus, prefix-type errors were even more apparent [169]. Additionally, cases of extra digits that were inserted in DOI strings were observed in both DBs [53]. On the other hand, as a rule, the DOI error rate was determined to be relatively low when compared to other types of errors in both DBs, and an even lower rate was estimated in Scopus, as compared to WoS [53,62,167]. Nevertheless, due to the aforementioned errors in DOI indexing, publication matching in both DBs often requires performing an additional matching by publication titles, sources, or publishing years. Yet, errors may also occur in these data fields.
Various mistakes may also occur in publication titles. Apart from misspelling, other frequent mistakes in English written publication titles may occur, such as specific terms and names, may not be capitalized; letters may be missed or additional spacing introduced. Accordingly, more of these mistakes can be noted in publications of specific research fields, especially when chemical formulas and various abbreviations are used in the publication titles. However, most of the spelling mistakes occur in the titles that are written in other than English languages, especially where language specific characters are placed, which are most probably caused by incorrect transcription during data uploading into DB. It should be noted that approximately 20% of Scopus source titles are multilingual [95]. Apart from that, article titles in the reference lists may be omitted. However, errors occurring in publication titles and/or author names were empirically shown to be much less frequent in Scopus than in WoS, which suggests that Scopus deals better with language specific letters [62].
Source titles may be presented both incorrectly or, although correctly, but differently in the DBs. Thus, for bibliometric analyses at a source level, sources are mainly matched by International Standard Serial Number (ISSN) and/or International Standard Book Number (ISBN) numbers [47,50]. However, the ISSN/ISBN numbers may also be presented differently and, sometimes, incorrectly, which can further complicate the analysis [45]. In addition, source titles may change over time, but the previous titles might be left unmatched to the current title in the DBs, especially when along with the title, source's ISSN/ISBN number changes as well [4,30,58]. Accordingly, mistakes and changes both in ISBN/ISSN numbers and in source titles may lead to misidentification of such sources as new ones, and might result in duplicate source profiles. However, duplicate source profiles can be created in the DBs, even when aforementioned data are correct. On the other hand, in DBs, sources with the same title can be merged. These discrepancies in source data were determined in both WoS and Scopus and are important, since they may highly affect ranking position of the source [58].
Publication years may also be indicated differently between the DBs, as they can be presented as the date of publishing online, or as the date of inclusion to a particular journal issue. Yet, there is no common rule indicating which date should be documented in publication metadata. Therefore, the publication dates may differ in a range of a year or more between the data sources, as was shown in a recent large-scale analysis. The overall agreement on publication dates indicated in WoS and Scopus was estimated 99.5%. The majority of disagreement cases were determined as being within a range of one year, but differences by two years were also noted and they were more frequent in Scopus when compared to WoS [53]. Although the frequency of discrepancies in publication dates in both DBs was determined as being relatively low, the possibility of such differences is worth consideration, especially in performing longitudinal analyses. Apart from that, other types of mistakes may appear in the publication metadata. For instance, volume and pagination of the source may also be incorrectly presented [40,65,66,170].
Because omitted citations are the main consequence of errors in publication metadata, multiple studies have investigated and compared the presence of cited references and the accuracy of citation links in WoS and/or Scopus. One large-scale study has taken a closer look at the omitted citations, which should theoretically overlap between the DBs [62].
Although it was determined that the cited references are being frequently omitted in both DBs, but to the lesser extent in Scopus when compared to WoS (more than 4% in Scopus and more than 6% in WoS) [62,65]. In some cases, the entire reference list might be missing. This error was determined to occur more often in Scopus as compared to WoS [40,66].
Meanwhile, the cases of missing individual or multiple references, as well as references being inaccurate, were more frequent in WoS [62,66,95].
Cited references might also be missing from the DBs due to another serious reasonwhen a cited document is not indexed in the DB [95]. This cause of missing citations was much more evident in Scopus (1.30% of missing citations) as compared to WoS (only 0.16%) [62]. However, citations more frequently do not appear in the reference lists due to the incorrect linking, which, in turn, is mainly caused by the mistakes in the omitted references [62,65]. On the other hand, citation links might not be established at all in the DBs, even when both citing and cited documents are indexed properly [40,62], which might be attributed to the accuracy of the citation matching algorithms [65,66]. It was also noted that Scopus may be unable to match many citations with its indexed books [90].
Regarding the frequencies of DBs errors that are present in omitted references, Scopus was determined to be more accurate than WoS (3.53% errors for Scopus, against 4.51% for WoS). During the more detailed investigation of these errors, it was observed that the majority of them were caused by DBs (database errors), meanwhile pre-existing errors were identified less frequently. Moreover, the rate of pre-existing errors in Scopus was determined to be more than three-times lower when compared to WoS (0.59% and 1.95%, respectively), which suggested that the Scopus citation matching algorithm seems to be more robust than in WoS, when reference data with pre-existing mistakes are uploaded. This may also explain the higher occurrence of so called "phantom" citations from papers that did not actually cite the target paper, in WoS as compared to Scopus (0.46% and 0.10%, respectively) [62,65].
On the other hand, several years ago it was shown that omitted citations are gradually being corrected in both DBs, although it is not clear whether it is done systematically, or only in response to errors that were reported in the literature [66,170]. Scopus reports having improved its precision for its citation matching algorithm up to 99.9% and a recall of up to 98.3% [12]. Yet, judging by the findings of the more recent study, precision seemed to be lower than that reported by Scopus, although the high recall was confirmed [40]. Meanwhile, WoS precision in citation matching was estimated at 100%, but recall was considerably lower (93.81%) [65]. However, the overall performance of the WoS citation matching algorithm was evaluated to be slightly better when compared to Scopus [66].
Another error noted in both DBs, but more frequent in Scopus, is a presence of duplicate entries [30,40,66], which may be caused by several reasons. Duplicates may appear as a consequence of indexing of the so-called "Article-in-press", also referred to as "online first" publications, which are not assigned to the certain volume and issue of a source yet, but are already available at the publisher's website [152]. When paper in-press is uploaded into Scopus, a publication entry is created with a document type "Article-in-Press". After the paper is assigned to a certain issue of a journal, a new entry for the publication is created [89]. However, the first entry with "Article In Press" status is not always instantly removed [30]. Because Scopus is more focused towards high precision rather than recall, these versions of the document may not be merged if inconsistencies in the metadata between the items are detected [171]. Nevertheless, this Scopus error is very important, as the citations obtained by the "Article-in-Press" version of the paper tend to be lost after merging with the relevant official version of the paper [62,170,172].
Not long ago (at the end of 2017), WoS also made "Early access" articles available in the DB [173]. However, in WoS, after a document is assigned to the certain volume and issue of the journal, additional source information is added to the "Early access" publication's entry [174], instead of creating a new entry, as it is done in Scopus. Thus, "Early access" entries should not cause duplicates in WoS. Nevertheless, errors related to the online-first articles were determined to be present in almost equal rates (around 0.7%) in both DBs [62].
Duplicate entries may also be caused by the changes in source titles or their variations, which may result in articles being mapped in the DB to different journals from the same publisher. This also might influence citation analyses, as duplicate entries may inflate both publication and citation counts if both versions of the entry are being counted. On the other hand, the citation count may decrease if only one version of duplicate entry is included in the analysis, since citations may be split to both entries, counted only for one of the entries or be equal for both entries. In addition, in Scopus, duplicate and/or incomplete entries may occur due to the inclusion of Medline and Embase records [30,87].
Both owners of the DBs also declare that all sources are indexed "from cover-tocover". However, several authors have reported certain publications [85] and/or entire volumes [57,63] of indexed sources being missing in both WoS and Scopus. Elsevier states that, over the recent couple of years, Scopus has markedly improved its coverage integrity [12]. Meanwhile, inconsistencies in the coverage of the most recent literature may also appear due to the lagging in publication metadata uploading into DBs [28,173,175].
The accuracy of funding information (FI) that was presented in WoS and Scopus was also investigated. Several types of FI errors were noticed in Scopus. For instance, the presentation of funding text in Scopus was inaccurate for a considerable share of funded articles (FAs). In some cases, this was due to the ascribing of the institutions mentioned in the main article text as funding agencies, but, at the same time, the actual funding organizations were not identified. It was also noticed that Scopus tends to extract institution names as funding agencies, although they were mentioned in the acknowledgment section for other reasons. In the other cases, the partial or full information of funding agencies or grant numbers was missing, or funding organizations were misidentified [79].
The same kinds of mistakes in the extraction and coding of FI were also reported in WoS. For instance, the funders acknowledged in a publication were shown to be listed incorrectly in approximately 32% of the cases in WoS. WoS was shown to have missed at least one funder in about 11% of the records, whereas, in about 22% of funded publications, at least one additional funder was included by WoS. The authors also noted that these errors were more frequent in less popular publication types, such as letters, editorials, and conference papers [75]. Similar findings were obtained in another study, where the frequency of full or partial FI omission in WoS was determined to be approximately 12% [78]. The higher rates of omitted information were directly related to the length of the acknowledgement text. Yet, overall, a given funder was almost always included in the FO subfield, while specific grant information appeared in only about half of the entries, which was mainly ascribed to the incomplete information that was provided by the authors. Apart from that, various other mistakes in WoS FI were determined, such as inconsistent registering of co-funded grants or multiple grants being assigned to the same funder [78].

Inconsistencies in Subject Classification Schemes and Document Types
All of the sources indexed in the DBs are sorted by disciplines and subject fields in order to aid in information retrieval when aiming to narrow or specify the context of literature or journal search. However, different bibliographic DBs sort their indexed content based on their own individual subject classification schemes. This becomes problematic when data from both sources are used or compared, since the large-scale analyses usually involve the aggregation of data from multiple disciplines and subject fields highly differing in publication and citation practices, which, in turn, requires classifying the data under a common classification scheme. Thus, making an accurate delineation of subject fields is crucial for reliable bibliometric analyses, for calculating field-normalized indicators, and for studying disciplinary relations of research activities [64,173,[176][177][178].
Scopus uses classification of two levels. All of the sources are divided into four major disciplines: Life Sciences, Physical Sciences, Health Sciences, and Social Sciences and Humanities. These disciplines are subdivided into 27 categories, which are further broken down into more than 300 subject fields, based on All Science Journal Classification (ASJC) [64]. Nowadays, WoS offers two separate classification schemes: by categories and by research areas. Classification by categories is more detailed. It consists of 252 categories, based on tASCA (traditional ASCA-American School Counselor Association) categories with codes, covering the main disciplines of Arts & Humanities, Life Sciences & Biomedicine, and Physical Sciences, Social Sciences, and Technology [28,64]. Classi-fication by research areas was introduced in 2011. It comprises of over 150 categories, covering broad disciplines of Social Sciences, Arts & Humanities, and Science & Technology [28,177,179].
WoS subject classification by categories seems to be the most popular and most commonly used within the scientometric community [64,180]. Scopus classification has also been employed, but mainly due to the use of Scopus as the data source for the study (e.g., [58]). Nevertheless, both WoS and Scopus classification schemes are generally regarded as being far from perfect, although their empirical comparisons are very sparse. Moreover, classification schemes applied in both WoS and Scopus lack documentation adequately describing the methodology used to construct them, thus making their use for bibliometric analyses problematic [64]. Accordingly, setbacks that arise due to the incompatibility and flaws in WoS and/or Scopus classification schemes have been (and still are) largely discussed among the scientometric community [4,44,173,178,[180][181][182], and they are attempted to be solved by proposing alternative classification methods [177,[183][184][185][186][187][188][189].
It should be noted that, when new scientific topics emerge, the categories and their number in classification systems may change over time. Accordingly, journals already indexed in the DBs may be reclassified [176], which might affect journal's citation relations and impact [177]. However, the reliability of the results that were obtained from different DBs may be affected not only by the incompatible classification schemes, but also by a questionable assignation of indexed sources to the subject fields within the DBs [64].
An incorrect journal assignment to subject fields can be suspected in two situations: when a journal is assigned to a category to which it is weakly related, and when a journal shows high relation to a category that it is not assigned to. These relations may be determined by several ways, but least time and effort requiring methods are based on the comparison of citation relations between journal in question and other journals in the category. This method was applied in the large-scale comparison of WoS and Scopus subject classification schemes. The analysis showed that WoS performs significantly better than Scopus, especially regarding the cases of journals with weak relations to the categories that they are assigned to. Meanwhile, the situation of journals showing strong relations to the categories that they are not assigned to was less pronounced in both DBs, and the difference between WoS and Scopus was far less noticeable. When both of the situations of journal relations to categories were evaluated at the same time, in both DBs the journals weakly related to assigned category, but with strong relations to the other category were identified. However, the number of such journals was higher in Scopus as compared to WoS. On the other hand, the study only investigated journals with more than 100 citations, which might result in the inaccurate evaluation of small subject categories covering less popular and more difficult to assign subject fields. Additionally, one can argue that the higher numbers of inaccuracies determined in Scopus might be related to the higher numbers of Scopus journals included in the study due to the wider Scopus coverage, but the differences in DBs' coverage were not accounted for [64].
More recent studies also reported examples of misclassification cases or other inaccuracies in disciplinary classifications. However, opposing to the aforementioned conclusions, less errors in subject classification were indicated in Scopus compared to WoS [49]. Yet, a general observation is that in both DBs subject assignation errors mostly occur in the cases of multidisciplinary sources, which cannot be properly classified at the journal level [64,177]. Meanwhile, WoS and Scopus classification systems both work at the journal level. In both DBs, journals are often being assigned to multidisciplinary categories or with more than one subject field, but these practices were much more frequent in Scopus than in WoS. Additionally, journals in Scopus are usually assigned with a higher number of categories, and there are significantly more "multidisciplinary" categories in Scopus classification when compared to WoS [58,64]. Apart from that, it was also noticed that books in Scopus lack detailed classification, as they are often only assigned to broad categories, but they are not assigned to the particular subject fields [90]. Journal misclassifications and/or assignment to multiple subject fields, especially in the cases of multidisciplinary sources, can, in turn, cause incompatible journal ranking positions between the DBs, as the same journal may appear in the top in one subject category, but in the bottom in the other. In fact, while generally high correlation between journal ranks between WoS and Scopus were observed to be high, considerable variations in rank for individual journals were also noticed [48,58,181]. Moreover, journals might be assigned to a certain category in one DB, but be absent from the corresponding category in another DB [40,85]. Thus, journal lists representing certain categories provided by different DBs will most likely be different, and thereby should not be directly compared.
The results of publication analysis involving various types of documents could also be affected by the inconsistencies in document types between the DBs [26,95]. These inconsistencies may occur due to the fact that DBs differ by the document types that they index and by the classification of the documents, therefore some documents from the same source are present in one DB, but not in the other [59,95]. Moreover, the amounts of certain type publications in WoS may be overestimated, as WoS assigns conference papers published in regular journals as both articles and conference papers [40,52]. Thus, in analyses by document types these publications would be counted twice, while citation counts per publication might decrease [26]. On the other hand, for such publications, WoS additionally provides a precedence code to be used when it is necessary to attribute only one document type [28]. Some cases of such double assignment were also noticed in Scopus, but the occurrence of this error was estimated to be very low [62]. It was also noticed that chapters of authored books in WoS were (mistakenly) indexed as individual publications [13]. Meanwhile, at least some of the Scopus categories include non-academic papers from magazines that are classified as articles [190]. However, to the author's knowledge, the exact frequencies of mistakes in publication assignment to document types in WoS and Scopus have not been estimated.

Comprehensiveness of Information Provided by the DBs' Owners
Clarivate and Elsevier both provide extensive information describing WoS and Scopus DBs, including not only content coverage, but also all additional tools, features, and indexing policies of DBs. However, descriptive information is scattered across a multitude of information resources, which makes it very difficult and time-consuming to gather all of the relevant facts and characteristics that are necessary to create an overview of DBs and assess their suitability for a particular task.
Factual data describing the content coverage and other features of WoS and Scopus DBs (or the most part of it) are available in the official websites of vendors, presented as textual information, as well as various lists, fact sheets, and guides. Some information can also be retrieved from web-interfaces (Help sections) of DBs. However, the amount of available descriptive information is overwhelming. This is particularly apparent in the case of WoS, as information may be found at several owner's websites [191][192][193][194]. Clarivate also provides various fact sheets and reference guides that are available for downloading from LibGuides (and other) sites [195]. Thus, the same information is often presented in multiple resources (for instance, the availability of content selection descriptive information mentioned earlier). The factual data do not always completely coincide, as separate information sources differ in their preparation and updating dates (if the dates are indicated at all). Yet, even when the dates of the last update are indicated to be very recent, the information provided is often clearly not the most recent one, especially regarding the numbers describing DBs coverage, which can confuse users in determining which information they can rely on [23]. More importantly, all descriptive WoS CC information cannot be obtained from a single source. This is well illustrated in Table 1, which summarizes the main numbers describing Scopus and WoS CC coverage. Source titles (overall number) >25,100 (active) >21,400 (journals only) [196] Journals with impact metrics~23,500 >12,100 (JCR) [197] Full Open Access sources >5500~5000 (JCR-1658) [197,198] Hybrid sources n.i. JCR-7487 [197] Conferences (number of events) >120,000~220,000 [196] Conferences (number of entries) >9.8 M >10 M [28] Books (overall number of items) >210,000 >119,000 [196] Book series >850 >150 (BkCI) [28] Patents (number of entries) >43.7 M Not included in WoS CC [196] Trade publications 294 Not indexed in WoS [28] Publishers >5000 >3300 (JCR), >600 (BkCI) [114,199] Author profiles >16 M n.i.
Cited references (overall number) >1.7 B >1.6 B [196] Cited references (covered time-frame) since 1970 since 1900 [196] Entries dating back to 1788 1864 [201] Scopus descriptive information is also available through several resources. General information regarding Scopus is available at Elsevier website [202]. Yet, the information about Scopus is more concentrated and practically all of the features are clearly explained at the Scopus Access and Use Support Center [203]. Moreover, the website offers a convenient information search system, allowing to search within a selected category and sort information by intended purpose. Although Elsevier provides fewer descriptive documents, but more detailed descriptions of newly implemented features, including visual examples, are also provided in Scopus Blog [204]. Apart from that, both DBs' owners also offer visual training materials and organize webinars, explaining the most effective use of their products [205,206].
On the other hand, both of the vendors also offer several more detailed and more inclusive descriptive resources, freely available for downloading. Elsevier provides "Scopus Content Coverage Guide" [89] describing not only the coverage aspects of the DB, but also journal inclusion and metadata indexing policies. The most recent version currently available for downloading at Elsevier webpage [146] and at the Scopus web-interface was last updated in January 2020. Similar, but broader and much more detailed WoS CC descriptive document, is also provided by Clarivate [28]. However, the last version available was updated on July 2018. Similar extensive information describing the indicators implemented in WoS and additional features is also provided in the "Indicators Handbook" [207]. It should be mentioned that this resource is mainly focused on indicators that are available through the InCites Benchmarking & Analytics tool, which is not a part of WoS CC. Both of the documents are available at LibGuides site [191]. It should be also kept in mind that, since both WoS and Scopus are commercial DBs and major competitors, their vendors are mostly interested in promoting their product. Thus, certain facts may be somewhat subjective or exaggerated.

Content Coverage Evaluation by Source Lists
The fastest and easiest way to compare source coverage between particular DBs is to compare the source lists provided by the owners of DBs [27,47]. However, the quality and comprehensiveness of information provided in the lists are not usually discussed in the literature.
An accurate evaluation of WoS CC content coverage is highly challenging, as WoS is composed of a variety of specialized indexes and the composition of WoS CC can be modulated by different subscription terms [15,42]. Clarivate annually provides a full list of journals included in JCR where full and abbreviated journal titles, country/region, and inclusion to SSCI or SCIE indexes are indicated. However, because JCR only includes journals indexed in SCIE and SSCI, this list does not represent even a full list of journal titles indexed in WoS CC. Lists of the journals indexed in all WoS CC SCIE, SSCI, A&HCI, and ESCI indexes and in specialty collections: Biological Abstracts, BIOSIS Previews, Zoological Record, and Current Contents Connect, as well as the Chemical Information products can be downloaded from Master Journal List (MJL) page [116]. Recently, a separate list of journals included in JCR has also been made available for downloading at the MJL website. It should be noted that a free MJL login is required to access the downloadable files.
The downloadable lists that were provided at the MJL website have only recently been improved from unnumbered journal tables in PFD format [47], which were very inconvenient to analyze and compare and, currently, all of the lists are provided in CSV format. However, the lists still contain very little additional information regarding the journals, as each list only provides the journal title, ISSN/eISSN, publisher name and address, and WoS subject categories, which might be insufficient for certain tasks. The additional information, including citation metrics, peer review details, open access information, and more, can be found at journals' profile pages, but it can only be viewed and collected by searching journals manually [27]. Additionally, MJL does not provide any explanatory information for the lists. For example, it is not clear to what extent the titles in these lists overlap between different indexes.
However, an evaluation of coverage of other source types in WoS CC is even more complicated for non-bibliometricians, especially in the cases of books and conference material, since JCR and MJL only include journal type sources (with a minor exceptions). According to information that were provided by Clarivate Product Support, Customer Service, a small number of book series and conference proceedings, most of which were added to coverage in SCIE and/or SSCI many years ago, may also be included in JCR and have impact metrics calculated for them (N. Begum, personal communication, 10 September 2020), but this information is not mentioned in any provided JCR description.
Conference proceedings in WoS CC are grouped under separate Conference Proceedings Citation Index (CPCI), CPCI-Sciences (CPCI-S), and CPCI-Social Sciences & Humanities (CPCI-SSH). However, there are no lists of conference proceedings available for downloading or even for browsing. As for books, Clarivate offers separate Master Book List (MBL) [208] website for searching book titles. Books can be searched by title, series title, ISBN, or publisher. The link to the full list of publishers is also provided [199]. However, there is no possibility to download a list of indexed books. Additionally, according to Clarivate, books are integrated within other indexes [196]. Thus, the book search in the MBL may not be limited only to WoS CC indexes, and it is not clear whether content access restrictions that are defined by subscription terms are relevant to the search. The lack of exhaustive source lists describing the coverage of other sources besides journals was also noted in literature [62,170].
Regarding Chemical Information products, according to the official description of WoS CC, two specialized chemical indexes (Current Chemical Reactions (CCR) and Index Chemicus (IC) are also a part of WoS CC [196], but apparently they are not always accessible with a basic WoS CC subscription. For instance, our university subscribes WoS CC, but it has no access to IC and CCR. However, there are no explanations provided regarding such cases.
Aiming to broaden geographical and language coverage, WoS has included several regional indexes (SciELO, KCI, RSCI, and ACI), which may be accessible with the basic WoS CC subscription, but the terms of their accessibility are not explained. Moreover, the content overlapping with the main WoS CC indexes is unclear and difficult to determine, since the lists of source titles that are covered in regional indexes are not provided in the MJL. A brief description of indexes, including coverage extent and time frame, content providers, and the main subject fields covered (for most of the indexes), can be accessed from WoS website [193], and also at additional sites [209,210], as well as via the WoS "Help" directory. On the other hand, WoS provides extensive descriptive WoS CC document with their own performed analyses of content coverage by disciplines, languages, and countries, including overlap analyses between different indexes [28]. However, the set of sources included in the analyses differs with no clear explanation.
Apart from that, although ESCI is a part of WoS CC, ESCI BackFile covering literature back to 2005 [211] is not included into basic subscription and it has to be purchased separately [55]. Thus, it is not always entirely clear if certain citation indexes are a part of WoS CC or are accessible only with an additional subscription (as for already mentioned chemical and regional indexes). Moreover, the accessibility of content might be further restricted by time-frame limitations (for instance, in the case of Lithuanian Universities subscribing WoS CC, the available data reach back only to 1990). Therefore, the composition of WoS CC and the extent of actually accessible content for users of particular institution may differ significantly from the one described by the owner (Table 1) [15,29].
Meanwhile, in Scopus DB, the full lists of source titles (including both currently active and discontinued titles) are provided and they can be downloaded in an Excel file format at Scopus Source page (requires user login) [115], and at Elsevier website (registration is not required) [146]. Two separate source lists are available for downloading at Source page: one list is named as "source titles and metrics", and the second one, for "source titles only". However, both lists present much wider information than could be expected. In the first list, full source title, Scopus Source ID, source type, publisher, Open Access status, ISSN/ISBN numbers, classification by Scopus subject areas, and by ASJC classification codes are provided, along with various measures describing the quality of sources, such as citation counts, percent of cited documents, percentiles, ranks, and quartiles in every subject category to which the source was assigned, scholarly output, and Scopus metrics (CiteScore, SNIP and SJR) of the last ten years. Titles with CiteScore Percentiles from the 99 th -90 th (Top 10%) are also marked separately.
The second list, although named as presenting only source titles, actually provides much more information regarding the sources. In addition to current, former, and other related titles, it also lists sources' Scopus IDs, ISSN/ISBN numbers, activity and Open Access statuses, coverage time frame, article languages, inclusion in Medline, "Article-in-Press" availability, source type, publisher's name and country, classification by disciplines and by ASJC codes, and the values of impact indicators (CiteScore, SIP and SJR) for the preceding three years. Moreover, titles that are included in the DB during the current year are also marked separately.
It should be noted that both Scopus source lists (except for the conference proceedings in the second "titles only" list) are formatted as a pivot tables. Therefore, the source data provided in these spreadsheets can be easily sorted and filtered by all of the listed variables, which makes them very convenient to use. Moreover, an explanation of Scopus metrics, ASJC code classification, and/or information regarding the inclusion of Medline content are also provided in the separate sheet in both lists.
In addition to journal type sources, conference proceedings are also included into the aforementioned Scopus source lists. Proceedings are listed separately in the second ("titles only") list: in one sheet serial conference proceedings (with profiles) are listed, and in the separate sheet all indexed conference proceedings are listed. Apart from conference source titles, these lists also provide names of the conference events, Scopus source IDs, ISSN (if available), year of publishing, ASJC code classification, and values of the main metrics for the last three years (if applicable). For serial conference proceedings with profiles, coverage time frame is also listed and titles discontinued by Scopus due to quality issues are marked. Moreover, the lists also include book series and trade publications.
Apart from that, a separate more detailed list of all books indexed in Scopus can be downloaded at the Elsevier website [146]. This list is also provided in Excel format. Books are divided into separate sheets: all of the non-serial books are listed in the first sheet, books series are listed in the second one, and, in the third sheet, book series are listed by individual volumes. The list contains information of book titles, ISBN/ISSN numbers, publication year, publishers imprint and its grouping to the main publisher, and general classification by subject areas.

Search and Online Analysis Capabilities
With the growing amount of scientific literature, the retrieval of the most relevant information becomes increasingly important, especially when performing systematic literature reviews and metaanalyses, as well as when information is searched for the other tasks. Thus, the performance of search engines implemented in the bibliographic DBs and provided search options may highly influence the suitability of the DB for a particular task [24,42,84,212].
Various search options are available at both DBs. WoS allows for performing the basic search, cited reference search, advanced search, and author search [105]. In Scopus, one can search for documents, authors, and affiliations, and perform an advanced search [108]. The basic search can be focused on various search fields, with the common ones between the DBs being: title, topic, author, language, funding information, source information, affiliation information, references, DOI number, ORCID identifiers, and more. Additional search field rows allowing to search by several different criteria at once can be added in both DBs. The secondary search within the results can also be performed by entering additional terms in the "search within results" window. Additionally, for registered users, the previous searches can be edited, combined, saved, and set as RRS feeds or alerts. However, both of the DBs do not provide the possibility to search documents by the publisher.
A separate option for author search is available in both DBs. In Scopus, the initial author search can be performed by the last name and by the first name or an initial. The search can be limited by an affiliation, and the ability to search for author by ORCID is also available [108]. In WoS, authors can only be searched by their last names and first names or initials. On the other hand, differently than in Scopus, WoS allows performing single search of an author, including additional name variants in the same search query. In addition to the search by ORCID, WoS allows for searching authors by their Researcher IDs [213].
Author searches in both DBs bring the list of matching authors, but the information provided in the result lists and refinement options are different. In Scopus, not only the name of the author, but also the current institutional affiliation of the author, city, country, number of documents, with the ability to view the most recent one's title, and h-index value are presented for every author in the list. Meanwhile, WoS additionally provides alternative name variants, publishing time frame, and the most popular source titles for every author in the list. The retrieved list of authors can be further refined by author name, organization, and subject categories in WoS, and by affiliation, city, country, and source titles in Scopus. Post query refinements are helpful (and often even necessary, especially in WoS) in the cases of authors with very common names [4,13].
Although author disambiguation is more straightforward in Scopus, as every author is assigned with unique identifiers, in some cases, authors may still have multiple IDs. Selected authors from the list can be requested to merge, but there is no possibility for an instant preview of combined results. On the other hand, the potential author matches can be viewed and grouped in Scopus author's profile page using "Potential author matches" link [108]. However, this function is only available to subscribed users. Meanwhile, in WoS, potentially matching authors can be temporary grouped directly from the author search results page [213].
Searching for an institution by WoS basic search may be performed in two ways-by using Organization-Enhanced or Address fields. In the first case, the institution of interest may be selected from the Index, listing institutions disambiguated by Organization Enhanced (OE) system. However, it should be kept in mind that not all organizations have been unified yet [214]. Poor coverage of institutions in the WoS Organizations-Enhanced list was confirmed by a recent study, where only 20% institutions were retrieved by OE tool from the sample of 445 institutions [72]. According to WoS owners, the search in address field is more accurate, as it searches the complete author affiliation, including country, postal code, department, or organization abbreviation, while Organization-Enhanced only searches for the unified organization name [214]. The search query should be formulated very specifically towards the searched institution in order to minimize the retrieval of false positive results. However, although a full coverage of sample institutions was achieved by searching institutions with their normalized name variants using address field, this method offered poor precision (0.61) and recall (0.74) and, therefore, may not be considered to be a viable alternative [72]. Meanwhile, in Scopus, the institution search option is listed separately among other search options as "Affiliation" search. The search can be conducted by using institution's name or a part of it. Moreover, differently than in WoS, affiliation search retrieves the list of institutions, not publications [215].
DBs also provides abilities to search for sources. WoS sources can be searched several ways. JCR is a preferable tool for journals indexed in SCIE and SSCI indexes, also allowing for comparing sources [114]. However, JCR does not include journals, indexed in other WoS CC indexes. Journals indexed in all major WoS CC indexes (SCIE, SSCI, A&HCI, and ESCI) can be browsed and searched at MJL page [116], being directly accessible from the DB's interface. Meanwhile, books can be searched at MBL page [208], but the direct link is not provided. WoS does not seem to provide any additional tools enabling to search for conference proceedings. The only option to locate these source types is to perform basic search using the publication name field within CPI indexes. However, in this case, the search will retrieve all of the documents published in sources with titles matching the search query used. Therefore, there is no guarantee that all of the retrieved documents are published in the same source. Especially, the fact that conference titles are usually long and may be presented very differently makes it difficult to formulate an accurate search query [57]. Meanwhile, in Scopus DB, all of the indexed periodical sources, including journals, books series, conference proceedings, and trade publications, can be searched, browsed, compared, and analyzed directly at the Source section of the DB's interface. The only exception applies for patents, as they are provided as a separate list in search results [53].
Both DBs provide opportunities to perform a more comprehensive search by using advanced search option. Advanced search can be performed by an exact phrase, truncated words, or using wildcards. The search with Boolean operators was also shown to work properly in both DBs [42]. The content of DBs can be searched by a large variety of field tags, including the ability to search for Medline records (by PubmedID), funding information, to perform cited reference search (including patent citations), and more [28,89]. The existence of institutional and author identifiers in Scopus allows for employing them using the advanced search option. Meanwhile, although using WoS advanced search the particular author may be selected from the Index, the search still automatically groups the publications of other matching authors together.
The precision and recall of search engines of both DBs was estimated to be very high, although a higher focus on recall rather than precision was also observed. The precision of all searches can be increased by applying post query refinements [42]. Scopus was shown to perform better than WoS both in regard to precision and sensitivity. However, for both WoS and Scopus, cited-reference search was shown to be the most suitable when a maximal count of relevant results is required, since reference searches have retrieved approximately three-times more results when compared to keyword searches [83]. Meanwhile, keyword searches were shown to be more efficient in WoS as compared to Scopus [168].
In any case, it should be noted that the recall rates between DBs may highly differ due to the different coverage of relevant content [23,27,43,57]. Additionally, available search options in WoS CC are highly dependent on the underlying databases used. For instance, search options are narrowed when a search is performed within "All databases" [42]. Thus, it should be kept in mind that all of the search results in WoS are only retrieved from editions and years available by the institution's subscription package. The only exception is the case of Cited Reference search, which does not depend on the subscription type, since "Citing Articles" counts reflects citations from all years and all editions of the WoS CC-even of those that are not subscribed [105]. Meanwhile, in Scopus there are no such limitations as all content of the DB is subscribed as a whole product by a single type subscription [53].
Apart from that, author would like to mention several useful observations that were pointed out by authors in their studies. Firstly, since WoS only records acknowledgments if they contain funding information [74,76], WoS FI may not be suitable for broader explorations of other kind acknowledgements. For instance, when investigating acknowledgements to libraries in WoS indexed papers, WoS FT search only retrieved 56% of articles actually having that information, and only if funding was mentioned [104].
FI in WoS may be incomplete with certain data fields being absent, as was already discussed earlier. Thus, the search should be performed by all FI fields (FT, FO, and FG) separately in order to retrieve as complete as possible set of funded publications from WoS [77]. Also, according to Clarivate, the process of unifying funding agencies' names is still ongoing. Thus, users have to search for all possible variants of funding agencies names [103]. Meanwhile, Scopus FI was not investigated as extensively as in the case of WoS, but it was shown that Scopus funding information suffers from the same and even greater deficiencies [79]. Thus, the same precautions should be taken when searching for FI information in Scopus.
A search by DOI identifiers is the other important search option. However, a field tag "DO" in an advanced search in WoS also searches in the article number field, which might be related to the fact that a separate field tag for a search by article number is not provided, but this point has not been explained by WoS owner and it may confuse users. However, users should manually add the prefix "ARTN" before the "article number" values in order to retrieve an article by the DOI search. Meanwhile, differently from WoS, Scopus allows searching by the article number. However, DOI searches in Scopus were also shown to retrieve inadequate numbers of records. However, in this case, the problem was mainly ascribed to illegal DOI names [169].
The obtained search results or any other sets of documents can be further sorted, filtered, selected, excluded, and further analyzed in both DBs. DBs also provide tools for performing online data analysis at their webpage interfaces: WoS offers "Citation Reports" and "Analyse Results" tools. The "Create Citation Report" tool provides citation information for the investigated publication set, including h-index, which is presented both with and without self-citations, and its calculation graph. The "Analyze Results" option allows for assessing wider and more detailed information regarding the analyzed publication set, allowing to sort it by authors, co-authors, publication years, countries, institutions, source titles, funding information, and the main research areas. The refinement of search results by the aforementioned features may also be performed directly within the search results window by applying respective filters [105]. Very similar online analysis opportunities are also provided in Scopus. Practically all lists of retrieved results in Scopus can be viewed in search results format, and then further analyzed using "Analyze search results" and "View citation overview" tools [108]. The obtained document lists can also be refined by access types, years, authors, subject areas, funding sponsors, document types, affiliations, sources, publishers, countries/regions, and languages.
The results of performed searches, analyzes, and citation reports in both DBs are presented as lists, tables, and/or graphs. The lists can be exported (with certain limitations) and graphs can be downloaded. Scopus also allows for exporting the selected refinement options using the "Export refine" tool. The lists of results or references can be also exported to Mendeley or EndNote reference management tools or in other formats in both DBs [105,108]. Additionally, Scopus provides an opportunity to create bibliography of the selected documents by more than ten of the most popular citation styles using "QuikBib" tool [216].
It should be mentioned that, opposite to WoS, in Scopus the names of the authors, publication, and source titles, and even institutional affiliations almost everywhere (except the refinement panel), are presented as the active links redirecting to the particular profile or publication's details page. Meanwhile, in WoS, only publication titles and journal names are linked to the publication's and journal's details pages, respectively. Yet, in the case of source titles, this only applies for journals that are included in JCR [114].

Data Export Limitations
Large-scale bibliographic analyses are highly challenging due to the amount of the required data, since large data sets usually cannot be directly analyzed at the DBs' webinterface due to the limited online analysis capabilities. Possibilities to export or download data from the web-interfaces for external analyses are also usually limited, and the number of data rows available to export in one batch differs between the DBs [9,26]. For example, WoS allows for creating data sets of up to 50,000 data rows in one session (for citation analyses-up to 10,000). Analysis data can be extracted as displayed or all data rows (up to 200,000 rows), which can be downloaded as a tab-delimited text file. However, the export of citation analysis results is limited to 500 rows at a time [28,105]. In Scopus, the export limit reaches 2000 data rows, but Scopus also allows exporting up to 20,000 results (in "Citations only" (CSV) format) using e-mail services. Yet, export to reference management tools is limited to 500 document entries at a time [217].
Therefore, in most cases, large data sets are extracted through the API interfaces of the DBs (if available). However, since both WoS and Scopus are commercial, their content is protected by restricting access to raw data [42], and access to their APIs usually requires additional payment [32]. Although, according to Elsevier, Scopus provides free access to a basic API version of Scopus data for research purposes without a requirement of subscription, allowing limited access to basic metadata for most citation records, while full API access is only granted to subscribed users, but it does not require additional payments. Moreover, Elsevier has just recently established International Center for the Study of Research as a "virtual laboratory" that provides free access to Scopus data for research purposes [12]. Meanwhile, WoS offers several different API types, but access to them has to be acquired separately with additional conditions for use [28,218].
Raw data from both DBs can be further analyzed by various bibliometric software tools. However, this approach, as well as using APIs for data retrieval, requires some kind of programming skills [9,42]. Another possibility for analyzing larger data sets is to use additional online tools, such as InCites (for WoS data) and SciVal (for Scopus data) [26,219]. However, both of these tools are not a part of the DBs and they are only accessible with separate (and costly) subscription. On the other hand, almost all software tools and libraries currently used for bibliometric analyses can import data extracted from WoS and Scopus [9].

Citation Impact Indicators Implemented in WoS and Scopus
The amount of journals indexed in the main bibliographic DBs has significantly increased and is still growing [220]. Therefore, aiming to evaluate the quality of a journal, only its inclusion in WoS or Scopus DB is not a sufficient criterion anymore. The need for instantaneous quantitative indicators that are suitable for use in evaluating research and helping with daily tasks led to the constant development of impact indicators that are simple enough to calculate and interpret, but are also capable of correcting the main issues in citation analyses. Consequently, over twenty different bibliometric indicators are currently available, and new versions or alternative metrics are still being actively developed [26].
WoS and Scopus are the main sources of the most prevailed scientific impact indicators, therefore the comparison of these DBs will not be comprehensive without mentioning the impact metrics provided at these data sources. Especially, due to the fact that citation metrics are calculated only using data from the particular DB, they are dependent on DBs coverage width and depth. Although comparisons and assessments of impact indicators are one of the main topics in scientometric literature, their main features, especially their differences and limitations, should be constantly reminded, since impact indicators are one of the main factors determining the selection of the most appropriate data source. Yet, only a minority of stakeholders employing bibliometric indicators are adequately familiar with their meaning and appropriate use [2,4,5,7]. However, only the main features of WoS and Scopus journal impact indicators, including their meaning, purpose, the main drawbacks, limitations and usage precautions, will be discussed, as bibliometric indicators are not the main focus of this work. More detailed descriptions and comparisons of impact indicators are provided in a number of literature reviews [2,26,45,46,221,222].

Basic Journal Impact Indicators
WoS and Scopus offer different journal impact indicators. The most well-known Journal Impact Factor (JIF) and various other impact indicators, including Eigenfactor metrics (Eigenfactor Score (ES) and Article Influence Score (AIS)), are available at WoS DB [207]. Meanwhile, Scopus provides CiteScore (CS) metrics, along with other, more advanced indicators-SCImago Journal Rank (SJR) and Source Normalized Impact per Paper (SNIP) [175].
JIF was developed by Eugene Garfield more than half a century ago [223] and it was originally intended to define the quality of journals to provide high quality indexed content, and for publication or subscription purposes. However, today, JIF is often misused due to a misinterpretation of its meaning, as it has become a standard tool in virtually all evaluation practices at all levels, usually without proper considerations of the rationale for its use [2,44,222,[224][225][226][227].
The classical JIF is defined as all citations to the journal in the current JCR year to items published in the previous two years, divided by the total number of scholarly items that were published in the journal in the previous two years [228]. However, the JIF calculation method leads to several biases that limit its application and it is continuously discussed and criticized in scientometric literature concerning bibliometric indicators and their application in research evaluation practices [222,224].
The most obvious limitation of the JIF is the short two-year citation time window. This issue is even more obvious when applying JIF in subject fields where citations mature slower or are delayed, as in Mathematics, Social Sciences and Humanities [26,46,224,229]. Thus, JCR also provides five-year JIF, which is basically the same JIF with the time window for cited documents extended to five years. WoS also calculates an Immediacy index, which is basically a one-year JIF [26,45].
Yet, probably the most debated feature of JIF is the inconsistency of document types included in its numerator and denominator. In the numerator, citations are counted to all document types, and only specific types of documents, so-called "citable items", namely articles, reviews, and proceedings papers are included in the denominator, which represents the total amount of cited papers [45,224]. It should be noted that, although ESCI journals are not included in the JCR and do not receive JIF (and other JCR metrics), the citations from ESCI will accrue to all articles in the WoS CC and contribute to the JIF numerator [28].
The inconsistency of document types that are included in JIF calculation opens the possibilities for manipulation. JIF values can be deliberately distorted by publishing more reviews, which are more highly cited than regular articles, or by publishing case reports, letters to editors, and book reviews, which are being cited, but not included in the total paper count in the JIF denominator. In this way, JIF also shows a certain level of discrimination against less popular publication types that, in certain disciplines, are as important as original research articles [2,26,45,46,224,225,229].
JIF values can also be artificially inflated by other means, for example, by pressuring authors to reference articles from particular journal, as journal self-citations are not accounted for in the calculation of classical and five-year JIF [224,225,230]. On the other hand, the JIF version without self-citations, which is calculated in the same way as the classical JIF, but excluding journal self-citations from the numerator, is also available in JCR [45].
In response to aforementioned JIF drawbacks and limitations, in December 2016 Scopus launched CiteScore (CS) impact indicator. Because the basic principle of calculating CS is very similar to that of JIF, CS can be considered to be the Scopus version of JIF. However, there are some fundamental differences, mainly aimed at correcting the main limitations of JIF [175,231]. Firstly, the time window applied is larger than in classical JIF and, more recently, it has been increased from three to four years. Currently, CS counts both citing and cited papers published during the previous four years as compared to two-year time frame applied in the classic JIF [232]. Citations are also counted from the four-year period, as compared to JIF, where citations are only counted from the current JCR year. Scopus also offers the CiteScore Tracker, representing the trends of upcoming source's CS value (similar to WoS Immediacy Index). It is calculated for the current (citation) year, rather than previous, complete years, and it is updated monthly [175].
The second and most important difference is that, in CS calculation, the publication types included in numerator and denominator coincide and include all document types indexed in Scopus. In the recently updated CS version, the following publication types are included: articles, reviews, conference papers, data papers, and book chapters [232]. It should be noted that, in both JIF and CS calculations, citations to "Early Access" and "Article-in-press" publications (respectively) are not included. However, due to the consistency between CS numerator and denominator, the CS values would often be lower than the JIF values for the same journals [175].

Normalized Indicators
Different publishing and citing cultures between disciplines and even subject fields make traditional journal impact indicators not appropriate for direct comparisons of journals and analysis results that were obtained from different disciplinary context. Journal impact indicators that are normalized by subject field have been developed to address these limitations [178,233]. The main difference between normalized indicators lies in their normalization approach. The main two approaches used for subject normalization are: field (also referred as target or cited-side) normalization, and source (or citing-side) normalization [185,233].
Field normalization (or cited-side) is basically achieved by comparing the actual numbers of citations with the expected citation counts within a particular research field [233]. The main drawback of this approach is the dependence on the subject field classification schemes of WoS or other DBs. Generally, the WoS subject classification scheme is most commonly used for normalization purposes [26]. However, as was discussed earlier, both WoS and Scopus classifications are not perfect and they may introduce certain biases in normalizing impact indicators [221].
The underlying idea of citing-side or source normalization approach is that in the high density fields papers would likely have longer reference lists than in the low density fields [233]. Thus, the main difference of this approach from field (or cited-side) normalization is that the definition of subject field is not based on the WoS or any other predefined category classification schemes, but it is determined as the set of all papers or journals citing the target set of papers. In this way, the set of reference journals is unique for every journal and allows for avoiding using predefined category classification schemes [185,234]. On the other hand, it raises the question of validity for comparisons, as journals or other document sets are being evaluated against unequal benchmarks [221].
Both of the normalization approaches have been empirically compared. Many authors have determined that the citing-side approach may outperform the traditional method of cited-side normalization. For instance, the citing-side subject normalization approach was determined as being the most objective and appropriate for field normalization, when compared to normalization that is based on predefined subject classification schemes, especially in the case of multidisciplinary journals [180,185,235]. Moreover, the source normalization approach was shown to be effective, even at the level of pairs of journals from the same field [234]. However, these conclusions have also been challenged [236]. On the other hand, performance evaluation of normalization approaches also depends on the study design [185,233]. Thus, the definitive conclusion on which of the two approaches performs better has not yet been reached.
The SNIP indicator, as implemented in Scopus, is normalized using the source (citingside) normalization approach. This indicator measures the contextual impact of a journal, since the journal's subject field is defined as the collection of papers citing that journal. In this way, SNIP accounts for the differences in citation densities across fields, since the citations receive a higher value in the fields where citations are less common (cited references lists are shorter) than in the fields where citing is denser (the cited references lists are longer) [45,234]. Moreover, SNIP also takes the degree of DB coverage of literature of the field into account, which allows to avoid systematic underestimation of journals from subject areas poorly covered by the DB [237,238].
The calculation of SNIP is carried out in two stages [234]. However, several problems causing certain anomalies were pointed out in SNIP calculation and they led to revision of the methodology and implementation of several changes [235]. It should be noted that, in both SNIP versions, only articles, conference papers, and reviews are included in calculation as both cited and citing documents. However, in the revised SNIP version, the citing universe is further narrowed by excluding the papers that do not meet certain criteria. This correction makes the indicator more robust, but, on the other hand, more unfavorable for journals in certain fields, where less popular document types are often published. Both SNIP versions have been compared, but none of the versions has been determined to be superior to the other. There was concluded that both SNIP versions are highly correlated and effectively correct for the field differences [235,237]. Yet, several problems in both original and revised SNIP versions were also pointed out [239].

Journal Prestige Indicators
One of the limitations of the traditional journal impact indicators is that all of the citations are considered to be equal in weight. Aiming to account for the differences between citation values new family of prestige indicators (also referred as influence measures) have emerged. These indicators account for the journal's scientific importance or prestige, as citations are weighted according to the status of a citing journal. The main idea is that a citation from the high-quality journal is worth more than a citation from the obscure journal. Therefore, these indicators make a distinction between the journal popularity and prestige [45,238].
Prestige indicators are based on the relative frequency of journal's occurrence in the citation network. The prestige of a journal is computed by the recursive algorithm that is based on the behavior of typical researcher, who reads a random article and then selects a random article from the previous one's reference list and so on. In this way the researcher moves between journals in the citation network and the frequency at which each journal is visited reflects its scientific importance in the network [240,241]. However, more complexed calculations being required for the estimation of the journal's prestige make these indicators less transparent, harder to interpret, and difficult to replicate [175,221].
The most well-known journal prestige indicator-ES-is implemented in WoS (JCR). ES was developed by Carl Bergstrom in 2007. Initially, it was dedicated to help librarians in making journal subscription decisions, as it measures the total scientific prestige of the journal provided by all of the articles published in that journal in a year [240]. However, ES has several major limitations, which led to its derived versions.
The first and the most obvious drawback of the ES is its numerical value. The overall sum of ES values of all journals included in WoS CC is equal to 100 and it is divided to each journal according to their prestige. ES numerical values are very small and thus inconvenient to work with. Therefore, the normalized ES (nES) is simply rescaling of ES, so that the value of an average journal would be equal to 1.0. Additionally, differently than most of the second generation indicators, ES (and its normalized counterpart nES) is affected by the total number of papers published by the journal, resulting in the ES values decreasing even further when new journals are being added to WoS DB [45,221]. The size of the journal is accounted for in the AIS indicator, which measures the influence per article in the particular journal. AIS is derived from ES, multiplying it by 0.01 and then dividing by the number of papers published in the journal in the previous five years, providing an indicator with an easily comparable value (the mean AIS value is 1.0) [207]. Thus, AIS can be seen as a five-year JIF normalized value, and it can be more directly compared to JIF by the average performance of journal's articles [2,45,240]. It should be noted that, in all ES metrics, journal self-citations are explicitly excluded [2,26,221].
Scopus journal prestige indicator-SJR-was developed by the SCImago Research Group in Spain. The choice of Scopus as a data source for SJR development was based on the better coverage and representation of world science in comparison to WoS [241]. SJR works in a similar way to ES, but the calculated and normalized SJR values are adjusted to easy-to-use indicator values, where 1.0 indicates the value of an average journal. Therefore, SJR resembles AIS in both numerical value and meaning [221]. However, differently than in the calculation of AIS, the citation time window for SJR was set to three-years, as it was shown to be the shortest time window sufficient for the establishment of citation peeks among all subject fields in Scopus [241]. Journal self-citations in SJR are limited to 33%, aiming to restrict possible manipulative citation practices. However, differently than in the case of ES and AIS, the value of journal self-citations is not completely neglected in SJR [45,221,238,241].
SJR was also revised once and presented as a SJR2 indicator, being currently implemented in Scopus. SJR2 (now known simply as SJR) explains not only the prestige of the citing journal, but also its thematic affinity to the cited journal. Thus, besides the scientific importance of the citing journal, SJR also takes the differences between disciplines into account, since greater value is given to the citations from the same field than to the citations from unrelated fields. Both versions of SJR were empirically compared, and it was concluded that the revised version more efficiently addresses cross-disciplinary differences in impact [242].

H-Index
H-index is a hybrid metric that is provided in the majority of bibliographic DBs and other data sites. The h-index was introduced by Hirsch in 2005 to quantify an individual's scientific research output [243]. Its numerical value denotes the amount of top papers (h) in the collection of evaluated papers, each being cited at least h times. The indicator is robust, objective, simple, and easily calculated. However, the biggest advantage of h-index is that it combines productivity and impact in a single measure. Moreover, it can be applied at different levels-individual researchers, journals, institutions, or other paper collections. Therefore, although relatively new, the h-index rapidly became one of the most prevalent metrics in research evaluation practices [2,221,244].
However, as all other impact indicators, along with the aforementioned advantages, the h-index also has its own limitations. Firstly, its value can only increase. Additionally, as h-index is based only on highly cited publications, it is insensitive to the actual number of citations. Therefore, two journals or researchers with the same h-index can have a significantly different number of total citations. Moreover, h-index strongly depends on the total number of publications and their lifetime, which is directly related to the number of citations, which makes it disadvantageous for new journals and young researchers. Differences in co-authorship are also not considered. Yet, probably the most important disadvantage of h-index is that it is not normalized across the subject field and, thus, cannot be used for comparisons between different disciplines [2,46,244,245]. Additionally, the h-index does not account for self-citations with an argument that, while self-citations may increase the h-index value, their effect on the index value is much smaller than on the total citation count [26,243]. However, a theoretical study has shown that the h-index, as well as its variants, are susceptible to possible manipulations by self-citations [246]. Scopus provides an opportunity to select and view the h-index value without self-citations by using "Analyze author output" tool in the author's profile page. Meanwhile, this option is not available in WoS. Nevertheless, h-index can be a valuable tool for evaluations and comparisons, but only when used with an awareness of its limitations.
Aiming to overcome these limitations, approximately 50 variants of h-index were proposed and compared [246][247][248][249], but they are not implemented in the bibliographic DBs. Comparisons of the h-index with other impact indicators were also performed, usually concluding high or moderate correlation of h-index with other indicators [246], but mostly with ones measuring productivity rather than impact [221]. An overview of h-index related studies is provided by [250].
However, one limitation of the h-index-a high dependence on the data source-cannot be corrected by alternative versions, since the h-index calculation is only based on publications and citations covered by the particular DB. Thus, h-index values obtained from different data sources are almost always different making it unclear which h-index may be the most reliable [244,251]. Apart from that, it should be noted that WoS alone presents two versions of h-index for authors. The one calculated from search results using "Citation Reports" tool depends on the content that is available by subscription terms determining not only the subscribed set of WoS CC indexes, but also the time frame of accessible content. Meanwhile, the h-index value provided in WoS author's profile is calculated from all WoS CC content. Moreover, the "View Full Citation Report" option in author's profile may also provide slightly different citation counts. Therefore, one can get easily confused as to which the h-index from WoS DB is valid [252].
Nevertheless, the h-index values obtained from WoS and Scopus were usually determined to be highly correlated, with the h-index values based on Scopus being higher comparing to the values calculated from WoS data due to the wider coverage of Scopus DB [13,85,251]. Some exceptions were also observed, but only in cases of academics with long scientific careers, which were mainly attributed to the lack of Scopus coverage of citations prior 1996. However, the current situation might have been improved after a completion of Scopus Cited Reference Expansion Program, which have extended the time frame of cited references in Scopus up to 1970 [89].

Recommendations for the Correct Choice of Journal Impact Indicators
Originally, journal impact indicators were created to help answer more basic questions among the academic society: for authors to decide, where to publish their research, for students and researchers to choose the most relevant literature and most valuable collaboration possibilities, and for librarians to choose which journals to subscribe [2,222]. However, nowadays, they are more commonly used as the fundamental tools in research evaluation practices. Because the correct choice and appropriate application of indicators is equally important for research evaluations, bibliometric analyses, and other tasks, the issue is clearly recognized in the scientometric literature, with recommendations and guidelines being basically provided in all related works.
Because of the huge variety of impact indicators available today, it is often difficult to choose the most suitable metric for the task, especially as many of the indicators may appear similar in their meaning and intended application purpose [221]. Unfortunately, the majority of stakeholders employing these metrics for various purposes are not the experts and, therefore, lack a clear understanding of metrics and how they should be appropriately used and, accordingly, often are unable to make the most adequate choice [2,4].
Journal level indicators can be differentiated according to their accountability for the size of the journal to size-dependent metrics, such as total citations, h-index, and ES; and, size-independent ones, including JIF, CS, SNIP, SJR, and AIS. Size-independent metrics represent the impact of a typical article in the journal and they are not affected by the differences in numbers of articles published in the journal. These indicators may be more suitable when choosing a journal for publishing. In the meantime, size-dependent metrics illustrate the overall impact of a journal and they may change in response to different numbers of published articles [26,253]. However, it should be kept in mind that they are designed only to assess the journal as a whole and cannot be used to assess individual articles in it. Accordingly, these indicators may be more useful for practices where a quality of a whole journal is important, such as journal subscription decisions [45,254].
Aiming to evaluate both the performance and suitability of the most prevalent bibliometric impact indicators, they have been continuously compared [253]. Generally, the most popular citation-based journal impact indicators were well correlated. For instance, JIF and CS showed a strong significant positive correlation [255]. A significant correlation between CS, h-index, and SJR was also determined [128]. JIF, SJR, SNIP, and ES were shown to be strongly correlating with the highest correlation rates being between SJR and JIF, and SJR and ES [51]. In the other study, the strongest correlations were observed between IF, five-years IF, and IPP, as well as between SJR and AIS, and between SNIP and IPP [221]. Similar results were also obtained in the other study [58]. However, while the ranking lists of journals based on different indicators tended to be similar, it was shown that the ranking positions of individual journals varied significantly, depending on the indicator used [48,221,242,256]. This illustrates why specific features and limitations of individual indicators are essential for the choice, since one indicator may be more suitable for one purpose, but less for another [2,4,5,238].
Basic journal impact indicators (JIF and CS) are size-independent and, thus, should not be affected by the journal size. However, both of the metrics also share several limitations. One of them is that citations are not weighted, which is, the indicators do not take the quality and prestige of the citing sources into account [257]. However, the most important limitation is that both of these indicators are not normalized by disciplines and, hence, cannot be used to directly compare journals from different subject fields [178]. Thus, normalized (SNIP) and/or prestige indicators (SJR, ES, and AIS) may be more suitable, as they were designed to address these limitations [2,45]. As a part of SNIP is calculated essentially in the same way as three-year JIF [221,235,236], SNIP can be interpreted as a proxy of JIF, being corrected for the differences between subject fields [237]. In fact, journal rankings based on the SNIP were shown to be of the greatest cross-discipline stability, when compared to the rankings based on SJR, IPP, h-index, and AIS indicators [58]. Therefore, SNIP is particularly useful in multidisciplinary research fields, and it is often suggested as the best alternative to JIF for cross-discipline comparisons [2,51,58]. SJR can also be applied for this purpose, since, in addition to the prestige of citing journal, it also accounts for the thematic affinity of the journals [242]. Meanwhile, the five-year window applied in the calculation of all Eigenfactor metrics (ES and AIS) makes them particularly suitable for application in Social Sciences and Humanities, where citations accrue more slowly [45]. Nevertheless, JIF and CS may both still be a valuable tool in determining a journal's impact within the particular subject field [2,256]. Table 2 summarizes core features of the most prevalent journal impact indicators provided in WoS and Scopus. Although the JIF and other WoS journal metrics are the most widely used for a variety of purposes, their application is limited, not only due to the limitations of the metrics themselves, but also by their availability [45]. It should be always kept in mind that all of the indicators provided at WoS are only available through the subscription of the JCR tool. The only exception is journal Quartiles (Q), which can be viewed at WoS webinterface without an additional subscription of JCR [196]. However, indicators are only calculated for journals that are included in JCR, covering only about two-thirds of journals indexed in WoS CC (SCIE and SSCI indexes). Thus, since journals belonging to A&HCI and ESCI are not included in JCR, they do not have JIF or other journal impact metrics, which makes them inapplicable for Arts and Humanities evaluations [45,224]. Moreover, even after the journal is transferred from ESCI to JCR indexes, the impact indicators for a journal are calculated only if the journal manages to stay in the JCR index for three consecutive years [175]. Thus, aiming to evaluate a quality of a journal indexed in WoS CC, it is important to take into account the index in which the journal is included and when, especially when subject specific, or regional journals are being evaluated. Additionally, even journals that were included in the most recent JCR lists may not have JIF (only an Immediacy Index), but it is unclear as to how many of such journals are in JCR, since the journal metrics are not provided in JCR journal lists. Therefore, WoS journal indicators cannot be applied for evaluation of journals newly included in the WoS CC. Moreover, as JCR does not include books and conference material (with rare exceptions), these sources also cannot be evaluated by WoS journal metrics. Only conferences that were published in the journals or books series may be indexed in JCR and receive JIF values [86].
Scopus does not separate newly emerged, regional, or newly accepted sources. CS is calculated for sources after first full year of indexing in Scopus, in contrast to JIF, which is only calculated after three consecutive years of journal indexing in JCR. However, more importantly, CS, as well as other Scopus journal metrics, is calculated for all actively indexed serial sources within all disciplines, including books series, conference proceedings, and trade publications, if at least one document in the preceding three (currently, four) years was published [175,258]. However, Scopus only calculates CS for a minor part of indexed conference material sources. For instance, separate conferences and ones published as a special journal issues or part of book series do not receive CS metrics [86]. Apart from that, it was observed that many of the sources that are listed in Scopus Source page do not have CS values with the indication "N/A", without a clear explanation of the underlying reasons for the absence of metric's value. An empirical investigation has shown that many of the sources without CS value were discontinued or newly included in the DB. Although, several hard to explain cases were also found [259]. On the other hand, a subsequent study that was performed with the same set of journals did not found any journals without indicated CS value [255]. Yet, unlike the journal indicators provided by WoS, all of the Scopus metrics and full information of indexed sources presented in their profiles are available free of charge, even without a DB subscription [51].
Nevertheless, no indicator is perfect, because as they all have characteristic limitations. Thus, it is recommended to use more than one metric for the most reliable results [222].
In any case, it should be kept in mind that the use of publication and citation counts and, accordingly, the indicators that are derived from them, is only valid when they are applied at the similar levels of aggregation and in the appropriate context [260]. Therefore, despite the metrics used, comparisons of researchers would only be meaningful when comparing authors in the similar field and of the similar age or stage of their careers. The same applies in evaluations of larger entities, such as research or academic institutions and countries, since they can only be compared to those that are similar in size [2,26]. Even when journals are being compared by the values of impact indicators, the differences in size of the journals should be also taken into account [261]. However, generally, more reliable results are obtained at higher aggregation levels (e.g., institutions or a countries) [178].
To summarize, in order to choose the most suitable metrics for any task, one should have a clearly defined main objectives and application context of the task, and be fully aware of the indicators' designated purpose, main limitations, and appropriateness of their application [2,5,26,45,262]. In addition, the indicators themselves have to be accurate, robust, transparent, and unbiased, especially when non-bibliometricians intend to use them [221]. All of the journal impact indicators are valuable if appropriately applied [254].

Fundamental Concerns in Bibliometric Practices
The rise of technological advances and the availability of data, along with the growing number and importance of publications and the undeniable relationship between research, innovations, and economic development, have made the use of quantitative impact-based research assessment methods a daily routine when making various decisions within academic, economic, and political communities. Moreover, research evaluations are applied in almost all domains, regardless of their size: from assessing the achievements of scientific research in countries or academic institutions at the national or even global level, to assessing the career, competencies, and personal achievements of individual researchers. Academic evaluation is often conceived as a universal instrument without consideration of the context in which it is applied, which undermines the difference between quality and scientific impact and, most importantly, overlooking the real meaning and value it should provide to the scientific community [1][2][3][5][6][7]44,260,263].
The core element in all assessment practices is a quality of research, which is inevitably linked with the generated scientific output that is equalized to publications. With the growing amount of publications, it has become evident that publication counts alone cannot provide an accurate evaluation of research, institution, or researcher. The simple amount of publications does not reflect the scientific impact, since publication counts measure productivity, rather than quality [1,2]. Moreover, publications themselves are not equal in quality. Therefore, publication counts are only meaningful when used together with their quality measures.
However, research quality is a multidimensional concept and, thus, cannot be only assessed by quantitative measures. Despite this, the scientific impact of publications, which is determined by citations, is now generally viewed as an indicator of the quality of research, since citations are considered to be proof that the knowledge encoded in the publication was used and, therefore, made an impact. Citations may be appropriate in assessing the scientific impact of the research, but they do not show the impact of the research outside the scientific community [1,3,5,6,260,264,265].
On the other hand, citations themselves are not equal, primarily because citations may have very different meaning, which is determined by the plethora of reasons for citing [266][267][268]. For instance, only part of citations, as listed in publication's reference list, are dedicated to the main idea of the paper, while other sources are frequently only being mentioned as a simple recognition of other similar works, or as a persuasion on the scientific ground of the study [1,269]. Moreover, not all of the cited publications could be read at all by the authors [2]. On the other hand, not all articles read are cited, although they may have had a significant impact on the study [46]. Although rarely [3], citations can also be negative, raising the question about their impact to the quality of the cited document [45].
There are numerous factors affecting citation counts [270,271], which can be divided into (1) factors that are related to the paper, (2) factors related to the journal, and (3) factors related to the author [266,272]. The extent to which these factors can influence citations varies. Also, in many cases, the effect may be indirect, since most of these factors are interrelated. For instance, journal impact indicators influence citing behavior, firstly because impact factors increase the visibility of the paper based on an assumption that journals with higher impact indicators publish papers of a higher scientific value. In turn, papers that are published in high quality journals are cited more frequently. Thus, citation-based impact indicators tend to create bias amongst authors not only when choosing journal for publishing, but also when choosing which literature to cite [268,[273][274][275].
It is often argued that the scientific impact of any evaluated entity cannot be determined solely by citations due to the different citing behaviors and resulting incompatible citation counts between disciplines [1][2][3]5,260,267]. Opinions regarding the extent to which the aforementioned factors shaping different citing behaviors affect the validity of applying citations as a measure of scientific impact in the scientometric community are divided into two theories. Normative theory is based on the general assumption that citations are credible proof of impact and, while the aforementioned factors affecting citation behavior may influence the total numbers of citations, the effect is meaningless. Meanwhile, from the constructivist point of view, all of the factors determining citing behavior should be taken into account when evaluating scientific impact by citations and, therefore, these factors weakens the validity of citations [266,276]. However, understanding the main reasons and driving forces of citing behavior and the resulting citation distribution patterns may help to better understand their value in interpreting citation analyses results and scientific quality assessments.
Yet, apart from differences in citation culture between disciplines, the other obvious problem in citation based evaluations and analysis is mainly caused by skewed citation distributions [178,274]. This phenomenon can be well illustrated by the 80-20 rule describing that, usually, the top 20% of papers tend to receive 80% of total citations, while remaining publications are cited very scarcely, if cited at all [2,272]. This should be kept in mind when interpreting citation based impact metrics, as all of the most popular journal impact indicators are calculated as ratios of citation and publication counts and they are based on arithmetic means. This principle of calculation can be recognized as one of the main drawbacks of citation impact indicators, limiting their reliability, since individual papers published in the same journal can differ greatly in their impact, which is determined by the number of citations that they received. Accordingly, the values of mean-based indicators can be highly distorted by the presence of few highly cited publications [44,45,221,224,[277][278][279]. On the other hand, it is a mistake to assume that articles with zero citation have no scientific value, because they first had to go through a peer-review process to be published [46].
While there is still no common agreement if citations and derived impact indicators can be used as a sole measurement of scientific impact, it is generally strongly recommended to use them along with additional methods, such as traditional peer-review [5,265,280,281]. Yet, peer-reviewing has its own characteristic biases and limitations, which is one of the reasons why, nowadays, peer-review has been increasingly extensively replaced with bibliometric measures of impact [3,44,276]. The main reasons why metrics are favored over peer-review are: (1) they are cheaper and faster to apply than peer-review; (2) they are regarded as more trustworthy and objective, because it is believed that the use of indicators may help to avoid biases that are introduced by personal interest and authority relations; (3) they are perceived as more accessible and do not require the involvement of scientists or subject specialists [6].
Although bibliometric indicators were generally shown to be positively correlated with peer-review evaluations, it was also observed that the correlations between metrics and expert judgements vary greatly across disciplines, application contexts, and between metrics used [44,276,280,282]. On the other hand, conclusions stating poor agreement between metric and peer-review were also obtained [283]. However, these conclusions cannot be extrapolated as a general tendency because most of these comparisons had certain limitations and differed in their design [44,276]. Thus, the question of an agreement between metric and peer-review still remains open for discussions.
However, nowadays the use of bibliometric indicators has greatly distanced from their intended purposes. Today, they are being applied as universal quality measures, based on the underlying assumption that journal quality rank reflects the quality of publications of the researcher and, accordingly, overall research. This type of narrow-viewed evaluation may lead to detrimental consequences, since all of the research evaluation practices inevitably affect the behavior of evaluated parties. Accordingly, this approach creates a bias towards only publishing in journals with high impact values and decreases the recognition of sciences mainly published in journals of national orientation, or preferring less popular sources for knowledge dissemination, like books, since impact indicators are usually not calculated for these sources. On the other hand, researchers are highly encouraged to publish their study results in a timely manner in the best possible journals. Thus, this "publish or perish" attitude, on the one hand, stimulates the productivity of researchers, but, on the other, creates a bias towards quantity, often overlooking quality [2,3,[6][7][8]44,81,163,260,263,[284][285][286][287][288].
Consequently, the vastly increased imprudent reliance on the use of bibliometric indicators and their widespread use in evaluation practice has led to misinterpretations of the concept and significance of scientific impact itself and what measures can be used to properly assess. Hence, because publication data and citation metrics continue to play the most important role in research assessments worldwide, indicator-based research assessments also remain at the center of the global debate, not only about their validity, but also about their influence on research activities, scientific output, and its quality [1,[5][6][7]44,260,265,289].
In response to the widespread misuse of impact indicators, San Francisco Declaration on Research Assessment (DORA) was developed in 2012 and became a worldwide initiative strongly recommending against the inappropriate use of impact factor in the research assessments [290]. Furthermore, in 2016, the American Society of Microbiology (ASM) declared that JIFs will no longer be posted to the ASM journal websites or used in advertising [226]. In 2015, Nature published "The Leiden Manifesto for research metrics" [291], which established the main ten principles of the best practice in metrics-based research assessments. Meanwhile, "The Metric Tide: Report of the Independent Review of the Role of Metrics in Research Assessment and Management" took a deeper look at the potential uses and limitations of research metrics and indicators [262]. Additionally, public recommendations were released, such as European Commission frameworks [292,293], encouraging an application of more sustainable evaluation methods. There were also other proposals for improving research assessment practices [289]. It should be noted that these recommendations do not entirely neglect the value of bibliometric indicators in evaluating research performance, but they are highlighting the importance of their responsible use regarding their original purpose and intended meaning.
Apart from that, the reliability of any type of bibliographic analyses or evaluations based on publication and citation data depends not only on the metrics used, but also on the choice of bibliographic data source. The reason is that citation data and indicators are only calculated from the data covered by the particular DB, and no data source covers all relevant documents. This attains the main importance in the cases of particular disciplines and source types. Studies covering specific disciplines or particular subject fields could appear to be less impactful on the global scale, but they may have great importance in the national context. These studies are also often targeted at the general public and, therefore, are often published in the national language, which narrows the size of potential audience and, consequently, reduces the likelihood of citation [87,96,260,265,294]. Consequently, sources publishing papers with a national focus attract much fewer citations and they are generally considered to be of inferior quality and are usually not included in the main selective bibliographic DBs. Thus, the suitability of WoS and Scopus as data sources for evaluating national research is highly questionable. The same can be said about the use of WoS and Scopus data in Social Sciences & Humanities, since these disciplines are clearly underrepresented in the DBs [5,44,47,88,91,163,263]. In fact, some countries are explicitly not using WoS for assessments of national research, due to its lack of comprehensive coverage width [81]. Thus, for more reliable results in these contexts, it is generally recommended to use several data sources, including thematic DBs, or to rely on national and/or institutional resources instead of selected international DBs [47,163,284,295].
The vast majority of authors who have studied or reviewed bibliographic DBs, citationbased indicators, and their application also emphasize the importance of choosing the right data source and metrics, as well as their adequate use in evaluating research and other practices. However, according to the author, a noticeable impact in solving these problems can only be achieved when these issues are recognized and resolved at all levels of the academic, economic, and political society, including not only governors, policy makers, and universities administrators, but also the researchers themselves.

Discussion
In today's technologically savvy and data-driven society, most of the valuation methods are based on quantitative measures. Research evaluations are also no exception and, today, they are performed as bibliometric analysis of scientific output (publications) produced by the evaluated research unit. For this, two components are critical-a bibliometric data source and tools for quality interpretation-metrics. They are both provided by the major bibliographic DBs.
For a long time, WoS was the only comprehensive bibliographic data source available. However, the situation has changed with an introduction of Scopus, which rapidly became a major competitor to WoS. Accordingly, the comparisons of these data sources have become one of the major discussion themes in scientometrics, as DBs can differ in many aspects, determining their suitability as an appropriate and reliable data sources for various purposes, as it was already highlighted more than a decade ago: "[. . . ] for those who perform bibliometric analyses and comparisons of researchers, countries or institutions, the existence of these two major databases raises the important question of the comparability and stability of statistics obtained from these data sources" [296].
The extent and comprehensiveness of covered content is the most important characteristic of DB for obtaining reliable results. It is evident that both DBs certainly offer a wide coverage of highest quality journals, along with the additional analysis tools for publications and citations. WoS and Scopus both offer an extensive coverage of Natural, Medicine, Health Sciences, Engineering, and Technology disciplines and, thus, could be used in research evaluations of these disciplines. On the other hand, the coverage of certain disciplines or subject fields' literature in selective bibliographic DBs should be evaluated with additional precautions, since it is highly dependent on several other aspects of coverage, such as indexed source and document types and coverage of non-English language publications. However, although both WoS and Scopus have made noticeable efforts to expand their coverage, especially during the last decade, even the most recent studies did not indicate any significant improvement in the coverage of books and conference proceedings, concluding that the coverage of these document types is still insufficient for reliable analyses or evaluations in disciplines where these source types are the most prevalent. The same can be said regarding the coverage of non-English publications and sources of regional importance. Therefore, the main biases towards the overrepresentation of English language sources, unequal representation of countries, and underrepresentation of SSH literature still remains the main limitations of these data sources. Nevertheless, multiple studies have shown that Scopus offers wider coverage, both of publications and of citations, in all major disciplines and document types, as well as better representation of non-English and regional literature. Thus, Scopus might be a better choice for performing tasks within the context of Arts & Humanities and focused on more nationally orientated and novel research [44,87]. Especially when it comes to assessing the quality of sources in these contexts, since WoS does not provide impact metrics for these sources.
Meanwhile, coverage depth, particularly regarding citations, is generally better in WoS. However, in certain cases, the time frame for accessible citation data in WoS may be even shorter than in Scopus due to the content access limitations resulting from the time frame restrictions indicated in the subscription terms. The same restrictions also apply for publication data. Moreover, indexes that are accessible through WoS CC subscription may also vary. Thus, although the ability to modulate WoS subscription provides the institutions with the opportunity to pay only for the most relevant content, these variations in WoS content availability makes it very difficult to reassure a reproducibility of any analyses performed using WoS data [15]. Variable access to WoS content may also make an evaluation of the suitability of WoS for a particular task misleading when based on information that is provided by the DB owner. Therefore, official descriptive WoS information, as well as any kind of results obtained from WoS, should be evaluated with caution. Apart from that, the ability to use DBs as a data sources for large-scale bibliographic analyses may also be hindered by certain data export and accessibility limitations [45]. However, Scopus seems to provide better and easier access to the data.
Another advantage of Scopus is that it provides searchable and comprehensive profiles, along with unique identifiers for all authors, institutions, and periodical sources. Regarding source profiles, WoS provides more detailed information, but the profiles are only available for journals, while, in Scopus, they are created for all indexed periodical sources. WoS also employs disambiguation systems of authors and institutions, but individual identifiers are not assigned. However, an incomplete coverage and insufficient precision and recall of WoS and Scopus institutional disambiguation systems imply that both of these systems cannot currently be fully relied on in evaluations of institution's research performance [72]. Meanwhile, although author disambiguation systems in both DBs were shown to be more accurate [121], split identities and other discrepancies still occur. Thus, the author information provided by the DBs should be always checked before its application in bibliometric analyses and research evaluation practices. On the other hand, disambiguation systems are being improved over time, thus suggesting that, in the near future, they might become sufficiently accurate for their application for both bibliometric and other purposes. Despite this, disambiguation systems are already an excellent tool for aiding in both personal and institutional performance evaluation.
Judging from the point of practical use of DBs' web-interfaces, in author's personal opinion, Scopus DB is also more convenient. Although both DBs use powerful and comprehensive search engines with additional refinement capabilities, Scopus is not divided into separate indexes and, thus, all of the searches are performed in the range of all indexed content without differences in the search parameters available for separate indexes, as in WoS CC case [42]. Secondly, the majority of information is linked and therefore allows easy transition between different information types. Besides, most of the lists and other data can be opened in separate or emerging windows and tabs. The author's opinion may be supported by the results of an empirical study showing that both novice and experience users of the DBs made less mistakes while executing search tasks in Scopus and were more satisfied with its overall performance, compared to WoS [24].
Despite being globally acknowledged as the most comprehensive data sources, both WoS and Scopus are not immune to errors that occur in publication metadata. Generally, the distributions of errors between WoS and Scopus are very different, with one error being more frequent in one DB, and others, in the other DB [62,66,79,95]. This might be at least partially explained by the different data uploading and curation mechanisms used in the DBs as many errors are being made by the authors and/or publishers, which are uploaded into DBs with publications' metadata, while other mistakes are being introduced by the algorithms employed in the DBs [62]. On the other hand, the accuracy of bibliometric DBs has improved significantly, since DBs more carefully index new content in order to avoid errors, as well as actively correct already present mistakes [66,67]. For instance, according to the Scopus representative, over the past few years Elsevier has made significant efforts in addressing all errors and inconsistencies occurring in Scopus [12]. However, judging by the most recent studies, in both DBs, errors of all types are still present. Thus, there is still plenty of room for improvement. On the other hand, occasional mistakes occurring in the DBs might be expected and at certain level justified, as an accurate extraction of metadata depends not only on DBs capabilities and may be caused by various pre-existing errors and external factors. Moreover, because the rates of errors are not very high, they should not significantly affect the results of the analyses, if they are properly taken into account.
The content coverage and quality of DBs are constantly changing and improving, as well as the convenience of their web-interfaces. In addition, DBs' vendors are promoting user-orientated changes through cooperation with their customers. Elsevier promotes authors to supervise and maintain their Scopus author profile integrity by providing a free access to the profiles and their correction opportunities. Additionally, Elsevier encourages Scopus users to report any observed discrepancies [171]. Meanwhile, Clarivate actively engages in personal communication with WoS customers and addresses concerns that are expressed in their feedback.
Nowadays, an increased attention is focused at making science more efficient and fair by enabling free access to published researches. Open Science initiatives (e.g., Plan S) aim to make science more transparent and reproducible. Thus, authors are encouraged (and often, required) to open not only their published research results, but also research data [132]. Published data sets can be cited separately and, accordingly, included in research evaluation practices as additional publications [297]. Thus, as both WoS and Scopus indexes data documents, the comprehensiveness and quality of their coverage in the DBs will also gain great importance in the upcoming years.
However, because this work was written mainly with the aim of providing a more informed choice of which data source to subscribe to, or which to choose whether both data sources are available by subscription, the coverage of freely available Open Access (OA) content was not elaborated, as it may be accessed from various other open platforms, search engines, or directly from publishers' websites [42]. On the other hand, although studies of the prevalence and impact of Open Access publishing are currently becoming one of the most trending topics in scientometric literature [163,298,299], comparisons of OA content coverage in WoS and Scopus have not received significant attention. Yet, in the author's opinion, this feature is important enough to receive a more detailed evaluation. Moreover, the overall coverage of DBs sources can also change significantly due to the global shift towards openness in research and publishing practice. Thus, content coverage and quality comparisons will also remain highly relevant.
Another important goal of this work was to make the users of DBs more familiar with the convenience of DBs' web-interfaces. However, as was mentioned before, web-interfaces are being constantly changed. Thus, the features that are described here may change soon and they may not fully coincide with the ones described in this work. In particular, in the case of WoS, since, during the preparation of this article, Clarivate announced major changes that are planned to be implemented in WoS interface in the near future [300]. Most of the changes will be made to the layout of the web page, which should make the interface more modern and user-friendly. Additionally, there will be some changes in naming (e.g., Organization-Enhanced search will be renamed as "Affiliation" search), and refinement (e.g., the ability to sort the analysis results by publishers will be included). Some of the new features will be more similar to the ones currently implemented in Scopus (e.g., refinement of filters). WoS vendor also promises an improved search speed and access through mobile gadgets, as well as unified user profiles. These changes have just begun to be actively implemented, and the final transition to the new interface is planned for 2021. Not all the features are already included in the new WoS interface, which is available for testing to evaluate its usability in practice. On the other hand, the array of basic functions should remain the same. Yet, a detailed study of the upcoming changes with a comparison of the current version would be helpful in the future. The same can be said for Scopus, because this DB also constantly improves its web-interface. For instance, during the last couple of years, the layout of profile pages has been updated at least several times. However, it is important to note that all the evaluations of DBs' features, functionalities, and performance discussed in this work are based on the current versions of the DBs.
In addition, both DBs owning companies also offer additional products and tools, in order to assist in online data analyses, such as InCites and Essential Science Indicators powered by WoS data, and SciVal and Pure powered by Scopus data. However, these products are only available with an additional subscription, so few WoS and Scopus subscribers have access to them. Consequently, relatively few studies describe the performance and usefulness of these tools (e.g., [173,301]). On the other hand, the lack of detailed explorations of these tools may be one of the reasons leading to the relatively low usage while they actually might be very helpful. Thus, a more detailed look at these tools may also be a relevant topic for future studies.

Conclusions
Although, during the last decade, there was a significant growth of available bibliographic data sources and metrics, Web of Science (WoS) and Scopus databases (DBs) still remain the two major and most comprehensive sources of publication metadata and impact indicators. Therefore, they serve as the major tools for a variety of tasks: from journal and literature selection or personal career tracking to large-scale bibliometric analyses and research evaluation practices in all possible levels. However, because both DBs are subscription-based and expensive data sources, institutions often have to choose between them.
Despite the fact that WoS and Scopus DBs have been extensively compared for more than 15 years, the scientometric community still have not reached the verdict of "which one is better". On the other hand, both DBs are constantly being improved due to the intense competition and notable transfer of academic activities into digital internet-based environment. Consequently, nowadays they encompass so many features and functionalities that it is impossible to draw such a general conclusion, since one DB may be a better choice for one purpose, but less for another. Thus, if an institution has access to both DBs, each member of the institution should be able to make a personal and well-informed decision regarding which one is more suitable for a particular task.
Despite the serious biases and limitations that both WoS and Scopus share, in the author's opinion, Scopus is better suited for both evaluating the research results and for performing daily tasks for several reasons. First, Scopus provides wider and more inclusive content coverage. Secondly, the availability of individual profiles for all authors, institutions, and serial sources, as well as the interrelated interface of DB, makes Scopus more convenient for practical use. Additionally, thirdly, the implemented impact indicators perform equally well and even better than the metrics that were provided by WoS, are less susceptible to manipulation and are available for all serial sources in all disciplines. However, most importantly, Scopus is subscribed as a one single DB, without the confusion or additional restrictions regarding content accessibility. Moreover, Scopus is more open to the society, as it provides free access to author and source information, including metrics. On the other hand, WoS also has its own advantages. For instance, it may be more suitable for searching and analyzing Open Access resources at the publication level.
Generally, the suitability of DB mainly depends on the objectives and application context of the particular task, including consideration of the required degree of the selectivity and the level of aggregation. Nevertheless, academic institutions will be forced to subscribe to WoS and Scopus DBs, or at least to one of them, as long as their provided metrics will remain the core elements in research evaluation and career assessment practices. Accordingly, the institution's choice of the DB subscription is primarily determined by the metrics that were applied in national and institutional research evaluation policies. On the other hand, because publishing and evaluation trends, as well as the DBs themselves, are not constant, new insights in DBs' suitability for particular assessments may, in turn, suggest some changes for these policies. Either way, changes in evaluation policies are necessary, since a widespread requirement to publish research results only in journals indexed in WoS and Scopus, and the fact that researchers' careers and salaries often are dependent on the number of such publications, inevitably affects their behavior by redirecting their focus from quality towards quantity, which poses a threat to the overall quality of science.