Bibliometric Analysis: The Main Steps

Definition: Bibliometric analysis is a systematic study carried out on scientific literature for the identification of patterns, trends, and impact within a certain field. Major steps include data collection from relevant databases, data cleaning and refining, and subjecting data to various bibliometric methods—an ensuing step in the generation of meaningful information. Bibliometric analysis is an increasingly popular and thorough technique for examining and assessing massive amounts of scientific data, which is being used more and more in research. This entry thoroughly introduces bibliometric methodology, emphasizing its numerous methodologies. It also provides a set of reliable, step-by-step instructions for confidently performing bibliometric analysis. Furthermore, we investigate the suitable use of bibliometric analysis as an alternative to systematic literature reviews. This entry aims to be a useful tool for learning about the methods and approaches that may be used to perform research studies that use bibliometric analysis, particularly in the fields of academic study.


Introduction or History
Bibliometrics has become a trend in academic research in recent years [1][2][3][4][5].However, many young colleagues still lack the skills to conduct a start-to-end bibliometric analysis.Conversely, bibliometrics in research is not just a passing trend.First, the term bibliometrics was introduced in the nineteen thirties by the Belgian documentalist Otlet [6] and was re-invented and made popular by Pritchard in 1969 [7].In that same year, Nalimov proposed the term scientometrics [8,9].Although in those years there were some differences between the two fields, nowadays both terms, bibliometrics and scientometrics, are synonyms [7,[9][10][11][12][13].
Bibliometrics reflects its applicability in handling vast amounts of scientific data and its significant contribution to research impact.Numerous variables, including the development, accessibility, and availability of bibliometric tools like R and VOSviewer and scientific databases like Google Scholar, Scopus, and Web of Science, are responsible for this popularity.The cross-disciplinary influence of the bibliometric methodology from data science to operational research has also played a significant role in its widespread adoption [14].
Academics employ bibliometric analysis for many other purposes, including uncovering emerging trends in article and journal performance, collaboration patterns, and research constituents and exploring the intellectual construction of a given domain within the existing literature [15][16][17][18].The data central to bibliometric analysis is often extensive, for instance, hundreds, if not thousands of pieces, and objective, like the number of citations and publications, occurrences of keywords, and topics.At the same time, its meanings often rely on subjective (for instance, thematic analysis) and objective (for example, performance analysis) evaluations derived via well-informed methods and processes.Through rigorous efforts to make sense of vast unstructured data, bibliometrics, or similarly, scientometrics, helps map and understand cumulative scientific knowledge and evolutionary subtleties of well-established domains.Thus, well-conducted bibliometric research may provide strong groundwork in all fields.
Although bibliometric analysis has many advantages, it is still a relatively new tool in research, and its full potential remains untapped.The results from bibliometric studies provide a fragmented knowledge of an area by relying on a small collection of data and approaches [16,19,20].Notably, academics looking for a thorough yet readable resource on the approach and its application have a major hurdle as there is no authorized reference for bibliometric analysis.Although there are reputable manuals for systematic literature reviews [21,22], they fall short in their coverage of the bibliometric analysis technique.
This entry aims to accomplish two things: first, present a thorough review of bibliometric techniques, and second, present the main steps and instructions for carrying out a bibliometric analysis.This entry introduces bibliometric analysis for academics across all fields, including its principles, methods, processes, supporting details, and explanations.This entry makes significant contributions, as seen in Table 1 below.

EndNote, Zotero, Mendeley
A comprehensive dataset of relevant publications.

Data Cleaning and Preprocessing
Clean and preprocess the data to ensure accuracy (e.g., removing duplicates and correcting author names).

R, Python, Excel or LibreOffice
A refined and accurate dataset ready for analysis.

Selection of Bibliometric Techniques
Choose appropriate bibliometric techniques based on research objectives (e.g., co-citation analysis, co-word analysis, bibliographic coupling).

Data Analysis
Conduct the analysis using chosen techniques.

R, Python, VOSviewer, CiteSpace
Insights and patterns in the literature.

Visualization
Visualize the results to aid interpretation and presentation.

Interpretation and Reporting
Interpret the results and prepare a report detailing the findings and their implications.

MS Word, LaTeX
A comprehensive report with insights and recommendations.

How to Conduct a Bibliometric Analysis The Main Steps for Bibliometric Analysis
To conduct a thorough bibliometric analysis, it is crucial to clearly define your research objectives to address specific issues or inquiries, thus maintaining a focused and relevant analysis [23][24][25].Utilize a variety of databases such as Scopus, Web of Science, or Google Scholar to conduct a complete literature search, and utilize reference management tools like EndNote, Zotero, or Mendeley to organize your data and create a comprehensive dataset in your preferred format (e.g., ris).Afterward, ensure the accuracy of your data by removing duplicates, standardizing author names, and completing necessary metadata using tools like R or Python.Then, select appropriate bibliometric methodologies that align with your study objectives, such as co-citation analysis, co-word analysis, or bibliographic coupling, and use software like VOSviewer or CiteSpace.VOSviewer is software for creating and viewing bibliometric maps.CiteSpace, with its unique capabilities, is a fascinating tool for researchers, librarians, and analysts.It offers a deep dive into the structure and dynamics of scientific literature across various fields, sparking curiosity and interest [26][27][28][29][30].These tools analyze and visualize networks of co-citations, co-authorships, keywords, and other bibliometric data.Using advanced visualization techniques allows you to explore and analyze large quantities of bibliometric data efficiently and intuitively utilize these tools to conduct bibliometric analyses and identify patterns and trends within the literature.Enhance the understanding and presentation of your findings by visually presenting them using applications like Bibliometrix or VOSviewer.Bibliometrix is a valuable tool for researchers who require advanced methods to analyze and visualize the structural aspects of scientific literature.It aids in exploring research trends, author productivity, and collaboration and citation networks [31].Finally, interpret the results and produce a comprehensive report detailing your approach, findings, and implications, ensuring a professional format using word processing software such as MS Word or LaTeX.
Table 1 delineates a systematic approach to conducting a bibliometric analysis, a proposed methodology for quantitatively examining scientific literature.This process encompasses seven critical steps, each contributing to gaining insight and identifying trends within a specific research domain.
In Step 1, the author should define the research objectives.This foundational phase does not require any specific tools or software but necessitates a thorough understanding of the research questions and the scope of the study.The primary expected outcome is a set of well-defined research questions and objectives that will guide the subsequent stages of the analysis.
In Step 2, the author should conduct the literature search and download the dataset.This phase collects relevant literature from established databases such as Web of Science, Scopus, and Google Scholar or collects raw data (e.g., from no database) and builds a custom database.Tools like EndNote, Mendeley, and Zotero are instrumental in organizing and managing these references.The anticipated result is a comprehensive dataset comprising relevant publications that form the basis of the bibliometric study.
In Step 3, the author should clean the data and perform pre-processing.This step involves tasks such as removing duplicate entries and correcting inconsistencies in author names.Programming languages and tools include R and Python or simpler tools like Excel or LibreOffice.Usually, those are typically employed to facilitate this process.The outcome is a refined and accurate dataset ready for detailed analysis.
In Step 4, the author should select the bibliometric technique.Techniques like cocitation analysis, co-word analysis, and bibliographic coupling are considered during this stage.Software tools like VOSviewer and CiteSpace assist in identifying the most suitable techniques.The expected outcome is identifying the techniques that will be utilized for the subsequent data analysis.
In Step 5, the author should run the data.This stage uses methodologies like R, Python, VOSviewer, and CiteSpace to reveal insights and patterns embedded within the body of literature, as mentioned before.The primary outcome is extracting meaningful insights and identifying trends and patterns in the research field.
In Step 6, the author should visualize the results.The visualization step aims to create graphical representations of the analysis results to aid their interpretation and presentation.
In Step 7, the author should interpret and report.This report is typically drafted using software such as MS Word or LaTeX.The expected outcome is a detailed and insightful report that provides recommendations and highlights significant trends and patterns identified through the bibliometric analysis.
All steps mentioned separately and together in the bibliometric analysis process are integral to thoroughly understanding the research landscape.From defining clear research objectives to the meticulous collection and cleaning of data and selecting and applying appropriate analytical techniques to visualize and report findings, this structured approach ensures a rigorous and insightful exploration of the scientific literature.The use of specialized tools and software at various stages further enhances the accuracy and efficiency of the analysis, ultimately leading to a comprehensive and informative report.

Understanding Bibliometric Methodology
Bibliometric methodology is the use of quantitative approaches, such as author analysis, citation analysis, or keyword analysis, to bibliometric data [16,[32][33][34].The growth in bibliometric publications, driven by the increase in scientific research and the availability of large bibliographic datasets [4], is evidenced by an average of 1021 publications annually in the last decade [4].Scientific databases such as Google Scholar, Scopus, and Web of Science, along with software like R, Leximancer, and VOSviewer, facilitate the collection and analysis of extensive bibliometric data, boosting scholarly interest [4,31].

Comparison with Other Review Methods
Bibliometric analysis can be compared with meta-analysis and systematic literature reviews.Meta-analysis estimates the overall strength and direction of effects and the variance across studies [48] while organizing and assessing the existing literature using systematic procedures, often manually [21].
Meta-analysis, like bibliometric analysis, handles large volumes of literature and provides a nuanced summary of a field, although it may be affected by publication bias and study heterogeneity [48].Systematic literature reviews, which tend to focus on narrower scopes, are better suited for confined or niche research areas and typically include fewer papers [21,49].
Although both meta-analysis and bibliometric analysis are quantitative, they differ in focus.Meta-analysis summarizes empirical evidence by examining relationships among variables, often serving as a tool for theory extension [48,50,51].In contrast, bibliometric analysis is simpler and explores a field's bibliometric and intellectual structure by analyzing relationships among research constituents (e.g., authors, institutions, topics).
The choice among bibliometric analysis, meta-analysis, and systematic literature reviews depends on the review's goals and the literature's scope.These methods are complementary, each offering unique benefits to researchers.Table 2 provides a comparative overview of these methodologies to guide authors in selecting the appropriate review method.Bibliometric analysis can possibly be defined as an attempt to manage huge information through conceptualization, showing trends and structural composition of a domain in scientific research.This review can be comprehensive with regard to co-authorship patterns, intellectual structure, and patterns since it presents different aspects of the literature in quantified form-for instance, citation counts.This technique would appear especially appropriate for more general reviews with a large amount of data, well suited to mapping a research field; detecting key publications, authors, and journals that influenced debate; and obtaining an idea of how research topics emerge and develop over time.Bibliometric analysis is far less suitable in the approach for a specific review or where use is made of small containable datasets.Its quantitative nature does not do justice to the nuanced qualities of an explicitly defined research question.This scope of bibliometric analysis is, therefore, very wide, such that there are hundreds or even thousands of publications in these different databases to ensure comprehensiveness in a given field.In general, a bibliometric study includes quantitative content analysis using some metrics like citation count and h-index, although aspects of the qualitative analysis can be concerned with the content analysis regarding keywords or themes in the form of frequency.
Meta-analysis is a statistical synthesis of empirical evidence across studies aimed at unveiling new relations, providing a high-level theory for the synthesis of research findings.It pools results from a large number of studies to draw more valid conclusions.A great feature of meta-analysis is that it can simplify the results across homogeneous studies.Homogeneity comes with respect to the methodology, population, and context of the study.This is used only in cases where there is a need to aggregate findings to result in more robust conclusions.This is not the right technique in the case heterogeneous studies with respect to important differences in either study design or context or when low numbers of high quality studies are available, since the variability among studies can severely dilute the validity of the meta-analytic findings.Meta-analyses can be broad or very specific, depending on the research question.It may also be due to availability: some will be homogeneous and focus on aggregating data.The dataset might be large or just sufficiently inclusive, a combination of several empirical studies meeting predefined inclusion criteria, and in this respect, its quality and homogeneity are critical for the validity of the meta-analysis.The analyses of a meta-analysis are quantitative, and the statistical technique combines all the data from the studies in order to give sizes of effects that may be graphically described by forest plots.
One of the major objectives of a systematic literature review (SLR) is the summarizing and synthesis of the existing literature findings in a structured and systematic manner, which provides an overview of the state of the research on a given topic, including gaps and inconsistencies.SLRs are especially valuable for small, specific reviews with manageable datasets dedicated to an in-depth analysis of well-defined research questions aimed at synthesizing already existing knowledge comprehensively.SLRs would be less effective for large, broad reviews or when issues are associated with large datasets since one would need to be extremely detailed and exhaustive, which may not be practical for large summaries of the literature.The SLR accurately scopes a narrowly defined research area and provides a deep dive into the literature for detailed insights.The datasets included in SLRs are small, with only a few studies and rigorous inclusion processes that guarantee the quality and relevance of the literature reviewed.Systematic literature reviews are qualitative in nature and involve analysis through the synthesis of findings of selected studies, the ascertainment of the commonalities in themes, and the preparation of a narrative synthesis of literature.
Consequently, the two types of reviews are for their unique purposes and fit into different kinds of research contexts.The advantage of bibliometric analysis is that it offers a nice approach to giving general overviews of large datasets through the use of both quantitative and qualitative data.Meta-analysis is derived from the aggregation of empirical evidence from studies homogeneous with each other; this method involves quantitative synthesis.Systematic literature reviews are in-detail syntheses on very precise topics, using qualitative analysis.Knowledge of the strengths and limitations of each approach means that the most appropriate methodology is chosen for any research objective.

Main Methodologies and Explanations
Bibliometric analysis involves two main approaches: performance analysis and science mapping.

Performance Analysis
Assessing performance in research involves evaluating the impact of researchers, institutions, and countries using metrics such as total publications, author contributions, and citation-related indicators [52].
Publication metrics evaluate the quantity and collaborative aspects of research output, while citation metrics gauge the influence and impact of research through citation analysis [4,52].
Furthermore, combined metrics that consider citations and publications provide comprehensive insights into the characteristics of research output and its reception within the scholarly community.Table 3 summarizes those metrics.

Total publications
The total number of publications produced by a researcher, institution, or country.

Sole-authored publications
Publications written by a single author indicating individual research contributions.

Co-authored publications
Publications written by multiple authors reflect collaborative research efforts.

Number of active years of publication
The years during which a researcher or institution has been actively publishing.

Productivity per active year of publication
The average number of publications produced per active year.

Number of contributing authors
The total number of unique authors contributing to a body of work.

Total citations
The total number of citations from a researcher's or institution's publications.

Average citations
The average number of citations per publication indicates the impact of the work.

Citations per cited publication
The average number of citations per publication that has been cited.

Number of cited publications
The total number of a researcher's or institution's publications that have been cited at least once.

Proportion of cited publications
The proportion of publications that have received citations out of the total number of publications.

Collaboration index
A measure of the extent and intensity of collaborative research efforts.

Collaboration coefficient
A coefficient indicating the degree of collaboration in research.

g-index
An index that considers both the number of publications and the number of citations per publication.

h-index
An index that quantifies both the productivity and citation impact of a researcher's publications.

Science Mapping
Science mapping helps to map the structure of scientific research and its dynamics.This is performed with the help of various approaches of this technique: the cited works' analysis to identify the most influenced publications, co-citation analysis to further understand relationships between referenced works, bibliographic coupling to link related publications, co-word analysis to show relationships on topics, and co-authorship to understand relations on social interactions between authors [52,53].Citation analysis is the research of how publications are interrelated through citations with the objective of identifying key works and trends in a given area.Co-citation identifies the connection between cited documents and/or indicated significant research themes in that area.Bibliographic coupling is similar but investigates links based on documents that derive from the same references, showing similarities in subject matter.It identifies simultaneous use of keywords in the record and serves detection of relationship between different research topics.Co-authorship analysis allows an investigation into the collaborative networks that are formed between researchers, both social and institutional, through scientific research [53,54].Table 4 summarizes those analyses.

Citation Analysis Explanation
Relationships among publications Examines how publications are related through citations, showing how knowledge is built over time.

Most influential publications
Identifies publications that have had the most significant impact on a field, as evidenced by citation counts.

Co-citation analysis Explanation
Relationships among cited publications Analyzes the frequency with which two documents are cited together, indicating their relatedness.

Foundational themes
Identifies core themes and seminal works that form the basis of research in a particular field.

Co-word analysis Explanation
Existing or future relationships among topics Analyze the co-occurrence of keywords or terms within publications to identify relationships between topics.

Written content words
Focuses on the content of publications to uncover trends and patterns in research topics.

Bibliographic coupling Explanation
Relationships among citing publications Examines how publications are linked by their references to the same documents, suggesting topical similarities.
Periodical or present themes Analyzes current and emerging themes in research based on shared references.

Co-authorship analysis Explanation
Social interactions or relationships among authors Studies the collaboration patterns among authors, highlighting social networks in research.

Authors and author affiliations, institutions, countries
Analyze authors' affiliations to understand the research collaboration's geographic and institutional distribution.

Limitations of Bibliometric Analysis
One important concern is that of controlled coverage in the literature.The consequent weakness in the sense that it probably does not cover all relevant literature may, at times, result in biased findings.One of the critical factors that would, therefore, determine the outcome of bibliometric analysis is the depth of the search of the literature.Therefore, to guarantee this, a multilateral search strategy is very key.This means that the approach will ensure a range of literature coverage taken from the different databases, such as Web of Science, Scopus, and Google Scholar since each of these databases possesses its independent strengths in the standards of indexing.The incorporation of grey literature, including conference proceedings, theses, and technical reports, can capture invaluable research that might not be indexed in traditional databases.The iterative search method, with the initial findings guiding the subsequent searches, would help generate an additional number of relevant studies that may have been missed at the preliminary stage.Furthermore, involving exactly pertinent specialists in this process of reviewing search strategy and literature will only enhance the dataset toward more relevancy and comprehensiveness.In this manner, the researcher ensures that the relevant literature is not missed, and hence, bias is minimized [55].
Another issue that must be dealt with is that of technological bias in terms of the software tools themselves.Some software tools will be encoding bias in the type of results received, simply those expected by the software tool makers because they are closer to what is considered the "expected" outcome from such software tools.This illustrates how constraints within a software program can produce unintended effects on the analysis results.This will require an understanding of the functions and abilities of each tool selected.It brings out the fact that by integrating various tools, such as R, Python, VOSviewer, and CiteSpace, the inherent limitations of any single tool can be mitigated to strike the right workability balance in the analysis.On the other hand, results derived using different tools should be validated as part of ensuring the analysis is robust and reliable.Furthermore, updating and enhancing software tools to include the most innovative algorithms and techniques also offers a way out of technological bias.Documentation of how methodologies and the tools used work should be performed in a transparent way, as well as the limitations that might be imposed, to ensure integrity not only in the steps of research but also to make such research findings reproducible [56][57][58].
Finally, it is necessary to acknowledge the shortcomings of bibliometric techniques.While the first strands focus on citation counts, co-authorship patterns, or the like as quantitative features of the literature, those that exploit this further with research designs already-descriptive in character-do not factor in many qualitative dimensions, such as a theoretical description and the practical significance of the research.This can actually simplify an exercise that, in its fullest implementation, is quite complex.One way to reduce such potential detrimental effects of being forced into this strict constraint would be supplementing a bibliometric analysis by using qualitative methods such as content analysis or expert interviews to obtain a more detailed understanding of the literature.This integration of these two insights, that is, the quantitative and the qualitative, further enhances the analysis to make it rich and to enrich the perspective about the research domain.Limitations of the final report should be acknowledged and how they may affect the interpretation of the results should be discussed.Following a mixed-method design and being transparent on weaknesses boosts the quality and depth of bibliometric studies [5].

Key Questions for Performing Bibliometric Analysis
The authors should begin by clearly defining the specific research questions or problems they aim to address and outline the goals of the bibliometric analysis.They need to identify which databases, such as Web of Science, Scopus, and Google Scholar, will be used for the literature search and develop strategies to ensure the dataset is comprehensive and relevant, including the inclusion of grey literature.To manage duplicates and inconsistencies in author names, authors should use tools like R and Python and establish clear criteria for including or excluding publications based on relevance and quality.They should determine which bibliometric techniques, such as co-citation analysis, co-word analysis, and bibliographic coupling, are most suitable for their research questions and objectives.Authors must select appropriate software tools, such as VOSviewer, CiteSpace, and Bibliometrix, to conduct the analysis and interpret the results effectively.They should choose visualization methods, such as graphs, network maps, and thematic clusters, that best represent the data, making the findings clearer and more impactful.Finally, authors need to interpret the results in the context of their research objectives and communicate their findings and implications effectively through comprehensive reports, utilizing software like MS Word or LaTeX.
The following table (Table 5) outlines the key inquiries writers should consider when conducting a comprehensive bibliometric analysis.Table 5. Key questions for bibliometric analysis.
Step Guidelines Questions to Consider

Define Research Objectives
Clearly outline the objectives of the bibliometric analysis.
What specific research questions or problems am I aiming to address?What are the goals of this analysis?

Literature Search and Data Collection
Collect relevant literature from reputable databases.
Which databases will I use for the search?How will I ensure a comprehensive and relevant dataset?

Data Cleaning and Preprocessing
Ensure the accuracy and consistency of the data.
How will I handle duplicates and inconsistent author names?What criteria will I use to include or exclude publications?

Selection of Bibliometric Techniques
Choose techniques that align with the research objectives.
Which bibliometric techniques are most suitable for my research questions?How do these techniques help me achieve my objectives?

Data Analysis
Conduct the analysis using the selected techniques.
What software tools will I use for the analysis?How will I interpret the results?

Visualization
Create visual representations of the data to aid interpretation.
What types of visualizations will best represent my data?How can these visualizations make my findings clearer and more impactful?
Step Guidelines Questions to Consider

Interpretation and Reporting
Interpret the findings and prepare a comprehensive report.
What do the results mean in the context of my research objectives?How can I effectively communicate my findings and their implications?

Conclusions
This entry concludes that bibliometric analysis is one of the important scientific methods that is available for established scientists and young researchers aiming to review an enormous dimension of research.Bibliometric methodology is receiving increased interest with highly available and useful bibliometric software and databases.Such techniques are increasingly gaining importance with the rising field of artificial intelligence and big data.It further breaks down in human language what the bibliometric analysis encompasses, including methodologies, techniques applied, recent improvements made in the analysis, and much more.Deciding which method to choose is crucial at each step in the process of bibliometric analysis since it will determine the data inputs and results from the same.

Table 1 .
Main steps for bibliometric analysis.