Energy Efﬁciency in Cloud Computing: Exploring the Intellectual Structure of the Research Field and Its Research Fronts with Direct Citation Analysis

: The aim of the study is to explore the intellectual structure of the ﬁeld and fronts in research on energy efﬁciency in the context of cloud computing and thus to contribute to science mapping of the research ﬁeld. The research process was driven by the following study questions: the research ﬁeld? and (2) what are the research fronts in the research ﬁeld? The method of direct citation analysis was employed in the research process. Data for analysis were obtained from the Scopus database and analyzed with the use of VOSviewer science mapping software. In response to the ﬁrst question, we identiﬁed the most inﬂuential publications in the research ﬁeld and analyzed their types (i.e., whether they are original research papers or rather the “context” papers e.g., survey or review papers, framework papers, challenges papers, and study papers). Moreover, a comparison analysis between the types of papers among the most cited “classical” publications and “emerging stars” was conducted. In response to the second research question, we identiﬁed ﬁve research fronts concentrated around the issues of: virtual machine management (“VM”); task-focus, concerning data replication, task consolidation, and task scheduling (“task”);


Introduction
The second decade of the 21st century saw the period of rapid growth of cloud computing-a new trend in communications and computing technology, which started only a few years prior. Enabled by the constantly growing capacity and ubiquity of the internet and progress in mass storage technology, it offers many advantages-and associated new challenges-to companies and users, enabling software as a service approach with simple deployment of updates, advanced functionality delivered with low-resource end devices (IoT-Internet of Things), large scale data correlation, analysis, and more (see Avram [1], for example). However, the trend moving functionality and data storage towards data centers also has a significant impact on the use of energy for computing. While user devices become increasingly mobile, with reduced energy consumption appropriate for battery use (with some exceptions, the most obvious being high-end gaming and small-scale high-performance computing-basically fast graphics-processing units (GPUs), which tend to consume energy at similar rates between generations, increasing computing power instead), the demand for data centers of varying size grows. Even though server hardware becomes more and more energy efficient itself, the growing demand increases the energy consumption by the data centers-see Avgerinou et al. [2]. This can be seen as a problem as well as an opportunity-concentration of computation in large data centers enables new, innovative methods of optimizing energy use through energy-aware management of tasks, virtual machines, and data location, something not possible years ago, when most of the processing was handled by end devices. As a result, this decade has seen a lot of research activity in this area.
Lis et al. [3] notice that research on energy efficiency in cloud computing has not yet been mapped bibliometrically, although the field has been developing dynamically in recent years and bibliometric reviews have become increasingly popular approaches to study the outputs of scientific productivity on cloud computing in general and its particular aspects, e.g., security issues or quality of experience by customers, or contexts, e.g., the healthcare sector. This gap has not been filled either by traditional literature reviews, which focus on narrowly defined aspects rather than covering the overall picture of the research field [3]. The aforementioned work by Lis and his associates has attempted to map the conceptual structure of the field, giving emphasis on leading thematic areas and emerging topics. Nevertheless, the intellectual structure of the field has not been included in this study.
Therefore, the aim of the study is to explore the intellectual structure of the field and fronts in research on energy efficiency in the context of cloud computing, and thus to contribute to science mapping of the research field. Referring to Zupic andČater [4], the following research questions were defined to operationalize the aforementioned aim of the study: (1) what are the most influential publications in the research field? and (2) what are the research fronts in the research field? The method of direct citation analysis was employed in the research process. Data for analysis were obtained from the Scopus database and analyzed with the use of VOSviewer science mapping software [5].
The remainder of the paper consists of three sections and conclusions. This introduction is followed by the methodology section explaining employed methods, sources of data, and the research sampling process. The main body includes two sections covering results presentation and analysis, as well as a discussion of research findings.

Materials and Methods
Science mapping is one of the branches of bibliometric studies. Science mapping methodology comprises the five main methods employed to explore research fields i.e., direct citation analysis, co-citation analysis, bibliographic coupling, co-author analysis, and co-word analysis [4]. Three among them, i.e., direct citation analysis, co-citation analysis, and bibliographic coupling, explore the relations among documents in a dataset and may be employed for exploring the intellectual structure of a field and research fronts within it. For the purposes of this study, we employed the method of direct citation analysis. Citation analysis is based on the assumption that the more often a given publication is cited, the more significant impact it has on its research field. In spite of its weaknesses, such as bias towards 'core' authors (so-called Matthews effect) and older publications, as a result of inequal distribution of citations, direct citation analysis is recognized as an effective method for analyzing the intellectual structure of a field [6] and identifying research fronts [6,7]. Although comparing against two other methods recommended for mapping research fronts, i.e., co-citation analysis and bibliographic coupling, direct citation analysis is found to show a slightly lower accuracy, but is advantageous in regard to more even clustering of publications in the analyzed period of time. As observed by Boyack and Klavans [7] (p. 2391), "[i]n a longitudinal dataset where links are restricted to those within the set, bibliographic coupling is able to cluster very recent papers but clusters fewer of the very old papers, while co-citation clustering does the opposite-it clusters the older papers, but cannot cluster the most recent papers that have not yet been cited. Direct citation clusters documents more evenly across the time window, and tends to cluster a larger number of documents than either bibliographic coupling or co-citation processes". Direct citation analysis is also found to be more effective in mapping research fronts than two other methods taken into account (i.e., co-citation analysis and bibliographic coupling) [8]. As summarized in the study, "[d]irect citation, which could detect large and young emerging clusters earlier, shows the best performance in detecting a research front [ . . . ]. Additionally, in direct citation networks, the clustering coefficient was the largest, which suggests that the content similarity of papers connected by direct citations is the greatest and that direct citation networks have the least risk of missing emerging research domains because core papers are included in the largest component" [8] (p. 571).
The publications related to energy efficiency issues in cloud computing, indexed in the Scopus database, constituted the sample for analysis. On 5 September 2020, we searched for the logical conjunction of phrases "cloud computing" AND "energ*" AND "efficien*" in the titles of publications indexed in Scopus. The truncation (stemming) technique (an asterisk after the roots of searched words) was used to include all the variations of the words related to "energy" and "efficiency". Neither the date of publication nor a subject area were limited to include all the relevant items. We replicated the searching criteria of the study by Lis and associates [3] in order to ensure a comparison between the research fronts identified with direct citation analysis and thematic areas discovered through co-word analysis. Consequently, 323 publications were retrieved and selected for further analysis. The process of citation analysis was supported by VOSviewer science mapping software [5]. We decided to employ VOSviewer due to its growing popularity within academia. The first study indexed in Scopus and employing VOSviewer for science mapping was published in 2011. In 2018, for the first time, more than 100 publications were indexed which included the phrase 'VOSviewer' in their titles, keywords, and abstracts. The number of such publications increased to 493 in 2020 and to 775 in 2021 (as of 03 October). VOSviewer is a free-of-charge software aimed at creating, analyzing, and visualizing bibliometric maps. The authors of the software explain that "VOSviewer constructs a map on a co-occurrence matrix. The construction of a map is a process that consists of three steps. In the first step, a similarity matrix is calculated based on the co-occurrence matrix. In the second step, a map is constructed by applying the VOS mapping technique to the similarity matrix. Finally, in the third step, the map is translated, rotated, and reflected" [5] (p. 530). VOSviewer supports distance-based bibliometric maps. In this type of maps, the distance between the two items manifests the strength of relation between them, i.e., the closer two objects are located to each other, the stronger relation is noticed [5]. A unified framework consisting of the VOS (i.e., visualization of similarities) layout technique and the VOS clustering technique [9,10] is employed by VOSviewer to create bibliometric maps. As claimed by Van Eck and associates, "in general, maps constructed using VOS provide a more satisfactory representation of a dataset than maps constructed using well-known multidimensional scaling approaches" [11] (p. 2405). In the process of analysis, the layout parameters of attraction and repulsion "influence the way in which items are located in a map by the VOS layout technique", while resolution "determines the level of detail of the clustering produced by the VOS clustering technique" [12] (pp. 21,22). This means that an increase in the value of the resolution parameter results in an increase in the number of clusters. Following the recommendation of the authors of the VOSviewer software, in our analysis of research on energy efficiency in the context of cloud computing, we employed default values of the parameters of attraction, repulsion, and resolution. In order to avoid having single items or very small clusters, we switched the function of merging small clusters on and set the minimum size of a cluster at the level of 10 items. Detailed parameters are provided in Table 1.

Core Publications
Among the publications comprising the research sample (n = 323), there are 237 works which were cited at least once. Those items are the subject of direct citation analysis. The study of "Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing" by Beloglazov and associates [13] with 1679 citations is found to be the most influential paper in the research field. Another highly cited publications include: a review of methods and technologies employed for energy efficient cloud computing by Berl et al. [14] (443 citations), "A taxonomy and survey of energyefficient data centers and cloud computing systems" by Beloglazov et al. [15] (427 citations), an investigation of energy consumption by mobile users of cloud computing services by Miettinen and Nurminen [16] (408 citations), and an analysis of heuristics optimizing the use of resources and energy consumption in cloud computing by Lee and Zomaya [17] (377 citations). The density visualization of the most prominent publications in the cloud computing energy efficiency research field weighted by the number of citations is displayed in Figure 1, and the catalogue of the top 20 core references in the research field is enumerated in Table 2. In the item density maps, the color scale ranges from blue through green and yellow to red as the number of citations increases. The labels of publications display the name of the first author and the date publication. Source: authors' own study. Data sourced from Scopus, analyzed with VOSviewer (5 September 2020). As direct citation analysis shows a natural favoritism towards older publications, which have had more chances to be cited than those more recent, we included the attribute of the normalized number of citations into analysis. In VOSviewer, "[t]he normalized number of citations of a document equals the number of citations of the document divided by the average number of citations of all documents published in the same year and included in the data that is provided to VOSviewer" [12] (p. 37). The normalization function is used to mitigate the bias of direct citation analysis towards the earliest documents at   As direct citation analysis shows a natural favoritism towards older publications, which have had more chances to be cited than those more recent, we included the attribute of the normalized number of citations into analysis. In VOSviewer, "[t]he normalized number of citations of a document equals the number of citations of the document divided by the average number of citations of all documents published in the same year and included in the data that is provided to VOSviewer" [12] (p. 37). The normalization function is used to mitigate the bias of direct citation analysis towards the earliest documents at the expense of the most recent publications.
Taking into account the normalized number of citations, some very recent publications are found among those having a strong impact on development of the research field. First and foremost, the works by Kandavel and Kumaravel [29] and Liu et al. [25] are worth mentioning. The analysis of "Offloading computation for efficient energy in mobile cloud computing" [29] published in 2019 has received 87 citations so far and scored the highest value of the normalized number of citations (14.69) within the sample. The study of "An energy efficient ant colony system for virtual machine placement in cloud computing" issued in 2018 is cited 101 times and is found to be another one of the most influential recent researches in the field with the normalized number of citations equal to 13.29. The seminal paper by Beloglazov et al. [13], published in 2012, is recognized as the most cited paper in the field (1679 citations) and is the third top publication when taking into account the normalized number of citations (11.94). The map of density visualization of items with the highest value of the normalized number of citations is displayed in Figure 2, and the list of top 20 publications is enumerated in Table 3. Identifying core publications of the highest normalized number of citations provides information about the most influential documents in the research field. Mitigating the natural bias of direct citation analysis towards earlier publications, the citation normalization technique enables researchers exploring the intellectual structure of the field to supplement classical and seminal works [13] with "emerging stars" of a potentially high impact on the development of the field, e.g., [25,29]. the normalized number of citations (11.94). The map of density visualization of items with the highest value of the normalized number of citations is displayed in Figure 2, and the list of top 20 publications is enumerated in Table 3. Identifying core publications of the highest normalized number of citations provides information about the most influential documents in the research field. Mitigating the natural bias of direct citation analysis towards earlier publications, the citation normalization technique enables researchers exploring the intellectual structure of the field to supplement classical and seminal works [13] with "emerging stars" of a potentially high impact on the development of the field, e.g., [25,29].

Research Fronts
In order to identify research fronts within the field, network visualization function of VOSviewer was employed. The publications cited at least five times were taken into account. There were 147 publications meeting this threshold. For them, the number of citation links was calculated and documents with the largest number of links were chosen. The largest set of connected publications counting 67 items was identified and categorized into five clusters ( Figure 3) and then the composition of clusters was explored ( Table 4). The number of clusters was calculated automatically by the VOSviewer software according to the pre-defined analysis parameters (cf. Table 1). The clusters were ranked from these of the highest number of items to those with the lowest one.  Source: authors' own study. Data sourced from Scopus, analyzed with VOSviewer (05 September 2020).   Network analysis of citation links among the analyzed publications indicates five clusters of connected publications focused on the following issues: (1) VM; (2) task; (3) energy; (4) model; and (5) network. In all cases, the focus is clear, but not complete. As can be expected in an analysis based on citations alone, topics of individual papers can be quite different, but the leading issue of the cluster is clearly visible in most of its papers. Cluster 1 (marked in red in Figure 3) follows the topic of energy-aware resource allocation in cloud computing data centers, with a clear focus on the virtual machine management, leading to the label "VM". Note that the adjacent clusters (especially 2 and 3) are strongly related thematically. Cluster 2 (green) is closely linked with the previous one. While some of the highly connected papers in the cluster explicitly consider virtual machine deployment, the core of cluster 1, the remaining papers in the cluster are more task-focused, concerning data replication, task consolidation, and task scheduling. This resulted in the label "task" assigned to the cluster. Cluster 3 (blue) is closely linked to the previous two. While the works in this cluster often include scheduling or VM deployment themes typical of clusters 1 and 2, analysis of titles and abstracts shows a clear focus on energy efficiency, leading to the label "Energy". Cluster 4 (yellow) is characterized by a strong theoretical bias, approaching the problem of energy efficiency through modeling and optimization, resulting in the label "model". Cluster 5 (purple) is significantly less connected with the other clusters, which finds support in the analysis of titles and abstracts in this group. The focus is clearly on energy efficiency in the networking context, leading to the label "Network".

Discussion
The following discussion is focused on the results of analysis of the titles and abstracts of the papers in each of the identified clusters, moving on to more general analysis of the field.

Clusters as Research Streams
The clusters identified in the previous section can be viewed as streams of related research connected by citations. In this section, we present these research directions in more detail.
Cluster 1 (VM) has roots in the framework paper by Beloglazov et al., 2012 [13]. The first works of this stream-Borgetto and Stolf [43]; Horri et al. [31]; and Shidik and Ashari [52]-focus on virtual machine management: allocation, reallocation, and consolidation. In the same year, Djemame et al. [46] present a cloud architecture focused on energy efficiency and Tesfatsion et al. [55] present techniques involving CPU frequency scaling and core allocation. These early works set the theme for the rest of the cluster, as other works in this cluster expand on these issues. In cluster 2 (task), the earliest works are by Ye et al. [65] and Lee and Zomaya [17]. The first one is similar to cluster 1 in its focus on virtual machine migration, while the second deals with task consolidation heuristics. Starting there, most of the works in the cluster deal with various mathematical approaches to task scheduling and resource allocation. Early works (2013-2016) propose various optimization or heuristic approaches, e.g., resource allocation as a constraint satisfaction problem in works of Lin et al. [59,60] or as a linear programming problem with various selection algorithms in Kumar et al. [58]. Later papers experiment with other optimization approaches and metrics, e.g., evolutionary algorithm in Ye at al. [66].
The earliest paper in cluster 3 (energy) is Beloglazov et al. [15], who review the causes of high energy consumption. The papers in this cluster mostly follow this theme (with a few outliers dealing with scheduling). The general topic of this cluster is underscored by two highly cited publications from 2015-Kaur and Chana [27] survey the existing energy efficiency techniques, presenting data centers as both major energy consumers and potential energy savers, while Mastelic et al. [20] analyze energy consumption of the infrastructure behind the cloud. Newer papers follow the topic with new eco-aware metrics and reviews of energy saving methods. Cluster 4 (model) starts with an early review paper by Berl et al. [14] and Wang et al. [79,80] from 2012 on energy-efficient task scheduling models, employing genetic algorithms. This sets the general direction of the works in this cluster-e.g., Demirci [75] provides a survey of works involving machine learning for cloud energy efficiency problems, while Tang et al. [76] use genetic algorithms to solve the energy consumption optimization problem. Various other models are discussed either as the core topic of the papers or as an aspect of a different core problem (e.g., in the design of a virus scanning service in Zhang et al. [79]).
Finally, cluster 5 (network) begins with Cao et al. [84] considering the VM allocation problem, but is better defined by the second oldest paper by Xiang et al. [91], which discusses the link selection and data transmission scheduling problems. The focus of this cluster is the cloud as a networked system, with various transmission problems to consider. For example, energy-efficient routing is a core problem discussed by Fallahpour et al. [86] and Jiang et al. [88], while Lu and Sun [89] discuss energy-efficient load balancing.

Thematic Separation of Clusters
As could be expected in case of a connected set of publications, all of the clusters share a common theme which appears as a leading issue in some papers regardless of clusters. This central theme is the problem of energy-efficient deployment of virtual machines. The topic is central to cluster 1, but appears in all clusters, for example:

•
Cluster 2: Zhou et al. [67] and Zhou et al. [68] explicitly address virtual machine deployment and virtual machine migration, respectively; • Cluster 3: Fayyaz et al. [70] consider VM consolidation as a tool in energy-efficient resource scheduling. Note that this is the least clear example of the common theme; • Cluster 4: Vakilinia et al. [77] propose a platform for virtual machine placement/migration; • Cluster 5: Even though the cluster is most isolated from the other ones, Cao et al. [84] focus on energy-efficient allocation of virtual machines based on demand forecast.
Even outside of this common theme, the clusters are, in general, clearly linked thematically. The entire group can be described as a body of work regarding energy-efficient task and resource allocation in cloud environments, with the different clusters focusing on different aspects of the problem or handling it in a different way. The labels introduced in Section 3.2 correspond to those different approaches. The division between clusters 1 and 2 is unclear, but works in cluster 1 tend to focus more on VM placement, while, in cluster 2, task scheduling is more explicitly addressed. Clusters 3 and 4 seem to take more high-level approaches, with cluster 3 more explicitly focusing on the goal of energy efficiency and cluster 4 taking a structured modeling approach.
Cluster 5 is much more clearly separated from the remaining clusters, as its central focus is on networking-routing algorithms, QoS constraints, manycast, etc. These themes are not present in most of the papers in the other clusters. However, even this cluster is not thematically consistent, with several papers considering energy efficiency in general, without a clear networking focus.
The contrast between the relatively clear separation of the clusters in the citation graphs and mixed content of the clusters can probably be attributed to the backgrounds of the researchers; depending on the previous work and familiarity with different aspects of the topic (task scheduling, VM management, energy efficiency, networking, linear modeling/optimization, etc.), the selection of literature for very similar papers can show significant bias.

Key Publication Types in Clusters
Analysis of the clustering behavior of the selected connected group of papers allows certain observations on the role of different papers. Most of the papers in the group are regular; they cite common sources for research context and recent closely related research, while themselves being cited in the latter role. However, certain papers stand out either as heavily cited centers of the clusters, or as links between otherwise weakly connected clusters of sources.
The keystone paper for this entire group is clearly the one by Beloglazov et al. [13], i.e., the most influential paper in the entire sample in terms of citation count. According to the abstract, the paper proposes "an architectural framework and principles for energy-efficient cloud computing". Even though there are several older papers in this group, with high citation counts, this one clearly ties together the entire body of work, directly citing or being cited by papers in each of the five identified clusters. Note that the keystone role does not require such a high citation count-it is defined by the graph connectivity, so a paper with very few citations from outside the identified group could potentially play the same role as long as it is well connected within it.
Each of the identified clusters also includes central papers tying them together: • Cluster 1 is focused directly around the above-mentioned keystone paper. Another highly connected paper in this cluster is Sharma et al.  [86]. The first paper is an investigation of key resource allocation challenges, while the second one is a research paper which proposes an energy-efficient manycast routing and spectrum assignment algorithm.
The above analysis confirms that, while direct citation analysis is effective at identifying clusters of related research, this is largely facilitated by high-quality survey/taxonomy or framework papers, citing a collection of crucial older papers and used in later papers as support for the importance of the addressed challenge, as a base/framework for the research or simply as a high-quality wide research context. Some research papers can sometimes take this role in smaller clusters, either tying together related papers due to an extensive bibliography, or by achieving sufficient significance to be often cited as previous works.  [83] stand out visually as a link of cluster 5 to other clusters, an example of another important role-the role of connecting nodes. Such nodes can appear in two very different ways. In some cases, a connecting paper signifies an important step in the differentiation of the field, with further research having reduced incentive to cite the papers from other clusters, but identifying the branching one as a precursor. However, in this case, the paper is, in fact, relatively new, and newer than most papers in cluster 5. Furthermore, it is not yet very highly cited (6 citations). Instead of a research branch diverging from a common root, this is a case of thematically related research based on different sources being correctly identified and connected by a publication citing from both groups.

Core Publication Types
Analysis of core publications in the entire sample provides similar conclusions to the selected clusters, but interesting observations can be made by comparing the results for normalized and non-normalized citation counts. For this purpose, it is useful to review the publications' abstracts and titles and estimate their character, as accomplished in some cases in the previous section (cf . Tables 2 and 3). In general, we will classify the papers into two groups. One group ("research") focuses on original research of the authors related to the core problems of the field-from proposed models to experimental verification of new approaches, algorithms, etc. The other group ("context") is an amalgamation of several types of papers: survey or review papers, cataloguing existing research, framework papers proposing high-level architectures or methodologies, challenges papers identifying open questions in the field, and study papers, either comparing existing solutions or proposing new tools for the field-experimental approaches, methodologies, etc. Clearly, the individual subclasses in the second group may overlap; in fact, a single paper may fit more than one group. The boundary between the two groups is also not strict-e.g., there can and do exist papers that both provide new solutions and present them in the context of a wide set of existing ones, providing a useful comparative study.
The top 20 publications by citation count are generally older, 65% of them published before 2016, only one in 2019 and none in 2020, as it is difficult to reach such high citation numbers so fast. There is a similar amount of research and context papers in this group (9 vs. 11), with the context papers occupying places mostly in the upper half of the list. The context papers on the list tend to be older, with two newest being from 2016, while relatively new research papers are present. This is consistent with intuition; as new research requires context (surveys, reviews, methodologies, etc.), papers of this type accumulate a lot of citations, and the ones already highly cited tend to gather more. On the other hand, only especially successful research papers gain comparable numbers of citations.
Analyzing the top 20 list for normalized citation counts gives a very different outcome. Normalization changes the order significantly. Some context papers are replaced with research ones (65% research papers on this list). Furthermore, this list is-as expected-far less biased towards old papers. In fact, 11 of the 20 papers were published in 2016 or later. For research papers this trend is even reversed, with almost 70% of the research papers on the list published in year 2016 or later. The prevalence of context papers in the upper half of the non-normalized list is not present here; instead, the top of the list seems to be more accessible to newer papers, especially in the case of research papers.
This result suggests that the two groups follow very different dynamics. Good context papers tend to gain citations over a long period of time, as their usefulness for current research remains significant. This leads to very high citation counts in older papers. On the other hand, good research papers tend to gather most of their citations early, through immediate follow-up research or state-of-the-art reviews which focus on the latest results. This results in high normalized citation counts for new papers, which tend to decrease over time. In summary, the citation analysis results seem to agree with the intuition which allows us to expect a group of "context"-type papers and only some key "research"-type papers to form the core of literature of any field.

Comparison of Bibiliometric Methods
As noted earlier, the sample analyzed in this paper is the same as the one used in Lis et al. (2020) [3], although citation counts have been updated. This allows us to compare the results of different bibliometric methods applied to the same data. Lis et al. (2020) [3] adopted a multifaceted approach, utilizing a combination of bibliometric descriptive studies (research profiling), science mapping (keyword co-occurrence analysis), and literature reviews (systematic literature review), but notably omitting direct citation analysis. In their study, "network analysis of high-frequency keywords indicates the four following thematic clusters within the research field focused on the studies of energy efficiency in cloud computing systems: (1) virtualization; (2) power; (3) scheduling; and (4) offloading" ( [3] p. 17). While the previous sections show that direct citation analysis does produce interesting results correlated with the actual content of the papers in the sample, the question whether the results of such a content-agnostic approach are similar to more content-focused approaches is interesting. Table 5 shows publications from clusters identified in Section 3 that also appear in the top-cited lists for each cluster in Lis et al. (2020) [3]. Note that the nature of membership in clusters is different in both cases-in this work, each reference can only appear in one cluster, while, in Lis et al., the clustering is performed on keywords, which means that each paper can belong to more than one cluster. surprising, as we have selected the largest connected group in the sample. A large connected group of publications can be expected to focus on well-known topics. New trends are more likely to occur in smaller, less connected groups, as the network of citations is still being built.

Conclusions
The study was oriented to provide responses to research questions focused on the intellectual structure of the cloud computing energy efficiency research field and research fronts within the field. In response to the first question, we identified the most influential publications in the research field and analyzed their types (i.e., whether they are original research papers or rather the "context" papers, e.g., survey or review papers, framework papers, challenges papers, and study papers). Moreover, a comparison analysis between the types of papers among the most cited "classical" publications and "emerging stars" was also conducted. In response to the second research question, we identified five research fronts concentrated around the issues of virtual machine management ("VM"); task-focus, concerning data replication, task consolidation, and task scheduling ("task"); energy efficiency ("energy"); modelling and optimization ("model"); and energy efficiency in the networking context ("network"). Moreover, we compared and contrasted mapping of the cloud computing energy efficiency research field completed with different science mapping methods, i.e., direct citation analysis employed in our study and co-word (keywords co-occurrence) analysis used by Lis et al. [3].
The contribution of our study is mainly of a theoretical character. Through identifying core references in the field, including both the seminal works and "emerging stars", we provide scholars designing and conducting research in cloud computing energy efficiency with reading guidelines. Moreover, the findings related to the distribution of research papers and "context" papers among core references may be a useful observation for scientometrics, which certainly requires further exploration and validation in other research fields. Identification of research fronts brings another "added value" to the research field, as it offers a kind of a "map" of the research landscape to other scholars and facilitates them maneuvering within a variety of research themes. Direct citation analysis is also shown as a tool for tracing the development of the field when the papers are analyzed in a chronological order. This can be used to identify current trends and find related works regardless of the keywords proposed by the authors. To the best of our knowledge, our work is the first application of this method to the field of energy efficiency in cloud computing.
Discussing the findings of the study, the limitations of the research process should be taken into account. Firstly, using only one database as a source of bibliometric data should be considered as a weakness. Although we considered sourcing data from some other databases, due to technical limitations of the employed software, we were not able to implement our plan. VOSviewer creates maps based on bibliographic data retrieved from Web of Science, Scopus, Dimensions, PubMed, RIS, Crossfer JSON, or Crossref API. Nevertheless, analyzing files retrieved from more than one database is not possible in the same session. Thus, we decided to source data from Scopus, which is listed among the largest databases and is commonly valued for indexing high-quality publications [93][94][95][96]. Certainly, Scopus has some vulnerabilities, e.g., its bias towards output written in English [97], which we are aware of. Thus, replication of the study with the use of data retrieved from other sources, e.g., more open to publishing in languages other than English, is worth considering. Secondly, direct citation analysis was the only method employed to identify research fronts in the field, which inhibits triangulation of research. In consequence, we recommend exploring the research field of cloud computing energy efficiency with other science mapping methods, e.g., co-citation analysis [98] and bibliographic coupling [99], which are as well recognized as effective solutions for mapping research fronts [7].