Overview of Blockchain Oracle Research

: Whereas the use of distributed ledger technologies has previously been limited to cryptocurrencies, other sectors—such as healthcare, supply chain, and ﬁnance—can now beneﬁt from them because of bitcoin scripts and smart contracts. However, these applications rely on oracles to fetch data from the real world, which cannot reproduce the trustless environment provided by blockchain networks. Despite their crucial role, academic research on blockchain oracles is still in its infancy, with few contributions and a heterogeneous approach. This study undertakes a bibliometric analysis by highlighting institutions and authors that are actively contributing to the oracle literature. Investigating blockchain oracle research state of the art, research themes, research directions, and converging studies will also be highlighted to discuss, on the one hand, current advancements in the ﬁeld and, on the other hand, areas that require more investigation. The results also show that although worldwide collaboration is still lacking, various authors and institutions have been working in similar directions.


Introduction
"Although oracles play a critical role . . . the underlying mechanics of oracles are vague and unexplored" [1]. A preliminary study on Decentralized Finance (DeFi) oracles from the University of Singapore shows that despite the massive amount of money managed by oracles on DeFi platforms, their functions and roles are still widely neglected. Despite the plethora of papers involving blockchains, less than 15% consider oracles, and an even smaller percentage further investigated related issues [2]. The subject of blockchain oracles is critical because the entire concept of blockchain applications revolves around the idea of decentralization and trustless transactions. Those pillars, however, are undermined when, gathering real-world data, blockchain applications rely on centralized and trusted third parties. This issue, either addressed as an oracle problem [3] or an oracle paradox [4], makes the community of blockchain enthusiasts quite skeptical about real-world applications [5]. Proposing a robust blockchain application against the oracle problem requires the redaction and discussion of the so-called "trust model", a document or scheme that broadly explains how data are fetched by oracles in a decentralized and trustless manner [6][7][8][9]. A robust trust model should first include information concerning how data collected by oracles are validated before being pushed into the smart contract. Second, it should specify how the security and unforgeability of data are ensured from the time they are collected to the moment they are permanently stored on the ledger. Third, it should outline the incentive mechanism implemented to prevent collusion or the deliberate tampering of data feeds for selfish purposes [9][10][11]. Defining and adopting a robust trust model is not only essential for a blockchain application to work properly but is also often considered the key to mass adoption [12]. However, academic contributions concerning oracles or those discussing a detailed "trust model" [2] remain scarce. On the one hand, proposing a real-world blockchain application without analyzing the oracle's role in depth poses serious doubts about the feasibility and genuineness of the underlying project [13]. On the other hand, proposals with a detailed trust model would greatly help researchers and practitioners analyze oracle-related features and issues and reproduce successful projects respectively [14].
Therefore, knowing which institutions are actively undertaking research on blockchain oracles and which ones are already implementing them in real-world applications is interesting and important. Scholarly interest in blockchains has resulted in some literature reviews on this topic, but none has yet undertaken research through a bibliometric analysis on blockchain oracles [15][16][17]. A bibliometric analysis aims to identify how the body of knowledge on blockchain oracles has evolved in the last few years in terms of the leading publication outlets, the geographical distribution of research communities, the density of collaboration, and methodological approaches. Unlike classic literature reviews, a bibliometric analysis provides a quantitative and structural overview of the investigated scientific field, reducing the chances of subjective biases [18]. The advantages of undertaking this type of study are the representation of a phenomenon in a formal and objective way, ensuring the robustness and reproducibility of results. A bibliometric analysis is also meant to guide scholars who are interested in undertaking research in that sector to understand the research gaps, methodologies used, and appropriate outlets for publication. To ensure the significance, usability, robustness, and replicability of the research study, this paper will follow a standard bibliometric approach that has been used in several studies across different disciplines [19][20][21][22][23]. The methodology will be extensively explained so that any individual can reproduce every passage, regardless of their expertise. The data extracted will be motivated by the associated meaning and will be presented with the aid of figures and tables. Following prior bibliometric analyses in other sectors, the collected sample will be organized based on the categories and sub-categories of the topics [22,24]. In this study, three areas will be investigated. First, an overview of the most productive institutions (in terms of papers published), the most cited authors, and the most common publication outlets will be provided. The authors will then have a better overview of the venues that support research in this domain. Second, ongoing studies will be further investigated to identify common streams of research, themes, and research directions to incentivize cooperation and progress in the field. Third, by discussing the reviewed literature, we will highlight areas that require further investigation. The following are the objectives of the study: Objective (1) Identify the most cited authors and productive institutions to find institutions and authors focused on the subject of the study; Objective (2) Identify research themes, directions, and converging studies to promote cooperation and progress; Objective (3) Highlight the areas that require further investigation. We consider this study necessary given the massive resonance of blockchain-related research and the slight growth in oracle-related investigations [14,15]. The contributions provided in this study will help researchers and entrepreneurs know which institutions are actively involved in a specific real-world blockchain application, how oracles are implemented, and which aspects the academic studies are focusing on. Discussing the key findings of the reviewed papers can also help other academics improve the quality and speed of research in related fields [2,25]. In contrast to other bibliometric analyses in the field of blockchains, this study focuses on oracles, a specific aspect of the technology that particularly affects real-world applications. Specific bibliometric analyses on cryptocurrencies and blockchains in healthcare or supply chains already exist, but to the best of the author's knowledge, there are no studies focused on oracles yet.
To better understand the value and contribution of this paper, we should point out that real-world blockchains to which this study refers are applications other than cryptocurrencies, such as healthcare, supply chain, DeFi, and resource management. Therefore, specific studies on blockchain characteristics, ecosystems, and cryptocurrencies are not considered in this paper because they are not directly related to blockchain-oracle ecosystems. Furthermore, a certain degree of subjectivity, especially in the selected categories, cannot be excluded despite the rigorous research design. Given the absence of prior studies, a predetermined framework was also not available to build upon. Given the scarcity of data and the increasing academic interest in the subject, the data presented in this study may also face early obsolescence. This paper is organized as follows: Section 2 covers the literature background, and Section 3 outlines the methodology used. Section 4 summarizes the results, and Section 5 reviews the literature, identifying common themes, research directions, and converging studies. Section 6 discusses the review results and identifies areas that need further investigation. Section 7 concludes the paper by providing suggestions for further research.

Literature Background
The power of Bitcoin lies not only in its decentralized features but also in its programmability. Experts, such as Antonopoulos, address it as "programmable money" [26]. Just by using "scripts" and without the intervention of third parties, premade "agreements", such as timelocks, Pay-to-Script-Hash, and multi-signatures, can be executed on transactions [27]. However, because of Vitalik Buterin and the introduction of the Ethereum virtual machine with smart contracts, blockchains became more developer-friendly and could be easily programmed for applications above the simple exchange of cryptocurrencies [5]. Nonetheless, the Ethereum blockchain needs to be a closed ecosystem operating on data that are already on the blockchain to reproduce Bitcoin's trustless and deterministic setting [5]. This condition is necessary to ensure that all the required data for smart contracts are publicly verifiable and auditable by all nodes [5,28]. Without data coming from the external world, the range of possible automated contracts would have been extremely limited [29]. Therefore, a means to deliver extrinsic data to the blockchain was needed to broaden the use of smart contracts [3,30,31]. This method is called an oracle. The oracle is an entire ecosystem that permits the collection from and the transfer and insertion of external data to the decentralized application [32,33]. As displayed in Figure 1, the oracle ecosystem usually comprises the following three parts. studies, a predetermined framework was also not available to build upon. Given the scarcity of data and the increasing academic interest in the subject, the data presented in this study may also face early obsolescence. This paper is organized as follows: Section 2 covers the literature background, and Section 3 outlines the methodology used. Section 4 summarizes the results, and Section 5 reviews the literature, identifying common themes, research directions, and converging studies. Section 6 discusses the review results and identifies areas that need further investigation. Section 7 concludes the paper by providing suggestions for further research.

Literature Background
The power of Bitcoin lies not only in its decentralized features but also in its programmability. Experts, such as Antonopoulos, address it as "programmable money" [26]. Just by using "scripts" and without the intervention of third parties, premade "agreements", such as timelocks, Pay-to-Script-Hash, and multi-signatures, can be executed on transactions [27]. However, because of Vitalik Buterin and the introduction of the Ethereum virtual machine with smart contracts, blockchains became more developerfriendly and could be easily programmed for applications above the simple exchange of cryptocurrencies [5]. Nonetheless, the Ethereum blockchain needs to be a closed ecosystem operating on data that are already on the blockchain to reproduce Bitcoin's trustless and deterministic setting [5]. This condition is necessary to ensure that all the required data for smart contracts are publicly verifiable and auditable by all nodes [5,28]. Without data coming from the external world, the range of possible automated contracts would have been extremely limited [29]. Therefore, a means to deliver extrinsic data to the blockchain was needed to broaden the use of smart contracts [3,30,31]. This method is called an oracle. The oracle is an entire ecosystem that permits the collection from and the transfer and insertion of external data to the decentralized application [32,33]. As displayed in Figure 1, the oracle ecosystem usually comprises the following three parts. Data Source: This is the source from which the data are collected and stored. It may or may not eventually be used by a decentralized application. The data source can be a Web Application Programming Interface (API), a sensor, or a human aware of a specific knowledge or event [34].
Communication Channel: This is usually referred to as "node". It collects data from the data source and delivers them to a smart contract so that the latter can be executed. Sometimes, oracle nodes coincide with blockchain nodes, but this is not always the case [29,35].
Smart Contract: This contains the code that establishes how the collected data can be managed. Usually, it has prespecified quality criteria for data to be accepted or rejected. Data Source: This is the source from which the data are collected and stored. It may or may not eventually be used by a decentralized application. The data source can be a Web Application Programming Interface (API), a sensor, or a human aware of a specific knowledge or event [34].
Communication Channel: This is usually referred to as "node". It collects data from the data source and delivers them to a smart contract so that the latter can be executed. Sometimes, oracle nodes coincide with blockchain nodes, but this is not always the case [29,35].
Smart Contract: This contains the code that establishes how the collected data can be managed. Usually, it has prespecified quality criteria for data to be accepted or rejected. If necessary, it may also perform computations to deliver the appropriate data to the contract [36,37]. Depending on how these three parts are organized and interact with each other, multiple types of oracles can be designed [12]. These three parts of an oracle are not always separate from each other, as the same entity may sometimes cover two or three roles at once. A human, for example, can serve as a data source and communicate the data directly to a smart contract [38]. In actuality, having more than one entity that covers the role of data source/node is possible and desirable. Relying on multiple entities is, in fact, crucial to ensure the execution of smart contracts, especially when one or more data sources/nodes are malfunctioning or offline [39].
The above-described oracle ecosystem is typical of blockchains that support smart contracts (e.g., Ethereum and Tron). Instead, oracles are implemented differently for blockchains, such as Bitcoin, where smart contracts (apart from a few scripts) are unavailable. If smart contracts are unavailable, oracles are usually implemented through M-of-N (e.g., three out of five) multi-signature wallets, requiring more than one signature to broadcast a transaction [40]. Therefore, the owner of a key plays the role of an oracle and executes the transaction when a certain condition is met. In that case, the oracle covers both the role of the node and the data source-for example, an agreement that sets a payment upon the delivery of a parcel ( Figure 2). If necessary, it may also perform computations to deliver the appropriate data to the contract [36,37]. Depending on how these three parts are organized and interact with each other, multiple types of oracles can be designed [12]. These three parts of an oracle are not always separate from each other, as the same entity may sometimes cover two or three roles at once. A human, for example, can serve as a data source and communicate the data directly to a smart contract [38]. In actuality, having more than one entity that covers the role of data source/node is possible and desirable. Relying on multiple entities is, in fact, crucial to ensure the execution of smart contracts, especially when one or more data sources/nodes are malfunctioning or offline [39].
The above-described oracle ecosystem is typical of blockchains that support smart contracts (e.g., Ethereum and Tron). Instead, oracles are implemented differently for blockchains, such as Bitcoin, where smart contracts (apart from a few scripts) are unavailable. If smart contracts are unavailable, oracles are usually implemented through M-of-N (e.g., three out of five) multi-signature wallets, requiring more than one signature to broadcast a transaction [40]. Therefore, the owner of a key plays the role of an oracle and executes the transaction when a certain condition is met. In that case, the oracle covers both the role of the node and the data source-for example, an agreement that sets a payment upon the delivery of a parcel ( Figure 2). A multi-signature wallet must be set up in which one of the keys has to be entrusted to a third party that performs the role of an oracle. When the buyer acquires the product, she signs the transaction with her key. However, given that the second signature has not been inserted, the transaction remains on hold. When the parcel is delivered, the entity in control of the oracle key signs the transaction, allowing for the successful execution of the transaction. Evidently, the choice of the entity that possesses the oracle key plays a crucial A multi-signature wallet must be set up in which one of the keys has to be entrusted to a third party that performs the role of an oracle. When the buyer acquires the product, she signs the transaction with her key. However, given that the second signature has not been inserted, the transaction remains on hold. When the parcel is delivered, the entity in control of the oracle key signs the transaction, allowing for the successful execution of the transaction. Evidently, the choice of the entity that possesses the oracle key plays a crucial role in those types of ecosystems [3]. This is a trivial example of an oracle solution on the Bitcoin blockchain implemented in traceability; however, the most commonly used cases belong to the finance/gambling field [41].
A thorough explanation of all oracle types is beyond the scope of this study; however, further information can be found in dedicated papers and web articles listed in references [10,31,40,41]. Given that oracle ecosystems operate in a different way with respect to blockchains, characteristics such as immutability, transparency, and trustless execution are not ensured [42]. This discrepancy in attributes implies that when blockchain-based applications need data from the external world, the characteristics of oracles are to be taken into serious consideration. If the data source is unreliable, the node is not trusted (or private), and the smart contract is poorly audited, the fact that an application runs on the blockchain is practically irrelevant [3,14,43]. Depending on third parties, blockchain technology alone cannot represent a solution to centralization, trust, and security issues.
This condition, widely explained by blockchain experts such as Andreas Antonopoulos and Paul Sztorc [44,45] and labeled by Dalovindj [41] as "the oracle problem", must be considered at the time of integrating blockchain with applications in the area of the supply chain, healthcare, and academic credentials. Various consequences may be faced, depending on the faulty oracle part and the application type [14,38,46]. In the healthcare sector, the presence of oracles constitutes another possible source of data breach, exposing patient records to theft or manipulation [47]. In the DeFi sector, the dependency on oracles would expose decentralized applications that rely on centralized or insecure data sources to risk millions of dollars of invested capital [46,48].
In the traceability sector, blockchain technology has been proposed, relying principally on the misconception that considering that the origin and movement of a cryptocurrency on the blockchain can be traced in a secure and trustless manner, the same can be performed with a tangible asset, such as food, clothes, and medicine [44]. Because the dependency on oracles for real-world applications makes it unlikely to reproduce the same level of tracking accuracy, only a few traceability projects show some robustness against that issue [8,42]. Lately, with Non-Fungible-Tokens (NFTs) and stablecoin technology, the blockchain-based traceability of tangible products is also following another path [49][50][51]. Rather than directly tracking a real product with blockchains, companies are instead creating a representation of those on the blockchain (NFTs) to guarantee genuineness and ownership.
Because of the oracle problem, numerous critiques and concerns also arise for other blockchain applications, such as intellectual property rights management, e-government, and resource management [30,[52][53][54].
For these applications to run genuinely decentralized and trustless, oracle ecosystems should be structured to ensure the same characteristics as blockchains. However, unlike blockchain technology, which has a history and development of nearly thirty years (considering the work of Haber and Stornetta [55] as its precursor), oracle ecosystems are relatively newer and unexplored spaces with few actors and limited literature [2]. This is the gap in which this study finds its legitimacy. It aims to shed light on academic contributions concerning blockchain oracles and promote cooperation and progress.

Methodology
An appropriate methodology should be chosen to fulfill the purpose of this study. Furthermore, an in-depth description of the steps followed had to be provided to ensure the reproducibility of the results. A bibliometric analysis was perceived as the appropriate method for reaching the goals of this research. Moreover, its standardized and systematic approach would ensure the reproducibility of results [19,56]. Building on prior bibliometric analysis [57,58], the methodology description will first involve database selection, inclusion, and exclusion criteria and, finally, data extraction variables. Regarding data collection, the intention is to include as many articles as possible, as long as they are academic in nature. Therefore, gray literature, such as whitepapers, opinion posts, and news, will not be considered in this research. On the one hand, although not peer-reviewed, this analysis will also consider preprints. The reason for this choice is that the included preprints are written by academics for submission to academic journals. On the other hand, non-peer-reviewed materials, such as opinion posts, are not meant to follow an academic path. Following Buttice and Ughetto [24] and Martinez-Climent et al. [56], the selected databases were Scopus and Web of Science (WoS), but Google Scholar was also queried. As the analysis also comprises preprints and unpublished manuscripts, limiting the research to Scopus and WoS would not have been a coherent choice. Including a third database would also increase the chance of retrieving other relevant articles. For the three databases, the research was conducted on 2 March 2022. When "blockchain" and "oracle" were used as keywords in the TITLE-ABS-KEY of Scopus database, 312 articles were identified. In the WoS database, two strings were implemented in the "Topic" section so that articles containing the word "oracles" were also included and identified. The research returned 143 results. Google Scholar database was queried using the same keywords as those used on the Scopus database, but the queries returned more than 10,000 entries because of their structural differences with Scopus and WoS. For that reason, and due to saturation of results, the author decided to stop the research study on Page 35 (which presents 350 entries organizing results in ten per page). Table 1 summarizes the queried databases, along with the selected research strings. Appropriate exclusion criteria were adopted to narrow down the most appropriate data sample, with the aim of balancing inclusiveness with relevance. However, no restrictions based on language or timeframe were applied because of the nascency of the topic and the research goal. Given that the goal was to gather all relevant information about oracle research, related authors, and institutions, adding a time or language restriction was not a coherent choice.  First, the abstract and introduction were read to retrieve and exclude evidently offtopic papers. Many documents were included in the sample for mentioning "random oracles" or "test oracles", which, despite a similar name, were not the oracles on which this study investigates. Other papers that mention Oracle, the name of a company, were also included, which, although involved in some blockchain projects, is again unrelated to the oracles discussed in this study. After following these steps, 163, 69, and 189 articles were removed from the Scopus, WoS, and Google Scholar samples, respectively. Given that gray literature was also retrieved from the Google Scholar sample, 7 other articles were removed because they were neither written by academics nor published in academic venues. After the duplicates were removed, the three samples were merged, obtaining a nonredundant sample of 282 entries.
With the steps mentioned above, the obtained sample was composed of papers that included the "oracle" keyword and specifically referred to the communication channels between the blockchain and the real world. However, the aim of this paper was to present the portion of literature that not only mentioned the oracles or explained their use but also offered a direct contribution to the oracle literature. Therefore, to further skim the results, all PDF articles were downloaded and inspected one by one with a word processor. All occurrences of the word "oracle" were contextualized and analyzed. The criterion was that if oracles were mentioned in the introduction or literature review but did not constitute a central part of the analysis, the article was not included in the sample. To better explain this research step, the table in Appendix A provides a list of the research and inclusion criteria. With this criterion, nearly half of the sample (120 papers) were discarded. Therefore, the final selection was reduced to 162 entries. In summary, because of these research steps, articles that not only mentioned blockchain oracles but also discussed their role and contributed to their development were retrieved. Table 2 broadly summarizes the methodology followed.

Data Extraction
Appropriate extraction variables (displayed in Table 3) were identified to extract as much information as possible from the selected sample. As this is probably the first bibliometric analysis on blockchain oracles, building upon existing or prior research was impossible. However, given that the aim of bibliometric analyses is relatively homogeneous, extraction variables could be taken from similar papers investigating other literature domains [24,56,59]. First, the "year of publication" was considered to place the literature within a specific timeframe, whereas the "element type" shows the most usual outlet for retrieved publications. "Authors", "institutions", and "countries" of provenance geographically contextualize the paper sample, highlighting the contributors to the academic advancements in the sector. Citations and keywords were used to analyze metrics. Finally, as in Butticè and Ughetto [24], articles were further divided based on their specific fields of analysis. This categorization of papers serves to investigate whether streams of literature exist where researchers are more contributing and others that require more attention. Although it may constitute a bias, in line with prior research, articles were associated with only one field category to avoid double entries [24]. First, two main categories were identified, mainly to distinguish between studies concerning oracles themselves and oracles applied to other sectors.
Second, the papers were divided to further differentiate them based on their specific fields of analysis. Although inspired by related research, category selection embodies a certain degree of subjectivity. Therefore, a description of these categories, starting with the main ones, is provided hereafter.
Oracle Theory (OT): Under this category, papers specifically focused on blockchain oracles, either from a theoretical or a practical point of view, were included.
Oracle Applied (OA): This category included papers that focused on real-world applications, such as healthcare, finance, and business process management, and also provided a detailed analysis of the role of oracles in these fields with theoretical or experimental approaches.
The main categories were further divided into sub-categories. Hereafter, those that belong to OT are listed as follows.
Architecture: With an empirical or theoretical approach, papers in this category performed analyses on the oracle framework to improve technical aspects, highlight current challenges, and identify new avenues for research. Unlike proposals or OA papers, this group includes works that have investigated existing oracle schemes that are not directly applied to a specific sector.
Proposal: These papers propose new oracle frameworks that may be implemented in real-world applications. These may still be at a conceptual or prototype stage.
Oracle Problem: These articles focused on aspects related to the trustworthiness of oracles and their limits to decentralization. Whereas all papers should outline trustworthy oracle environments, the papers in this category focused on the involved actors' incentives to cheat and the consequences of a deviation on the underlying applications.
Sub-categories belonging to OA, such as healthcare and energy, are intuitive, but those that require clarification are described hereafter.
Data Management: Articles concerning the transfer of data from the real world to blockchain pertain to the main category of OT. In this field, articles that analyzed access data management for reputation, privacy, or GDPR purposes were considered. Cloudcomputing-related research was filed under its own category, given that it mainly concerned data elaboration.
Finance: In this category, articles that involved oracles applied in financial applications and those that explored timeliness and gas usage of transactions were grouped. Those concerning asset management on blockchains were also included.
IoT: This category comprised papers investigating oracles as efficient IoT systems but did not refer to a specific real-world application. A paper concerning IoT in the supply chain, for example, would instead be inserted into the "supply chain and traceability" category.
Business Process Management: This category included works that proposed blockchain integration in business processes, clearly identifying the role of oracles. Although the supply chain is part of the business processes, articles specifically investigating this field were filed under their own categories.
Artificial Intelligence: Papers filed under this group concerned research toward the integration of blockchain technology into existing AI tech through the use of oracles or AI to improve oracle efficiency and reliability.
Transport: This category included papers investigating blockchain integration into intelligent vehicle development and the transport industry in general. Research on IoT device/sensors specifically implemented in the transport field were also filed in this category.
Supply Chain and Traceability: Papers investigating the benefit of integrating blockchains in the local or global supply chain belong to this category. Moreover, studies that concerned the traceability of physical products or documents were also included. Works investigating the traceability of financial assets (e.g., stocks or crypto) were included instead in the finance field.
Only the first author was taken into consideration to extract the country and institution provenance of the paper. Considering all the authors would have created a bias toward articles with a higher number of authors. We were aware that this choice may eventually affect the final results, but any other option would have yielded the same results. Regarding the authors' affiliation, the choice was to take the one declared in the last published paper to avoid the problem of double affiliation. With this criterion, some affiliations may have changed by the time the paper was published. Finally, citations were taken from Google Scholar because it was the only database in which all the papers in the sample could be retrieved. We were aware that prior studies cited in this paper utilized ad hoc programs, such as VOSviewer, for the elaboration of the result graphs. However, considering the extremely limited size of the retrieved sample, Excel tables and charts were considered to be much more intuitive. Furthermore, considering preprints from Google Scholar, software such as Bibliometrix could not be implemented. Therefore, a non-automated analysis was perceived as the most reasonable option.

Results
In this section of the paper, the results of the bibliometric analysis are reported. With a quantitative approach, the status and trends of the literature on blockchain oracles are shown. The analysis first covers the time and space of the research and then focuses on the outlets, authors, and field of analysis.

Number of Publications Per Year
The first academic papers considering blockchain oracles appeared in 2016 and were equally distributed among the categories "oracle theory (OT)" and "oracle applied (OA)" [60,61]. As Figure 3 shows, interest in the topic remained low until 2018. Until 2019, the number of papers concerning OT were slightly more than those discussing OA. The increase and the shift in the trend can be observable from 2019, with 2020 having four times more publications than in 2018 and 2021 having more than double the number of publications of 2019. Moreover, the number of papers regarding OA started to exceed that of OT by 2021. Although the 2022 sample concerns only the first two months, the imbalance in the number of publications appears to be confirmed. These data reveal that the topic has gained more impact and attention among academics, probably because of the higher developments of blockchain-related platforms.   However, in absolute terms, the overall numbers remain low, with a peak of 62 publications in 2021 and only 162 publications in all six years of academic production. These numbers show that this is still a niche subject.

Productivity Rate by Geographical Distribution
Tables 4 and 5 present the distribution of papers by country and continent, respectively. We can observe that the continents with the highest productivity are Europe and Asia, with more than 70% of total paper production. Asia, however, appears to be more focused on OA than Europe, which, although with practically the same OA contributions, presents a balance between the two main categories.  Concerning countries, the situation partially reflects what is observed with continents. The most productive countries are China and Italy, followed by the USA and Canada. Only those four countries together accounted for more than 44% of total publications. Concerning fields, countries appear to be sufficiently balanced, except for the UAE, which is more focused on OA, whereas Australia, USA, and Austria mostly contribute to OT research.

Publications by Outlets and Publishers
As Figure 4 shows, the majority of papers published in this field are journals (73) and conference papers (60). However, a small portion consists of book sections (20) and preprints (9). These data contrast previous blockchain technology reviews, showing that the number of conference contributions is four times more than that of journal publications [2,16]

Publications by Outlets and Publishers
As Figure 4 shows, the majority of papers published in this field are journals (73) and conference papers (60). However, a small portion consists of book sections (20) and preprints (9). These data contrast previous blockchain technology reviews, showing that the number of conference contributions is four times more than that of journal publications [2,16]. This finding supports the idea that there seems to be no dedicated conference venue on blockchain oracles. Table 6 and Figure 5 show the distribution of papers by journal and publisher, respectively. We observed that the majority of papers (61) are published in IEEE outlets and venues, whereas 25, 15, and 13 papers are published in Springer, Elsevier, and MDPI, respectively. However, if we consider only journal publications, the weight of the contributions would slightly change, given that 43 IEEE documents were conference papers, and of 25 Springer entries, 20 were book sections.  This finding supports the idea that there seems to be no dedicated conference venue on blockchain oracles. Table 6 and Figure 5 show the distribution of papers by journal and publisher, respectively. We observed that the majority of papers (61) are published in IEEE outlets and venues, whereas 25, 15, and 13 papers are published in Springer, Elsevier, and MDPI, respectively. However, if we consider only journal publications, the weight of the contributions would slightly change, given that 43 IEEE documents were conference papers, and of 25 Springer entries, 20 were book sections. Then, excluding non-journal publications, we would have IEEE with 18 publications, followed by Elsevier with 15, MDPI with 13, and Springer with 5. This information is incredibly insightful when considering Table 7, which shows that only four journals published more than two papers on the subject. Conference venues and book sections, except for two venues, contributed no more than one document. publisher, respectively. We observed that the majority of papers (61) are published in IEEE outlets and venues, whereas 25, 15, and 13 papers are published in Springer, Elsevier, and MDPI, respectively. However, if we consider only journal publications, the weight of the contributions would slightly change, given that 43 IEEE documents were conference papers, and of 25 Springer entries, 20 were book sections.   As shown in Table 6, the journals that published more contributions are IEEE Access and Future Generation Computer Systems, both with eight contributions. Among the other venues, the only notable is Business Process Management: Blockchain and Robotic Process Automation Forum, which contributed five book chapters. Table 7 provides an overview of the paper types determined by fields based on the main categories and sub-categories indicated in Section 3.1. It emerges as more than half (103); precisely, 63% are empirical papers, 23% are theoretical papers, and 14% reviews. At the general level, the majority of academic research over oracles is of an empirical nature. Nevertheless, these data still need to be distinguished by field of research.

Article Type, Fields, and Keywords
Concerning division by category, despite the higher number of sub-categories, the total number of papers belonging to OT (75) is slightly below those on OA (87). This is understandable, considering that oracles are still in their early-stage development, and a heterogeneity of views on how they should function and operate still exists. Although the majority of articles are still empirical, they are well balanced with theoretical and review types for the "architecture" and "oracle problem" sub-categories.
The second thing that emerges is that proposals are mainly of empirical/experimental nature, which bodes well for the birth of oracle frameworks in cooperation among or fully developed by academic institutions.
Regarding "oracle applied (OA)" papers being ideally a more practical area compared to OT, why an imbalance (except for BPM, AI, and transport) exists between empirical and theoretical papers is understandable. Furthermore, the smaller category size explains why only seven review papers were retrieved. By analyzing sub-categories, we can observe that some areas have fewer contributions than others. The finance sector leads with contributions, with 22 contributions, followed by data management (14) and IoT (12). Given the higher advancement level of blockchain applications in these sectors and the empirical nature of academic contributions, why other sectors, such as healthcare, transport, and energy, have less than five contributions is also understandable.
Keywords are also an important parameter to consider when evaluating a sample. A total of 650 keywords were extracted from the sample, which means a media of 3.9 per article. While some articles had six or more keywords, others (mainly preprints) had none. After duplicates and plurals were removed, 307 unique keywords were found. To avoid biases with the research strings used, however, we excluded keywords such as "blockchain" and "oracle(s)" from the analysis. Keywords composed of multiple words (e.g., Smart Contract) were considered unique, and those composed of banned keywords, such as "price-oracles", were not excluded. The choice to leave keywords composed of the two banned words lies in the idea that, while those keywords alone are common for all papers, composed keywords, such as centralized oracle or blockchain interoperability, are proper in specific sectors, which will benefit from homogeneous keyword usage. Plurals were also merged with singular forms (e.g., contract/contracts). Figure 6 shows the word cloud made with all the keywords in the sample. Notably, the most frequently used keywords are smart contract and Ethereum, with 67 and 21 occurrences, respectively. Whereas the keyword smart contract says very little about our sample, the recurrence of "Ethereum" surely reflects the most common study environment on oracles that appear to be the Ethereum network. Other keywords used are internet of things (8), consensus (7), and cryptocurrencies (5), whereas some have a lower currency rate. Interestingly, of the entire sample of 307 keywords, the majority (250) occurred only once. Keywords were also divided into categories to achieve good data breakdown.
After the most common keywords (e.g., Ethereum and Smart Contract) were excluded, excessive heterogeneity was still apparent, even after dividing them by categories. Composite keywords, such as "business process monitoring" and "business process management", were merged (e.g., business process) for consistency. In Table 8, keywords with higher occurrences divided by categories are listed. These data are useful for indexing purposes and for research to be easily retrieved by the appropriate audience. The "transport" category was excluded from the table because of excessive heterogeneity. Future Internet 2022, 14, x FOR PEER REVIEW 14 of 38 Figure 6. Keywords word cloud.

Contribution by Author/Institution and Metric
The most cited papers, authors, and contributing institutions are displayed in Tables 9-11, respectively. Building on prior bibliometric analyses [62][63][64][65], the papers were ordered in terms of citations; therefore, the ten papers displayed in Table 10 are the most cited ones. However, institutions were ordered in terms of the papers produced. The list was not limited to ten but is restricted to those who provided at least three contributions. The most cited authors were selected with a mixed approach. Ordering authors by citation would have resulted in a biased list because of papers with many coauthors and citations. Therefore, to be inserted into the list, one requirement is that the author has to have produced at least two publications and has to be the first author for at least one of them. The requirement of at least two publications is to avoid the insertion of authors who have randomly contributed to a related paper. Then, assuming that the first author is the lead or the most contributing author, having first authored a paper appears to also be a necessary requirement. However, to also provide visibility to coauthors, Appendix B shows a list of coauthors who contributed to at least three papers. We were aware that a higher number of papers produced or a higher number of citations would not necessarily imply a higher impact or contribution in the field of oracle research. Such a claim would require a thorough study of academic contributions to the development of successful oracle applications, which is beyond the scope of a bibliometric analysis. In this research, a parameter, such as citations or produced papers, will correspond to a notable interest in the produced research of an author or a major effort from the institution to investigate the related field. The retrieved parameters do not reflect or question, in any case, the quality of an author's or institution's publication.   As explained, information gathered with the above-mentioned approaches is provided in separate tables for clarity, but they should be discussed together to better grasp the meaning of the data.
The most cited author is Xu Xiwei (908 citations) from the University of New South Wales (UNSW) CSIRO-DATA61. She had co-authored the first two most-cited papers and four among the first ten. She started contributing to the subject in 2016, and given that her last paper on the topic was published in 2021, she appears to still be investigating the subject. All the included papers published by the UNSW are first-authored by her, except for one by Lo Sing Kuang, who is also among the most cited authors (116 citations). UNSW ranks third among the most productive institutions, with research mainly focused on oracles' architectures. The second most cited author is John Adler from the University of Toronto, who authored the third most cited paper (133 citations). In the University of Toronto, Merlini Marco is also among the most cited authors, and this institution is particularly focused on investigating decentralized oracle mechanisms. The sixth and seventh most cited authors are Omar Ilhaam A. and Al-Breiki Hamda from Khalifa University, with 107 and 97 citations, respectively. From the same university are also Battah, Ammar, and Madine, Mohammad Moussa, who are also among the most contributing authors but with fewer citations (50 and 49, respectively). Notably, Khalifa University is the most productive institution in the field, with 13 documents produced, of which 3 were among the ten most cited and 4 were among the first twenty. Observing the coauthorship, apart from the four most cited first authors, many other authors from the same university also participated in the research studies. Among them, Muhammad Habib Ur Rehman and Davor Svetinovic are the most cited, with 144 citations each. These findings provide an idea of institutions that are heavily investing in this sector. Furthermore, this institution contributed at least one paper to every oracle application category (except for business process management (BPM) and energy). Furthermore, in addition to offering contributions to the healthcare and data management fields, they also produced research to address the oracle problem. Moreover, the University of Verona is also focused on addressing the oracle problem, which is ranked second by the number of articles produced.
However, publications from this institution are relatively recent and are not among the top-cited publications. From the same country (Italy), the University of Insubria is also among the most productive institutions, and two authors, Carminati Barbara and Rondanini Christian, are among the most cited (50 citations each). Studies from this university and its researchers were mainly concerned with OA as an IoT in business processes. Another notable institution is the University of Ljubljana, from which its contributions focus on cloud/fog computing and the oracle problem. The institution also belongs to the fifth most cited paper [66] and the fourth most contributing author, Petar Kochovsky, with 115 citations.
Among the most productive institutions, five other institutions emerged, for which their researchers were also among the most impactful ones. These institutions include Beijing University, Technische Universität Berlin, the University of Potsdam, and the Institut national de la recherche scientifique (INRS) of Montreal. Beniiche, Abdeljalil, from INRS of Montreal, is the most cited in this group (44 citations), and his main contributions focused on OT. Finally, from Technische Universität Berlin is Ingo Weber, which, although not the first author of any of the papers in the sample, has coauthored some of the most cited ones (303 total citations).

Converging Studies, Research Themes, and Research Directions
This section of the paper is dedicated to reviewing and discussing the collected studies, with the aim of extracting critical features concerning the related fields and the research direction. The objective is to understand which aspects of oracles have been investigated, which methods are used, and what results have been generated to highlight emerging research trends. Furthermore, by comparing research papers, converging studies are highlighted to promote cooperation between institutions. Appendix C also provides a complete list of papers sorted by institutions and categories to better understand the research distribution.

Oracle Theory
Subjects pertaining to the OT and the oracle architecture comprise many different studies. Given that oracles are a niche area of investigation, they are not officially classified by type, and their characteristics have yet to be defined. A group of studies has been dedicated to investigating common patterns that emerge from oracle architectures, with the aim of classification and improvement [67][68][69][70][71]. Pasdar et al. [67] differentiated reputationbased and voting-based oracles, explaining how each design provides the answer to the smart contract. Muhlberger et al. [68] instead distinguished oracles between inbound and outbound, depending on the direction of the data flow (push and pull). Inbound oracles provide data to the blockchain, whereas outbound ones transfer data from the blockchain to the real world. Specific examples are also made of blockchain applications, where data are pushed or pulled into the smart contract. Xu et al. and Mammadzada et al. built a framework to select the most appropriate oracle design (in terms of security and data management) according to different blockchain applications [32,69,71].
Other works from the University of Colorado [72], Jiamusi University [73], and Hong Kong University [74] focused on the security and privacy challenges of Oracle-based smart contracts. Their research study mainly examined how to identify and prevent oracle malfunction (integrity), guarantee that data collected is exploited solely by the smart contract (confidentiality), and prevent downtime or censorship attempts (availability). A group of works from Montana State University [75], University of Sfax [76], and the University of Cagliari [77] focused instead on oracle fees and gas-price oracle malfunctions. The work of Montana State University investigated the reasons that led to gas price oracle failures, and the study from the University of Cagliari outlined the failure rate of gas price oracles with an empirical approach. The paper from the University of Sfax compared different gas-pricing techniques with the aim of improving oracle reliability.
Another central subject in OT is the oracle problem issue, for which many contributions were retrieved. A group of papers focused on explaining the oracle problem, whereas others focused more on empirically investigating the subject to overcome the issue. Two papers from the University of Ljubljana and Max-Planck Institute introduced the oracle problem from a legal point of view [29,30]. In this paper, the oracle's role as legal actors and their responsibility as a trusted entity were investigated. A similar discussion can also be retrieved in Mezquita et al. [78], which, however, focused on the legal audit of smart contracts. A thorough discussion of the audit of the smart contract in light of the oracle problem could instead be found in two works by Mark Sheldon D. [13,79], which first introduced the problem of auditing contracts and then offered insights for future auditors to perform the task better.
Other papers from the University of Verona and Khalifa University focused on investigating trust models and the consequences of having untrusted oracles in various sectors, such as Intellectual Property Rights (IPRs), DeFi, and supply chain [6,9]. Considering the amount of money managed by DeFi platforms, the financial implications are alarming [14,46]. Singapore University has also confirmed this result with research focused on investigating the reliability of oracles in DeFi applications [1]. Finally, studies from the Chiba University of Technology and the University of Dallas explored, with empirical data, the incentives of oracles to cheat or fail to transmit information [80,81]. The focus of these studies has mainly concerned the issue of how trust can be built or undermined in digital economies and how collective intelligence helps prevent selfish individuals from performing disruptive actions in the community.
The last subject of OT pertains to oracle proposals. Proposals concern elaboration from scratch or improvements of oracle trust models, such as the one discussed by Al-Breiki et al. [6]. However, because of excessive heterogeneity, finding research themes and research directions was not feasible for this category. Furthermore, proposals were retrieved in a balanced distribution among institutions in various countries, apart from being heterogeneous. Therefore, a considerable convergence of studies among institutions could not be retrieved. Table 12 summarizes the content of the paragraph.

Oracle Applied
Oracle-applied research is focused on various sectors. As expected, because of the resonance and hype that cryptocurrencies attract, finance applications constitute the widest sample. Although multiple institutions have investigated the subject, they show similarities in their focus. Studies from Concordia University, the University of Houston, and the University of Singapore focused on the very role of DeFi oracles: how they work, how they are designed, and how they interact with the underlying blockchain. Existing oracle types are also compared (in terms of efficiency), analyzing how data are retrieved, aggregated, and pushed into the blockchain [1,48]. Kaleem and Shi [82] also provided an overview of the percentage of DeFi oracle calls over total oracle calls. They discovered that almost 75% of ChainLink oracle calls are from Synthetics, a derivative-based DeFi project. Other studies from the University of Verona and Delhi Technological University focus on known threats to DeFi oracles. Although some, such as technical malfunctions or Sybil attacks, are efficiently spotted and addressed, others, such as frontrunning or flash loans, are still difficult to prevent and sometimes even to spot [46,83]. Three studies from Concordia University, Delhi University, and Delft University of Technology focused on the role of the oracle as a means to manipulate the market, showing the possible risks connected with its use and misuse [48,83,84]. Whereas the first two have a more theoretical slant, the third one with an empirical approach investigates how arbitrageurs exploit oracle vulnerabilities.
Empirical research from the Oxford-Hainan Blockchain Research Institute, Singapore University of Technology and Design, and Delft University of Technology further contributed to this field of study. The first proposed BLOCKEYE, a device able to hunt attacks on DeFi and oracle manipulations, for which the research team had already presented some experimental results [85]. Using primary data, the second showed the deviance rates of four oracle services to enlighten the oracle's reliability and possible malfunctions [1]. Finally, the third investigates how arbitrageurs' activities can influence or manipulate price oracle data feeds [84]. Another group of studies discussed how oracles intervene according to a specific financial application (e.g., loans, trading platforms, trust services) [46,86]; however, apart from two papers from Khalifa University and the University of Clermont Auvergne, which both investigated e-auctions, the rest had heterogeneous aims. Both studies on e-auction had an experimental approach and proposed a new auction service based on the Ethereum blockchain, specifying the role of oracles and how to overcome possible security issues [87,88]. Three studies also investigated the role of oracles in cross-chain asset transfers. Whereas the study from the University of Lisbon provides an overview of different cross-chain techniques, the work from the University of Verona discusses the utilities of the transferred tokens based on their provenance [51,89]. Another study from Beihang University proposes PracticalAgentChain, an intermediary between the data oracle and provenance blockchains (e.g., Bitcoin and Ethereum) [90]. The system works as a reputation-based trading pool and utilizes Town Crier for reliable oracle service.
Business process management research is mainly built toward the ability of blockchains to monitor business processes efficiently. Di Ciccio et al. [91] provided an overview of how the monitoring business process with blockchain can be achieved and discussed the challenges eventually faced. An extensive description of oracle implications is provided, first by discussing how oracles should be synchronized (time management) to avoid delay of reports. Second, the reliability of oracles is discussed to ensure that the data are not manipulated. Third, the flexibility of oracles should be guaranteed so that the smart contract can select the best data source according to the monitored event. Fourth, blockchain data are aligned with real-world data so that the event sequence is not misrepresented. Concerning the timeliness and alignment of oracles, a group of works by the University of Potsdam [92][93][94] proposed a "deferred choice pattern", given that the time of transaction is not known in advance. Their model involves an extended oracle architecture to make all historical process data available (history oracle), sensitive to any unexpected change (publish-subscribe oracle), and preserve privacy and data efficiency. The last is achieved by performing part of the computation and variable evaluation off-chain (conditional oracle variants). More focused on the privacy of business process execution, works from the University of Insubria have proposed an encryption mechanism to ensure data confidentiality, even in the presence of an untrusted oracle. The works also verify the encryption data consequences on smart contracts and transaction overhead [95,96].
As for the supply chain and traceability fields, the works seem heterogeneous, although eight entries were retrieved. Construction, fashion, and food supply chains were investigated, as well as the traceability of vehicles and COVID-19 infections [97][98][99]. Sanchez-Gomez et al. proposed that, for a blockchain traceability solution to work, it must operate on a dedicated layer/network, with traceability data separated from the blockchain data in which a reliable data verification mechanism should be implemented. Oracles and external APIs, in their design, play a critical role [100]. Following this approach, Moudoud et al. [99] proposed collecting traceability data on a cloud, where a network of trusted oracles (recognized by the signature) approves the most reliable information. Concerning the reliability of oracles, a study by Marbouh et al. [101] proposed rules to evaluate an oracle's reputation in tracing COVID-19 cases. Thresholds also determine the oracles' inclusion or exclusion from the trusted network.
Focusing on the construction supply chain, Lu et al. [97] proposed the use of smart construction objects (SCOs) as blockchain oracles given their intrinsic characteristics. SCOs are, in fact, able to sense the surrounding environment and efficiently communicate the information acquired. Lastly, they have the autonomy to respond to certain situations based on predefined rules. Victor and Zickau [98] proposed network operator companies as tracking oracles given the massive presence of cellular radio towers. Finally, Powell et al. [102] discussed the issue of attaching a physical product to the blockchain in order for it to be traced by an oracle. Studies from the University of Verona have also investigated this specific aspect [8,103]. All of these studies support the idea that a known oracle identity is fundamental to achieving this task.
For healthcare, only four papers were retrieved and were all focused on the security and access control of patients' records [104,105]. Madine et al. [105] proposed a decentralized reputation-governed trusted oracle network for patient records to promote competition among oracles and ensure quick and reliable data transmission. However, they also proposed that, because of the sensitive nature of data transmitted, oracles should be approved by a regulatory agency. In a subsequent study in which they proposed a system of tokens to ensure patients' control of their medical records, they also debated the necessity of having a second oracle type for a time-based trigger events [104]. Research by Goncalves et al. [106] focused on the same objective but proposed a specific oracle solution with the Chainlink oracle provider and Ethereum blockchain.
Seven entries regarding applications in AI were retrieved. The central focus was to exploit automation and oracles to guarantee trust in data gathering and processing. As in the original idea of the software oracle problem, the objective was to reduce external parties' intervention in automated procedures. Studies from Toulouse University and INRS Montreal investigated AI-based oracles to provide non-forged results [107,108]. While the first aimed to complete the automation of the oracle ecosystem, the second proposed a more hybrid system between humans and machines. In particular, Beniice et al. [107] demonstrated that the presence of a third party, a human or social robot, plays an important role in a blockchain-enabled trust game. The works of El Fezzazi et al. [109] and Richard et al. [110] aimed to exploit blockchain oracle features to improve machine learning processes and predictive models to reduce dependency on third-party data feeds. Both offered a theoretical overview of the blockchain implementation outcome at the concept stage.
The IoT sector has 12 publications, and the main issue of investigation is the problem faced while ensuring that the data gathered by IoT devices are trustworthy and private. Gordon [111] and Vari-Kakas et al. [112] outlined the problem of secure and trustworthy data provenance within IoT systems. The first focuses on the problem of the authentication of IoT oracles on blockchains to ensure that data are submitted only by trusted oracles. It proposes that oracles submit their addresses along with data so that blockchain applications can easily verify data provenance. The second is focused on the statistical probability for an IoT oracle to deliver reliable data to the blockchain. In response to this issue, Shi et al. [113] proposed a secure and lightweight triple-trusted architecture to guarantee the unforgeability of data collected by trusted oracles. In their research, however, the premise is that oracles are trustworthy in the first place. By contrast, contributions from Khalifa University and Insubria University approached the confidentiality of the IoT. The first proposes implementing cloud computing and different access privileges to guarantee against unwanted data leakage. The second proposes an encryption model in which IoT and related data are only accessible by the intended users [7,114]. Whereas the first is more oriented toward the technical feasibility of IoT-based blockchain data gathering, Moudoud et al. [115] proposed an ad hoc blockchain architecture based on sharding and a peer-to-peer oracle network in order to manage IoT devices. Although at an early stage, the prototype already shows some experimental results.
As for cloud computing, only five studies were retrieved. Two were published at the University of Ljubljana and focused on how oracles can enhance trust and efficiency in a cloud computing platform. Defining the drivers of trust, a trust management scheme is proposed to show how a trusted data flow can be achieved between application components (e.g., camera, fog node, and cloud storage) [66]. In subsequent work, the research is extended, showing how oracles can increase scalability and cost-efficiency in federated edgeto-cloud computing environments by allowing transactions to be executed off-chain [116]. Tao and Hafid [117] also proposed introducing a computing oracle to reduce on-chain network usage. In line with these studies, Taghavi et al. [118] proposed oracles as a monitoring service for service-level-agreement violations in cloud environments. Utilizing a Stackelberg differential game, they also investigated the perfect balance between quality verification requests and monitoring prices.
A consistent group of papers applied oracles for data management. Comuzzi et al. [119] investigated how oracles impact data quality in terms of the timeliness, costs, and availability of data. They showed that availability increases by querying an external oracle service, but so do also costs. Battah et al. [120] proposed a reputation system to reward better-performing oracles to improve accessibility and costs, eventually increasing data quality. In a subsequent study, the authors better specified the drivers to discover trusted and better-performing oracles, also showing simulation results [121].
Other authors have focused on data communication between blockchains. Mitra et al. [122] proposed DE-PEG, a modification of the PEG algorithm said to reduce the cost of data availability oracles and thought to also prevent stalling attacks. Gao et al. [123] explored active communication between blockchains through oracles, specifying which type of data should be transmitted. However, in their research, they assumed that the oracle is always trusted, so they also did not provide a scheme to prevent oracle data manipulation. Finally, Ouyang [124] proposed HBRO, an oracle system that enables communication between permissioned and permissionless blockchains concerning digital rights management (DRM). Once DRMs are elaborated on the permissioned chain, they are securely transmitted through the data oracle to a permissionless blockchain with a notary mechanism.
Works in the transport sector have focused on the security, privacy [125,126], and efficient identification of vehicles [127], as well as data processing for intensive transport environments, such as commercial waterways [128]. Whereas the works from Khalifa University and Guangxi University discussed the implementation of Chainlink for autonomous vehicle test-case repositories and identification, the works from other universities proposed their own oracle design for efficient data transmission in the transport industry.
Finally, three entries were retrieved for the energy sector. All three were published between late 2021 and early 2022. The shared vision aims at decentralizing the energy market, but with a different focus. Antal et al. [129] proposed an energy flexibility token to incentivize renewable energy production at the local level. Zeiselmair et al. [130] implemented a decentralized oracle system and zkSNARKs to improve renewable energy certificate allocation. Lastly, Weixian et al. [131] investigated efficient oracle designs to guarantee secure and unforgeable data transmission between actors involved in the energy market. An overview of the above-mentioned results can be found in Table 13.

Energy
Energy market management -Incentivize renewable energy production -Improve privacy in the energy market Technical University of Cluj-Napoca [129] Technical University of Munich [130] 6. Discussion The literature review showed some interesting insights about themes covered by the existing studies and areas that require more attention. Although a group of studies have tried to classify oracles, the very concept of oracle is still not clearly defined. Oracles are generally identified as data-feeding ecosystems, which can come in various forms or structures, but if this is the purpose of oracles, then anything that provides data for a blockchain is an oracle. Therefore, rollups or bridges should also be considered oracles. We could argue, for example, that the lighting network is an oracle for the bitcoin network, but then we may also have oracles on the lighting network. Therefore, the boundaries of what can be defined as an oracle should be clearly settled.
The classification of oracles is also quite heterogeneous, and it often reduces the clarity of the presented research. In addition to software and hardware, we also have reverse, centralized, decentralized, computation, consensus, and voting-based oracles. Often, they refer to the same type with different names or to different types with the same name (e.g., decentralized and consensus-based oracles). Similarly, the issue of having trusted entities in trustless environments, often referred to as the "oracle problem", has also been labeled in some research as the "oracle paradox". Likewise, the coexistence of inbound and outbound oracles is referred to equally as the "dual-oracle problem" or "dual simplex communication". Therefore, efforts should be made in this regard so that research can build on a common oracle taxonomy.
Concerning the investigation of oracle trust models, this seems to still be a niche field, and despite the few contributions, the very concept of trust lacks a broader discussion and clarification. Indecisiveness on whether an oracle should be trustless or "provably honest" is apparent, which is, in theory, not really the same concept.
With regard to oracle proposals, they are heterogeneous, but almost all focus on decentralized or consensus-based oracle systems. Centralized oracles, such as the ones proposed in [124,132,133], should also be worthy of investigation. For certain data types that are not in the public domain and where data sources are limited, a centralized and secure data channel would be more appropriate than a slow, expensive, and probably less secure decentralized oracle type. A trusted oracle is indeed a single point of failure, but it is undoubtedly more efficient as long as it resists attacks and behaves honestly. Therefore, more research is expected and needed in that direction. Furthermore, despite the plethora of emerging oracle solutions on the market, on the academic side, the trend is to propose a new oracle solution or utilize only the most known or advertised ones (e.g., Chainlink or Provable). Therefore, an exploration of emerging oracle solutions by academic researchers is expected and required.
Another issue that emerged in this study is that studies concerning oracle solutions for DRM, such as the one by Ouyang [124], are limited. As the original idea of blockchain proposed by Haber and Stornetta [55] was to authenticate digital documents, more attention to this sector was also expected in blockchain oracle research.
Finally, concerning a well-known issue discussed in 2018 by Song [134] (that is, how to link a physical object to the blockchain), an effective method to fill this gap has not been developed despite the plethora of research and proposals. This considerable limitation greatly undermines the feasibility of blockchain-based proposals and applications, especially for the traceability sector (but also in healthcare or BPM). Therefore, more than speculating on the hypothetical advantages of having a blockchain-based tracing system, a considerable effort should be made to understand whether a physical product can be "attached" to the blockchain in the first place. Building on the above-mentioned limitations, Table 14 suggests some research themes along with their expected/desired outcomes. -Investigate robust links between physical products and the blockchain -Improve any blockchain-based application involving physical products (e.g., traceability).
-Investigate emerging oracle solutions -Promote active collaborations between academia and oracle providers

Conclusions
This paper undertook a bibliometric analysis of published studies about blockchain oracles. The aim was to display publication trends along with preferred outlets and publishers. The most-cited papers and authors and the most contributing institutions are also shown. After the selected literature was reviewed, emerging themes, research directions, and converging studies were discussed to promote innovation and cooperation between institutions.
The obtained results show that, within seven years of academic production, only 162 papers (including non-peer-reviewed) were retrieved in scholarly databases. This result supports the view that blockchain oracle is still a widely neglected subject, despite its crucial importance. The review also reveals heterogeneity in the oracle literature; therefore, major efforts are required to find a widely accepted oracle taxonomy. Furthermore, limited oracle selection suggests the need for more active collaboration between practitioners and academia. Finally, more theoretical work is required on the underlying trust concept that identifies oracles as "trusted" ones.
The findings of this research are useful for academics, students, and practitioners. Offering an overview of institutions investigating a specific field, this study can promote cooperation between existing or entering research teams in the blockchain oracle domain. This, in turn, can constitute a reference for entrepreneurs undertaking blockchain-based projects. Students and other academics can then utilize a resource on the state-of-the-art knowledge of related fields and investigate emerging gaps (e.g., missing resource management contributions) or create other research by building on existing studies. This paper also has limitations given the scarcity of retrieved material that determined low numbers in absolute terms in all the tables and figures. As specified in Section 3, a degree of subjectivity in the presented results cannot be excluded. Whereas previous studies inspired the method and bibliometric research, the author had to select them arbitrarily. Subjectivity can also be found in the sample classification, given that the division of topics into categories and sub-categories had to be performed manually. Again, the author wishes to reiterate that the data provided in this paper should not, in any case, be interpreted as a quality evaluation of the cited works. Because of the selection criteria, some works or authors may have been excluded inadvertently. Further studies can build on this bibliometric analysis to investigate the trust models adopted and presented in the published literature and the preferred oracle applications for academic investigations.
Funding: This research was funded by "Emma Gianesini Fund", managed by UniCredit Foundation.

Data Availability Statement:
Almost all the data generated or analyzed during this study are included in this published article. Unreported data are available upon request.

Conflicts of Interest:
The author declares no conflict of interest. Table A1. Relevant contribution example.

Paper Title Oracle Contribution Reference
The limits of smart contracts Provides an analysis of the role of oracle from a legal point of view [30] LoC-a new financial loan management system based on smart contracts Discusses how oracles can be implemented to ensure data privacy in loan management [135] A pattern collection for blockchain-based applications Describes different oracle types and how to recognize the most suitable one according to the needs [69] On the characterization of blockchain consensus under incentives Compares blockchain consensus and oracle consensus under specific incentive mechanisms [136] Distributed network slicing management using blockchains in e-health environments Shows the implementation of a decentralized oracle solution for the management of patient records [106] Blockchain for COVID-19: review, opportunities, and a trusted tracking system Outlines a means to recognize a trusted oracle network for tracking purposes [101] To chain or not to chain: a reinforcement learning approach for blockchain-enabled IoT monitoring applications Presents a blueprint of a private network in which oracle contracts improve their efficiency according to data collected by IoT sensors [137] Blockchain as a platform for secure inter-organizational business processes Discusses oracle data correctness and confidentiality in business process management [95] Future Appendix C    [123,194] Chiba Institute of Technology [195] Kaunas University of Technology [196] Khalifa University [120,121,197] Rennes University [198] Shenzhen Technology University [199] UNIST-South Korea [ [128]