Open Data for Open Innovation : An Analysis of Literature Characteristics

In this paper, we review some characteristics of the literature that studies the uses and applications of open data for open innovation. Three research questions are proposed about both topics: (1) What journals, conferences and authors have published papers about the use of open data for open innovation? (2) What knowledge areas have been analysed in research on open data for open innovation? and (3) What are the methodological characteristics of the papers on open data for open innovation? To answer the first question, we use a descriptive analysis to identify the relevant journals and authors. To address the second question, we identify the knowledge areas of the studies about open data for open innovation. Finally, we analyse the methodological characteristics of the literature (type of study, analytical techniques, sources of information and geographical area). Our results show that the applications of open data for open innovation are interesting but their multidisciplinary nature makes the context complex and diverse, opening up many future avenues for research. To develop a future research agenda, we propose a theoretical model and some research questions to analyse the open data impact process for open innovation.


Introduction
Since the beginning of the 2000s, the use of the term "open" has increased exponentially [1] In 2003, Chesbrough proposed a new paradigm of the innovation [2,3].For this author, open innovation constitutes a model where firms use both external and internal resources and commercialize both external and internal ideas/technologies [2].Open innovation is defined as "The use of purposive inflows and outflows of knowledge to accelerate internal innovation, and expand the markets for external use of innovation, respectively.Open innovation is a paradigm that assumes that firms can and should use external ideas as well as internal ideas, and internal and external paths to market, as the firms look to advance their technology" [4]  The open data concept alludes to "data that anyone can access, use, and share.Governments, businesses and individuals can use open data to bring about social, economic, and environmental benefits" [5].Its annual economic impact is important to note: Open data potentially generate 900 billion dollars in the global economy [6], with a European Union market share increase of 36.9% between 2016 and 2020 [7].Open data offer the potential for reuse, which produces new, innovative services for citizens and society in general [8,9].Likewise, open data initiatives have an impact on aspects such as citizen engagement, transparency and innovation in the public sector [10].
We see then that open data can be a source to innovate.Some authors highlight that it is interesting to understand, in the context of open data and smart cities, how data-driven innovation is performed and its creation of social and economic value for the society [9,11].Considering the interest of studying innovation in the context of open data and the importance of the openness phenomenon, we examine the possibility of using open data for open innovation.In that sense, we have searched articles that offer state-of-the-art ideas on that theme but have not found literature reviews that join open data and open innovation.Due to this, we have searched literature reviews of each theme to look for interest to study the combination of those two terms.We have found literature reviews about open data using different methodologies and temporal scopes [12][13][14][15].Other studies analyse the literature on open innovation, combining several methodologies and temporal scopes, with 2017 being the last year analysed in the most current articles [16][17][18][19][20][21][22][23][24][25].Finally, we have found that some of these studies have identified interest in the relationship between the terms "open data" and "open innovation" [12,14,17].
In that context, open data offers access to internal and external data that come from, mainly, public organisations.Governments and public agencies are liberating their data and they want open data to be used to solve problems and to create and improve products and services.However, access to open data in itself does not produce innovation [26].New services, created by open data, mainly software applications, can be produced using a process known as open innovation, defined as "the opening of the innovation process to knowledge from outside the innovating organisation" [27] (p. 2), in which diverse agents such as citizens, companies, public entities, or academia collaborate to co-create these new services [28].Thus, it is necessary to know how to implement open innovation using open data.A first stage to develop that idea is to review the previous literature. In

Methods Search Protocol
The Web of Science (WoS) and Scopus databases were used to perform the literature review, since they are the most relevant databases in academia.While WoS included 20,000 indexed journals, Scopus included 21,950 [29].
The search protocol used is: -Final number of documents: 55.

Descriptive Analysis
Figure 1 presents the number of documents per year for the combination of the two topics studied.The first publications are from 2012 (4), and a certain growth can be seen from 2014 to 2017, with the highest number of documents appearing in 2015 (13) and 2017 (13).

Descriptive Analysis
Figure 1 presents the number of documents per year for the combination of the two topics studied.The first publications are from 2012 (4), and a certain growth can be seen from 2014 to 2017, with the highest number of documents appearing in 2015 (13) and 2017 (13).Table 1 shows the details of the documents identified in our analysis: authors, year of publication, title, citations in WoS, and Scopus and type of paper (articles: 28; conference papers: 27).Table 1 shows the details of the documents identified in our analysis: authors, year of publication, title, citations in WoS, and Scopus and type of paper (articles: 28; conference papers: 27).

Journals and Conferences
Tables 2-4 present the analysis of the documents according to type: articles or conference papers.Regarding articles (Tables 2 and 3), the "Information Polity" and "Government Information Quarterly" journals stand out with three and two articles, respectively.Regarding conference papers (Table 4), the book series "Lectures Notes in Computer Science" stands out with three documents.The other journals and sources only have one document each.We have analysed the different subject areas and categories of the Journal Citation Report (JCR) and Scimago Journal and Country Rank (SJR) (Tables 2-4).Most indicate a link with knowledge areas such as Information Technology and Computer Science and its offshoots.A review of the Computer Science subject area indicates the prevalence of the Computer Science Applications, Computer Networks, and Communication and Information Systems categories.Also prevalent are knowledge areas such as Public Administration and Government within the Social Sciences subject area, displaying a significant variety of associated categories: Sociology and Political Science, and Library and Information Sciences stand out, among others.Furthermore, knowledge areas such as Systems Engineering, Electronic Engineering or Electrical Engineering, among others (included in the Engineering subject area), have a significant presence.The Technology and Innovation Management knowledge area also appears, mainly linked with the subject areas of Business, Management and Accounting, and Decision Sciences.Medicine, Molecular Medicine, Pharmacology, and Chemoinformatics have a minor presence.Finally, we must mention the knowledge area of Museology, under the subject area of Arts & Humanities.When analysing the journals ranked by JCR, eight are in the first or second quartile and by SJR, 20 are in the first or second quartile for the last available year (2017).

Authors
Table 5 presents the most productive authors by affiliation and knowledge area.Several authors from the Nagoya Institute of Technology's Graduate School of Engineering (Nagoya, Japan) stand out with three publications each in the knowledge area of Computer Science: Tossavainen, Shiramatsu, Ozono and Shintani.Their publications focus on the use of web applications to promote collaboration between different interest groups (individuals or organisations) for the purpose of solving public and social problems [68,72,73].

Author Affiliation Knowledge Area Documents
Authors that focus on this topic belong to three knowledge areas: Computer Science, Information Technology and Economics (Table 5).Some authors such as Yoshida, Lee and Choi belong to two knowledge areas, Economics (focus in the open innovation research) and Information Technology or Computer Science (focus in the open data research).The affiliations of the top authors are Japanese (six), Swedish (four), Spanish (three), Finnish (three), Korean (three) and Taiwanese (three).

Studied Themes by Knowledge Area
We analysed the knowledge areas considering the SJR subject areas and categories.In the Information Technology and Computer Science knowledge areas, topics such as the development of open innovation processes through web platforms are the most commonly studied [72,73]; other topics include the impact of the use of open government data to improve or produce new products and services, as well as the open innovation processes derived from the use of these data [71].This last topic has also been addressed in knowledge areas such as Public Administration, along with other topics such as open data, transparency, civic engagement, and public sector innovation [10].
Regarding the knowledge areas of Systems Engineering, Electronic Engineering, Electrical Engineering, the most prevalent topics are the development of systems that offer a service to the user and that enlist the collaboration of these users to improve the product, thus involving various stakeholders in a co-creation process [56].For Technology Management and Innovation, topics addressed include the management of technology innovation processes in organisations [32], or the phenomena of co-creation and innovation promotion [75].
In the knowledge areas of Medicine, Molecular Medicine, Pharmacology and Chemoinformatics, the positive impact of open data and open innovation on drug discovery and development processes is analysed [77,79].Lastly, in Museology, the impetus of open data and open innovation in museums, libraries and archives is discussed [63].

Methodological Characteristics of the Documents
To perform a more in-depth literature review, this section presents an analysis of the methodological characteristics of the documents studied as the type of study, the analytical techniques, the source of information and the geographical area.
Analysing the type of documents indicates that 65.5% ( 36) are empirical and approximately 34.5% ( 19) are theoretical.Several aspects of the empirical documents have been analysed, such as the type of study (Table 6), analytical techniques used (Table 7), and sources of information (Table 8).
Tables 6 and 7 show that 61% ( 22) of the empirical documents are exclusively qualitative studies using the analytical technique of case study.On the other hand, six documents (approximately 17%) are exclusively quantitative, using analytical techniques such as the varimax rotation method, correlation coefficients, Cronbach's alpha coefficient, regression analysis, structural equation modelling, and descriptive statistics.Furthermore, seven documents (approximately 19.4%) use a combination of quantitative and qualitative techniques.All are case studies with various types of descriptive statistics, except for one by Smith & Sandberg, 2018 [69], that combines a case study with a cross-tabulation matrix.If we analyse all the studies that are exclusively quantitative or that are combined with a qualitative study, 13 documents are found (36% of the empirical documents).Eight of these are cross-sectional studies for the same period, and five are longitudinal studies.

Type of Study
Author/s, Year

Quantitative
The most prevalent analytical technique used is the case study, identified in 28 documents (77.8% of the empirical studies), followed by descriptive statistics found in nine documents (25% of the empirical studies) (Table 6).
Table 8 presents the information sources used in the empirical studies.Most (28, or 77.8%) of the documents analysed have a secondary source; 16 documents (44.4%) have only one source; and 8 documents (22.2%) have three or more sources.Primary sources are found in 18 (50%) of the empirical studies; 8 (22.2%) have a single primary source and 9 (25%) have two primary sources.
Table 9 shows that 60% of the documents (33) correspond to a single geographic area, while 18.2% (10) correspond to several geographic areas.Approximately 21.8% (12) of the documents do not indicate any geographic scope.The geographic areas represented are widely scattered, although approximately 53% (29) of those that indicate a geographic area are analyses conducted in Europe.
Table 8.Sources of information/author/s, year.

Sources of Information
Author/s, Year Table 9. Geographical area/author/s, year.

Geographical Area
Author/s, Year

Discussion
After In this context, we have developed a theoretical model, which includes some dimensions of previous models of open data and open innovation.On the one hand, following Abella et al. (2019) [80], we have used the open data impact process and the reusers categories of open data.The model presents a process with four phases: 1. Candidate data; 2. Published data; 3. Reused data; and 4. Impact; and proposes a classification of data reusers in three groups: (1) primary open data source (public organizations and other related organizations that publish open data); (2) direct reusers (social and professional); and (3) end users (social, citizen, professional and academic).On the other hand, following Gassmann and Enkel (2004) [81] and Nerone, Canciglieri Junior, Steiner and Young (2014) [82], we have considered two types of open innovation: inbound (to insource external ideas and technologies to enhance products' values) and outbound (to outsource internal resources for refining, exploiting and bringing them to market).We also consider the two types together, or coupled (a combination of the inbound and outbound processes).Our model is the first theoretical proposal for the study of the use of open data for open innovation (Table 10).

Conclusions
There is growing interest from both academic and professional scenarios of studying the innovation topic under the perspective of openness [83] and the reuse of open data [80].One of the main effects of this reuse is the possibility of innovating and creating new businesses or developing new products or services for citizens [8,9].Therefore, these two concepts are fully related and it is necessary to deepen, from the academic context, in their joint study in order to guide to the managers to take advantage of open data and open innovation.
Literature reviews are very useful to know the state of the art about a topic.In this sense, we have found some literature reviews about open data or open innovation, but there are still no studies that jointly analyse both topics.This paper tries to cover this gap in the literature by formulating three research questions.To this aim, we have carried out a search of the papers that include open data and open innovation research.We have identified just 55 documents.Many of them are in the initial stages of the research because they are conference papers.It seems logical to say that the joint study of these two topics is emerging and that several documents have not yet been published but are being presented in various academic and professional forums.
To answer the first research question, two analyses have been carried out.Firstly, we have identified the main journals and conferences that publish papers on these topics.The results show that the documents are published in journals of different knowledge areas, Computer Science and Engineering and Public Administration that analyse the issue of open data.Other knowledge areas are focused on open innovation such as Business, Management and Accounting or on the practical applications that have the use of open data to perform open innovation, as is the case of applications or examples of its use in knowledge areas related to Health Sciences, Engineering or the knowledge area of Museology.Secondly, the paper identifies the authors that have published in these issues.It is observed that there is still little productivity per author (maximum three articles), which confirms that this line of research is in its initial stages.The authors are related to knowledge areas as Computer Science, Information Technology and Economics.If we consider their affiliation, the authors of research institutions of Japanese, Korean or Taiwanese universities stand out.There is also a presence of European researchers (Spanish, Finnish and Swedish) among the top authors.
To answer the second research question, knowledge areas are analysed.The main conclusion is the multidisciplinary character of this topic.The most outstanding knowledge areas are Information Technology and Computer Science.Also, from other areas such as Public Administration, Business and Management, and Medicine, papers are being carried out focused on aspects more related to management issues and the application of open data to open innovation.
Regarding the third research question, it is observed that although it is an emerging topic, most of the papers (65.5%) are empirical.This result highlights the need to carry out more theoretical studies that help lay the foundations and the theoretical bases to jointly study these two issues.Moreover, most of the empirical papers are qualitative (61%), which is consistent with the state of development of the research line.The most used technique is the case study.This methodology helps to understand, solve or improve a professional world procedure [84] and is appropriate when the phenomenon investigated is exploratory and descriptive and when primary information is available.As the literature is not conclusive, it is necessary to carry out an in-depth and qualitative analysis on the topic.In this sense, it is observed that 50% of the articles analysed use primary information sources and there are some that combine primary and secondary.The case method also allows This paper presents some theoretical and practical implications.The paper analyses the main aspects of the previous literature that has combined the terms open data and open innovation: journals, conferences, authors, knowledge areas and methodological characteristics.Our results are useful for researchers who start to research this topic because they identify existing gaps and propose new research questions.In addition, "open innovation can help to identify opportunities for entrepreneurs" [87] (p.2).In that sense, the paper can be useful as a starting point for agents such as citizens, companies or public institutions that want to carry out an open innovation activity such as the creation of digital applications and services through the reuse of open data.
Finally, the paper has some limitations.Other techniques can also be used in order to complete the descriptive analysis, such as bibliometric techniques (bibliographic coupling, co-citation analysis or co-author analysis) that would provide additional information and alternative approaches to describe how state-of-the-art this topic is.
, giving rise to concepts such as open data, open innovation, open medical records system, open science, open knowledge, and open education, among others.
(p. 1).In that sense, open data is an external source that can be used for generating open innovation, and open innovations can create open data.
this paper, we have analysed the characteristics of the previous literature that has related open data with open innovation.We propose three research questions: (1) What journals, conferences and authors have published papers about the use of open data for open innovation?(2) What knowledge areas have been analysed in research on open data for open innovation? and (3) What are the methodological characteristics of the papers on open data for open innovation?To answer the first question, we use a descriptive analysis to identify the relevant journals and authors.To address the second question, we identify the knowledge areas of the studies about open data for open innovation.Finally, we analyse the methodological characteristics of the literature (type of study, analytical techniques, sources of information and geographical area).After answering these three questions, we will be better able to (a) identify who is who in that research line; (b) show the opportunities to implement open innovation to the agents of the open data ecosystem and (c) orient the new research about the use of open data for open innovation

Figure 1 .
Figure 1.Number of documents per year.
analysing the characteristics of previous literature that jointly analyses open data and open innovation, we discuss the different knowledge areas focused on this topic.We observe that open data and open innovation studies are addressing the topic from different perspectives.While open data has been analysed under the Computer Science, Engineering and Public Administration disciplines, open innovation has been developed in the Management and Innovation subjects.Subsequently, we develop these arguments according to the knowledge areas identified in our analysis.Knowledge areas such as Information Technology and Computer Science help to understand how the data must be (characteristics, quality . . ..) and the format in which data have to be published for performing open innovation.Additionally, we think that it is necessary to deepen the study of the data publishing mediums (platforms, webs . . . ) and their utility for performing open innovation.On the other hand, it is interesting to know how the data can be reused for performing open innovation.So, literature focused on the Public Administration area offers a framework which allows us to analyse the ecosystem of reusers and the products and services that can be obtained under the open innovation paradigm.Regarding the Management and Innovation subjects, previous literature shows theoretical open innovation models that can be adapted for studying the use of open data for performing open innovation.More empirical studies that develop applications about this topic are necessary.In some knowledge areas such as Systems Engineering, Electronic Engineering, Electrical Engineering, Medicine, Molecular Medicine, Pharmacology and Chemoinformatics, and Museology, the case study methodology is too frequent.These papers offer cases or examples of open innovation activities obtained from open data.In our descriptive analysis, we have found no documents about the state of the art about open data and open innovation jointly.Even though the previous literature focuses on the study of some specific aspects in different knowledge areas, there are no papers that develop theoretical frameworks that help to understand the use of open data for generating open innovation.
applying the inductive method to propose propositions or theoretical hypotheses based on practical experience and examples of application of open data use to open innovation.Finally, results show that the studies have been carried out in different geographical areas.This shows the global reach of these issues, which, besides being applicable in different areas of knowledge, are also applicable in different geographical areas.The joint analysis of open data and open innovation can be studied considering three dimensions: (1) the main phases of the open data process, (2) the types of open innovation that can be developed with open data, and (3) the ecosystems of reusers that are the agents that make the open innovation possible.In that sense, we have proposed a theoretical model to analyse the open data impact process for open innovation.This model can be a guide to future research and help us to present some future research lines and questions.Future research can analyse the following questions for each phase of our theoretical model (Table 10).Phase 1: How does outbound open innovation select the candidate open data?What is the role of public administrations in the selection of open data for outbound open innovation?What effect do the open data policies of each country have on the opportunities to perform open innovation by both public and private institutions?How can the FAIR principles for scientific data-findable, accessible, interoperable and reusable-[85] be adapted to the context of open data for open innovation?Phase 2: How does outbound open innovation publish open data?What is the role of public administrations in the publication of open data for outbound open innovation?How can models developed for innovation in open science such as European Open Science Cloud (EOSC) [86] be adapted to the open data for the open innovation context?Phase 3: What forms of open data reuse are more suitable for open innovation?What is the inbound innovation of each reuser like? and Phase 4: What economic and social effect does the use of open data have in making open innovation?What is the social, economic and technological impact of each type of open innovation?What is the social, economic and technological impact for each reuser?And, in addition, some future research is necessary to develop theoretical and practical applications and examples from a holistic perspective considering all the aspects included in our theoretical model.In that sense, other research questions have been raised by our study.What topics have been the most studied?What are the theories that can be applied to study this phenomenon?What opportunities for open innovation do open data offer?What are the barriers when using open data for open innovation?
Shiramatsu, Tossavainen, Ozono &  Shintani, 2015 [68](CP) Towards Continuous Collaboration on Civic Tech Projects: Use Cases of a Goal Sharing System Based on Linked Open Data 3 Smith & Sandberg, 2018 [69] (A) Barriers to Innovating with Open Government Data: Exploring Experiences across Service Phases and User Types 0 Smith & Seward, 2017 [1] (A) Openness as Social Praxis -A: Article; CP: Conference paper.

Table 2 .
Articles: journal/ranking and category JCR.
Note: NA: not available.

Table 3 .
Articles: journal/ranking, subject area and category SJR.

Table 4 .
Conference papers: source/ranking, subject area and category SJR *.
NA: not available.* Note: Information about conference papers ranking and categories JCR is not available.

Table 6 .
Type of study/author/s, year.

Table 10 .
Theoretical model: Open data impact process for open innovation.