From Spatial Data Infrastructures to Data Spaces—A Technological Perspective on the Evolution of European SDIs

: The availability of timely, accessible and well documented data plays a central role in the process of digital transformation in our societies and businesses. Considering this, the European Commission has established an ambitious agenda that aims to leverage on the favourable technological and political context and build a society that is empowered by data-driven innovation. Within this context, geospatial data remains critically important for many businesses and public services. The process of establishing Spatial Data Infrastructures (SDIs) in response to the legal provisions of the European Union INSPIRE Directive has a long history. While INSPIRE focuses mainly on ’unlocking’ data from the public sector, there is need to address emerging technological trends, and consider the role of other actors such as the private sector and citizen science initiatives. The objective of this paper, given those bounding conditions is twofold. Firstly, we position SDI-related developments in Europe within the broader context of the current political and technological scenery. In doing so, we pay particular attention to relevant technological developments and emerging trends that we see as enablers for the evolution of European SDIs. Secondly, we propose a high level concept of a pan-European (geo)data space with a 10-year horizon in mind. We do this by considering today’s technology while trying to adopt an evolutionary approach with developments that are incremental to contemporary SDIs.


Introduction
Almost 13 years after the adoption of the INSPIRE Directive [1] aimed at supporting European Union's (EU) environmental policies, stakeholders have come a long way in making use of one of the world's largest coordinated efforts for establishing a Spatial Data Infrastructure (SDI). Currently, more than 150 thousand datasets are documented and increasingly made available (i.e., discoverable, viewable and downloadable) within the pan-European INSPIRE infrastructure. With the overall deadline for the full implementation of INSPIRE foreseen for the end of 2020 [2], also in light of the recently announced European data strategy [3], it is the appropriate time to take stock of the state-of-play, analyse benefits and issues, assess future challenges and opportunities, and outline possible strategies for the evolution of Europe's SDI. The main starting point of this reflection is that, when compared to 13 years ago, today we are facing a substantially different technological scenery. Alternative data sources such as digital sensors, Earth Observation platforms and citizen contributions are challenging the role of the public sector as the main producer and owner of geospatial content.
In addition, the private sector is playing an increasingly important role in the creation, storage and maintenance of data (including personal data), but also in the extraction of value from existing data through the application of sophisticated, often proprietary, algorithms. Under these external pressures, public authorities are left with no other chance but to adapt and redefine their role in the attempt to cope with this rapidly changing context.
The objective of this paper, given the boundary conditions described above, is twofold. Firstly, we position SDI-related developments in Europe within the broader context of the current political and technological scenery. In doing so, we pay particular attention to relevant technological developments and emerging trends that we see as enablers for the evolution of European and global SDIs. Secondly, we propose a high level concept of a pan-European data space with a 10-year horizon in mind. We do this by considering today's technology while trying to adopt an evolutionary approach with developments that are incremental to contemporary SDIs.

Materials and Methods
As far as the study approach is concerned, the work presented here stems from the authors' long term engagement with topics having to do with digital geospatial data interoperability and standardisation. In preparing the manuscript we used a wide variety of heterogeneous materials, ranging from legislative acts, software repositories, specifications of standards and academic literature. We have thoroughly referred to those materials wherever necessary, which has led to an unusually high number of cited references. Given the specificity of the topic, it is important to highlight that this paper is not to be seen as a traditional research output with the application of a clear methodology which yields some results, but rather as a discussion paper which addresses a multitude of interdependent aspects ranging from technological enablers to legal frameworks.
Following this brief introduction, the remainder of the paper is structured as follows. Section 2 provides an overview of the legal and organisational context around geospatial data in Europe, with a primary focus on the INSPIRE Directive. This Section also provides a brief overview of the availability of spatial data from public sector sources in Europe. With the same European perspective in mind, we then outline the most prominent technological enablers for the evolution of SDIs in Section 3. The most substantial part of the work, where we present our scenario for the evolution of Europe's SDIs post-2020 is provided in Section 4. Finally, our concluding remarks and discussion points are included in Section 5. Structured in this manner, the article consecutively covers all four dimensions (legal, organisational, technological, semantic) defined by the European Interoperability Framework [4]. Given the strictly technological scope of the article, we have paid more attention to the emerging ICT trends, and they are therefore provided separately. Clearly, the emphasis in all five sections is on Europe, as we anchor our work within the EU legislative and organisational ecosystem. However, considerable portions of Section 3, but also parts of Section 4, would be relevant to a different geographic context with minor adjustment. In a broader context, we see Europe, due primarily to its complexity, as a miniaturised representation of the world. From this point of view, the work presented here is also important for developments at the global level.

Spatial Data Infrastructures
There is no single definition of the term Spatial Data Infrastructure. In accordance with the INSPIRE Directive [1], an SDI means 'metadata, spatial data sets and spatial data services, network services and technologies, agreements on sharing, access and use, and coordination and monitoring mechanisms, processes and procedures, established, operated or made available in accordance with the Directive. Similarly, Bernard [5] defines SDIs as 'frameworks of policies, institutional arrangements, technologies, data, and people that enable the effective sharing and use of geographic information. Both definitions cover multiple facets that the term encompasses, ranging from (i) the legal and political setting, to (ii) the organisational aspects, and (iii) technological enablers that make the sharing and use of geographic data possible. The notion of SDIs emerged more than 20 years ago, and the concepts around that topic have been evolving in response to technological and organisational developments. A comprehensive overview of the historic development of SDIs is provided by Schade et al. [6]. Despite the long history, it is important to emphasise that from a temporal perspective, SDIs are neither to be seen as 'frozen', nor bound to a particular set of technologies or standards. Innovations are, sometimes slower than in mainstream ICT, taken on board in order to better address the requirements of users, thus ensuring that the infrastructures remain fit for purpose [7]. Within the context of this paper, it is important to also highlight that SDIs are not isolated from the rest of the data landscape, as the geospatial dimension is a powerful integrator for data which might only implicitly be related to geographic space [8].  [1], the EU embarked on an ambitious initiative for enabling access and reuse of geospatial data and information across all levels of government and borders. INSPIRE quickly proved itself to be a de facto pioneer and innovator of a European Interoperability Framework [4] and the Digital Single Market priority initiative for building a European data economy benefiting the overall economy and society [9,10]. At the EU level the geospatial dimension, which is relevant to INSPIRE, is very important for the provision of multiple public services. This is reflected in the strong political commitment expressed in the 2017 Tallinn Ministerial declaration on eGovernment [11] where EU Member States and European Free Trade Association (EFTA) countries pledged to adhere to the vision outlined in the European eGovernment Action Plan 2016?2020 [12]. Furthermore, the newly released European Data Strategy [3] establishes an ambitious agenda that aims to leverage on the favourable technological and political context and empowers EU citizens, businesses and the public authorities through a data-agile approach which is (i) in line with European values, and (ii) reflects the needs of a multitude of actors.
Looking beyond the environmental domain, digital spatial data is an important resource for economic growth, competitiveness, innovation, job creation and societal progress (see e.g., [13][14][15]). The re-use of Public Sector Information (PSI) data can contribute to the growth of the European economy, the development of Artificial Intelligence (AI), and can play an important role in addressing a multitude of societal challenges. The Open Data Directive [16], which revises the PSI Directive [17], entered into force in July 2019. The new Directive aims to address the issues, that were detected during the evaluation in 2018, affecting the full exploitation of the potential of public sector information for the European economy and society. It also encourages the access to and re-use of public and publicly funded data and recognises INSPIRE as a good practice. In addition, the European Commission has also defined a series of key principles to be considered so as to make data sharing a success for all parties involved, in Business-to-Business (B2B) and Business-to-Government (B2G) situations [18].
The Open Data Directive makes also explicit mention to the increasingly huge amount of digital data (including not only 'data' in the conventional sense but also the associated algorithms, tools and workflows) produced in research and having a high potential for re-use for societal benefit. In line with the good data management practices translated by the FAIR (Findability, Accessibility, Interoperability, Reusability) guiding principles [19], in 2016 the European Commission setup an Expert Group to investigate how the FAIR paradigm can be turned into reality [20]. The Group mandate has recently ended with the release of a dedicated Report and Action Plan focused on the actions needed in terms of research culture and technological infrastructure to ensure that all digital outputs of research are made FAIR [21]. Achieving FAIR research data at the EU level would be a necessary condition to establish the European Open Science Cloud (EOSC), a trusted environment for sharing and analysing data from all publicly funded research that should become operational within the next few years [22].
In addition, the Open Data Directive introduces the concept of 'high-value datasets'. These are datasets holding the potential to (i) generate significant socio-economic or environmental benefits and innovative services, (ii) benefit a high number of users, in particular SMEs, (iii) assist in generating revenues, and (iv) be combined with other datasets. Given this, the Directive requires that such datasets are available free of charge, are provided via Application Programming Interfaces (APIs) and as a bulk download where relevant, and are machine-readable [16]. The Directive does not include the specific list of high-value datasets -which is expected in the future -but only their thematic categories, one of which is 'Geospatial'.
As  [26]), Future of the Common Agricultural Policy [27] in particular related to data sharing, and the EU space programmes Copernicus [28], Galileo [29] and EGNOS [30] in accordance with the proposed new regulation for space program.
Finally, the amounts of available geospatial data are increasing on a daily basis thanks to the implementation of the legal instruments described above, not to mention the data generated by crowdsourced initiatives and technological drivers such as the Internet of Things (IoT). Clearly the main impact on data reuse is given by the specific policies applied to each data source. INSPIRE supports open government principles and open data initiatives but it does not specify a common data policy [1], which results into datasets published by EU Member States being available under a heterogeneous set of licenses (see Subsection 2.2.2). The General Data Protection Regulation (GDPR, [31]), which entered into force in May 2018, is also impacting the European geospatial data landscape. Thus, it is not difficult to envisage cases where the combination of different geospatial datasets exposes the location of individuals, that is, those are to be considered as personal information. In addition, while the GDPR is conceptualised and put in place for good reasons, there might be cases where it is used as an 'excuse' for public sector authorities to not share their data.

Availability of INSPIRE Data
All the resources (datasets and corresponding metadata) shared by Member States and EFTA countries which fall under the scope of INSPIRE can be accessed through the INSPIRE Geoportal [32]. This represents the entry point to the whole INSPIRE infrastructure and allows to search and discover datasets based on their metadata and then visualise or download them [33]. Datasets can be accessed using two different viewers-the first focused on so-called priority datasets (i.e., specific datasets used for environmental reporting [34]), the second focused on datasets belonging to each of the INSPIRE's 34 cross-sectoral categories, named data themes [35]. An overview of the total number of datasets published by each EU Member State or EFTA country is also offered (see Figure 1). The three numbers associated to each country indicate, respectively: the number of datasets for which a metadata record exists, the number of datasets for which an INSPIRE View Service exists, and the number of datasets for which an INSPIRE Download Service exists. According to a Service-Oriented Architecture (SOA) approach based on the Open Geospatial Consortium (OGC) standards for geospatial interoperability,

New Data Sources
Back in 2005, before the launch of Google Maps, geospatial solutions formed a niche segment empowering a relatively low number of experts in the know. Similarly, before the emergence of popular citizen-driven or Volunteered Geographic Information (VGI, [36]) initiatives such as Wikimapia [37] and OpenStreetMap [38], geospatial data production and management were an almost exclusive prerogative of the public sector and its contractors. Not only did the involvement of companies such as Google bring the geospatial to the masses but, today, data are increasingly being produced by citizens and enterprises [39]. In addition, the rise of the IoT and newly deployed satellite Earth observation initiatives such as EU's Copernicus are contributing to the exponential growth of the volumes of data available. The velocity and veracity of data sources has also intensified. This trend has led to the emergence of the well-know concept of big data [40,41]. Figure 2 shows the relative popularity of the terms 'big data' and 'IoT' that shape the technological bounds of SDIs. Within this dynamic context, the business model of the public sector, which is traditionally empowered through some kind of legal act to collect, maintain and distribute data on behalf of citizens, is under threat. The public sector perspective in response to the above-mentioned pressures argues that (i) governmental data is quality-assured and official and therefore the only trustworthy source, and (ii) mandate and use of governmental data is required by law. That is certainly true in many cases, however there have been already early evidences (see e.g., [42][43][44][45] that the private sector and/or citizens can outperform the public sector and produce better quality data faster, with a higher rate of update and for only a fraction of the price. This leaves governmental bodies in a situation where their exclusivity would only be protected by their own legal acts. In the remainder of this section we focus on the novelties in terms of geospatial data generated by (i) the IoT, (ii) crowdsourcing initiatives, (iii) newly deployed Earth Observation satellite systems, and (iv) research. We provide a European perspective on those novelties whenever possible.

Internet of Things (IoT)
A growing variety of devices connected to the Internet are capable of producing increasing volumes of data [46]. Those are only expected to grow in terms of both the number of devices and the data they produce. At the same time, the prices of hardware are quickly dropping, which provides unprecedented opportunities for densifying existing monitoring networks and collecting data with a precision and spatial resolution that was unthinkable only a decade ago. The rise of the IoT has a direct influence not only on the data landscape but also on the way in which geospatial analysis is being performed in environmental informatics [47,48]. A discussion on the advantages and challenges related with the uptake of the IoT for geospatial applications is provided in [49].

Crowdsourced Geographic Information and Citizen Science
Geospatial data produced by citizens was termed Volunteered Geographic Information (VGI) in the seminal paper by Goodchild [36], after which a plethora of other terms have been created which are overall summarised by the umbrella expression crowdsourced geographic information [50]. The most popular among such initiatives is OpenStreetMap (OSM), a crowdsourced vector database of geospatial objects with a global extent and available under the open access Open Database License (ODbL) [51]. Among other things, to date OSM has attracted millions of data contributors and data users thanks to the simplicity of its data model, based on only three different object types [52] complemented by a flat list of any number of key-value pairs [53]. This makes it extremely easy to query, access and consume OSM data in mainstream GIS software as well as to build third-party applications on top of them. OSM is also a notable example of the public sector adaptation to the current data landscape described in Subsection 3.1, since there are cases where public sector data is blended inside OSM while keeping track of data provenance [54]. Nonetheless, such examples are still isolated, and the different possible business models and their practical implications are to be investigated [55]. As mentioned above, a huge body of literature exists about the quality of VGI, and OSM in particular [56], which attests how-despite the non-professional nature of contributors and the traditional absence of metadata-crowdsourced geospatial data can be used as a valid alternative, complement or update to governmental and private data sources.
Crowdsourced geographic information features also a strong intersection with Citizen Science (CS), a term describing citizens' participation in scientific activities [57]. In contrast to OSM where contributors do not necessarily focus on a particular thematic area, in CS the objectives relate to answering a particular research question. An example is the monitoring of biological invasions [58], where (i) citizens are engaged in the collection of raw data on invasive alien species through a dedicated smartphone app, (ii) data is validated and fused with other sources such as research findings and authoritative datasets, and (iii) citizen-generated data is used for policy-making.

Data from Earth Observation Platforms
Satellite remote sensing has a longstanding tradition in providing raw observation data in support of many geospatial applications. In a domain currently characterised by the presence of big industry players such as Maxar and Planet, some recent developments have occurred which are revolutionising the way in which we observe our environment. Among those (i) affordable small satellites [59], (ii) low-cost unmanned aerial vehicles (UAV, [60]) and (iii) the European Copernicus Sentinel missions [28] play a prominent role.
Low cost small satellites provide opportunities for mass deployment and establishment of dense Earth Observation constellations at the fraction of the price of traditional (military, government and private) systems, while UAV platforms have established themselves as valuable data sources for a completely new wave of disciplines and applications.
Copernicus data products are by definition free of charge and provided through an open license [61]. Copernicus is still under development, but nonetheless it is already changing well-established processes, for example within the context of EU's Common Agricultural Policy where the huge amounts of spatio-temporal observation data are streamlining and optimising the monitoring of agricultural subsidies. When fully operational, Copernicus will generate more than 25 PB of data per year, thus comprising the largest environmental satellite system in history [62].

Private Sector Data
Private companies in multiple domains and scales ranging from small and medium enterprises to global corporations nowadays hold significant volumes of data. Being collected, stored, and used sometimes without the awareness of those who contributed it, data in the private realm are an asset that is of critical importance for the success of a growing number of commercial endeavours. Examples of privately held data are heterogeneous, and it is out of the scope of this paper to provide a comprehensive overview.
One of many possible examples, illustrating the huge potential for privately-owned geospatial data are those collected by mobile phone operators [63]. Figure 3 shows a density of cellular network subscribers assigned to a particular cell. Even if not equivalent to population, such data is of relatively high spatial resolution. More importantly however, the temporal resolution of the data is very high. In the particular example provided in Figure 3 new data are available every 15 minutes. When compared to population Censuses which are held every 10 years and require substantial organisation and resources, this is an exceptionally high update rate. Clearly, mobile phone data is not a substitute of Census results, but interesting opportunities arise if both public and private data are combined, considering the advantages and disadvantages of both sources. However, when considering commercial data as a source of information for public good, the emerging issues relating to privacy and ethics are to be considered [64]. The reuse of private-sector data is at present difficult, as companies are in most cases contributing them on a voluntary basis. Clearly, considering that data are an important asset of companies, they are very often not willing to share them with the rest of the world. In the absence of a clear regulatory framework requiring the private sector across multiple domains to contribute their data, societal benefits from their possible reuse unfortunately remain limited. Additionally, if private data are to be shared, specific measures for their anonymisation shall be put in place, while ensuring that companies which contribute them retain their competitiveness. In this complex setting, it is still to be understood how to best reuse private sector data in a way that is to the maximum extent possible beneficial for all stakeholders.

Open Research Data
In parallel to the legislative framework and the related initiatives discussed in Subsection 2.2.1, an increased attention has been placed on making research digital objects FAIR. The main output of this has been the creation of multiple FAIR repositories, either specific to some disciplines or institutions or global and general-purpose ones, the latter including for example FigShare, Dataverse, Zenodo and DataHub which welcome a huge variety of object types and do not impose strict restrictions on their metadata. The open science approach, based on the FAIR principles and the underlying values of openness, transparency, reliability and collaboration, has stimulated a growing interest among the scientific community in the concept of research reproducibility. Evidences of this are, for example, the increasing number of journals providing open access publication options or journals specifically focused on making data and/or software discoverable, accessible and citable, for example MDPI's Data [65], Elsevier's Data in Brief [66] and Springer's Open Geospatial Data, Software and Standards [67]. As a notable example in the geospatial domain, the AGILE (Association of Geographic Information Laboratories in Europe) has recently published the Reproducible Paper Guidelines [68], which require the presence of a section on data and software availability in all papers submitted to the AGILE conferences, and details its requirements.

New Architectures
The emergence of new technologies, alternative data sources and increased user demand has led to the establishment of completely new architectures that provide flexible and scalable solutions for accessing and consuming data. The rise of the IoT, described in Subsection 3.1.1, even if fragmented from an infrastructural point of view, contributes enormous volumes of spatio-temporal data. In the following subsections we present and discuss several disruptive approaches that are already redefining the way in which geospatial data is being shared and used. It is very difficult, if at all possible, to provide a synthesised view on all such architectural approaches. The common denominator between all of them is their data-centrism, that is, data is an asset and all technologies and approaches should ensure that the access to it is as efficient and easy as possible.

Simple APIs
Application Programming Interfaces (APIs) provide an opportunity for developers to easily create value-added products. APIs can hide the complexity of upstream infrastructures and offer a set of well-defined and documented methods for data utilisation and processing across various components. The service interfaces in traditional SDIs (e.g., WMS, WFS, WCS and SOS) are well known and supported by client applications. Depending on how we define the term 'API', they can be considered as data-access APIs providing standardised access to geospatial data. Modern web-based APIs go one step further as they (i) provide a simple approach to data processing and management functionalities, (ii) possibly offer different encodings of the payload, (iii) can easily be integrated into different tools, and (iv) can facilitate the discovery of data through mainstream search engines such as Google and Bing. The adoption of a RESTful architecture further simplifies the access to the functionality offered by the API while minimising the bandwidth usage.
Two recent developments by the OGC, namely the 'OGC API -Features' [69] and the SensorThings API [70] provide standardised APIs for ensuring modern access to spatial and observation data. However, multiple questions arise on possible approaches on the implementation of APIs alone, or in combination with existing services, for example how to implement APIs while leveraging on investments already made in service interfaces such as WFS and WMS. Finally, the frequently used OpenAPI specification [71] allows to document APIs in a vendor independent, portable and open manner, but also fully integrates a testing client within the API documentation.

Data Streaming
The rise of the IoT has led to the emergence and spread of the data streaming phenomenon. A stream provides a sequence of digitally-encoded signals with a certain frequency and payload that are transmitted and/or received. With data streaming, often, there is no need to store the frequently streamed data, for example data might make sense only under certain circumstances and when put in the right context. Such an approach is in strong contrast with the traditional architecture of an SDI, where web services for data access are built on top of a data storage that is infrequently (when compared to IoT) updated.

Eventing and Asynchronous Data Transactions
The exchange of data in traditional SDIs is handled in a service-oriented architecture, following a request/response pattern that is, through data polling. From a technological point of view this approach is mature and well supported by client applications. However, data polling leads to the generation of excessive traffic, and is not necessarily well-suited for data intensive use-cases, or when data is needed only as result of the occurrence of a particular event, for example when a threshold value is reached, or when new data is made available. That is why Rieke et al. [72] recommend the establishment of "Event-driven SDI's". From our perspective, this should be achieved in an evolutionary manner, that is, complementing, and not substituting existing approaches. Through such an approach users would have a choice and use the solution which is tailored to their particular needs. Such an approach is feasible from a technological point of view, as the emergence of cloud-based solutions can address the user demand in a flexible and scalable manner.

Edge, Fog and Cloud Computing
The increased diversity of architectural approaches that we are facing is inevitably impacting SDIs. Whereas in the past data was processed and made available on some sort of centralised server, now this only comprises one of several frequently used architectural approaches, described in detail in [49]. Depending on the concrete use case, computing can also take place in (i) the network edge (e.g., on sensors devices), (ii) fog (e.g., on network gateways), or (iii) cloud. This novelty has led to the emergence of data spaces with various complexity that also occur at multiple levels such as the local or the regional. If the majority of data today are being processed in the cloud (80% in accordance with a recent study of the International Data Corporation [73]), it is expected that the rapid growth of the IoT will revert that, and the majority of processing would therefore take place close to the network edge. Today, network latency poses a limitation for the uptake of edge and fog computing, which will be overcome by the fifth generation wireless technology for digital cellular networks (5G). Despite the different architectural setting of edge, fog and cloud computing, a shared characteristic of the three approaches is the coupling of data and algorithms.

Grassroots Standardisation
In addition to all the technological trends described above, a new kind of standardisation initiatives are emerging. Historically standardisation bodies often approached studied phenomena through a holistic approach in their attempt to represent multiple, sometimes very complex phenomena. This approach slowed down standardisation, resulted into the adoption of highly excessive standards which in turn sometimes lead to very few implementations, and posed a burden on both users and data providers. Standardisation for the needs of SDIs inherited such complexity from their multiple standardisation building blocks. INSPIRE is no exception, and that is why it is considered by some as too complex. Recently, understanding the limitations introduced by this top-down approach, but also taking advantage of the availability of tools for co-creation of content such as wiki and git, a new breed of standards are rapidly being established. In contrast to the above-mentioned approach, those standards are co-created online by multiple actors in an agile and iterative manner. Capturing all possible use cases is no more the primary objective of the standardisation activity. On the contrary, tweaking the developments to the needs of users, thinking of simplicity and lowering the entry level are the primary objectives. Good examples of such a lightweight approach are the 'OGC API -Features' and 'SensorThings API' standards [69,70]. Both standards have huge potential for modernising SDIs and are already considered as possible INSPIRE Download Services [74,75].

Growing Ecosystem of Software Tools
From our perspective, the increased availability and diversity of data sources goes hand in hand with a growing versatility of software tools (e.g., open source, proprietary, mixed, standalone, bundled). This in turn is leading to an increased number of different approaches for processing, analysing and visualising data. Clearly, having more options at disposal empowers data users, and is a precondition for the reuse of existing data. However, from a European perspective aligned with the FAIR principles and centred around the concepts defined in the European Data Strategy [3], we assume that open source software would be very well fit for utilising the different data sources described in Section 3.1.

Europe's SDI Post-2020
It is appropriate at this stage, before analysing the possibilities for the evolution of Europe's SDI with a 10-year horizon in mind, to ask ourselves whether it is at all feasible to continue with such developments. Are Spatial Data Infrastructures still needed, or does the contemporary socio-economic and technological context make them obsolete? The debate around this question, even if the objective of this manuscript focuses on the European context, is relevant to the development of SDIs at the national level, as well as for the Global Earth Observation System of Systems (GEOSS).
Firstly, the exponential growth of data, combined with the increased heterogeneity of sources only reinforces the need of a sound data infrastructure that meets the requirements of users. On the other hand, the rich ecosystem of tools and technologies that can be used together with different data confronts users with too many choices. The sole fact that there is more data and better tools than ever before does not inherently mean that they are ready for use. The discoverability, accessibility and harmonisation of data [19] are more prominent than ever. Secondly, data-even if in abundant quantities-make very little sense in the absence of contextualisation that define their access and reuse conditions. A standardised and simple licensing framework is therefore needed. Contemporary SDIs provide the legislative foundation for the standardisation of licenses. Thirdly, the investment already made in the establishment of SDIs, even if very difficult to quantify, is substantial. It would be costly and cumbersome to discontinue the developments that are already made. Seeing what is developed so far as an asset, and coming up with a clever way of retaining and improving those assets is therefore needed. Lastly, a community of data providers and early adopters already exists around the topic of SDIs which we see as an important precondition for sustaining and gradually improving the availability and usability of the data assets that are made available.

From Data Infrastructures to Data Spaces
Even if all the technological aspects mentioned above are properly addressed and the access to data is ensured, the risk that the resultant infrastructure is too much provider-centric remains. The assumption, characteristic to contemporary SDI development, that when data are exposed online through interoperable services all the benefits would automatically come does not hold. Two aspects should be addressed in order to avoid a provider-centric development with limited uptake by users. Firstly, all actors should become first-class citizens who participate in the co-design and co-creation of technological solutions. This means that users and providers, but also those in between such as data integrators, should be considered in all stages related to the creation, management and use of the data. Such an inclusive approach would, of course, include an infrastructure component, but would in addition reflect the actual needs of stakeholders in an inclusive and sustainable, yet flexible manner. Secondly, it is worth considering that the notion of 'spatial' is not special anymore (see Section 3.1 for an explanation of why). Therefore, SDIs as we know them should, whenever possible, benefit from approaches and technologies that are characteristic to mainstream IT instead of trying to invent own silo approaches. Only by resolving those two high-level challenges can we ensure that the SDIs dissolve in emerging data spaces defined by the European Commission as 'a seamless digital area with the scale that will enable the development of new products and services based on data' [18].
In what follows in this section, we provide our views on the future of Europe's data landscape as a transition from existing Spatial Data Infrastructures into data spaces. From our perspective, the only viable way forward in this respect is through an evolutionary approach where novelties are incremental to already existing developments. If a geospatial data space is to be established with sustainability in mind, that is, planning with a horizon of approximately a decade, today we need to consider technological developments most of which did not exist when INSPIRE was conceptualised around 2007. In such an exercise, given the pace at which technology is evolving, we are surely going to miss certain developments. That is why, in order to avoid speculation, we have no choice but to root our considerations to a particular set of existing technologies. As a starting point, Figure 4 shows a hypothetical architecture of a data space that considers the technological enablers covered in Section 3. In preparing it we deliberately reused the original INSPIRE architecture which is described also in [8]. The Figure, even if capturing most of the technological trends covered in Section 3 would easily become obsolete with the emergence of a new technological trend. Having this in mind, our ambition in compiling it is not to be comprehensive, but rather to illustrate the interplay between different interconnected actors within an increasingly rich ecosystem of technologies and approaches.
Below, we focus on the principles that should be adopted in ensuring a future-proof evolution of contemporary SDIs into data spaces. We cover both organisational and technological principles, thus follow the logic defined by the European Interoperability Framework [4]. While the former will mostly impact those involved in the design and conceptualisation of data spaces, the latter will address their technical implementation and related stakeholders. However, each of these principles does not exist on its own but is highly interdependent and, to a certain extent, overlaps with the others.

Organisational Principles
From a governance perspective, it is clear already that there is no single approach capable of fitting the high diversity of existing data-related practices. Here we propose an organisational approach that is flexible, open and inclusive, thus providing the opportunity to better address the data-related challenges and requirements of heterogeneous stakeholders outlined in the previous sections.

Co-Design by Default
The development of Europe's SDIs, and the INSPIRE infrastructure in particular, pioneered mechanisms for co-creation and co-design of consensus-based technical arrangements. However, this approach (i) represented at large the perspective of those who participated in the drafting, that is, data providers, and (ii) took too long for the arrangements to be endorsed and adopted. As a consequence, other stakeholders such as software vendors and users were not fully included in the process. Nowadays, mainstream tools such as GitHub and BitBucket provide excellent opportunities for building on top of the already established partnerships, but also in speeding up the process of co-creation and endorsement of technical specifications. In this context, the overall approach to SDIs should remain neither top-down, nor bottom-up, but shared across multiple levels.
In addition, the design of data spaces should embrace in advance the fact that technologies are, and will continue to rapidly change, that is, there will be new disruptive technologies soon, while others will quickly become obsolete. That is why, a rigid process which roots all developments into a particular standard or technology makes very little sense. Instead, an agile approach should be endorsed which respects the overall principles and concepts but can easily be tailored to the needs of different stakeholders in a flexible and inclusive manner.

Simple Licensing Framework
If the licensing conditions for a dataset are unknown, or too difficult to understand, for example, due to a customised license, users are very likely to avoid using the content and look for substitutes. That is why, it is essential that well-established and easily understandable licenses such as those from Creative Commons [76] become the norm. Apart from choosing the right licenses, a comprehensive licensing framework should be developed and implemented in practice.

Technical and Semantic Principles
Whatever technical decisions are made, it is critical to ensure that datasets are made available and are easily re-usable together with other sources with as little prior knowledge as possible. In addition, duplication of data between different sources should ideally be avoided.

Data Structures Based on Spatial Features
The harmonised data models, including those defined in INSPIRE, represent a consensus-based perspective on how the world should be modelled. They contain valuable domain expertise and are an asset that should be preserved and tailored to newly emerging trends. Despite new technological developments, the need for the use of structured data with clearly defined semantics across multiple domains still remains [6]. A simple yet extensible data structure based on INSPIRE semantics [77] and features (spatial objects) instead of the 34 INSPIRE data themes would be well supported by client applications and libraries. In addition, having the spatial feature as the basic building block would unambiguously address the outstanding question whether data belongs, or does not belong to INSPIRE. Such a structural approach would certainly be very far from capturing all possible use cases in a particular domain, however, in many cases that would satisfy the needs of most users. In addition, a simple model based on features can easily be transformed with ready to use tools into more sophisticated data models. Such an approach would not go against the rich semantics created during the conceptualisation of INSPIRE.
A good starting point for implementing the approach described above would be to extract the spatial object types defined in the Unified Modelling Language (UML) models for each INSPIRE theme [78]. The UML models reflect a various level of abstraction, combined with a different level of maturity. That is why, the exclusion of objects that are not instantiable will be challenging. Nonetheless, we believe that the change of the approach from data themes to features would still provide a good starting point.
An existing solution that is to a large extent similar to what is said above is used by OSM. Its native data structure is based on simple geometries (nodes, ways and relations) and key-value pairs (tags) [53]. Another similar path is followed by the widely used Darwin Core Archive (DwC-A) biodiversity informatics data standard [79]. DwC-A describes species occurrence and sampling campaigns based on a flat data structure and a set of terms that are made available through Unified Resource Identifier (URI). Both of those initiatives document the encoding approach on wiki pages through clear examples, and in the case of OSM pictures of spatial objects. The latter would be, for example, a nice addition to the currently existing INSPIRE Registry.

Encoding-Agnostic Approach
Enforcing a particular data encoding, such as the Geography Markup Language-GML [80] or GeoJSON [81] has advantages that relate to (i) increased interoperability, and (ii) anchor the SDI developments to a particular international standard (e.g., from the OGC or ISO). However, this approach has serious drawbacks, as (i) a single encoding can easily become obsolete with the rapid technological evolution, (ii) the choice for users is limited, and (iii) issues arise if the chosen encoding is not well supported by client applications.
The notion of dataset distributions as defined in the Data Catalog Vocabulary-DCAT [82] might be endorsed in order to provide different representations of a dataset thus increasing the flexibility in making the data available to stakeholders with different needs. In accordance with the DCAT specification [82], through distributions datasets might be available in multiple serialisations that may differ in various ways, including natural language, media-type or format, schematic organisation, temporal and spatial resolution, level of detail or profiles.

Mainstreaming IT-Related Developments
As already outlined, it is important to maintain what is special about spatial data while embracing mainstream IT principles. With regards to the access to data, the use of APIs offers many opportunities for providing easy access to data and enlarge the user base beyond the typical geospatial expert. Not only would APIs allow for simplifying the access to data, but they would also help to better understand user demand. In addition, the use of fine-grained loosely coupled services, together with lightweight protocols would help improve the usability of geospatial data. With the emergence of virtualisation technology such as Docker, data infrastructure components are no longer strongly coupled with a particular architectural setting. Instead, microservices, data and processing can easily be migrated from physical servers to virtual (including cloud-based) ones.

Conclusions
The ecosystem of tools and technologies for producing data is mature, and the creation of data is easier than ever. In addition, the time, cost and effort associated with the collection of new data is constantly decreasing [48]. This rapidly creates new demand and new expectations (e.g., drastically increased requirements for use cases such as self-driving cars and AI). Within this context, the political and legislative context in Europe, combined with the pace of technological developments, and the diversification of data sources create excellent conditions for the evolution of SDIs and their fusion into data spaces that are open to mainstream IT developments and are fit for the needs of all stakeholders. Considering the pace of contemporary technological developments, it is very difficult, if at all possible, for us to be able to predict what would technologies look like in 10 years from now. However, from our perspective, the envisaged agile approach for establishment of geospatial data spaces would ensure that data, as well as software tools are desired by users and therefore remain fit-for-purpose.
In describing the envisaged process of establishing European data spaces we covered several interrelated aspects. In Section 2 we provided an overview of the contemporary technological and legislative substrate that defines the bounding conditions for the evolution of contemporary SDIs. Then, Section 3 outlined the main technological enablers that define the bounding conditions for the evolution of Spatial Data Infrastructures. Consequently, in Section 4 we provided our perspective on the main organisational and technical principles for the fusion of SDIs into data spaces at multiple levels of governance, ranging from the local to the European.
As already outlined, in writing this paper we adopted a purely technological perspective. A wider view is needed that looks deeper into emerging data spaces. Below, we outline several outstanding issues which fall outside the scope of this paper but require further attention.
• Being able to understand the requirements of users in terms of content, data encodings and semantics and granularity is critically important. Harmonising and making data available that is of little interest to users, while not focusing on those datasets that are highly desired makes little sense. A mantra that is often heard at geospatial conferences is related to following the user. However, the concrete pragmatic approach for doing so is still missing. Within the emerging data space we see an opportunity for a more prominent role of users but also of data intermediaries that help bridge the gap between the provision and use of the data. In addition, the rise of AI, if combined with data mined from web access logs and other sources might provide valuable insights in resolving this issue. Once the data are used, it is essential that the concrete feedback of users is fed back into the data space thus improving and ensuring its sustainability. • The governance of data in an increasingly complex setting is challenging. How to make the most out of the scattered resources while retaining sovereignty from big software platforms is to be addressed. Whatever the possible solution, the public sector has an important role to play. In addition, the organisational approaches for the establishment of operational data sharing arrangements, reflecting appropriate incentives that would guarantee stakeholder participation, are to be investigated. Numerous data-related novelties are successful at the urban and regional levels. How to scale and spread data-driven innovation is still not fully understood. While there is evidence on the success of data-related developments at the urban level, the higher the territorial level is, the weaker the evidence for success is, as is the sustainability of good practices. • Regulatory aspects that would ensure that all involved actors such as the private sector and citizens can contribute, but also benefit from the emerging data spaces should be clarified. • Finally, it is important to highlight that multiple data spaces can occur and co-exist at different territorial levels, which in turn would require the elaboration of organisational and technological approaches for coupling the different data spaces in a loose and flexible manner. The role of well-documented APIs can play a central role in this setting.
Author Contributions: Alexander Kotsev worked on all sections of the paper, helped build the overall story line and coordinated the input with all co-authors. Marco Minghini wrote substantial parts of the text on emerging technological trends, and provided input to most of the remaining sections. Robert Tomas scoped the approach for structuring data based on features, and contributed to most sections of the paper. Vlado Cetl and Michael Lutz provided the parts related to the European Union context, political agenda and the state-of-the-art of the INSPIRE implementation. All co-authors contributed to the section dedicated to the vision for Europe's SDI post-2020, and the discussion and conclusions. All authors have read and agreed to the published version of the manuscript.
24. von der Leyen, U. A Union that strives for more -My agenda for Europe: Political guidelines for the next European Commission 2019-2024, 2019.