Modeling Data Sovereignty in Public Cloud—A Comparison of Existing Solutions

Galij, Stanisław; Pawlak, Grzegorz; Grzyb, Sławomir

doi:10.3390/app142310803

Open AccessArticle

Modeling Data Sovereignty in Public Cloud—A Comparison of Existing Solutions

by

Stanisław Galij

¹

,

Grzegorz Pawlak

^1,* and

Sławomir Grzyb

²

¹

Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland

²

Faculty of Electrical Engineering, West Pomeranian University of Technology in Szczecin, 70-310 Szczecin, Poland

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(23), 10803; https://doi.org/10.3390/app142310803

Submission received: 27 September 2024 / Revised: 11 November 2024 / Accepted: 15 November 2024 / Published: 21 November 2024

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

Data sovereignty has emerged as a critical concern for enterprises, cloud service providers (hyperscalers), end-users, and policymakers at both national and international levels. In response, cloud-based distributed computing models have been proposed as frameworks to enforce data sovereignty requirements. This study aims to evaluate and enhance data sovereignty practices within public cloud environments. Through a comprehensive literature review, we analyze existing reference architectures and solutions that address data sovereignty, identifying the technological and economic constraints they impose, such as increased computational costs associated with specific frameworks and cryptographic measures. To address these challenges, we propose an abstract data sovereignty model designed to aid system designers and architects in developing compliant cloud-based systems. Additionally, we conduct computational experiments assessing the performance of the IDS connector, a key data sovereignty tool, deployed on the Google Cloud Platform and Microsoft Azure. Results reveal that while the geographic location of the software significantly impacts performance, the choice of cloud platform minimally influences the IDS connector’s efficiency. These findings offer insights into optimizing data sovereignty strategies for cloud solutions, with implications for future system design and policy development.

Keywords:

public cloud; data sovereignty; data spaces; cloud computing; data governance; IDS connector; Dataspace Connector; Google Cloud Platform; Microsoft Azure

1. Introduction

Data sovereignty within public cloud infrastructure is a complex, multi-layered issue that requires thorough analysis and an understanding of its diverse components. This study builds on a systematic literature review to establish a foundational understanding for developing a data sovereignty model specific to public cloud environments. By examining reference architectures proposed by non-profit organizations, we aim to highlight the state of current frameworks and identify gaps in existing approaches. Additionally, we report on computational experiments designed to test a specific data sovereignty solution in practice.

Digital sovereignty, encompassing both data sovereignty and technological sovereignty, refers to the ability of organizations to control their digital assets independently. Data sovereignty, as a component of digital sovereignty, specifically addresses the control an organization maintains over its data, particularly in the context of cross-jurisdictional cloud services. Key drivers include industry-specific regulations, such as the Digital Operational Resilience Act (DORA) for the EU’s financial sector, as well as broader international data protection frameworks like the General Data Protection Regulation (GDPR). Beyond regulatory compliance, organizations increasingly recognize the necessity of safeguarding their digital assets to maintain autonomy and security in an interconnected digital landscape.

Achieving data sovereignty in public cloud environments requires the implementation of specific sovereignty controls, ranging from contractual obligations to enhanced monitoring of data operations. Although monitoring alone cannot secure data, it can provide accountability by tracking data usage and penalizing misuse. Effective data sovereignty measures are grounded in robust identity and access management (IAM) systems, which enforce sovereignty through secure authentication processes [1]. Cryptographic solutions, such as bring your own encryption (BYOE), also play a crucial role in enabling secure and compliant data handling in public cloud contexts.

Several approaches have been proposed to address data sovereignty within public cloud environments. Sovereign cloud offerings from providers such as Atos exemplify solutions that provide varied levels of sovereignty based on customer needs, spanning from privacy-shielding technologies for lower-security requirements to fully isolated cloud instances for sensitive data [2]. Other solutions, such as data spaces [3], enable standardized data ecosystems that facilitate secure data sharing across multiple parties, promoting data sovereignty through controlled access and exchange. Emerging Web 3.0 technologies also show promise, offering decentralized identity management and secure data exchange, which align with data sovereignty objectives [1].

Public cloud providers have developed managed services that extend their cloud platforms to on-premises environments, thereby broadening data sovereignty options. Notable examples include Google Distributed Cloud Edge, Amazon AWS Outposts, and Microsoft Azure Stack, each enabling organizations to integrate public cloud capabilities with their local infrastructure for enhanced control and sovereignty.

The systematic literature review conducted for this study reveals a notable gap in the current body of knowledge: existing models are often too closely tied to specific technologies, lacking a generalizable framework that describes the functional layers of data sovereignty applicable across various cloud platforms. To address this gap, we propose an abstract data sovereignty model that organizes the essential layers and functional components independently of technology, making it adaptable for a range of sovereign solutions within public cloud environments or extended to edge computing.

Beyond theoretical modeling, we designed and executed computational experiments to validate the practical application of this framework. Specifically, we tested the performance of the International Data Spaces Association (IDSA) data sovereignty solution across different cloud platforms, providing empirical evidence to support our model’s applicability and evaluating its effectiveness in real-world scenarios.

2. Literature Review

As a foundation for this study, a systematic literature review was conducted using a query in Web of Science with the keywords “Data Sovereignty” AND “Public Cloud,” which returned 11 literature items. Additionally, we included a 2023 State-of-the-Art (SoK) paper from the IEEE Symposium on Security and Privacy conference on Data Sovereignty [2] and information from the International Data Spaces Association [3]. References were also expanded based on the reviewer’s recommendations: [4,5,6,7,8].

The literature review reveals three primary perspectives on data sovereignty within the public cloud context: (1) legal aspects, particularly international regulations; (2) strategies for managing public and government data; and (3) technical solutions, ranging from experimental to mature production-ready tools.

(1)

Legal frameworks and their limitations:

The legal dimension of data sovereignty in the public cloud spans various jurisdictions, each with distinct regulatory approaches. Recent studies highlight several challenges:

−: International regulations and data transfers:
Studies focusing on China [9] illustrate regional complexities by comparing overseas data transfer laws in China, the EU, and the USA. However, these frameworks are limited by jurisdictional constraints, creating challenges for multinational organizations that require seamless data access across borders. The lack of global standardization can result in compliance risks and enforcement difficulties.
−: Regional data localization laws:
For instance, the EU’s ongoing efforts to govern data through projects aimed at cloud federation and standardization [10] underscore regional attempts to manage sovereignty. Yet, these efforts reveal gaps, as they may inadvertently restrict cross-border data flow, complicating data accessibility in globally distributed environments. Similarly, data localization laws in Russia [11] enforce strict national boundaries but face criticism for impeding economic exchange.
−: Historical context of privacy and sovereignty laws:
The Data Privacy Matrix (DPM) developed by New Zealand researchers [12] categorizes data privacy laws across major cloud-hosting nations. This resource reflects the inconsistency in data privacy and sovereignty laws globally, which can lead to conflicting legal obligations for cloud service providers and their clients. High-profile events like Edward Snowden’s 2013 revelations about the PRISM program have led to increased scrutiny of US-based cloud providers, raising sovereignty concerns for data stored within the USA.
The limitations in these legal frameworks demonstrate the need for cohesive, enforceable international standards that address the complexities of data sovereignty. Existing models are insufficient for managing the nuances of data jurisdiction in cloud environments, particularly as they often lack clear provisions for cross-border data governance.

(2)

Governance frameworks and public data management:

Research on governance frameworks reveals various national approaches to managing public and government data in the cloud:

−: Government cloud strategies and sovereignty:
In South Africa, studies explore government cloud migration readiness [13]. Challenges arise when attempting to balance sovereignty with efficiency, as on-premises systems are often outdated but shifting to the cloud risks foreign data control. Other studies [14] analyze the government data strategies of the USA, UK, and Australia, where regulations like the USA PATRIOT Act permit extensive access to data by authorities. This raises significant concerns over data control and sovereignty when using foreign cloud providers.
−: Cloud sovereignty projects in the EU:
European countries, through initiatives like GAIA-X [15], have launched sovereign cloud projects aimed at creating an ecosystem where data can be securely shared. However, these projects face obstacles in balancing openness and control, and the infrastructure is still in early stages, lacking a comprehensive governance model that supports consistent data sovereignty practices across borders.
These studies reveal that governance strategies are often reactive, addressing data sovereignty concerns only after challenges emerge. This reactive approach limits the development of proactive, flexible governance models that can adapt to changing cloud environments and multinational data-sharing needs.

(3)

Technical Frameworks and Practical Tools for Data Sovereignty:

The technical dimension of data sovereignty involves practical solutions to safeguard data in public cloud environments:

−: Data encryption and access control solutions:
Recent studies introduce proxy re-encryption technologies, such as ID-based proxy re-encryption (IBPRE) [16], designed to streamline data sharing without exposing decrypted information. This technology provides a layer of sovereignty but is constrained by computational requirements and the need for additional key management infrastructure.
−: Data partitioning and redundancy:
Systems like ARGUS [17], which partition encrypted data across multiple storage providers, add redundancy and security layers to cloud storage. However, while ARGUS enhances security, it does not fully address concerns over the control of data by third-party providers, nor does it provide robust geolocation assurances, which are critical in certain data sovereignty contexts.
−: Blockchain-based governance for cloud resources:
Platforms like CloudAgora [18] leverage the blockchain to create a distributed marketplace for cloud resources, supporting user control and transparency. However, the reliance on the blockchain brings scalability challenges and may not be practical for environments with high data volumes and rapid access needs.
−: Systematization of Knowledge (SoK) in data Sovereignty:
The comprehensive SoK article [1] categorizes data sovereignty mechanisms into layers like decentralized identity, access control, and policy-compliant computation, offering a conceptual foundation for Web 3.0 applications. Although promising, the interoperability limitations in Web 3.0 solutions, such as Decentralized Identifier (DID) systems, highlight gaps that limit their immediate application in large-scale public cloud settings.
These studies indicate that while technological advancements offer promising solutions, the tools are often experimental, resource intensive, and not yet suited for widespread deployment in public cloud environments. Further research is needed to enhance scalability and efficiency, providing stronger guarantees for data sovereignty in a global context.
The reviewed literature underscores significant gaps in existing frameworks for data sovereignty in the cloud. Legal frameworks lack consistent international standards, governance strategies are fragmented and reactive, and technical solutions remain limited in scalability and practicality. Current models are insufficient to address the unique challenges presented by data sovereignty in public cloud environments, where data jurisdiction issues intersect with complex legal, governance, and technical demands.
This study focuses on addressing gaps within technical frameworks by proposing an abstract, layered model for data sovereignty in public cloud environments.

3. Materials and Methods

In this study, a comparison of reference architectures was conducted using expert judgment and analytical modeling methods.

This paper selects and describes three mature, comprehensive solutions that are “production-ready” and already in use across multiple use cases.

As a result, an abstract data sovereignty model was developed. Subsequently, relevant experiments were designed to assess the performance of the Dataspace Connector software when hosted on two public cloud providers: Google Cloud Platform and Microsoft Azure. The present research represents the initial phase of this experimental study, with subsequent phases planned to incorporate the Amazon Web Services platform and Edge hosting.

The performance tests of the Dataspace Connector software (version 8.0.2) were executed using the Python script provided in Appendix A. For a detailed description of the testing environment, refer to Section 4.3, and consult Appendix A for specifics of the testing script. The resulting data were collected as CSV text files (output from the testing script) and stored in a GitHub repository for future reference. The results of experiments from this paper can be used to select the most efficient hosting option for the IDS connector. The research also serves as proof of concept that the Dataspace Connector can be hosted on Google Cloud Platform and Microsoft Azure cloud platforms.

4. Models and Computational Experiments

In this section, we present a comparison of three reference architectures, followed by the introduction of an abstract data sovereignty model. Finally, we describe the experiments we conducted.

4.1. Reference Architecture Comparison and Description

There are three non-profit organizations we have selected during research that develop reference architecture models for services supporting data sovereignty. These organizations are as follows:

−: International Data Spaces Association
−: Gaia-X
−: FIWARE

Each of these organizations has a slightly different goal:

−: IDSA creates international standards for data spaces;
−: Gaia-X is focused on data infrastructure for Europe;
−: FIWARE creates framework smart solutions.

The above organizations have been selected for comparison because they contribute significantly to research and development in the fields of data spaces and data sovereignty.

4.1.1. International Data Spaces (IDSA)

The Reference Architecture Model (RAM) [19] developed by IDSA, currently in version 4.2, released one year ago, describes the general concept and functionality needed to create secure network of trusted data. It uses a technology-agnostic approach and describes components from three perspectives: security, certification, and governance, and within five layers: business, functional, informational, process, and system.

The below diagram Figure 1 shows the interaction of functional components. The main component of the architecture allowing access to data spaces is IDS connector, which is used to provide or consume data endpoints. Data and metadata assets are exchanged between connectors. Applications from the App Store can be downloaded and executed on connectors. After successful contract negotiation, using the Clearing House service, the data can be provided by data provider connector and processed by an application executed on data consumer connector.

4.1.2. Gaia-X

On the other hand, the Gaia-X organization [20] is building federated secure data infrastructure for Europe, as Trust Framework described in an architecture document [21].

The Gaia-X organization describes its reference architecture Figure 2 using three planes: trust, management, and user. The ecosystem consists of federated services that allow data exchange between providers and consumers. Trust is ensured by using verifiable credentials, which are part of self-descriptions published by participants. Self-descriptions attached to service offerings can also contain usage policies which are described in machine readable form such as ODRL. Usage policies consist of rules which implement restrictions of data usage after granting access to data assets. This is a common concept with IDSA reference architecture.

4.1.3. FIWARE

The third reference architecture, Figure 3, has been designed by FIWARE—an organization focusing on building smart solutions that gather data from many different sources, such as IoT devices or third-party systems like weather service. FIWARE organization is developing a curated framework of open source platform components to accelerate the development of smart solutions [22,23].

There are hundreds of use cases implemented already using FIWARE framework in the domains of industry, agrifood, or cities. The central component here is the context broker using the NGSI API. This is a standard API that allows users to read or modify data, for example, from IoT devices like sensors using standard HTTP methods. The concept here is that smart solutions can communicate via standard API using a shared data space.

Additionally, there is cooperation between IDSA and FIWARE foundation on the implementation of IDS reference architecture using FIWARE technologies, where the FIWARE Context Broker component is placed inside the IDS connector.

4.2. Data Sovereignty Model

In this section, we present a data sovereignty model Figure 4 that is generalized from various reference architectures and implementations, e.g., three architectures presented above but also the model presented in paper [1]. Our contribution is the analysis and generalization of various models that group common concepts and divide functionality into corresponding layers. The data sovereignty model is divided into four logical layers: hardware, platform, data protocols, and distributed applications. These distinct layers group technologies and functions that enable data sovereignty.

The lowest layer comprises heterogeneous hardware components, including servers and devices of various sizes and physical locations. This foundational layer provides essential computational, storage, and network resources that support all upper layers. The second layer, known as the trusted infrastructure, is designed to implement trusted platform components that enforce policies, manage certificates, and handle other security measures. It establishes a secure foundation for the data exchange and application layers by integrating robust security features. The subsequent layer encompasses data exchange protocols, such as ODRL [24], NGSI-LD [25], and IDS usage control language, which enable standardized data exchange. This layer integrates international standards and common data models that are interoperable across diverse solutions and systems. Finally, the top layer consists of application frameworks, such as Hadoop, Spark, and Flink, which support distributed processing with data sovereignty enforcement. This layer represents the highest level of the stack, where the complete business logic of applications is implemented. Connectors span the top three layers, as they implement functionalities that belong to the platform, data protocols, and application layers.

Connectors represent a critical component of the data sovereignty model, as they integrate the three primary layers and enable the hosting of essential security and privacy features. An example of an implementation of the abstract connector component is the IDS connector. In the following chapter, we will discuss computational experiments conducted to evaluate the performance of the connector.

4.3. Experiments

The purpose of the experiment was to understand the impact of running Dataspace Connector software (version 8.0.2) on various public cloud platforms hosted in different geographic locations.

As discovered during the systematic literature review, designing and implementing data sovereignty in public cloud environments requires additional software layers, which increase computational complexity and cause additional performance impacts. This paper presents research results related to the performance of accessing data spaces designed for data sharing and exchange. We decided to perform an experiment comparing the performance of the Dataspace Connector.

The Dataspace Connector is an open-source implementation of the IDS connector [26]. A comprehensive list of IDS connector implementations is published monthly by the IDSA organization and is available on their website [27]. Currently, this list includes 38 implementations. The Dataspace Connector was selected for experimentation due to its high maturity level and open-source licensing. In the subsequent phase of experiments, we plan to include additional implementations of IDS connector components.

In this phase, we hosted the Dataspace Connector in two different public cloud environments: the Google Cloud Platform and Microsoft Azure. We selected two Dataspace Connector API endpoints (out of a total of 187) that create or retrieve catalog entities:

GET/api/catalogs Get a list of base resources with pagination.
POST/api/catalogs Create a base resource.

All available Dataspace Connector API endpoints are listed in Appendix B.

The Dataspace Connector has been executed on Docker runtime and HTTPS requests were sent to different endpoints hosted on virtual servers and response time was measured.

We used three virtual machines with similar configuration, hosted in Europe and US regions.

Detailed configuration was as follows:

VM 1
○
Platform: GCP Compute Engine
○
VM size: 2 vCPU, 1 core, 8 GB memory
○
Operating System: Debian GNU/Linux 12 (bookworm)
○
Region: Europe Central 2—Warsaw, Poland
VM 2
○
Platform: GCP Compute Engine
○
VM size: 2 vCPU, 1 core, 8 GB memory
○
Operating System: Debian GNU/Linux 12 (bookworm)
○
Region: US Central 1—Council Bluffs, Iowa
VM 3
○
Platform: Azure Virtual Machine
○
VM size: 2 vCPU, 1 core, 8 GB memory
○
Operating System: Debian 11 “Bullseye”—x64 Gen2
○
Region: Europe—Poland Central—Warsaw

Test results are stored in a GitHub repository [28]. Table 1 presents the average response times for GET and POST HTTPS requests sent to endpoints remotely and locally (from the same virtual machine). Tests have been repeated using local and public IPs.

To create catalog items, we used POST https://endpoint/api/catalogs requests, and to read catalog items, we used GET https://endpoint/api/catalogs requests with an initial catalog population of 100 catalog entries.

Figure 5 presents a visualization of response times for individual HTTP GET requests across three different cloud platforms.

Figure 6 presents a visualization of response times for individual HTTP POST requests across three different cloud platforms and Figure 7 visualizes statistical analysis of response times.

5. Discussion

Modeling data sovereignty architecture in public clouds is a complex task that involves analyzing existing models, taking into account available technologies, and identifying gaps in the abstract models and existing technologies. In this research, the technologies have been categorized into functional layers, which facilitates comparison and supports architects when designing new systems that enforce data sovereignty requirements. During the research, it was discovered that quite mature models already exist, many of which have been implemented as open-source software as well as commercial solutions. Data sovereignty solutions are often based on cryptographic methods to provide confidentiality, data exchange protocols to define data usage policies, and secure platforms to host the required infrastructure. The three reference architectures described in this research, designed by three organizations, have slightly different goals. However, they reuse common components, implemented as open-source software, and these organizations cooperate to ensure interoperability between different data spaces.

The model developed in this research builds upon the framework presented in [1]. It is more generic, as it does not rely on any specific technology, and offers greater granularity. Unlike the previous model, which compares complete solutions, this research focuses on the individual software components. The experiments conducted in this research aimed to prove that accessing the IDS data space is cloud agnostic, i.e., running gateway software (Dataspace Connector v. 8.0.2) is independent of cloud hosting infrastructure (e.g., Google Cloud Platform or Microsoft Azure), and the performance of the solution is acceptable across all tested deployments. Observations indicate that POST HTTP requests, in general, were faster than GET requests, likely due to the internal implementation specifics of the Dataspace Connector. Similar response times were measured for all virtual machines (VMs) when tests were run locally, averaging around 100 ms. When using a public IP address for local tests, the response times were slightly higher, by approximately 10 ms. Among the cloud platforms tested, Azure had the lowest response times for GET requests, while the Google Cloud Platform (GCP) had the lowest response times for POST requests. The longest response time was observed for a GCP VM hosted in the United States, at approximately 700 ms.

It is noteworthy that the experimental results are consistent with expectations, however, further work is needed to extend the experiments to include a third major public cloud provider: the Amazon Web Services platform.

6. Conclusions and Future Research Directions

The abstract data sovereignty model introduced in this paper aids in designing components that support data spaces by following a layered approach, which separates the purposes of different independent software building blocks. The experiments confirmed that the most significant factor affecting the performance of the Dataspace Connector is the geographical location of the virtual server (see Figure 3), rather than the cloud platform hosting it. Additionally, only minimal differences in the performance of GET and POST HTTPS requests were measured.

The experiments conducted in this research are limited to one implementation of the IDS connector, two HTTP endpoints, and two cloud service platforms. Future research steps could involve running the Dataspace Connector on edge devices, utilizing public cloud serverless managed services (e.g., GCP Cloud Run or Azure Container apps), and extending the test environment to include Amazon Web Services (AWS). Additional API endpoints can also be included in the test to perform actual data access, allowing the measurement of the time required for policy enforcement. We observed that the main impact on connector performance was related to geographical location, so future research could extend the list of locations. Since various cloud providers and edge devices may have different network connections, it is important to compare different connector hosting options.

The comprehensive set of experiments conducted on three major hyperscale cloud service providers, along with edge hosting options and testing several implementations of the connector service, will provide a more thorough understanding of the performance of data sovereignty solutions in the public cloud. It should be noted that conducting experiments in public cloud environments introduces additional complexity, which may impact results—particularly when measuring response times—due to factors such as network variability. To mitigate this risk and ensure reliable results, it is recommended to repeat tests at different times and locations. This approach also highlights a challenge for future experiments, as cloud providers host data centers in diverse geographical locations. However, future experiments can be designed to remotely initiate test scripts across various combinations of target virtual servers and test-triggering locations, enabling more robust and representative data collection.

The practical implications of this study for organizations influencing decision-making processes related to implementing data sovereignty solutions in public cloud environments are twofold. First, production-ready, off-the-shelf solutions are available, such as those developed by IDSA, FIWARE, and GAIA-X. Second, the abstract data sovereignty model introduced in this paper provides a framework for constructing customized solutions, allowing organizations to build systems from distinct components assembled within defined functional layers. As discussed in the literature review, different data sovereignty requirements necessitate distinct sovereignty solutions, which may range from privacy-shielding services on public cloud platforms to fully isolated private cloud solutions. Generally, practitioners and policymakers involved in data governance decisions should begin with an in-depth analysis of specific requirements. Based on this analysis, they can then assess whether any of the existing solutions presented in this paper meet all criteria, or if a completely new system must be designed using the abstract data sovereignty model outlined in Section 4.

Author Contributions

Conceptualization, S.G. (Stanisław Galij) and G.P.; methodology, G.P.; software, S.G. (Stanisław Galij); validation, G.P. and S.G. (Sławomir Grzyb); formal analysis, G.P. investigation, S.G. (Stanisław Galij); resources, S.G. (Stanisław Galij); data curation, S.G. (Stanisław Galij); writing—original draft preparation, S.G. (Stanisław Galij); writing—review and editing, S.G. (Stanisław Galij), G.P. and S.G. (Sławomir Grzyb); visualization, S.G. (Stanisław Galij); supervision, G.P. and S.G. (Sławomir Grzyb); project administration, G.P.; funding acquisition, S.G. (Stanisław Galij). All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by MEiN Program ‘Implementation Doctorate’, agreement No. (PP) RU00023881 dated: 30.09.2022 and statutory funds of Poznan University of Technology and West Pomeranian University of Technology in Szczecin.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

The appendix contains the listing of the Python script source code used for performance testing of the Dataspace Connector software:

import datetime
import time
import requests
import sys
import os
import statistics

gcp_eu_url = “https://<IP-address>:80/api/catalogs”
gcp_us_url = “https://<IP-address>:80/api/catalogs”
azure_url = “https://<IP-address>:80/api/catalogs”
local_url = “https://<IP-address>:8080/api/catalogs”

sys.stderr = open(os.devnull, ‘w’)

basic = requests.auth.HTTPBasicAuth(‘***’, ‘***’)
payload = {“title”:”Test-catalog”,”description”:”IDS test catalog”}
results = []
for i in range(10):
x = datetime.datetime.now()

# GET
response = requests.get(azure_url, verify=False, auth=basic)

# POST
#response = requests.post(gcp_us_url, verify=False, auth=basic, json = payload)

y = datetime.datetime.now()
elapsed = y-x
print(i, “- response time: “, elapsed.microseconds/ 1000, “[ms]”)
results.append(elapsed.microseconds)

# Print average test time [ms]:
print(“Average time: “, statistics.mean(results)/ 1000, “[ms]”)

Appendix B

The appendix contains the Dataspace Connector API specifications. For more details, refer to the full OpenAPI specification available in the GitHub repository [29].

Catalogs—Endpoints for operations on catalogs.

GET/api/catalogs/{id} Get a base resource by id.

PUT/api/catalogs/{id} Update a base resource by id.

DELETE/api/catalogs/{id} Delete a base resource by id.

GET/api/catalogs/{id}/offers Get all children of a base resource with pagination.

PUT/api/catalogs/{id}/offers Replace the children of a base resource.

POST/api/catalogs/{id}/offers Add a list of children to a base resource.

DELETE/api/catalogs/{id}/offers Remove a list of children from a base resource.

GET/api/catalogs Get a list of base resources with pagination.

POST/api/catalogs Create a base resource.

Subscriptions—Endpoints for operations on subscriptions.

GET/api/subscriptions/{id} Get a base resource by id.

PUT/api/subscriptions/{id} Update a base resource by id.

DELETE/api/subscriptions/{id} Delete a base resource by id.

GET/api/subscriptions Get a list of base resources with pagination.

POST/api/subscriptions Create a base resource.

GET/api/subscriptions/owning Get all subscriptions owned by this connector.

Representations—Endpoints for operations on representations.

GET/api/representations/{id} Get a base resource by id.

PUT/api/representations/{id} Update a base resource by id.

DELETE/api/representations/{id} Delete a base resource by id.

GET/api/representations/{id}/requests Get all children of a base resource with pagination.

PUT/api/representations/{id}/requests Replace the children of a base resource.

POST/api/representations/{id}/requests Add a list of children to a base resource.

DELETE/api/representations/{id}/requests Remove a list of children from a base resource.

GET/api/representations/{id}/offers Get all children of a base resource with pagination.

PUT/api/representations/{id}/offers Replace the children of a base resource.

POST/api/representations/{id}/offers Add a list of children to a base resource.

DELETE/api/representations/{id}/offers Remove a list of children from a base resource.

GET/api/representations/{id}/artifacts Get all children of a base resource with pagination.

PUT/api/representations/{id}/artifacts Replace the children of a base resource.

POST/api/representations/{id}/artifacts Add a list of children to a base resource.

DELETE/api/representations/{id}/artifacts Remove a list of children from a base resource.

GET/api/representations Get a list of base resources with pagination.

POST/api/representations Create a base resource.

GET/api/representations/{id}/subscriptions Get all children of a base resource with pagination.

Apps—Endpoints for app handling.

PUT/api/apps/{id}/actions Actions on apps

GET/api/apps Get a list of base resources with pagination.

GET/api/apps/{id} Get a base resource by id.

DELETE/api/apps/{id} Delete a base resource by id.

GET/api/apps/{id}/endpoints Get all children of a base resource with pagination.

GET/api/apps/{id}/appstore Get appstore by app id

Daps—Endpoints for operations on daps.

GET/api/daps/{id} Get a base resource by id.

PUT/api/daps/{id} Update a base resource by id.

DELETE/api/daps/{id} Delete a base resource by id.

GET/api/daps Get a list of base resources with pagination.

POST/api/daps Create a base resource.

Artifacts—Endpoints for operations on artifacts.

GET/api/artifacts/{id} Get a base resource by id.

PUT/api/artifacts/{id} Update a base resource by id.

DELETE/api/artifacts/{id} Delete a base resource by id.

GET/api/artifacts/{id}/representations Get all children of a base resource with pagination.

PUT/api/artifacts/{id}/representations Replace the children of a base resource.

POST/api/artifacts/{id}/representations Add a list of children to a base resource.

DELETE/api/artifacts/{id}/representations Remove a list of children from a base resource.

PUT/api/artifacts/{id}/data

POST/api/artifacts/{id}/data Get data by artifact id with query input.

GET/api/artifacts Get a list of base resources with pagination.

POST/api/artifacts Create a base resource.

GET/api/artifacts/{id}/subscriptions Get all children of a base resource with pagination.

GET/api/artifacts/{id}/route Get route associated with artifact by id.

GET/api/artifacts/{id}/data/** Get data by artifact id with query input.

GET/api/artifacts/{id}/agreements Get all children of a base resource with pagination.

_Messaging—Endpoints for invoke sending messages.

PUT/api/notify Notify all subscribers.

POST/api/ids/unsubscribe Send an IDS request message for unsubscribe from an element.

POST/api/ids/subscribe Send an IDS request message for subscribing to (meta)data updates.

POST/api/ids/search Perform full-text search.

POST/api/ids/resource/update Send an IDS ResourceUpdateMessage.

POST/api/ids/resource/unavailable Send an IDS ResourceUnavailableMessage.

POST/api/ids/query Send an IDS QueryMessage.

POST/api/ids/description Send an IDS DescriptionRequestMessage to query metadata.

POST/api/ids/contract Send an IDS ContractRequestMessage to start the contract negotiation.

POST/api/ids/connector/update Send an IDS ConnectorUpdateMessage.

POST/api/ids/connector/unavailable Send an IDS ConnectorUnavailableMessage.

POST/api/ids/app Download an IDS app from an IDS AppStore.

Endpoints—Endpoints for operations on endpoints.

GET/api/endpoints/{id} Get a base resource by id.

PUT/api/endpoints/{id} Update a base resource by id.

DELETE/api/endpoints/{id} Delete a base resource by id.

PUT/api/endpoints/{id}/datasource/{dataSourceId} Creates start endpoint for a route.

GET/api/endpoints Get a list of base resources with pagination.

POST/api/endpoints Create a base resource.

_Connector—Endpoints for general information.

GET/api Entrypoint for REST resources

GET/api/connector Get the private IDS self-description.

GET/ Get the public IDS self-description.

Rules—Endpoints for operations on rules.

GET/api/rules/{id} Get a base resource by id.

PUT/api/rules/{id} Update a base resource by id.

DELETE/api/rules/{id} Delete a base resource by id.

GET/api/rules/{id}/contracts Get all children of a base resource with pagination.

PUT/api/rules/{id}/contracts Replace the children of a base resource.

POST/api/rules/{id}/contracts Add a list of children to a base resource.

DELETE/api/rules/{id}/contracts Remove a list of children from a base resource.

GET/api/rules Get a list of base resources with pagination.

POST/api/rules Create a base resource.

Offered Resources—Endpoints for operations on offered resources.

GET/api/offers/{id} Get a base resource by id.

PUT/api/offers/{id} Update a base resource by id.

DELETE/api/offers/{id} Delete a base resource by id.

GET/api/offers/{id}/representations Get all children of a base resource with pagination.

PUT/api/offers/{id}/representations Replace the children of a base resource.

POST/api/offers/{id}/representations Add a list of children to a base resource.

DELETE/api/offers/{id}/representations Remove a list of children from a base resource.

GET/api/offers/{id}/contracts Get all children of a base resource with pagination.

PUT/api/offers/{id}/contracts Replace the children of a base resource.

POST/api/offers/{id}/contracts Add a list of children to a base resource.

DELETE/api/offers/{id}/contracts Remove a list of children from a base resource.

GET/api/offers/{id}/catalogs Get all children of a base resource with pagination.

PUT/api/offers/{id}/catalogs Replace the children of a base resource.

POST/api/offers/{id}/catalogs Add a list of children to a base resource.

DELETE/api/offers/{id}/catalogs Remove a list of children from a base resource.

GET/api/offers Get a list of base resources with pagination.

POST/api/offers Create a base resource.

GET/api/offers/{id}/subscriptions Get all children of a base resource with pagination.

GET/api/offers/{id}/brokers Get all children of a base resource with pagination.

Brokers—Endpoints for operations on brokers.

GET/api/brokers/{id} Get a base resource by id.

PUT/api/brokers/{id} Update a base resource by id.

DELETE/api/brokers/{id} Delete a base resource by id.

GET/api/brokers Get a list of base resources with pagination.

POST/api/brokers Create a base resource.

GET/api/brokers/{id}/offers Get all children of a base resource with pagination.

Contracts—Endpoints for operations on contracts (+ agreements).

GET/api/contracts/{id} Get a base resource by id.

PUT/api/contracts/{id} Update a base resource by id.

DELETE/api/contracts/{id} Delete a base resource by id.

GET/api/contracts/{id}/rules Get all children of a base resource with pagination.

PUT/api/contracts/{id}/rules Replace the children of a base resource.

POST/api/contracts/{id}/rules Add a list of children to a base resource.

DELETE/api/contracts/{id}/rules Remove a list of children from a base resource.

GET/api/contracts/{id}/offers Get all children of a base resource with pagination.

PUT/api/contracts/{id}/offers Replace the children of a base resource.

POST/api/contracts/{id}/offers Add a list of children to a base resource.

DELETE/api/contracts/{id}/offers Remove a list of children from a base resource.

GET/api/contracts Get a list of base resources with pagination.

POST/api/contracts Create a base resource.

GET/api/contracts/{id}/requests Get all children of a base resource with pagination.

_Utils

POST/api/examples/validation Get the policy pattern represented by a given JSON string.

POST/api/examples/policy Get an example policy for a given policy pattern.

GET/api/utils/enums Get a list of enums by value name.

App Stores—Endpoints for app store handling.

GET/api/appstores/{id} Get a base resource by id.

PUT/api/appstores/{id} Update a base resource by id.

DELETE/api/appstores/{id} Delete a base resource by id.

GET/api/appstores Get a list of base resources with pagination.

POST/api/appstores Create a base resource.

GET/api/appstores/{id}/apps Get all children of a base resource with pagination.

_Configurations—Endpoints for operations on configurations.

GET/api/configurations/{id} Get a base resource by id.

PUT/api/configurations/{id} Update a base resource by id.

DELETE/api/configurations/{id} Delete a base resource by id.

PUT/api/configurations/{id}/active Update current configuration.

GET/api/configuration/pattern Get pattern validation status.

PUT/api/configuration/pattern Allow unsupported patterns.

GET/api/configuration/negotiation Get contract negotiation status.

PUT/api/configuration/negotiation Set contract negotiation status.

GET/api/configurations Get a list of base resources with pagination.

POST/api/configurations Create a base resource.

GET/api/configurations/active Get current configuration.

Requested Resources—Endpoints for operations on requested resources.

GET/api/requests/{id} Get a base resource by id.

PUT/api/requests/{id} Update a base resource by id.

DELETE/api/requests/{id} Delete a base resource by id.

GET/api/requests/{id}/representations Get all children of a base resource with pagination.

PUT/api/requests/{id}/representations Replace the children of a base resource.

POST/api/requests/{id}/representations Add a list of children to a base resource.

DELETE/api/requests/{id}/representations Remove a list of children from a base resource.

GET/api/requests/{id}/catalogs Get all children of a base resource with pagination.

PUT/api/requests/{id}/catalogs Replace the children of a base resource.

POST/api/requests/{id}/catalogs Add a list of children to a base resource.

DELETE/api/requests/{id}/catalogs Remove a list of children from a base resource.

GET/api/requests Get a list of base resources with pagination.

GET/api/requests/{id}/subscriptions Get all children of a base resource with pagination.

GET/api/requests/{id}/contracts Get all children of a base resource with pagination.

Data Sources—Endpoints for operations on data sources/sinks.

GET/api/datasources/{id} Get a base resource by id.

PUT/api/datasources/{id} Update a base resource by id.

DELETE/api/datasources/{id} Delete a base resource by id.

GET/api/datasources Get a list of base resources with pagination.

POST/api/datasources Create a base resource.

Routes—Endpoints for operations on routes.

GET/api/routes/{id} Get a base resource by id.

PUT/api/routes/{id} Update a base resource by id.

DELETE/api/routes/{id} Delete a base resource by id.

GET/api/routes/{id}/steps Get all children of a base resource with pagination.

PUT/api/routes/{id}/steps Replace the children of a base resource.

POST/api/routes/{id}/steps Add a list of children to a base resource.

DELETE/api/routes/{id}/steps Remove a list of children from a base resource.

PUT/api/routes/{id}/endpoint/start Creates the start endpoint for a route.

DELETE/api/routes/{id}/endpoint/start Deletes the start endpoint of a route.

PUT/api/routes/{id}/endpoint/end Creates the last endpoint for the route.

DELETE/api/routes/{id}/endpoint/end Deletes the start endpoint of the route.

GET/api/routes Get a list of base resources with pagination.

POST/api/routes Create a base resource.

GET/api/routes/{id}/output Returns the output of the route

Agreements—Endpoints for contract agreement handling.

GET/api/agreements Get a list of base resources with pagination.

GET/api/agreements/{id} Get a base resource by id.

GET/api/agreements/{id}/artifacts Get all children of a base resource with pagination.

Routes (Apache Camel)—Endpoints for dynamically managing Camel routes.

POST/api/camel/routes Add a route to the Camel context.

POST/api/beans Add a bean to the application context.

GET/api/camel/routes/error Get new route related errors.

DELETE/api/camel/routes/{routeId} Delete a route from the Camel context.

DELETE/api/beans/{beanId} Remove a bean from the application context.

References

Ernstberger, J.; Lauinger, J.; Elsheimy, F.; Zhou, L.; Steinhorst, S.; Canetti, R.; Miller, A.; Gervais, A.; Song, D. SoK: Data Sovereignty. In Proceedings of the 2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P), Delft, The Netherlands, 3–7 July 2023. [Google Scholar]
Atos Sovereign Cloud Offering. Available online: https://atos.net/en/portfolio/create-the-sovereign-public-foundations-for-the-digital-era (accessed on 25 August 2024).
International Data Spaces Global. Available online: https://github.com/International-Data-Spaces-Association (accessed on 5 December 2023).
Hummel, P.; Braun, M.; Augsberg, S. Sovereignty and Data Sharing. ITU J. Future Evol. Technol. 2018, 1–10. [Google Scholar]
Yaodong, T.; Yang, S. Comparative Study on Data Sovereignty Guarantee Technology. EasyChair Preprint 8965. 2022. Available online: https://easychair.org/publications/preprint/hcFH (accessed on 28 October 2024).
Merlec, M.M.; Hoh, P. Blockchain-Based Decentralized Storage Systems for Sustainable Data Self-Sovereignty: A Comparative Study. Sustainability 2024, 16, 7671. [Google Scholar] [CrossRef]
Aruna, M.G.; Hasan, M.K.; Islam, S.; Mohan, K.G.; Sharan, P.; Hassan, R. Cloud to cloud data migration using self sovereign identity for 5G and beyond. Clust. Comput. 2022, 25, 2317–2331. [Google Scholar] [CrossRef] [PubMed]
Dordevic, D. Data Sovereignty Provision in Cloud-and-Blockchain-Integrated IoT Data Trading. Master’s Thesis, University of Zurich, Zurich, Switzerland, September 2020. [Google Scholar]
Ziyi, X. International Law Protection of Cross-Border Transmission of Personal Information Based on Cloud Computing and Big Data. Mob. Inf. Syst. 2022, 2022, 1–9. [Google Scholar] [CrossRef]
Renda, A. Making the digital economy “fit for Europe”. Eur. Law J. 2020, 26, 345–354. [Google Scholar]
Savelyev, A. Russia’s new personal data localization regulations: A step forward or a self-imposed sanction? Comput. Law Secur. Rev. 2016, 32, 128–145. [Google Scholar] [CrossRef]
Scoon, C.; Ko, R.K.L. The Data Privacy Matrix Project: Towards a Global Alignment of Data Privacy Laws. In Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, Tianjin, China, 23–26 August 2016; pp. 1998–2005. [Google Scholar] [CrossRef]
Shibambu, A. Migration of government records from on-premises to cloud computing storage in South Africa. S. Afr. J. Libr. Inf. Sci. 2022, 88, 1–11. [Google Scholar] [CrossRef]
Irion, K. Government Cloud Computing and National Data Sovereignty. Policy Internet 2012, 4, 40–71. [Google Scholar] [CrossRef]
Mitchell, A.D.; Samlidis, T. Cloud services and government digital sovereignty in Australia and beyond. J. Law Inf. Technol. 2021, 29, 364–394. [Google Scholar] [CrossRef]
Kim, W.B.; Seo, D.; Kim, D.; Lee, I.Y. Group Delegated ID-Based Proxy Reencryption for the Enterprise IoT-Cloud Storage Environment. Wirel. Commun. Mob. Comput. 2021, 2021, 7641389. [Google Scholar] [CrossRef]
Resende, J.S.; Martins, R.; Antunes, L. Enforcing Privacy and Security in Public Cloud Storage. In Proceedings of the 2018 16th Annual Conference on Privacy, Security and Trust (PST), Belfast, Ireland, 28–30 August 2018. [Google Scholar]
Bakogiannis, T.; Mytilinis, I.; Doka, K.; Goumas, G. Building Ad-Hoc Clouds with CloudAgora. In Proceedings of the 2019 38th Symposium on Reliable Distributed Systems (SRDS), Lyon, France, 1–4 October 2019. [Google Scholar]
IDS RAM 4. Available online: https://github.com/International-Data-Spaces-Association/IDS-RAM_4_0 (accessed on 18 June 2024).
GAIA-X AISBL. Available online: https://gaia-x.eu (accessed on 5 December 2023).
Gaia-X Architecture Document. Available online: https://gitlab.com/gaia-x/technical-committee/architecture-working-group/architecture-document (accessed on 18 June 2024).
FIWARE for Smart Cities and Territories. Available online: https://www.fiware.org/wp-content/uploads/Smart-Cities-Brochure-FIWARE.pdf (accessed on 18 June 2024).
Otto, B.; ten Hompel, M.; Wrobel, S. Designing Data Spaces: The Ecosystem Approach to Competitive Advantage; Springer Nature: Cham, Switzerland, 2022. [Google Scholar]
Open Digital Rights Language (ODRL). Available online: https://www.w3.org/TR/odrl-model/ (accessed on 25 August 2024).
NGSI-LD (Next Generation Service Interface with Linked Data). Available online: https://www.etsi.org/deliver/etsi_gs/CIM/001_099/009/01.05.01_60/gs_CIM009v010501p.pdf (accessed on 25 August 2024).
Dataspace Connector. Available online: https://github.com/International-Data-Spaces-Association/DataspaceConnector (accessed on 19 June 2024).
Data Connector Report. Available online: https://internationaldataspaces.org/data-connector-report (accessed on 30 October 2024).
Data Sovereignty Test Results. Available online: https://github.com/OneCloudDesignAuthority/data-sovereignty/tree/development/Experiments/Results (accessed on 19 June 2024).
Dataspace Connector API Specification. Available online: https://github.com/International-Data-Spaces-Association/DataspaceConnector/blob/main/openapi.yaml (accessed on 22 August 2024).

Figure 1. IDSA reference architecture (Source: internationaldataspaces.org).

Figure 2. Gaia-X reference architecture (Source: gaia-x.eu).

Figure 3. FIWARE reference architecture (Source: fiware.org).

Figure 4. Data sovereignty model.

Figure 5. Response times for Dataspace Connector GET requests.

Figure 6. Response times for Dataspace Connector POST requests.

Figure 7. Comparison of response times for Dataspace Connector.

Table 1. Dataspace Connector response time, (*) average time for 100 requests.

Environment	GET *	POST *
Azure EU	211 [ms]	187 [ms]
Azure EU-local test (public IP)	105 [ms]	104 [ms]
Azure EU-local test (local IP)	99 [ms]	98 [ms]
GCP EU	190 [ms]	207 [ms]
GCP EU-local test (public IP)	154 [ms]	134 [ms]
GCP EU-local test (local IP)	133 [ms]	115 [ms]
GCP US	757 [ms]	635 [ms]
GCP US-local test (public IP)	125 [ms]	116 [ms]
GCP US-local test (local IP)	113 [ms]	103 [ms]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Galij, S.; Pawlak, G.; Grzyb, S. Modeling Data Sovereignty in Public Cloud—A Comparison of Existing Solutions. Appl. Sci. 2024, 14, 10803. https://doi.org/10.3390/app142310803

AMA Style

Galij S, Pawlak G, Grzyb S. Modeling Data Sovereignty in Public Cloud—A Comparison of Existing Solutions. Applied Sciences. 2024; 14(23):10803. https://doi.org/10.3390/app142310803

Chicago/Turabian Style

Galij, Stanisław, Grzegorz Pawlak, and Sławomir Grzyb. 2024. "Modeling Data Sovereignty in Public Cloud—A Comparison of Existing Solutions" Applied Sciences 14, no. 23: 10803. https://doi.org/10.3390/app142310803

APA Style

Galij, S., Pawlak, G., & Grzyb, S. (2024). Modeling Data Sovereignty in Public Cloud—A Comparison of Existing Solutions. Applied Sciences, 14(23), 10803. https://doi.org/10.3390/app142310803

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modeling Data Sovereignty in Public Cloud—A Comparison of Existing Solutions

Abstract

1. Introduction

2. Literature Review

3. Materials and Methods

4. Models and Computational Experiments

4.1. Reference Architecture Comparison and Description

4.1.1. International Data Spaces (IDSA)

4.1.2. Gaia-X

4.1.3. FIWARE

4.2. Data Sovereignty Model

4.3. Experiments

5. Discussion

6. Conclusions and Future Research Directions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI