You are currently viewing a new version of our website. To view the old version click .
Applied Sciences
  • Article
  • Open Access

27 September 2021

An Empirical Study of a Trustworthy Cloud Common Data Model Using Decentralized Identifiers

,
and
1
Division of Computer Engineering, Baekseok University, Cheonan 31065, Korea
2
Department of Electronic Engineering, Soongsil University, Seoul 06978, Korea
3
Department of Software Science, Dankook University, Yongin 16891, Korea
*
Author to whom correspondence should be addressed.
This article belongs to the Collection The Development and Application of Fuzzy Logic

Abstract

The Conventional Cloud Common Data Model (CDM) uses a centralized method of user identification and credentials. This needs to be solved in a decentralized way because there are limitations in interoperability such as closed identity management and identity leakage. In this paper, we propose a DID (Decentralized Identifier)-based cloud CDM that allows researchers to securely store medical research information by authenticating their identity and to access the CDM reliably. The proposed service model is used to provide the credential of the researcher in the process of creating and accessing CDM data in the designed secure cloud. This model is designed on a DID-based user-centric identification system to support the research of enrolled researchers in a cloud CDM environment involving multiple hospitals and laboratories. The prototype of the designed model is an extension of the encrypted CDM delivery method using DID and provides an identification system by limiting the use cases of CDM data by researchers registered in cloud CDM. Prototypes built for agent-based proof of concept (PoC) are leveraged to enhance security for researcher use of ophthalmic CDM data. For this, the CDM ID schema and ID definition are described by issuing IDs of CDM providers and CDM agents, limiting the IDs of researchers who are CDM users. The proposed method is to provide a framework for integrated and efficient data access control policy management. It provides strong security and ensures both the integrity and availability of CDM data.

1. Introduction

Today, the key issue of medical services is moving forward from treatment to prevention and management of diseases [1]. Medical institutions and companies have been promoting technology development in related fields to provide services based on artificial intelligence and big data technology using medical data [2,3,4]. Clinical studies based on patient data from numerous hospitals can provide more meaningful results. However, since each hospital uses a different structure of Hospital Information System (HIS), the need for a CDM is recognized for systematic data management and integrated research [5]. CDM is a data structure defined to efficiently utilize hospitals’ data. It is composed based on international standard terms and has different components depending on the purpose. Through CDM, various data structures and meanings for each institution are converted to have the same structure and meaning, and various difficulties caused by different data structures between institutions can be solved when conducting multi-institutional joint research.
However, despite the advantage of being able to efficiently manage data, it still has problems such as regulation and protection of personal information due to the fundamental characteristics of medical data. In the existing CDM, identity management methods have mainly been isolated, centralized, and federated. These methods have limitations in interoperability due to closed identity management, Identifier (ID) leakage, and subordination with external ID management subjects [6,7,8]. In cloud CDM, it is necessary to design a secure cloud on a permission-type block chain in which the access control of the authorized and registered researcher is established [9].
In order to use the CDM data, the request of access permission from the researcher and Institutional Review Board (IRB) approval are required in the data supervision process, and the results of the process are maintained in the block chain. When various hospitals and research institutes take part in cloud CDM, an access control system is required to prove the researcher’s permission to participate in the research as well as the interoperability of the participating institution’s systems. In the operating organization of cloud CDM, a stepwise qualification process is required according to the roles of CDM provider, CDM consumer, researcher, and IRB. In the cloud CDM environment, verifiable identities are essential to handle CDM data securely and ensure the system supports the reliable and tamper-evident nature of the subject’s identity. It allows the development of independent digital identities rooted on a distributed ledger [10,11]. It also helps bring building applications with a solid digital foundation of trust by enabling the verifiable credentials model. For identity, verifiable credentials are derived from a registry.
Due to restrictions of the domestic medical law, sharing medical information outside the medical institutions in a domestic medical information utilization environment is restricted except when the patient himself/herself requests his/her own records for personal information. Because the data management system is fragmented and centralized, the exchange and use of medical information is limited, and the information management is insufficient, making it difficult for cooperative research [12,13].
One of important points about data sharing in this regulatory aspect is the IRB. For clinical studies of medical data, researchers must comply with the conditions set forth in the Research Participation Regulations. In the cloud CDM environment, the researcher has special requirements that the researcher’s affiliated institution may be different from the CDM data provider. To solve this problem, the researcher must obtain permission to participate in the research from the IRB of the institution that provides clinical information and controls the conduct of the research.
This paper describes the application of decentralized identifiers (DID) to prove user identity in the cloud CDM environment. DIDs’ transactions are configured using Hyperledger Indy, and CDM subjects are configured as agents based on Hyperledger Aries [14,15] to evaluate the behavior of CDM use cases. Here, we design and prototype a DID-based user-centric identification system to support the research of registered researchers in the cloud CDM environment involving multiple hospitals and research institutions. The prototype is an extension of the delivery method of encrypted CDM using DID and provides the identification system by limiting the use case of the CDM data of the researcher registered in the cloud CDM. The prototype constructed for agent-based PoC (proof of concept) is utilized for enhanced security of researcher use of ophthalmic CDM data. In this paper, the CDM identity schema and its definition are described by limiting the identity of main entities.
This proposed method aspires to provide a unified and efficient data access control policy management framework. It provides strong security and ensures both the integrity and the availability of CDM data. It aims to build upon and improve existing data governance processes between different organizations, translating the information sharing policies they already apply in their current operational interactions into electronically enforceable rules embedded in credentials.
The main contributions of our work can be summarized as follows:
  • DID-based user-centric identification is the first to approach supporting researchers autonomously with the identity verification with a verified proof without a third parity having central authority in the cloud CDM.
  • We propose and solve the service model that extends the DID basic model in order to solve the structural problem where it is difficult to participate in external researchers in the hospital situation related to IRB approval.
  • We validate user access control by applying the DID service model in the safe data transfer process between hospitals in Korea.
  • Our service model provides high interoperability by operating the prototype of identity proof using the standard messaging environment using DIDComm.

3. The Extended to Identify Management Scheme for Cloud CDM

3.1. The Cloud CDM Model

In order to collect and integrate clinical data of multiple hospitals, it is required to solve the heterogeneity of data structure and format, differences in quality and quantity of data, technical limitations of interoperability, and security issues. CDM should support the linking of common analysis codes for electronic medical record (EMR) resource linkage to support integrated data analysis of research institutions, without leaking sensitive personal information.
Data extracted from EMRs tend to be stored in different relational database schemas. Figure 2 illustrates the conventional concept of CDM and its operation scheme derived from several sources of EMR in hospitals.
Figure 2. Conventional concept of the Common Data Model (CDM) and operation scheme.
The cloud CDM reference model shown in Figure 3 is a partial result from the previous works and consists of several CDM providers and CDM consumers participating [10]. Using this presented reference model, clinical researchers can isolate and securely distribute CDM data.
Figure 3. Concept of the Secure-Cloud Common Data Model (SC-CDM).
  • Cryptography can be used for protecting information, using a hash value to maintain management of large-capacity CDMs. Encryption can be used to protect information using symmetric and asymmetric keys to maintain the management of large-capacity CDMs.
  • A distributed ledger is used to provide data integrity and share information through a CDM signature.
  • In the process of data creation and use, the distributed ledger guarantees data integrity, and transparently signed CDM can be accessed.

3.2. The Operation Scheme for Trustworthiness in CDM Cloud

In this paper we are focused on how to guarantee the trustworthiness using DID among the entities in cloud CDM. Hence, this model has no consideration of authentication and authorization based on in-person and group verification of cloud CDM. In cloud CDM, it is necessary to design a secure cloud on a permission-type blockchain in which the access control of authorized and registered researcher is established. In order to use the CDM data, the request of access permission from the researcher and IRB approval are required in the data supervision process, and the results of the process are maintained in the blockchain.
The following shows the process for uploading the CDM derived from the researcher’s query in Figure 4:
Figure 4. The overall concept of authentication and authorization in cloud CDM.
  • A researcher registered in a medical institution, Hospital B, sends a query to the EMR DB managed Hospital A.
  • The researcher requests the trust manager of Hospital A for CDM to hold the cloud CDM based on the result of the query.
  • The trust manager of Hospital A obtains the IRB’s approval for the request for the EMR data with the credential for identifying the researcher.
  • The trust manager in Hospital A builds the approved EMR data into CDM data and its metadata associated with encryption keys and storing the CDM encrypted to distribute to a repository in cloud CDM.
  • The trust manager in Hospital A uploads the encrypted data to the cloud CDM.
We assume the requirements of authentication and authorization as the research background. The authentication is the basic process of verifying that. the entities (researcher, IRB, CDM provider, CDM consumer) are who they claim to be before allowing access. In the context of cloud CDM, authorization determines the entitlement of an entity to perform tasks that are authorized within the system. A user’s authorization and authentication are initially activated by an identity provider (IRB) and provide CDM data about the person granted by the IRB.

3.3. The Basic DID Model for Cloud CDM

In the basic model, the identity information necessary for the information subject to receiving the desired service from the verification agency is issued and submitted by the personalization agency. To ensure the validity of the identity information issued by the personalization agent, the certificate of the personalization agent is stored in a verifiable data registry. The verification body that has received the proof of identity verifies the proof in the registry and provides services. A credential is an attestation of qualification, competence, or authority issued to an entity (e.g., an individual or organization) by a third party with a relevant or de facto authority or assumed competence to do so.
If research involves human subjects or is regulated by the Food and Drug Administration (FDA), it requires review and approval from an institutional review board (IRB) or the Human Subjects Office. It is the responsibility of all faculty and students to obtain IRB approval or Exempt determination before initiating any human subjects research projects.
Hence, IRB uses a public DID published globally. The IRB play a role as a verifiable credential issuer. Since the researcher as holder of the credential may present the credential to anyone, the identity (via the public DID) of the issuer must be part of what the verifier learns from the presentation. The verifier can investigate (as necessary) to decide if they trust the issuer. The public DID of IRB is put on a blockchain so that it can be globally resolved. It is used to establish secure, point-to-point messaging channels between the agents of the participants. With a verifiable credential, DIDs are used as the identifier for IRB as the issuer in cloud CDM.
IRB (the issuer) DID is used to uniquely identify the issuer and is resolved to obtain a public key related to the DID. That public key is then used to verify that the data in the verifiable credential did indeed come from the issuer. This public DID ensures that the verifier knows who issued the credential a holder presents.
Figure 5 shows the basic DID model for cloud CDM. Node A represents a CDM provider, and Node B represents a CDM consumer. Two trust managers located in the service broker play role as agents of the CDM provider operated in Node A and the CDM consumer operated in Node B for trustily delivering the CDM (represented as CDMA→B in Figure 3). The verifiers may not fully trust the researcher without a verifiable credential (VC) and want to share only a subset of data or respond with data retrieved from a particular query. They might also want to share different subsets of data to the researcher. The grant of access may also need to be revoked, updated, or set to expire.
Figure 5. The DID-based trust model for cloud CDM.
In cloud CDM, credentials need to be issued and verified through the following application use cases:
  • A researcher is a member of a group of researchers of a specific subject on which he or she wants to conduct research and is assigned a role as a research participant through IRB approval and is registered. Through the IRB, researchers are provided with a certificate of research participation (issuing research participation certificate through IRB).
  • CDM users apply to the creation of CDM data, encryption of the generated CDM data, and proof of access service for use in distributed storage.
  • CDM users apply for access service verification for decryption and distributed storage of CDM data in the process of accessing the created CDM data.
The following is assumed to operating environment:
  • For CDM use, researchers are registered with the CDM provider or user organization. Through the registration process, the researcher assumes that the mutual trust relationship of the cloud CDM participating organizations can be established, managed, and managed through the certificate authority (CA).
  • IRB approval documents are used for the purpose of price proof for CDM provision and use (users who have received credentials in the IRB use DID to identify their identity).
  • The researcher is provided with the ID of the CDM provider through the approval of the IRB.
  • The CDM provider decides to provide the CDM through verification of the researcher’s identity certificate. After qualification verification, the CDM provider performs encryption and distributed storage of CDM data.
  • CDM users access the encrypted and distributed CDM data through verification of the researcher’s identity certificate via the CDM consumer.
  • The researcher’s research participation certificate maintains the research period as an attribute and allows access to CDM services and data limited to the valid period.
The overall process of issuing and verifying credential when handling CDM in use-cases is shown in Figure 6.
Figure 6. The process of issuing and verifying credential when handling CDM.

3.4. Credential Definition of Identity

Self-sovereign identity consists of an identifier and identifier data. In cloud CDM, identifiers use DID, and identifier data consists of several attribute information. The main attribute information for identity consists of personal information, credentials, and verifiable presentation. A legal entity’s identity (i.e., an individual or an organization) can be represented using a set of attributes associated with the entity (such as name and role). The identity of the CDM providing and consuming institutions and the participant of these institutions is expressed in various attribute information. Identity management provides the functions for maintaining the identity data and their access control. IRB identity is defined based on its schema. The identity certificate is issued by the IRB provider. Figure 7 is the schema definition for CDM identity stored in Indy DLT.
Figure 7. Schema definition for CDM identity issued by IRB.
The following shows the schema defined for the issued CDM identity stored in the DLT. It shows that the schema was created by the IRB through the credential definition ID.
  • Schema ID: T8j4DNmf7Us8tTzpvoK6No:2:IRB schema:51.1.53
  • Cred def ID: T8j4DNmf7Us8tTzpvoK6No:3:CL:38:irb.agent.IRB_schema
  • Type: CRED_DEF
  • Reference: 38
  • Signature type: CL
  • Tag: irb.agent.IRB_schema
  • Attributes: affiliation, approved_date, gcp, irb_no, master_secret, name, role, timestamp
After the IRB agent starts up, the researcher agent establishes a trust channel with the IRB agent, and then the IRB performs DID exchange with the researcher. Algorithm 1 describes the steps for establishing a connection between these agents.
Algorithm 1 Establishing Trusted Connections
1: Researcher agent exchanges DIDs with the IRB agent to establish a DIDComm channel.
2: IRB offers an audited researcher credential over this channel.
3: Researcher accepts and stores the credential in their wallet.
† Audited researcher credential is specified by IRB.

3.5. Issuing IRB Credential

With a connection with the researcher’s agent established the IRB issuer can interact with that agent. It might ask for a presentation to confirm the identity of the researcher. Eventually, it will reach the point of needing to issue a credential to the researcher. To do that, the controller passes to the framework the type of the credential, the data for the claims, and the connection identifier for the researcher, and the framework (for the most part) takes care of issuing the credential for the given research subject. Note that after offering the credential to the researcher, the response might not come back for hours. This is not an issue, the issuer framework will just wait. Once the credential is issued, an identifier for the credential is given back to the controller, which again stores that with the rest of the information it keeps on the researcher. To issue an Indy credential, the simplest instance of the protocol must have three steps:
  • The issuer sends the holder an offer message.
  • The holder responds with a request message.
  • The issuer completes the exchange by sending the holder an issue message containing the verifiable credential.
The access policy defines programmatically the requirements for authorization to access CDM. The access policy defines these rules based on the CDM, user/group assignments, and ownership assignments. The IRB credential represents the access policy of CDM. Algorithm 2 describes the steps for issuing credential, and the detailed issuing flow is as follows.
  • The holder sends a proposal to the issuer (issuer receives proposal). When the holder starts with sending a proposal, it uses the/issue-credential-2.0/send-proposal endpoint.
  • The issuer sends an offer to the holder based on the proposal (holder receives offer). The issuer receives the proposal and can respond with an offer using the/issue-credential-2.0/records/{id}/send-offer endpoint. After this offer, the flow continues with the holder responding with a request.
  • The holder sends a request to the issuer (issuer receives request). If the holder automatically accepts offers and turns them into requests, then the issuing of credentials would be completely automated. That improves privacy—making the user in control of when and whom to share information with.
  • The issuer sends credentials to the holder (holder receives credentials). The issue credential protocol is used to enable an issuer to provide a holder with a verifiable credential. In this protocol:
    • There are two participants (issuer, holder).
    • There are four message types (propose, offer, request, and issue).
    • There are four states (proposed, offered, requested, and issued).
  • The holder stores credentials and sends acknowledgement to the issuer. Verifiable credentials are issued to the user and stored in his/her digital wallet, and the user decides when and where to use them.
  • The issuer receives acknowledgement.
Algorithm 2 Issuing credential
1: for each Researcher agent do
2: Initiate DID Exchange with CDM provider agent to establish DIDComm channel.
3: Researcher agent delivers the CDM selected to CDM provider agent via DIDComm channel.
4: CDM provider offers Verified CDM token credential over DIDComm.
5: Researcher agent accepts and stores the credential
6: CDM provider encrypts the CDM and delivers the cipher CDM to CDM consumer agent with the IRB number approved by IRB
7: end for
The CDM is derived from the EMR of in CDM provider
Verified CDM token credential is specified by PROVIDER

3.6. Proof the Credential

Privacy is important when dealing with CDM. The entities using DIDs will be able to express only the portions of their credentials. This expression of a subset of one’s credential is called credential presentation. Specifically, the presentation refers to the verifiable data received by a verifier. Instead of typing in the name, address, and government ID, a presentation of that information is provided from verifiable credentials issued from IRB by an authority trusted by the verifiers, CDM provider, and CDM consumer. The verifiers can automatically accept the claims in the presentation (if they trust the issuer) without any further checking.
Instead of obtaining the data directly from the issuer IRB, the data from the issuer comes from the holder, researcher, and the cryptographic material to verify that the authenticity of the data comes from the distributed ledger. This reduces the number of integrations that have to be implemented between issuers and verifiers. A researcher can be issued a professional accreditation credential from the relevant authority (e.g., the College of Physicians and Surgeons) and the claims verified (and trusted) by medical facilities in real time.
Should the doctor lose his or her accreditation, the credential can be revoked, which would be immediately in effect. This would hold true for any credentialed profession, be it lawyers, engineers, nurses, tradespeople, real estate agents, and so on.

4. Implementation

4.1. Experimental Setup

In this section, the design of the experiments is introduced. Detailed information of our hardware and software configurations is described in Table 2. To run von-network and agents, a docker engine is controlled by those containers. Each of the containers is running as a light-weighted virtual machine.
Table 2. Hardware and software configuration.
Hyperledger Indy node management is permissioned. It has its own ledger and stores/reads public information in the distributed ledger that is reliably elected. The nodes communicate to agree (reach consensus) on what transactions should be written and in what order. To start Hyperledger Indy nodes, a von-network is used. It is a portable development of Hyperledger Indy with a ledger browser. The von-network plays a role as a Hyperledger Indy public ledger sandbox instance. In this work, it is running in docker locally.
Figure 8 shows the von-network with four nodes for identity management in cloud CDM. The von-webserver has a web interface that allows you to browse the transactions in the blockchain.
Figure 8. The von-network running in cloud CDM.
Before issuing a credential, a credential definition as well as its schema needs to be created. Both the schema and the credential definition are recorded on a von-network. Hyperledger Aries Cloud Agent Python (ACA-Py) is a foundation for building a verifiable credential (VC) ecosystem [35]. It operates in the second (DIDComm Peer to Peer Protocol) and third (Data Exchange Protocols) layers of the Trust Over IP framework using DIDComm messaging and Hyperledger Aries protocols in Figure 9.
Figure 9. Trust over IP framework [36].
A business logic controller is written for the development of a given use case, and the created controller uses the ACA-Py library based on AIP (Aries Interop Profile) 2.0. AIP 2.0 protocols are used for issuing, verifying, and holding VCs that work with a Hyperledger Indy distributed ledger. The von-network is used to represent a credential format named AnonCreds (Anonymous Credentials). It is a kind of detailed implementation of zero-knowledge proof (ZKP) support.
A ZKP is a kind of cryptographic method, and its use in blockchain appears to be promising in cases where existing blockchain technologies can adapt a ZKP to address specific business requirements focusing on data privacy [37]. It proves attributes for an entity (a person, organization, or thing) without exposing a correlatable identifier about that entity. That claims from verifiable credentials can be selectively disclosed, meaning that just some data elements from credentials, even across credentials can (and should be) provided in a single presentation. By providing them in a single presentation, the verifier knows that all the credentials were issued to the same entity.
Four agents, the researcher, IRB, provider, and consumer are developed. Those agents are written in Python by using ACA-py library. Agents that receive a message from another entity post a webhook internally over HTTP, allowing the controller to respond appropriately. Note that this can include requesting the agent to send further messages in reply. More details can be seen in Table 3.
Table 3. Participating entities and their endpoints.
ACA-py can also notify its controller when an event has occurred. It supports webhooks that allow immediately obtaining an update of what happened. Requests and responses between controllers configured through ACA-py are transmitted as HTTP requests, and webhook notifications are delivered as a result of processing. Webhook is an asynchronous HTTP callback on an event occurrence. It is a simple server-to-server communication for reporting a specific event occurred on a server. The server on which the event occurred will fire an HTTP POST request to another server on a URL that is provided by the receiving server.
In this paper, each of the cloud CDM subjects operates their own agents acting as a peer, and transactions between peers are maintained in a distributed ledger. Agent-to-agent communication is based on the DiDComm specification to support bilateral communication through a trusted channel.

4.2. Experimental Result

The simulation environment setup starts with the registration of the entity researcher named Alice on each IRB. To establish the connection between IRB and Alice, IRB advertises an invitation data, Alice delivers the invitation message to IRB, and IRB responds to the accept message associated with the invitation. For peer-to-peer communication, Aries Interop Profile (AIP) uses 20. AIP is used to establish a connection between agents, exchange identity certificates, and perform transmission data through command delivery. After the identity is verified, the user’s CDM data credential is performed.
After processing the registration information, the IRB sends a unique connection invitation message to Alice, as represented in Figure 10. The connection request message is used to communicate the DID document of the invitee (Alice) to the inviter (IRB). The @type attribute is a required string value that denotes that the received message is a connection request. After receiving the connection request, IRB evaluates the provided DID and DID Doc according to the DID Method Spec.
Figure 10. JSON format of IRB invitation attribute.
When IRB and researcher agents want to connect with each other, they establish a connection by DIComm, a series of messages that go back and forth to establish a connection and exchange information. In Figure 11, connection_id is used to send a message between two agents.
Figure 11. JSON format of the accepted message associated with the invitation.
In answer to the connect invitation, the IRB issues and offers a researcher a VC, represented in Figure 12 (segment of the issued credential), to be used to prove his/her identity when connecting to CDM provider. The VC is issued according to its schema definition in Figure 6. The credential is stored in the wallet of the researcher. The credential is generated based on IRB records including IRB number, name, affiliation, the status of GCP, etc. GCP stands for good clinical practice. This means that the clinical studies using CDM satisfy the clinical trial management criteria through the IRB.
Figure 12. Researcher VC offered by the IRB upon registration.
Similarly, using CDM provider VC schema, the same setup is performed for the CDM provider. They aim to identify the CDM providers and the IRB issues and provide the CDM with a VC to allow the research to verify the CDM provider. Upon receiving the accessing CDM request, the CDM provider requires the researcher to present a valid verifiable credential (issued by the IRB), containing GCP in the allowed status of the credential in Figure 13. In the response to the CDM provider, the researcher presents a valid VC, with the allowed GCP granting the researcher permissions to access CDM data in that CDM provider. As shown in Figure 13, the result of the process is handled by the researcher. The proof from the researcher is validated by CDM provider. Using AnnoCreds, the validation process is based on the GCP attribute in VC.
Figure 13. The result of the process the proof by using ZKP.
Figure 14 shows a proof, which is part of the credential issued by IRB, provided by the researcher to IRB and the CDM provider showing that the researcher is qualified. IRB verifies the qualification in the ZKP method based on the properties of the provided proof. Using the proof IRB, IRB give a permission of the CDM data request qualification when the attribute value of GCP is greater than 0.
Figure 14. The proof.

4.3. Discussion

Privacy, security, and usability: the healthcare data are sensitive by nature, and they need a maximum of security against data breaches and privacy disclosure when exchanging the data, especially after enabling third parties’ medical services to interact with the system. Medical data formats such as CDM for joint use have been developed for the participation of multiple hospitals and research institutions, and a stronger response method that is not vulnerable to security is needed. In order to further improve usability, in this paper, reliable cloud CDM research is conducted using DID based on blockchain.
In the construction, operation, and utilization of CDM, it is generally used only in the computer network within individual hospitals so that it is maintained at the same security level as general medical information. However, the problem of information leakage may occur due to insufficient systems or regulations to take responsibility for information security and prepare countermeasures in multi-institutional combined research. In addition, although CDM is mainly built on a cloud-based basis, security for conversion and conversion and de-identification of personal information in the hospital information system cannot be performed by building a clear solution or system. Instead, CDM is verified by the business procedure to confirm or pledge not to leak personal information by the programmer and system manager who performs the conversion and has a very weak structure. Therefore, clinical information in hospitals usually has to go through the consent of the patient who is the data subject and approval by the IRB. In addition, there is a restriction that researchers must use medical data only inside the hospital.
Figure 15 shows the flow of access control in cloud CDM. When a manager with authority sends a plaintext inquiry to the CDM (① ②), the access control list and CDM data are transmitted to the cloud CDM (③). The cloud CDM performs the following detailed steps and then sends the encrypted request result and ACL. The user, data approval range, period of use, etc., are subject to IRB review, and if approved (④ ⑤ ⑥), the user finally performs the analysis with the CDM result value (⑦). During a series of processes, data are encrypted, and unauthorized users’ access is blocked so that the contents cannot be checked.
Figure 15. Flow of access control in cloud CDM. ① Search in trust manager ② Request data from the hospital where the data are available ③ Request for CDM data, attachment of access control list ④ Request result, ACL ⑤ IRB approval (user, data approval range, period of use) ⑥ Approval notice ⑦ User analysis.

5. Conclusions

Some businesses, including those that analyze CDM in public health research, which deals with sensitive information, may require a certain level of privacy and security. CDM for data sharing and utilization of medical institutions requires access to various patient medical information. It is used for disease research and customized medical care. Intrinsically the CDM data are highly sensitive, and they need maximum security against data breaches and privacy disclosure when exchanging data. The cloud CDM provides interoperability for the participation of multiple hospitals and serves as an information-based study for customized and user-centered healthcare. However, reliable management of safe and transparent medical information of personal information is required.
The cloud CDM proposed applies DID and blockchain technology for secure access control that occurs when a researcher accesses it. The proposed service model is used to provide the credential of the researcher in the process of creating and accessing the CDM data of the designed secure cloud CDM. It does not consider the interaction with the existing system for establishing the initial trustiness of entities participating in the cloud CDM and suggests showing that the DID is used as a method for identification.
The prototype is an extension of the delivery of encrypted CDM using DID and describes the identification by limiting the use case of the CDM data of the researcher registered in the cloud CDM. This proposed method aspires to provide a unified and efficient data access control policy management framework. The designed model was verified by applying the ophthalmic CDM data of domestic hospitals. It provides strong security and ensures both the integrity and the availability of CDM data.

Author Contributions

Conceptualization, Y.B.P. and Y.K.; methodology, Y.B.P.; software, Y.K.; validation, Y.K., Y.B.P. and J.C.; formal analysis, Y.K.; investigation, J.C.; resources, Y.B.P.; data curation, Y.B.P.; writing—original draft preparation, Y.K.; writing—review and editing, J.C.; visualization, Y.K.; supervision, J.C.; project administration, J.C.; funding acquisition, J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Korea Environmental Industry & Technology Institute (KEITI), grant number RE202101551 and The APC was funded by Ministry of Environment (ME).

Institutional Review Board Statement

Not applicable.

Acknowledgments

This work was supported Korea Environmental Industry & Technology Institute (KEITI) grant funded by the Korea government (Ministry of Environment). Project No. RE202101551, the development of IoT-based technology for collecting and managing big data on environmental hazards and health effects.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Shivade, C.; Raghavan, P.; Fosler-Lussier, E.; Embi, P.J.; Elhadad, N.; Johnson, S.B.; Lai, A.M. A review of approaches to identifying patient phenotype cohorts using electronic health records. J. Am. Med. Inform. Assoc. 2014, 21, 221–230. [Google Scholar] [CrossRef] [PubMed]
  2. Ferreira, J.C.; Ferreira da Silva, C.; Martins, J.P. Roaming service for electric vehicle charging using blockchain-based digital identity. Energies 2021, 14, 1686. [Google Scholar] [CrossRef]
  3. Liu, B.; Yuan, X.-T.; Yu, Y.; Liu, Q.; Metaxas, D. Decentralized Robust Subspace Clustering. Proc. AAAI Conf. Artif. Intell. 2016, 30, 3539–3545. Available online: https://ojs.aaai.org/index.php/AAAI/article/view/10473 (accessed on 1 July 2021).
  4. Xia, S.; Zheng, S.; Wang, G.; Gao, X.; Wang, B. Granular ball sampling for noisy label classification or imbalanced classification. IEEE Trans. Neural Netw. Learn. Syst. 2021. [Google Scholar] [CrossRef]
  5. You, S.C.; Lee, S.; Cho, S.Y.; Park, H.; Jung, S.; Cho, J.; Yoon, D.; Park, R.W. Conversion of National Health Insurance Service-National Sample Cohort (NHIS-NSC) database into observational medical outcomes partnership-common data model (OMOP-CDM). Stud. Health Technol. Inf. 2017, 245, 467–470. [Google Scholar]
  6. Chadwick, D.W. Federated identity management. Foundations of security analysis and design v. Lect. Notes Comput. Sci. 2009, 5705, 96–120. [Google Scholar]
  7. Jayaraman, I.; Mohammed, M. Secure Privacy Conserving Provable Data Possession (SPC-PDP) framework. Inf. Syst. E-Bus. Manag. 2019, 1–27. [Google Scholar] [CrossRef]
  8. Xiong, L.; Li, F.G.; Zeng, S.K.; Peng, T.; Liu, Z.C. A Blockchain-based privacy-awareness authentication scheme with efficient revocation for multi-server architectures. IEEE Access 2019, 7, 125840–125853. [Google Scholar] [CrossRef]
  9. Cho, J.H.; Kang, Y.; Park, Y.B. Secure delivery scheme of common data model for decentralized cloud platforms. Appl. Sci. 2020, 10, 7134. [Google Scholar] [CrossRef]
  10. Pãnescu, A.T.; Manta, V. Smart contracts for research data rights management over the ethereum blockchain network. Sci. Technol. Libr. 2018, 37, 235–245. [Google Scholar] [CrossRef]
  11. Androulaki, E.; Barger, A.; Bortnikov, V.; Cachin, C.; Christidis, K.; de Caro, A.; Enyeart, D.; Ferris, C.; Laventman, G.; Manevich, Y.; et al. Hyperledger fabric: A distributed operating system for permissioned blockchains. In Proceedings of the Thirteenth EuroSys Conference, Porto, Portugal, 23–26 April 2018; ACM: New York, NY, USA, 2018; p. 30. [Google Scholar]
  12. Dagher, G.G.; Mohler, J.; Milojkovic, M.; Marella, P.B. Ancile: Privacy-preserving framework for access control and interoperability of electronic health records using blockchain technology. Sustain. Cities Soc. 2018, 39, 283–297. [Google Scholar] [CrossRef]
  13. Silberschatz, A.; Korth, H.F.; Sudarshan, S. Database System Concepts; McGraw-Hill: New York, NY, USA, 1997. [Google Scholar]
  14. Hyperledger/Aries-Cloudagent-Python. Available online: https://github.com/hyperledger/aries-cloudagent-python (accessed on 1 April 2021).
  15. Reed, D.; Sporny, M.; Longley, D.; Allen, C.; Grant, R.; Sabadell, M. Decentralized Identifiers (DIDs) v1.0—Core Architecture, Data Model, and Representations. IT Security and Privacy—A Framework for Identity Management (ISO/IEC 24760-1). Available online: https://www.w3.org/TR/did-core/ (accessed on 1 March 2021).
  16. Blumenthal, D.; Tavenner, M. The “meaningful use” regulation for electronic health records. N. Engl. J. Med. 2010, 363, 501–504. [Google Scholar] [CrossRef]
  17. Jensen, P.B.; Jensen, L.J.; Brunak, S. Mining electronic health records: Towards better research applications and clinical care. Nat. Rev. Genet. 2012, 13, 395–405. [Google Scholar] [CrossRef] [PubMed]
  18. Glicksberg, B.S.; Oskotsky, B.; Giangreco, N.; Thangaraj, P.M.; Rudrapatna, V.; Datta, D.; Butte, A.J. ROMOP: A light-weight R package for interfacing with OMOP-formatted electronic health record data. JAMIA Open 2019, 2, 10–14. [Google Scholar] [CrossRef] [PubMed]
  19. Reps, J.M.; Schuemie, M.J.; Suchard, M.A.; Ryan, P.B.; Rijnbeek, P.R. Design and implementation of a standardized framework to generate and evaluate patient-level prediction models using observational healthcare data. J. Am. Med. Inf. Assoc. 2018, 25, 969–975. [Google Scholar] [CrossRef] [PubMed]
  20. Voss, E.A.; Makadia, R.; Matcho, A.; Ma, Q.; Knoll, C.; Schuemie, M.; Ryan, P.B. Feasibility and utility of applications of the common data model to multiple, disparate observational health databases. J. Am. Med. Inf. Assoc. 2015, 22, 553–564. [Google Scholar] [CrossRef] [Green Version]
  21. Garza, M.; Del Fiol, G.; Tenenbaum, J.; Walden, A.; Zozus, M.N. Evaluating common data models for use with a longitudinal community registry. J. Biomed. Inform. 2016, 64, 333–341. [Google Scholar] [CrossRef]
  22. Hripcsak, G.; Duke, J.D.; Shah, N.H.; Reich, C.G.; Huser, V.; Schuemie, M.J.; Ryan, P.B. Observational Health Data Sciences and Informatics (OHDSI): Opportunities for observational researchers. Stud. Health Technol. Inform. 2015, 216, 574. [Google Scholar]
  23. Yoon, D.; Ahn, E.K.; Park, M.Y.; Cho, S.Y.; Ryan, P.; Schuemie, M.J.; Park, R.W. Conversion and data quality assessment of electronic health record data at a Korean tertiary teaching hospital to a common data model for distributed network research. Healthc. Inform. Res. 2016, 22, 54–58. [Google Scholar] [CrossRef] [PubMed]
  24. Nakamoto, S. Bitcoin: A peer-to-peer electronic cash system. Decent. Bus. Rev. 2008, 21260–21268. [Google Scholar]
  25. Alamri, B.; Javed, I.T.; Margaria, T. A GDPR-compliant framework for IoT-based personal health records using blockchain. In Proceedings of the 2021 11th IFIP International Conference on New Technologies, Mobility and Security (NTMS), Paris, France, 19–21 April 2021; pp. 1–5. [Google Scholar]
  26. Simply Vital Health. Available online: https://www.simplyvitalhealth.com/ (accessed on 29 December 2018).
  27. Roehrs, A.; da Costa, C.A.; da Rosa Righi, R. OmniPHR: A distributed architecture model to integrate personal health records. J. Biomed. Inform. 2017, 71, 70–81. [Google Scholar] [CrossRef]
  28. Landau, S.; Le van Gong, H.; Wilton, R. Achieving privacy in a federated identity management system. In Financial Cryptography and Data; Dingledine, R., Golle, P., Eds.; Security 2009. Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2009; Volume 5628. [Google Scholar] [CrossRef]
  29. Allen, C. The Path to Self-Sovereign Identity. Life with Alacrity. Available online: http://www.lifewithalacrity.com/2016/04/the-path-to-self-soverereign-identity.html (accessed on 1 July 2021).
  30. Hardjono, T.; Pentland, A. Verifiable anonymous identities and access control in permissioned blockchains. arXiv 2019, arXiv:1903.04584. [Google Scholar]
  31. Shrestha, A.K.; Vassileva, J. Blockchain-based research data sharing framework for incentivizing the data owners. In Proceedings of the International Conference on Blockchain, Seattle, WA, USA, 25–30 June 2018; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2018; Volume 10974, pp. 259–266. [Google Scholar]
  32. Augot, D.; Chabanne, H.; Chenevier, T.; George, W.; Lambert, L.; Augot, D.; Chabanne, H.; Chenevier, T.; George, W.; Lambert, L. A user-centric system for verified identities on the Bitcoin blockchain. In Data Privacy Management, Cryptocurrencies and Blockchain Technology; Springer: Oslo, Norway, 2017; Volume 10436, pp. 390–407. [Google Scholar]
  33. Halpin, H. NEXTLEAP: Decentralizing identity with privacy for secure messaging. In Proceedings of the 12th International Conference on Availability, Reliability and Security, Reggio Calabria, Italy, 29 August–1 September 2017; pp. 1–10. [Google Scholar]
  34. Babkin, S.; Epishkina, A. Authentication protocols based on one-time passwords. In Proceedings of the 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), Saint Petersburg, Russia, 28–31 January 2019; pp. 1794–1798. [Google Scholar]
  35. Zhang, R.; Xue, R.; Liu, L. Security and privacy on blockchain. ACM Comput. Surv. 2019, 52, 1–34. [Google Scholar] [CrossRef] [Green Version]
  36. Taking the Sovrin Foundation to a Higher Level: Introducing SSI as a Universal Service. Available online: https://sovrin.org/taking-the-sovrin-foundation-to-a-higher-level-introducing-ssi-as-a-universal-service/ (accessed on 10 August 2020).
  37. Meralli, S. Privacy-preserving analytics for the securitization market: A zero-knowledge distributed ledger technology application. Financ. Innov. 2020, 6, 1–20. [Google Scholar] [CrossRef] [Green Version]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.