Privacy-Protection Scheme of a Credit-Investigation System Based on Blockchain

In response to the rapid growth of credit-investigation data, data redundancy among credit-investigation agencies, privacy leakages of credit-investigation data subjects, and data security risks have been reported. This study proposes a privacy-protection scheme for a credit-investigation system based on blockchain technology, which realizes the secure sharing of credit-investigation data among multiple entities such as credit-investigation users, credit-investigation agencies, and cloud service providers. This scheme is based on blockchain technology to solve the problem of islanding of credit-investigation data and is based on zero-knowledge-proof technology, which works by submitting a proof to the smart contract to achieve anonymous identity authentication, ensuring that the identity privacy of credit-investigation users is not disclosed; this scheme is also based on searchable-symmetric-encryption technology to realize the retrieval of the ciphertext of the credit-investigation data. A security analysis showed that this scheme guarantees the confidentiality, the availability, the tamper-proofability, and the ciphertext searchability of credit-investigation data, as well as the fairness and anonymity of identity authentication in the credit-investigation data query. An efficiency analysis showed that, compared with similar identity-authentication schemes, the proof key of this scheme is smaller, and the verification time is shorter. Compared with similar ciphertext-retrieval schemes, the time for this scheme to generate indexes and trapdoors and return search results is significantly shorter.


Introduction
The credit system is one of the symbols that constructs economic and social progress and development, and credit has gradually become everyone's second ID card [1]. Credit investigation reflects a person's credit status, and it has stepped into all aspects of our life. A credit-investigation agency provides convenient credit-inquiry services to users by legally collecting and processing the credit information of users from the credit investigation system.
With the development of Internet finance, we have entered the era of data, and credit data have shown explosive growth. The traditional credit-investigation system takes a few credit-investigation agencies as the main body, and its stored credit-investigation information is seriously unable to meet the needs of users. There is an island phenomenon among credit-investigation systems, and the format of their credit-investigation information is also different. These conditions make it impossible to share credit-investigation information. In addition, the traditional credit-investigation system relies on a centralized server, which means that once it is attacked by hackers, the entire system will fall down and cannot work. Credit-investigation information is sensitive and private, so the credit-investigation system needs security and privacy protection.
Blockchain and smart contracts can solve the problem of information silos and the risks brought by the centralized server in the traditional credit-investigation system. In the current credit-investigation system, financial institutions provide credit-investigation user inquiries and related credit services. Credit-investigation agencies act as creditinvestigation data providers, and they encrypt the data and upload them to the cloud. However, this multi-entity system will lead to complex interactions, which is not conducive to the sharing of credit-investigation information. Blockchain technology has the characteristics of traceability, privacy protection, and avoidance of single points of failure as a secure, distributed ledger based on cryptography. A smart contract is programmable code that cannot be tampered with running on the blockchain, which greatly expands the application fields of the blockchain. Smart contracts are deeply integrated with credit investigation, and they help to realize the safe sharing of credit data and to reduce the cost of credit-data collection.
The identity authentication of the traditional credit-investigation system will lead to a certain risk of identity information leakage. For example, the method of entering the account and password for identity authentication and login in the credit-investigation system will cause a user's identity information to be intercepted by the adversary, thereby jeopardizing the security of the credit-investigation system. The method of identity authentication through biometrics such as face and fingerprint has the advantages of not being forgotten or lost and is easy to use anytime and anywhere. However, because the biological characteristics remain unchanged for many years and accompany each individual throughout his life, its security cannot be guaranteed when there are loopholes or when the system database is attacked [2]. Although the blockchain provides pseudonyms that have nothing to do with the information of credit-investigation users, it cannot fully realize the privacy of users' identity. Therefore, the current credit-investigation system has an urgent need to solve the security problem of identity authentication.
The data in the credit-investigation system are sensitive, so malicious cloud service providers will spy on and tamper with user's data. To protect the privacy of the data, we need to encrypt them before uploading them to the cloud servers. Although the ciphertext can ensure the security of credit-investigation data, it consumes a significant amount of bandwidth resources when the ciphertext is searched. Therefore, we need to search the ciphertext and return the search results to the user, while ensuring the privacy and security of the data to the greatest extent.
Zero-knowledge-proof technology [3] can ensure the security and privacy of creditinvestigation users in the identity-authentication phase, and it can protect their identities from eavesdropping and acquisition by malicious adversaries. zkSNARKs [4], as a zeroknowledge-proof application tool, allows credit users to prove to the system that the "thesis" is correct by describing a particular "thesis," but the judgment will not reveal any valid information. Users submit their own identity information by using zero-knowledge-proof technology; then, the smart contract will verify the user's identity. Once the verification is passed, the user will be anonymously authenticated, and the blockchain will record that the user's address is legal.
Searchable-symmetric-encryption technology can improve the search efficiency of cloud service providers and can realize data searches in ciphertext data. In addition, this technology guarantees data security and prevents attacks from malicious cloud service providers and other adversaries.
This scheme combines blockchain, zero-knowledge-proof, and searchable-symmetricencryption technology to develop a securer credit information-sharing scheme.

Related Works
Blockchain is a horizontal and connected technology, which can promote interconnection among various industries and fields and the development of other technologies [5,6]. Xu et al. [7] proposed a social-credit-investigation system based on blockchain technology, which enabled smart contracts for identity verification and authorization.
Li et al. [8] proposed a reputation blockchain ecosystem to implement autonomous credit and established a credit-evaluation model. Zhang et al. [9] designed a personal-creditinvestigation-information-sharing-platform framework based on blockchain 3.0 architecture and implemented a credit-blacklist-sharing mechanism. Zhu [10] proposed an identityauthentication and intelligent-credit-reporting method based on blockchain technology, which realized multi-dimensional-identity-security authentication and a distributed ledger of credit ratings.
Faisca and Rogado [11] proposed an end-to-end-identity-authentication mechanism based on the JSON web token and blockchain technology. The token can use "claims" to encode personal-cloud-and customer-related information in a secure way. Cui et al. [12] proposed a blockchain-based multi-WSN authentication scheme for the Internet of Things to achieve mutual authentication of node identities in various communication scenarios. Abbasi and Khan [13] proposed a VeidBlock1 scheme, which generates verifiable identities by following a reliable authentication process. All identities created by VeidBlock are verifiable and anonymous, so this scheme protects the user's privacy during the verification and authentication phases. Zhang et al. [14] stored users' identities on the blockchain and stored the encrypted personal information outside the blockchain; however, the user's identity information was directly exposed on the blockchain, and malicious attacks can easily steal the user's identity. Zhou et al. [15] proposed an improved key-distribution solution (blockchain with identity-based encryption (BIBE)). BIBE separates the nodes in the chain to complete a user's identity verification and private-key protection. Mikula [16] proposed an identity and access-management system using blockchain technology to support the identity verification and authorization of entities in the digital system. Gabay et al. [17] proposed a scheme based on blockchain technology and smart contracts, and they achieved privacy protection authentication through a zero-knowledge=proof method based on tokens and the Pederson commitment. Wan et al. [18] proposed zk-DASNARK and realized the data-feedback-feed scheme of zero-knowledge proof based on zk-DASNARK to ensure the privacy and authenticity of smart contracts.
Li et al. [19] proposed a blind-signature scheme suitable for a blockchain system against quantum attacks, and this scheme improved the security and privacy of the blockchain. Li et al. [20] proposed a searchable-symmetric-encryption scheme based on blockchain technology, which not only improved the efficiency of data retrieval but also ensured the fairness of both parties' transactions. Gao et al. [21] proposed an attribute-based encryption scheme based on blockchain technology to achieve trusted-access control of data while ensuring an access strategy and attribute privacy. Agyekum et al. [22] proposed a proxy re-encryption method to ensure data-sharing security in a cloud environment.

Contribution
The goal of this study was to provide a privacy-protection scheme for a creditinvestigation system based on blockchain technology to ensure the secure sharing of data between the credit-investigation user and the credit-investigation agency. The main contributions are summarized as follows.
This study proposed a privacy-protection scheme for the credit-investigation system based on blockchain technology. The information-silo problem caused by the centralizedserver method was solved by using the decentralization and non-tamperable characteristics of the blockchain, and the credit-investigation data was protected from being tampered with by malicious cloud service providers.
This scheme adopted identity-authentication technology based on zero-knowledge proof-credit-investigation users can prove that they are legal users without revealing any private identity information. This scheme also used a searchable-symmetric-encryption technology, which ensures the secure storage and the efficient searching of creditinvestigation data.
Compared with some existing schemes, this scheme achieved better privacy protection. Compared with zk-DASNARK, the cost of this scheme in the identity-authentication phase is affordable. The cost of this scheme in the searchable-encryption process is significantly lower than other schemes.

Blockchain and Smart Contracts
Blockchain, as the underlying technology of Bitcoin [23], is a distributed ledger based on cryptography security with convenient validation and tamper-free features. Blockchain can also be considered as a list sorted by some sort of consensus, which is accessible to any node. A smart contract [24] is a program running on the blockchain network and is a collection of code and data (status). Through the realization of calculation and storage through smart contracts, a large number of applications on the Ethereum blockchain have become a reality. As a segment of code that can be uploaded and executed, it has tamperfree and fully distributed properties, and developers efficiently develop blockchain-related applications by deploying smart contracts.

Zero-Knowledge Proof
Zero-knowledge proof [3] implies that the prover can convince the verifier that a certain statement is correct through interaction with the verifier, but the verifier knows nothing about the statement otherwise. Zero-knowledge proof has three characteristics: completeness, soundness, and a state of zero knowledge.
(1) Completeness. If the prover does have the answer to a certain argument, he can definitely find a method to prove to the verifier that the data he holds are correct.
(2) Soundness. If the prover does not have the answer to a certain conclusion at all, he cannot (or can only with a very low probability) convince the verifier that the purported answer in his hands is correct.
(3) Zero knowledge. The verifier only knows that a certain assertion is correct or wrong, but he knows nothing about the assertion.

zkSNARKs
The acronym zkSNARKs [4] stands for zero-knowledge succinct non-interactive argument of knowledge, which is a form of zero-knowledge proof that is more concise and applicable in non-interactive environments. This logic was first proposed in 2012, and zkSNARKs can be implemented in the blockchain environment. In the case of blockchain transactions using zkSNARKs, the validity of the transaction can be transmitted to nodes other than the sending and receiving nodes without exposing information such as the receiver, the sender, and the transfer amount. The content of zkSNARKs is mainly divided into two parts: one part is to convert the problem to be proved into a circuit, and the other part is to convert the circuit into a verifiable mathematical polynomial problem.

Searchable-Symmetric Encryption
The traditional searchable-symmetric-encryption algorithm [25] has three participant entities: the data owner, the server, and the user. The data owner has a document set D = (D 1 , D 2 , · · · , D n ), and the cipher text set C = (C 1 , C 2 , · · · , C n ) is generated by the key K. In addition, an encrypted index I is also generated. Then, he sends C, I to the server. The data owner and the user share the key K. When the user wants to search for a document containing the keyword W, the key K is used to generate a search trapdoor t i for the keyword W, and the trapdoor is sent to the server. The server uses a search algorithm to search for ciphertext C through t i and I. Finally, the user decrypts locally to obtain the plaintext D.

System Model
The privacy-protection scheme of the blockchain-based credit investigation system includes five entities including credit-investigation agencies, credit-investigation users, the blockchain, cloud service providers, and financial institutions.
Regarding the credit-investigation agency, we assumed that the credit-investigation agencies (CI A) are completely trustworthy. The credit-investigation agencies (CI A) have a large number of user-credit-reporting reports and can safely own, control, and conditionally provide credit reporting of users' personal-credit information, in addition to obtaining relevant fees in the process.
The credit-investigation user (u i ) is a typical data consumer. He needs to pass identity authentication and entrust financial institutions to inquire about the relevant creditinvestigation information.
The smart contract in the blockchain (BC) can verify the identity of the credit-investigation users, and the blockchain network connects to other physical nodes. The blockchain stores the hash digest of the ciphertext of the credit-investigation user's information to ensure that the data can be traced and that they cannot be tampered with.
We assumed that the cloud service provider (CSP) is not trustworthy. It stores the ciphertext of the user's credit-investigation information and returns the ciphertext that meets the requirements based on the user's trapdoor information. In addition, it receives credit-investigation information but may dishonestly perform the tasks assigned in the system.
Financial institutions include commercial banks, etc. We assumed that financial institutions (FI) are completely trusted. In this scheme, financial institutions provide credit-investigation information inquiry and credit services to credit-investigation users, but they cannot authorize other credit-investigation users to query services. Definition 1. This scheme is composed of the algorithm eight-tuple (KeyGen, Enc, Setup, Prove, Authenticate, Trapdoor, Search, and Dec), part of which is based on a searchable-symmetricencryption algorithm [26]. The formal description is as follows: KeyGen(1 k ) → K. The algorithm is a probabilistic key-generation algorithm. It takes a security parameter k as input and outputs a key array K.
Enc(K, D i ) → (I, c w i ). The algorithm is an encryption algorithm. It takes a plaintext D i and a key K as input and outputs the index I and the ciphertext c w i .
Setup(1 λ , C) → (EK i , VK i ). This algorithm is a key-generation algorithm of zkSNARKs. It takes a security parameter λ and a circuit C as input and outputs a proof key EK i and a verification key VK i .
Prove(EK i , id i , sign(id i ), Timestamp) → π. This algorithm is a proof algorithm of zk-SNARKs. It takes a proof key EK i , a user's identity information id i , identity-information signature sign(id i ), and timestamp Timestamp as input and outputs a proof π.
Authenticate(VK i , π) → (true/ f alse). This algorithm is a verification algorithm of zk-SNARKs. It takes as input a verification key VK i and a proof π to verify whether the verification is successful.
Trpdoor(K, w i ) → t i . This algorithm is a trapdoor-generation algorithm executed by the financial institution FI that accepts the credit-investigation user's entrustment. It takes a key K and the search keyword w i , and it outputs the trapdoor t i .
Search(I, t i ) → c w i . The algorithm is a ciphertext search algorithm executed by the cloud service provider CSP that accepts the credit-investigation user's entrustment. It takes a search index I and a trapdoor t i as input, and it outputs the ciphertext c w i that meets the requirements.
Dec(K, c w i ) → D i . The algorithm is a decryption algorithm executed by the financial institution FI that accepts the credit-investigation user's entrustment. It takes the secret key K and ciphertext c w i as input and outputs the plaintext D i .
This scheme is based on blockchain technology, smart-contract technology [27], searchable-symmetric-encryption technology, and zkSNARKs to realize the identity authentication of credit-investigation users and the secure sharing of credit data in the credit-investigation system.
To facilitate understanding, in our plan, we instantiated a blockchain-based creditinvestigation-system plan, as shown in Figure 1. Credit-investigation agencies CI A have a large number of credit-investigation users' credit investigation information D = (D 1 , D 2 , · · · , D n ).
In the data-encryption stage, the credit-investigation agency first generates key K according to algorithm Keygen and generates ciphertext c w i and index I through algorithm Enc, which are stored by the cloud service provider CSP. Then, the key K is encrypted according to the public key of the credit-investigation user to obtain K , and the ciphertext c w i is hashed to obtain the hash value h w i before being saved on the blockchain.
At the registration stage, the credit-investigation user u i needs to register with the credit-investigation agency in advance. In addition, credit-investigation users need to entrust financial institutions to perform data-query services.
In the zero-knowledge key-generation stage, the credit-investigation agency must design and develop a circuit that conforms to the user's identity, and it must generate a proof key EK i and a verification key VK i through the Setup algorithm. Finally, the proof key is distributed to credit-investigation users, and the verification key is distributed to the verification contract.
In the proof-generation stage, when the credit-investigation user u i passes the identity authentication without revealing his specific identity information, he needs to enter his identity information in the Prove algorithm to generate a zero-knowledge proof π conforming to the circuit, and he needs to submit it to the verification contract.
In the identity-authentication phase, the verification contract verifies that π and the proof key are correct through the Authenticate algorithm, and the ciphertext K is sent to the credit-investigation user and the financial institution; additionally, the identity of the credit-investigation user u i is legal. Otherwise, the identity authentication cannot be passed.
In acquiring the trapdoor phase, the credit-investigation user receives K , decrypts it to obtain the key K, and then returns to the financial institution through the secure channel. Financial institutions use the Trapdoor algorithm to generate trapdoor t i and send it to cloud service providers.
At the stage of obtaining the ciphertext, the cloud service provider uses the Search algorithm to return the corresponding ciphertext c w i to the financial institution.
In the decryption stage, the financial institution decrypts the ciphertext through the Enc algorithm to obtain the credit-investigation information D i , returns it to the relevant credit investigation user u i , and provides relevant credit services.

Detail Scheme
To facilitate the understanding of these notations, the notations used in this article are shown in Table 1.

System Initialization
Set p to be a large prime number and G p to be the only subgroup of Z * p , and then select generator g ∈ G p , random number r ∈ Z p , security parameter k and hash function H : {0, 1} * → Z p . Suppose the public key of the credit-investigation user u i is PK i , and the private key is SK i .
Credit-investigation users, financial institutions, and credit-investigation agencies participate in the blockchain and obtain the blockchain address. These entities act as nodes to jointly build an alliance blockchain together. All nodes that maintain the blockchain network can upload data to the blockchain only through the practical-byzantine-faulttolerance (PBFT) consensus mechanism. Zero-knowledge proof I Index

Data Encryption
The credit-investigation agency CI A selects the security parameter K and uses the KeyGen(1 k ) → K algorithm to randomly generate the key K = (K 1,i , K 2,i , K 3,i , K 4,i ) for credit-investigation user u i .
We chose a pseudo-random function f and two pseudo-random permutations ϕ and φ. We scanned the credit data set D = {D 1 , ..., D n } and generated the keyword set δ(D) of the document. For each different keyword w i ∈ δ(D) , we generated the corresponding dictionary sequence table D(w i ) and set the global counter ctr = 1.
CI A has a large number of credit-investigation data sets D = {D 1 , ..., D n } and uses the Enc(K 4,i , D i ) algorithm to generate ciphertext c w i and index I, where 1 ≤ i ≤ n. The Enc algorithm uses the AES symmetric-encryption algorithm. Index I is composed of array A and lookup table T.

Create an Array
Array A contains a linked list L i including node N i,j , set id(D i,j ) to be the jth identifier of D(w i ) and generates key K i,j ← KeyGen(1 k ). Node N i,j = id(D i,j )||K i,j ||ϕ K 1 (ctr + 1) was created, where 1 ≤ j ≤ |D(w i )| − 1. Node N i,j was encrypted to get A[ϕ K 1 (ctr)] = Enc K i,j (N i,j ) . Then, the last node N i,j = id(D i,|D(w i )| )||0 k ||NULL was created, and the node was encrypted to get A[ϕ K 1 (ctr)] = Enc K i,|D(w i )|−1 (N i,|D(w i )| ). Finally, all the nodes N i,j of the linked list L i were stored in the array A in a random order.

Create a Query Table
We generated the query table T[π K 3 (w i )] = addr A (N i,1 )||K i,1 ⊕ f K 2 with the keyword w i ∈ δ(D). Among them, the query table was composed of a two-tuple < address, value >, value presents the position addr A (N i,1 ) of the array A and the decryption key K i,1 of the node in L i , and < address, value > presents the address f K 2 (w i ) of T.
The credit-investigation agency CI A uploads the ciphertext c w i and index I to the cloud service provider CSP, and it uploads the ciphertext digest h w i of c w i , the ciphertext K i , and the ciphertext of the searchable-symmetric-encryption key to the blockchain.

Registration
We assumed that the user also needs to register with the credit-investigation agency CI A in advance. Normally, a valid voucher or identification is required. Once the creditinvestigation user u i passes the identity authentication, as a legitimate user, he will not repeatedly submit his own specific identity information during the scheduling or decryption stage, and he will remain anonymous. After registration, the credit-investigation user u i sends an entrusted service request and related keyword w i to the financial institution.

Zero-Knowledge Key Generation
After the credit-investigation user completes the registration with the creditinvestigation agency CI A, CI A needs to pre-design and develop a domain-specific language (DSL)-program that meets the zero-knowledge proof to generate circuit C 1 and C 2 . Then, the circuit C 1 is composed of a calculation circuit, and the circuit C 2 is composed of a calculation circuit and a Sha256 circuit. In the circuit C 1 , the credit-investigation user's identity information set < id 1 , · · · , id n >, identity information signature set < sign(id 1 ), · · · , sign(id n ) >, and the timestamp < Timestamp > are input into the compute circuit for calculation; then, the calculated value h is output to verify the authenticity and availability of the data. The structure of circuit C 1 is shown in Figure 2.
In the circuit C 2 , the credit-investigation user's identity information set < id 1 , · · · , id n > and information-signature set < sign(id 1 ), · · · , sign(id n ) > are input to the compute circuit; both the timestamp < Timestamp > and the results obtained from the compute circuit are input into the Sha256 circuit and output the calculated hash value h. This value verifies the authenticity and availability of the identity-information set, the corresponding identityinformation-signature set, and the timestamp of the credit-investigation user. The structure of circuit C 2 is shown in Figure 3. Take the acquired security parameter λ and one of the C 1 and C 2 as input, then the proof key EK i and the verification key VK i are output. The credit-investigation agency CI A sends EK i to the credit-investigation user, then it creates a verification contract and sends VK i to the verification contract. This verification contract is public on the blockchain network and is used to verify whether the identity of the credit-investigation user u i is legal or not.

Generate the Proof
There are two types of input: public input and private input. The personal-identity information id i and sign(id i ) of the credit-investigation user u i are referred to as private input, and the timestamp is referred to as public input to prevent potential replay attacks and man-in-the-middle attacks. u i must input the correct id i , sign(id i ), Timestamp, and EK i to generate a credible zero-knowledge proof π . This process is performed outside the blockchain and will not be written into the blockchain.

Identity Authentication
The credit-investigation user u i submits the zero-knowledge proof π to the verification contract for anonymous identity authentication. The smart contract verifies the zeroknowledge proof π and the verification key VK i without the participation of a third party. If the user-identity authentication is correct, the smart contract sends u i 's key ciphertext K i to u i and the financial institution, then it determines that the credit-investigation user u i 's identity is legal, otherwise it will record that the user is illegal and cannot proceed to the next-step operation. The record of all identity authentication performed by the smart contract is stored on the blockchain, and the process will only reveal the address information of the user u i but will not reveal any identity information about u i .

Obtain the Trapdoor
The credit-investigation user u i uses his private key SK i to decrypt K i to obtain K 4,i and then sends K i and K 4,i to the financial institution through a secure channel. Then, the K i sent by u i has a time limit to ensure that he is the credit-investigation user authenticated by the verification contract just now. If the financial institution succeeds in the verification, then a trapdoor t i is generated according to the key K 4,i and the corresponding keyword w i . Finally, the financial institution submits t i and credit-investigation-information-inquiry fees to the cloud service provider.

Obtain the Ciphertext
After paying the inquiry fee, the cloud service provider receives the trapdoor t i sent by the financial institution FI, retrieves the corresponding ciphertext c w i through the Search algorithm, and finally returns the ciphertext to FI.
In order to verify that the cloud service provider CSP has not tampered with the integrity and availability of the credit investigation ciphertext data and ciphertext data in the transmission process, the ciphertext c w i needs to be hashed to obtain Hash(c w i ). The result obtained by Hash(c w i ) with the ciphertext digest h w i in the blockchain is compared. If h w i == Hash(c w i ) , it proves that the cloud service provider CSP has not tampered with the ciphertext data.

Decrypt the ciphertext
The financial institution decrypts the ciphertext with the key K 4,i to obtain the creditinvestigation information D i and sends it to the user u i through a secure channel. In addition, the financial institution provides related credit services to u i according to the credit-investigation report.

Performance Analysis
In this section, we perform a performance analysis of this scheme in two aspects.
(1) Since the cloud service providers set up in this scheme are malicious and dishonest, they will analyze and speculate or even tamper with the data of credit-investigation users. Malicious nodes in the blockchain will steal the data of users and impersonate other users. In view of the above situation, we should analyze whether the scheme meets the security and privacy-protection requirements based on the blockchain credit-investigation system.
(2) We first compared the characteristics of the credit-investigation system with the scheme [9] and the scheme [10], and then we analyzed the efficiency of the identityauthentication phase and compared it with zk-DASNARK [18]. Finally, we compared the cost of the searchable encryption phase with similar ciphertext-searchable schemes [20].

Security Analysis
Theorem 1. This scheme can realize the confidentiality of credit-investigation data.
Proof. Credit-investigation data are all searchable and symmetrically encrypted. Only credit-investigation users u i with successful authentication can obtain the key K 4,i . Other users cannot get the key, and even if they have keywords w i and trapdoor-generation algorithms, they cannot generate a trapdoor t i . Moreover, t i is encrypted so that a malicious cloud service provider cannot decrypt or infer the credit-investigation ciphertext from the index without the key. The theorem is proved.

Theorem 2.
This scheme can realize the availability of credit-investigation data.
Proof. In this scheme, only successfully authenticated entities can obtain credit-investigation information. Specifically, credit-investigation users u i can generate a fully credible zeroknowledge proof π based on their identity information and can submit it to the verification contract of the blockchain. If the identity verified u i by the verification contract is legal, u i can obtain the key. Finally, u i entrusts financial institutions to send a trapdoor to cloud service providers to obtain and decrypt ciphertext c w i .Therefore, this scheme guarantees the availability of credit-investigation data. The theorem is proved. Theorem 3. This scheme can realize the tamper-proofability and the traceability of creditinvestigation data.
Proof. The Merkle tree of the blockchain ensures the tamper-proofability of creditinvestigation ciphertext. When a malicious cloud service provider CSP wants to tamper with credit-investigation ciphertext, the ciphertext cannot correspond to the ciphertext summary stored in the blockchain. Once the credit-investigation data are recorded by the blockchain, this information can be queried and traced. The theorem is proved. Theorem 4. This scheme can realize the ciphertext retrievability of credit-investigation data.
Proof. In this scheme, the financial institution needs to obtain authorization from the authenticated credit user and then obtain the relevant key and generate the relevant trapdoor. The cloud service provider retrieves the credit-investigation ciphertext. The theorem is proved. Theorem 5. This scheme can realize the fairness and ciphertext retrievability between creditinvestigation users and cloud service providers.
In traditional searchable encryption schemes, it is assumed that the cloud server will honestly perform the retrieval task and return the corresponding results. However, the cloud server may be malicious, meaning it does not return the search result or it return the wrong search result after receiving the retrieval task submitted by the user, causing the user to not receive the corresponding service.
Proof. In this scheme, credit-investigation users can only obtain the ciphertext from the cloud service providers CSP by successfully delivering the credit-investigation fee. When CSP performs a retrieval task and returns search results to financial institutions, if the returned result is incorrect or the corresponding result is empty, then the behavior of CSP maliciously returning wrong results or blanks will be recorded on the blockchain. Therefore, CSP will return the correct retrieval results. In addition, financial institutions can retrieve the corresponding credit ciphertext through the trapdoor. The theorem is proved.
Theorem 6. This scheme can realize the anonymity and authentication of identity in a creditinvestigation data query.
Proof. Credit-investigation user u i conducts anonymous identity authentication through zero-knowledge-proof technology, and other irrelevant nodes only know that the user's identity is legal. This scheme uses timestamps Timestamp and id i signatures as input to prevent potential replay attacks. The theorem is proved.

Efficiency Analysis
This scheme was systematically compared with the blockchain-based creditinvestigation system and traditional credit investigation, as shown in Table 2. Compared with the blockchain-based credit-investigation system [9], this scheme can achieve identity anonymity and verifiability, as well as ciphertext retrievability. Compared with the blockchain-based credit investigation system [10], this scheme also enables ciphertext retrievability and identity anonymity.
We used a computer with an Intel Core i7-8750H CPU @ 2.2 GHz with 16 GB memory for efficiency analysis. Zokrates [28] is an all-in-one tool to apply zkSNARKs to the blockchain. We used Zokrates and the smart contract on the Ethereum test network Rinkeby for identity-authentication experiments.
We analyzed the efficiency of 8 inputs and 16 inputs on circuit A and 16 inputs on circuit B. We mainly analyzed the number of constraints, and the time consumed for compilation in the zero-knowledge-proof phase, key generation, proof generation, and authentication, as shown in Table 3. In addition, we also analyzed the size of the proof key, the verification key, and the size of the proof, as shown in Table 4. We repeated each operation 50 times and took the average.  Table 3 that the verification time was 0.006 s. Even if the number of constraints keeps increasing, the time consumed by verification does not increase, which greatly improves the scalability of the scheme. The time consumed by other phases increases with the number of constraints. Moreover, the number of constraints is related to the circuits generated by zkSNARKs, and circuit C 2 is more complex than circuit C 1 . The greater the number of constraints, the higher the security of identity authentication. In addition, we used the identity information id i and its signature as private input, and the timestamp Timestamp as public input to prevent potential replay attacks. It can be seen from Table 4 that the size of the proof key, the verification key, and the proof increased with the number of inputs. Specifically, the maximum verification key was 4.0 KB, the maximum proof key was 6.4 MB, and the maximum space occupied by the proof was no more than 1.9 KB, which were all in an acceptable range.
This scheme was compared with zk-DASNARK [18], as shown in Table 5. Since both proof schemes are based on Sha256 circuits, we considered using the circuit C 2 . The space occupied by this scheme to generate the verification key was almost equal to that of zk-DASNARK, while the time consumed to generate the proof was not much different from that of zk-DASNARK. Although this scheme consumed about twice as much time as zk-DASNARK in the key generation phase, zk-DASNARK consumed six times as much time as this scheme in the verification phase. Moreover, the key size generated by zk-DASNARK was eight times larger than that of this scheme. We implemented a searchable-symmetric-encryption process using Python 3.8 and compared it with another searchable-symmetric scheme based on blockchain technology [20]. We repeated each operation 50 times and took the average time. Figure 4 shows the time to generate indexes for both schemes, with keywords ranging from 10,000 to 50,000. With the increase in the number of keywords, the index time generated by the two schemes also increased, and the time of this scheme was generally shorter than that of scheme [20]. We analyzed the efficiency of the process of generating trapdoors and returning search results. The keywords ranged from 10,000 to 50,000. As shown in Figure 5, the time consumed by this scheme was basically about 1 ms. Because the scheme [20] needs to generate search transactions through a smart contract, the time consumed by this scheme to generate trapdoors and return search results was significantly less than that of scheme [20]. We took the compile operation entered by 16-Sha256 as the basic unit and calculated the approximate ratio of other operations to the compile operation. Since there are five different input numbers in the searchable-symmetric-encryption process, we set the number of keywords to 30,000. The specific data are shown in Table 6.

Conclusions
This study proposed a security and privacy-protection scheme for the credit-investigation system based on blockchain technology by combining technologies such as zero-knowledge proof, searchable-symmetric encryption, blockchain, and smart contracts. In terms of security, this scheme can guarantee the confidentiality, availability, tamper-proofability, and ciphertext retrievability of credit-investigation data, as well as the fairness and anonymity of identity authentication in the inquiry of credit-investigation data. In terms of efficiency, compared with similar identity-authentication schemes, in addition to the increase in the key generation time by about 225%, the proof key of this scheme was reduced by about 87.6%, and the time consumed for verification was reduced by about 82.9%. Compared with similar ciphertext-retrieval schemes, the index generation time of this scheme was reduced by about 66.7%, and the time for generating trapdoors and returning search results was reduced by about 98.7%. In the next work, we will focus on improving the efficiency of zero-knowledge proof.