Open Access Paper
2 February 2023 Research on blockchain construction in medical data privacy protection
Tianhui Li
Author Affiliations +
Proceedings Volume 12462, Third International Symposium on Computer Engineering and Intelligent Communications (ISCEIC 2022); 124621B (2023) https://doi.org/10.1117/12.2661046
Event: International Symposium on Computer Engineering and Intelligent Communications (ISCEIC 2022), 2022, Xi'an, China
Abstract
At the present stage, the informatization and intellectualization of the medical industry have greatly increased people's attention and developed medical informatization, but there are still problems such as inconsistent data and ineffective protection of patients' privacy. In terms of privacy protection of medical data storage and sharing, this paper first uses CiteSpace visual knowledge map analysis to learn the current research hotspots in the field of medical informatization. This paper addresses the problems of access barriers and information sharing and circulation among medical institutions, and protect patients' privacy data. Therefore, this paper designs a medical data privacy protection scheme based on the blockchain technology. Through the analysis of the practice results, it is verified that this scheme can achieve safe and reliable storage and sharing under the premise of effective protection of the patient's medical data privacy.

1.

INTRODUCTION

Blockchain technology can help solve the pain points of the medical industry. Personal medical data is very private and important. According to Tencent smart security, medical information on the black market of the US internet is 10 times that of credit card information. However, the precious data of the medical industry has internal and external problems. The encryption algorithm in the blockchain can effectively protect the privacy of patients. Even if it is leaked, bad people or hackers can get encrypted anonymous information, which will not pose a threat to the privacy of patients.

First, this paper uses the scientific spatial visualization knowledge base to understand the current research hotspots in the field of medical information. It has the characteristics of decentralization, distributed ledger, security, reliability and non-interference, and can ensure the authenticity and reliability of stored medical data. This paper discusses the access barriers, information sharing and circulation among medical institutions, and the protection of patients’ privacy data. Therefore, this paper designs a medical data privacy protection scheme based on the chain technology. Data layer, network layer, consensus layer, access control layer and application layer are the main framework structure. Through the analysis of the practical results, it is verified that the scheme realizes safe and reliable storage and sharing on the premise of effectively protecting the medical treatment and privacy of patients issue [1].

2.

CORE TECHNOLOGY COMPONENTS

It consists of three parts: basic technological components, basic application components and support facilities. Among them, the basic technological components include the communication layer, the storage layer, the security mechanism layer and the consensus mechanism layer.

For communication layer,blockchain usually adopts peer-to-peer (P2P) technology to organize various network nodes. Each node performs the functions of routing, identifying new nodes and broadcasting data via multicast.

The blockchain data for the storage layer is stored in memory in a blockchain data structure during operation, and will ultimately be kept and stored in the database. For larger files, they can also be stored in file system outside the blockchain, and the summary (digital footprint) can be saved in the blockchain for self-certification.

The blockchain system encrypts data and protects privacy for the security mechanism layer through various cryptographic principles. As for public blockchain or other blockchain systems involving financial applications, high robustness and reliable security algorithms are the fundamental requirements, which must achieve the state-level of secrecy and have some efficiency advantages.

For the consensus mechanism layer, the strategies and methods agreed upon by all nodes in the blockchain system should be selected with flexibility depending on the different types of systems and application scenarios.

2.1

Core application component

Basic application components means about basic technological components, the core application components provide functions for blockchain specific application scenarios, allowing the issuance of digital assets by programming, and the flexible operation of assets on the chain by writing smart contracts through supporting script languages.

2.2

Supporting facilities

Supporting facilities means a typical distributed system, blockchain needs supporting development and testing tools and environment in the research and development stage. In the production stage, There is a need to establish the operating and maintenance system and associated operations management functions. At the deployment level, the blockchain system can be deployed with a single server as a node in the blockchain network. It can also be deployed on multiple servers and joined like a node in the blockchain network in the host cluster unit. The latter can improve the stability and throughput of nodes, and more appropriate to consensus mechanisms that have high demands on node availability.

A standard blockchain project should include at least three layers: data layer, network layer and consensus layer. The application layer, contract layer and incentive layer can be excluded.

2.2.1

Data layer

The data layer has three characteristics, which are tamper proof, full backup and full equality (data, authority and code) To achieve this feature, we rely on the chain structure which contains prev hash, nonce and tx, as shown in Figure 1:

Figure 1.

Chain structure

00047_PSISDG12462_124621B_page_2_1.jpg

2.2.2

Data structure of data layer

The data layers are structured as blocks, block heads, and block bodies, respectively. As shown in table1.

Table 1.

Data structure of data layer

SizeFieldDescription
4-bytesEditionVersion number used to track protocol updates
32-bytesPre hash valueRefers to the hash value of the prev block in the blockchain
32-bytesMerkle rootHash value of Merkle root in the block transaction
4-bytesTime stampApproximate value generated by this block
4-bytesDifficulty targetDifficulty target for workload proof algorithm
4-bytesNonceCounters for workload proof algorithms

The container data structure that aggregates the transaction information contained in the public account book (blockchain), Including block header and a body. The block header + block body < = 1m, the block header is 80 bytes, and the block body indicates that each block contains 2000 transactions, with an average of at least 250 bytes per transaction. Thus, the block containing the full transaction is 4000 times the block header size.

The digital fingerprint obtained by hashing the head of the data block twice with the sha256 algorithm produces a fast hash value. It is unique, computable and storable. It is not included in the data structure of the block, but can be stored in an independent database table. The height of the block comes from the bitcoin network. Nodes dynamically identify the location of blocks in the network (also known as block height), which is not unique in a short time.

A complete Bitcoin node saves the local blockchain copy [3] from the created block, and the block local blockchain [4] copy will be constantly updated to expand blockchain. Upon receiving incoming blocks from the network, the node checks these blocks, and then connects them to the existing blockchain network. In the case of new blocks, the node will find the hash value of its parent block in the field of “parent block hash value”.

2.2.3

The timestamp of the data layer

The timestamp of the data layer refers to the local timestamp of each node. The block generation frequency is 10 minutes. When the timestamp of the data layer is greater than the median value of the timestamp of the first 7 blocks and less than 2 hours after the average time of the system, the block will be rejected.

2.2.4

Transaction records of data layer

Figure 2 shows the transaction records of the data layer. The private key, public key and wallet address of the data layer bitcoin system uses the elliptic curve signature algorithm. The private key of the algorithm is composed of 32 bytes random numbers. The public key can be calculated through the private key. The public key gets the bitcoin address through a series of hash algorithms and coding algorithms, and the address can also be understood as the summary (hash) of the public key.

Figure 2.

Transaction records of data layer

00047_PSISDG12462_124621B_page_3_1.jpg

2.2.5

Maintaining the Integrity of the Specifications

The template is used to format your paper and style the text. All margins, column widths, line spaces, and text fonts are prescribed; please do not alter them. You may note peculiarities. For example, the head margin in this template measures proportionately more than is customary. This measurement and others are deliberate, using specifications that anticipate your paper as one part of the entire proceedings, and not as an independent document. Please do not revise any of the current designations.

The broadcast block transaction data includes three parts: original data, public key and the transferor’s public key. The original data includes the transfer amount and the other party’s wallet address, signature, and signature of the original data using the transferor’s private key, The public key is a publicly visible key in the whole network. It is used to verify the information and encrypt the data generated according to the private key. It can be verified by the data signed by the private key.

The asymmetric encryption algorithm of the data layer uses the public key and the private key respectively in the process of “encryption” and “decryption”, including

Asymmetric encryption, elliptic curve encryption algorithm and Merkle tree[2] of data layer. A Merkle tree is a binary hash tree used for fast recursion and large scale data integrity checking purposes. This tree is called balanced.

In the blockchain[5],all blocks consist of all transactions generated from the block and are represented by Merkle trees., as shown the Formula 1 and 2:

00047_PSISDG12462_124621B_page_4_1.jpg
00047_PSISDG12462_124621B_page_4_2.jpg

The proof of the existence of a specific transaction in the block requires only the computation of log2 (n) hashing. 16 transactions can be proved by 4 hashes + Merkle tree root. For 65535 transactions, only 16 transactions are required, which can be seen in Figure 3.

Figure 3.

The broadcast block transaction data

00047_PSISDG12462_124621B_page_4_3.jpg

As shown in the above figure, if we want to determine the accuracy of the transaction H (k), the proof method is that the detection server a has only the block header and no block, but it can communicate with all 2 nodes to obtain the hash of the Merkle tree. and then to detect server B. The Merkle path (H (L), H (ij), H (mnop), H (abcdeffgh)) can be determined. It can be verified with the last requested data from the complete access list (H (L), H (ij), H (mnop), H (abcdeffgh)), perform several hash operations, and finally perform hash comparison with the Merkle root. By using the same original information of the hash function, the same summary information can always be obtained by using the same hash. Any minor changes to the original information hash out unidentifiable summary information. The three features of the original information cannot be inferred from the summary information and a short summary information can be obtained.

2.3

Data dissemination and verification in network layer

In network layer, the new transactions will be broadcasted to the entire network.

Each node will merge the received transaction information into a block, and try to find enough difficult workload proof (mining) in its own block. If a node finds a workload certificate (which satisfies the conditions for packing a block), it broadcasts the newly packed block to the whole network. The validity of a block is only identified by other nodes if all the transactions contained in the block are valid and did not previously exist. Other nodes indicate that they accept the block. Acceptance is achieved by creating a new block, extend it after the block ends and treat the random hash of the accepted block as the random one before the new block.

2.3.1

P2P technology of network layer.

P2P networking technology was early applied in P2P download software such as BT, which means that blockchain[6] has automatic networking function and supports TCP, UDP and other communication protocols.

2.4

Consensus layer of blockchain model

This layer of consensus encapsulates all sorts of algorithms for consensus mechanisms of network nodes. The blockchain [7] is based on the consensus mechanism algorithm, because it determines block generation, accounting decision-making methods will affect the security and reliability of the entire system.

2.4.1

Data desensitization module

Data desensitization refers to the use of masking, generalization, encryption and other technologies to eliminate the identifiable sensitive information contained in the original data, but at the same time, it needs to retain the data characteristics in a certain use environment [6]. This scheme will first adopt data desensitization strategy to protect sensitive information in medical data.

2.4.2

Digital signature module

SM9 is an identification cipher algorithm constructed by bilinear pairs on elliptic curves. Its main body includes numbers, word signature algorithm, public key encryption algorithm, and etc. This scheme uses SM9 identification algorithm in the digital signature module to perform the digital signature operation. The algorithm generates public-private key pairs based on the user’s ID. In this mode, we no longer rely on the keystore and Ca in the traditional PKI system to issue and verify certificates. Therefore, using SM9 identification algorithm can greatly reduce the resource overhead in computing and storage, which is in line with application requirements of medical scenarios.

2.4.3

Data encryption and decryption module

The scheme designed in this paper adopts SM9 identification national secret algorithm to sign and count the medical data first. According to the signature verification, the data source is safe and reliable. Since the blockchain system[8] is open and transparent,the data stored on the blockchain can be viewed by all the nodes in the network, and it is stored directly. In essence, text messages don’t solve privacy concerns about medical data, so it needs to be stored on the chain. The stored medical data and relevant personal privacy information are encrypted, so that they can be safely stored in the form of ciphertext in blockchain.

2.4.4

Implementation of digital signature module

This scheme adopts SM9 identification algorithm as digital signature algorithm. The significance of using digital signature in this paper is to ensure that medical data has a true and reliable source. Only when the source is reliable, can the sharing of medical data have practical significance, so the use of digital signature technology in this scheme is essential. In this scenario, doctors can generate a pair of public and private keys through their own identity, and use their own private keys to digitally sign the medical data containing patient information and diagnosis and treatment information. The data receiver can verify the signature with the doctor’s public key. The verified medical data can be regarded as a reliable source, and files that can be encrypted and stored in the distributed file system of IPFs.

2.4.5

Implementation of data encryption and decryption module

In this scheme, IPFs from distributed file systems are adopted as a medical record storage aid. Since the data stored on the blockchain [9] is open and transparent, themultiple nodes can participate in the backup. If data is directly stored in the distributed file storage system as plain text, and the hash value returned by data stored in IPFs is stored in blockchain network, when the blockchain data is synchronized with other nodes, data address becomes open and transparent. If the illegal user finds the corresponding information through hash, it will inevitably lead to the privacy disclosure of patients. Therefore, medical data can only be effectively stored in distributed storage system after encryption.

2.4.6

Implementation of distributed storage module for medical data

As a major component of blockchain technology [10], the choice of specific implementation scheme needs careful consideration. In this experiment, the delegated proof of stake (dpos) algorithm is selected. By voting, a certain number of proxy nodes are selected, and these nodes obtain the accounting right of the block. In the process of experimental testing, the system preset 10 nodes to meet the experimental needs, and took port number 8881 to port number 8890 as the node identification. Through the voting mechanism, the nodes that get the top three votes are selected as accounting nodes, which are responsible for uploading medical data and generating blocks. Other nodes are responsible for consensus verification, and all election results will be subject to consensus verification.

3.

CONCLUSION

This paper uses CiteSpace software to visually analyze the relevant documents, and determines that health data privacy in storage and sharing is more prominent. There are two reasons. First, most medical institutions currently store patients’ medical data in third-party storage institutions, which makes patients lose data ownership and bear the risk and threat of medical privacy disclosure. Second, most of the patient’s medical data are exposed in the system in the form of plaintext during the sharing process, facing multiple threats such as internal theft and external attacks. Since the rise of the concept of bitcoin in 2008, after years of development, the underlying blockchain technology has gradually been widely used in other scenarios except the financial field. The medical field is actually a potential application scenario, in combination with the privacy problems in the storage and sharing of medical data in the medical scene.

Block chain technology is considered as a feasible solution. Among them, the characteristics of the blockchain technology that are difficult to tamper with and distrust can ensure the authenticity and reliability of the stored medical data. The decentralized point-to-point transmission mode can solve the access barriers between medical institutions and the problem that information cannot be shared and circulated under the condition of ensuring security. At the same time, the privacy protection method based on cryptography can effectively prevent the personal medical data of patients from being leaked.

Therefore, the third part of this paper focuses on the privacy of medical data during storage sharing, and constructs the blockchain framework structure as a whole, specifically including the data layer based on hashing algorithm, Merkel tree, timestamp and public medical dataset, network layer based TCP P2P network protocol to realize decentralized mode. The consensus layer with improved dpos consensus mechanism and the access control layer based on SM9 identification algorithm and proxy re encryption technology are adopted. A complete solution of medical data privacy protection based on blockchain is proposed. According to the relevant requirements involved in the solution, the technical modules such as data desensitization, digital signature, data encryption and distributed storage are designed in detail and the algorithm is selected.

Finally, according to the overall design of the scheme, this paper carries out relevant experimental analysis on each module including digital signature, medical data encryption and decryption, IPFs distributed file storage and dpos consensus mechanism. After verification, the scheme proposed in this paper analyzes the feasibility and advantages of this method.

REFERENCES

[1] 

Marwan, M., Kartit, A., & Ouahmane, H., “A framework to secure medical image storage in cloud computing environment,” Journal of Electronic Commerce in Organizations (JECO), 16 (1), 1 –16 (2018). https://doi.org/10.4018/JECO Google Scholar

[2] 

Hu, R., & Yan, W. Q., “Design and implementation of visual blockchain with Merkle tree,” Handbook of Research on Multimedia Cyber Security, 282 –295 IGI Global.2020). https://doi.org/10.4018/AISPE Google Scholar

[3] 

Bodkhe, U., Tanwar, S., Parekh, K., Khanpara, P., Tyagi, S., Kumar, N., & Alazab, M., “Blockchain for industry 4.0: A comprehensive review,” IEEE Access, 8 79764 –79800 (2020). https://doi.org/10.1109/Access.6287639 Google Scholar

[4] 

Jiang, S., Cao, J., Wu, H., Yang, Y., Ma, M., & He, J., “Blochie: a blockchain-based platform for healthcare information exchange,” in In 2018 ieee international conference on smart computing (smartcomp), 49 –56 (2018). Google Scholar

[5] 

Swan, M., “Blockchain thinking: The brain as a decentralized autonomous corporation [commentary],” IEEE Technology and Society Magazine, 34 (4), 41 –52 (2015). https://doi.org/10.1109/MTS.2015.2494358 Google Scholar

[6] 

Xia, Q. I., Sifah, E. B., Asamoah, K. O., Gao, J., Du, X., & Guizani, M., “MeDShare: Trust-less medical data sharing among cloud service providers via blockchain,” IEEE access, 5 14757 –14767 (2017). https://doi.org/10.1109/ACCESS.2017.2730843 Google Scholar

[7] 

Azaria, A., Ekblaw, A., Vieira, T., & Lippman, A., “Medrec: Using blockchain for medical data access and permission management,” in 2016 2nd international conference on open and big data (OBD), 25 –30 (2016). Google Scholar

[8] 

Xu, J., Xue, K., Li, S., Tian, H., Hong, J., Hong, P., & Yu, N., “Healthchain: A blockchain-based privacy preserving scheme for large-scale health data,” IEEE Internet of Things Journal, 6 (5), 8770 –8781 (2019). https://doi.org/10.1109/JIoT.6488907 Google Scholar

[9] 

Kumar, R., & Tripathi, R., “Traceability of counterfeit medicine supply chain through Blockchain,” in 2019 11th international conference on communication systems & networks (COMSNETS), 568 –570 (2019). Google Scholar

[10] 

Hu, R., & Yan, W. Q., “Design and implementation of visual blockchain with Merkle tree,” Handbook of Research on Multimedia Cyber Security, 282 –295 IGI Global.2020). https://doi.org/10.4018/AISPE Google Scholar
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Tianhui Li "Research on blockchain construction in medical data privacy protection", Proc. SPIE 12462, Third International Symposium on Computer Engineering and Intelligent Communications (ISCEIC 2022), 124621B (2 February 2023); https://doi.org/10.1117/12.2661046
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data storage

Computer security

Data backup

Medical research

Distributed computing

Symmetric-key encryption

Analytical research

Back to Top